-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memory exhausted errors #20
Comments
Which version of the browscap file do you use (5026 full)? |
the latest version of the Browscap.php file uses the full one.. I was hoping to switch to the light, but there is a comment above that says I can't. (note: light version doesn't work, a fix is on its way) I would assume that the php script is pulling the latest file 5027 I believe. |
I too have this same issue, i have been using Browscap (the latest version and the full version), for 2 months, and this problem just occurred yesterday! Even increasing the memory limit to 256MB, doesn't work, or switching to the light version of the ini file. |
Hi there - we are aware of the increased file size causing memory issues in some cases. The problem is caused by the sizes of the INI file dramatically increasing since the project has become more active, and is simply as a result of many more user agents being defined in the file, thus the file has grown quite large. There are a couple of potential solutions I see to this:
It is also worth noting that our official recommendation is:
For example: // When parsing the UA on the site
$browscap = new phpbrowscap\Browscap($cacheDir);
$browscap->doAutoUpdate = false;
$information = $browscap->getBrowser($userAgent); and then in a background processed script (e.g. on a cron job) that gets run once a day/week: $browscap = new phpbrowscap\Browscap($cacheDir);
$browscap->updateCache(); This way, it is only the background script that would consume large amounts of memory, not your actual website. That said, we are aware of the issue and it is something we definitely want to fix. Pull requests welcome, as always ;) |
Thanks for the reply.... but..... This issue you have referenced is obviously related.. but to a newbie trying out your project for the first time.. it might appear that it just doesn't work - and an unsupported operand bug doesn't really help. |
I did not say I was going to close this ticket, I acknowledged that this is a problem and we are considering solutions. Just because the "unsupported operand" issue is closed it does not mean it won't appear in search results, and if someone has the same issue, they'll end up here after reading that issue. The project is not unusable you just need to increase the memory limit. Additionally, I have provided you a workaround which means your production servers are minimally affected until we are able to fix this issue in a decent way. Please don't make assumptions based on no evidence - this issue is staying firmly open until we fix it, rest assured :) |
@whiteatom one of the proposals we are making is to re-arrange what is included in each file - any feedback you have would be appreciated over at browscap/browscap#248 :) |
I'm currently working on it. After a few first changes I could reduce the memory usage by 8% percent (156M -> 143M). Question: Which is the minimum PHP version to support? |
@ziege what method are you using? |
@ziege and in answer to your question, currently >= 5.3, we haven't discussed increasing this to 5.5 yet. |
First change: unset all variables as early as possible in updateCache(). For the next changes I first have to look a bit deeper into the code... |
After the second change: 25% reduction (156M -> 117M). More coming soon. |
Good news, I could reduce the memory peak usage by nearly 60%. I'm currently testing on another system (with PHP 5.6.0beta1) and could optimize the usage there from 101.92M to 41.45M. It will be hard to optimize it more, because PHP requires 39.12M for parsing the ini file into the $browsers array, which can only be changed by changing the ini file. What did I do?
I also fixed a caching bug, see pull request. |
Test results with PHP 5.5.11: In PHP 5.5.11 the parsing of the INI file requires much more memory than in PHP 5.6 (61M compared to 39M), which explains the huge difference. I will prepare a pull request now. |
That's fantastic, and welcome news :) |
@ziege nice work! Share it soon! 👍 |
Is there a beta build I can try out? I am now getting memory exhausted errors with PHP set to 256MB. |
@asgrim In response to your message above, I am not trying to be insulting.. but anyone on shared hosting won't be able to increase their memory allocation - and cron jobs are probably above many people using this project. It is a fantastic effort, I was just pretty put off to have a major but just closed on me as soon as I reported it. |
I think another fix would be to implement some sort of segmented downloading. It would download the first 4mb, save it, then the next 4mb etc. When it finishes it would combine them all into a single file. The idea of a cron job would be perfect, but not for most people. A simple There is an answer on StackOverflow to solve this problem (Is there a way to use shell_exec without waiting for the command to complete?):
|
@whiteatom you could try checking out PR #26 and see if that works for you, @ziege has done s load of work to improve it, but I have not had a chance to look over it yet - any feedback is gratefully received :) @DaAwesomeP I believe PR #26 includes this sort of staged parsing to reduce memory usage, I just need to check it out. Memory improvements are on the way! :) |
@whiteatom also, I'm really not sure where you get the idea that I closed this ticket, if you look through I have never had the intention of closing this issue at all; I acknowledged it was a problem that we need to deal with and I provided a possible workaround for the interim while we fix the issue. |
@DaAwesomeP The INI file is only about 6-7MB and that wasn't a problem in my tests. Parsing it with parse_ini_file was a problem (caused a memory peak of about 40MB). I found a way to reduce it by about 25% (by splitting the file and using parse_ini_string instead), but this was unnecessarily complicated and didn't reduce the memory peak caused by the following steps. |
After my changes the remaining problem is the large array that results from parsing the INI file (and the additional arrays created from it to optimize the array search operations). One way to optimize it for the future would be to process the INI file in multiple steps. I tried it by splitting it into the sections marked by included comments (+ the default settings, which should be available for all sections), but noticed that some "Parent" references link to other sections, so that this didn't work. I think it will be difficult to realize this. Another way would be to download a prepared "cache file" directly instead of the INI file. I'm currently testing it, because some changes are required - but if it works and such a file could be provided by the browscap server, we could reduce the memory peak for the update to about 4MB... |
This is my new idea to optimize memory usage:
This solution should work for much more browscap data and only requires some prepared browscap JSON files, structured very similar like the current cache files (also with the double encoded values, to optize performance and memory usage - this can be confusing if used with other clients). Possible improvements:
My questions:
|
I don't want to use another package, what is going to happen to browscap-php?!! |
We are still looking into this. As I've said before, this is an open source project that people contribute to in their free time. If you have the inclination, perhaps you could look into this yourself @ammont ;) |
@asgrim thanks, I will as soon as I get a little free time. |
I've switched now to the class from ziege and I like it (@ziege thank you for that class) |
Hi. I solved my probleme. You must make this: I tried with 3 browsers and if you have the time, could you try with your application and tell me the result? Thank //////////////////////////////////////////////////////////////// browscapObject = $browscap; $this->nameFileTempo = session_id(); //tempo file will be called with session id $this->pathBrowscapIni = $browscap->cacheDir . $browscap->iniFilename; $this->pathNewFile = $browscap->cacheDir . "tmp_" . $this->nameFileTempo . ".ini"; $this->updateCach(); if ($this->createNewIniFile()) { $this->browscapObject->iniFilename = basename($this->pathNewFile); return true; } else { return false; } } else { return false; } } private function updateCach() { $interval = time() - filemtime($this->pathBrowscapIni); if ($interval <= $this->browscapObject->updateInterval) {//befor end of limit update, use local file $this->browscapObject->localFile = $this->pathNewFile; } } private function createNewIniFile() { $fileIni = fopen($this->pathBrowscapIni, 'r'); if ($fileIni) { $previousBuffer = ""; //last line $data = ""; //data to write in new file $endFindParameters = 0; //indicate if we found all generals parameters $inData = false; //indicate if we are in parameter data $inBrowserData = false; //indicate if we are in browser data $valueToSearch = array("[GJK_Browscap_Version]", "[DefaultProperties]"); while (($bufferRaw = fgets($fileIni)) !== false) { $buffer = rtrim($bufferRaw); //delete spaces //recover default parameter if ((!$inData) && ((stripos("[GJK_Browscap_Version]", $buffer) !== false) || (stripos("[DefaultProperties]", $buffer) !== false))) {//we find zone parameters $data = $this->addData($data, $buffer); $endFindParameters++; //indicate that we have find one parameters $inData = true; continue; } if ($inData) {//we are in parameter $data = $this->addData($data, $buffer); //add data in result if (preg_match("/[._]/", $buffer)) {//we fine new zone $inData = false; //stop recover parameter } if ((stripos("[GJK_Browscap_Version]", $buffer) !== false) || (stripos("[DefaultProperties]", $buffer) !== false)) { $inData = true; //we are in other parameters $endFindParameters++; } continue; } if (($endFindParameters == 2) && (!$inData) && (!$inBrowserData)) {//when we search pattern for browser if ((preg_match('/Parent="DefaultProperties"/', $buffer)) && (preg_match("/[._]/", $previousBuffer))) {//we find new zone for browser $pattern1 = str_replace(" ", "/", $previousBuffer); //replace " " by / $pattern = str_replace(".", ".", $pattern1); //replace "." by . if (preg_match($pattern, $_SERVER['HTTP_USER_AGENT'])) {//check if it's user agent $data = $this->addData($data, $previousBuffer); $data = $this->addData($data, $buffer); $inBrowserData = true; continue; } } } if (($endFindParameters == 2) && (!$inData) && ($inBrowserData)) {//find browser, recover all informations $data = $this->addData($data, $buffer); if ((preg_match("/[.*]/", $previousBuffer)) && (preg_match('/Parent="DefaultProperties"/', $buffer))) {//we find other zone for other browser $inBrowserData = false; break; } } $previousBuffer = $buffer; //store last line } fclose($fileIni); file_put_contents($this->pathNewFile, $data); return true; } else { return false; } } private function addData($oldData, $data) { $result = $oldData . $data . "\n"; return $result; } public function getBrowscapObject() { return $this->browscapObject; } public function deleteFile() { $result1 = unlink($this->pathNewFile); //delete file $result2 = unlink($this->browscapObject->cacheDir . $this->browscapObject->cacheFilename); return $result1 && $result2; } } |
Also had memory problems while updating cache (even with cronjob at the server). Decide to start using https://github.com/yzalis/UAParser. |
I have same memory problem. |
@ve3 start using |
@frederikbosch @ve3 or, feel free to submit a PR to make it better :) thanks! |
In fact, by the way the |
The crossjoin browscap parser https://github.com/crossjoin/Browscap is A LOT more memory efficient, and actually quite a bit faster than browscap-php, as outlined in this ticket: https://github.com/crossjoin/Browscap/issues/12 And it still uses browscap data, which is more complete than what ua_parser offers, IMO. |
@jaydiablo is that vs the |
@asgrim The benchmarks in that ticket are just against the latest 2.x version (2.0.5 at the time of that writing), however I have run the browscap tests against the 3.x branch as well, just never posted the results. Here are some results that I just ran on my mac, all are fresh checkouts of the respective master branch. browscap-php dev-master (2.x):
browscap-php 3.x-dev:
crossjoin dev-master
browscap-php 3 does use substantially less memory than browscap 2 in these tests, which is good to see, but it's actually worse in speed. It's not just during the cache generation either, as you can watch the tests run and see that it takes a while on individual tests after the cache building steps. I've also modified the useragent benchmark project a little bit to add browscap 3.x, here are the results (note that the times and memory shown here are to process the user agents in the file. The cache building is done as a separate step prior to running the benchmark). These are run on a linux virtual machine, I only note this because my mac has an SSD drive, the linux machine has magnetic.
One of the modifications that I made to the benchmark utility was to time each useragent lookup individually and log to a CSV file (I mention this in the crossjoin ticket). This allows me to build a comparison CSV to see which useragents each library is faster or slower at. Here's that CSV for the "top-200-last-24.txt" run above: https://docs.google.com/spreadsheets/d/1mtISknvOyC7KurRVeAeNFtq7ZZpuremkjITjHDlCF7I/pub?gid=1776088907&single=true&output=csv Certainly the memory use of Browscap 3 is looking better than 2, but at a cost for performance (as it is in its present state). The crossjoin parser excels at memory use, and speed. I have branches of browscap/browscap to run the tests, a simple clone of these should allow you to replicate the tests in your environment: https://github.com/jaydiablo/browscap The browscap 3.x one is mostly a copy of @mimmi20 's refactor branch, but I changed the formatter that it uses. If you see something incorrect with the setup/config of browscap 3 in that branch, let me know. As far as I can tell it generates the cache files, so I assume it's using them for the useragent parsing. I'd like to contribute back some of the changes I've made to the useragent benchmark utility, but I have to clean it up a little bit. It requires a bit of hoop jumping to get browscap-php and browscap-php 3 running at the same time (composer related) which I don't think can be committed back upstream, but a branch to replace browscap-php with browscap-php 3 could be made. |
I looked around a bit at the 3.x code, and it does look like a lot of it is based on crossjoin, particularly the storage of the patterns/iniparts in separated cache files. I ran a profile of 3.x with blackfire against the slowest useragent it processed in that csv I linked to above: https://blackfire.io/profiles/bdb371a4-11cb-428f-8281-66c9079ff64d/graph Like I was seeing with crossjoin when I first started testing with it, the majority of the time is spent in preg_match, and preg_match is called a lot of times (nearly 800 in this case). I suspect you'll see the most gains in speed by looking at reducing the number of patterns that are stored in the cache files, mainly be "compressing" them (as it's called in browscap-php 2.x's source), i.e., removing the digits from the patterns and replacing with regex \d, then removing any patterns that are the same. This was the biggest win for me when I was trying to optimize crossjoin: https://github.com/crossjoin/Browscap/commit/2ee9663bd4b971c658d87f33fe780858580f7bd4 (the digit replacement in that commit is a bit overzealous, and causes tests to fail, which was fixed later https://github.com/crossjoin/Browscap/commit/afd9f8eaec3fe506ed864a93a536e2370b570922). I'd love to help, but have spent too much time on browscap related tasks lately. We're VERY happy with crossjoin, but do dislike the idea of there being two (php) parsers out there. I'd love to see browscap 3 be the one true parser, but in order for that to happen I think the performance has to get better (at least for us). |
@jaydiablo thanks very much for your feedback. I've created a new issue in #101 to investigate the performance of the FWIW, I don't mind there being multiple parsers, it promotes a healthy ecosystem imo :) however, we'll only "officially" support stuff on github.com/browscap (although we'll do our best for other stuff too!) |
I have been a long time users of Browscap v1.0, however, I decided to upgrade to the newest version. Now I am getting constant out of memory errors in the Browscap script.
Fatal error: Allowed memory size of 201326592 bytes exhausted (tried to allocate 32 bytes) in /www/tracker/dev/incs/Browscap.php on line 683
I have had to increase the memory limit 3 times (up to 256MB) to get it to work.
Why is this new script so inefficient on memory? is there a fixed planned as it is not sustainable to run a production server with a single php script requiring that much memory.
The text was updated successfully, but these errors were encountered: