Love Fuel?    Donate

FuelPHP Forums

Ask your question about FuelPHP in the appropriate forum, or help others by answering their questions.
large image-thumbnailing operation fails without error
  • Hello,
    I'm running this code trying to cache thumbnails for all of my product images:





    set_time_limit(0);
    $products = \Product\Model::query()->get();
    foreach( $products as $key=>$product )
    {
    \Log::debug( $product->sku );
    echo $product->sku;
    $images = $product->images();
    foreach( $images as $image )
    {
    $image->thumbnail() . '<br>';
    }
    unset($products[$key]);
    unset($images);
    }






    After deleting the previously cached thumbnails (for testing purposes), the first time I run it, it fails after about 30 seconds, and caches images for about 150 products. I'm not running in safe-mode, so I don't think it's running into a time limit. The second run only lasts about 10 seconds, and caches about 30 products. The third run lasts about 5 seconds and caches fewer, and on the fifth or sixth run and afterward it only runs about 1 second and doesn't cache anything at all.



    There's no html output whatsoever, and the last lines of the log on the last run, with log level L_ALL, are:

    799 DEBUG - 2013-07-08 12:23:47 --> 514

    800 DEBUG - 2013-07-08 12:23:47 --> 513

    801 DEBUG - 2013-07-08 12:23:48 --> 512




    Any help would be appreciated.
  • Ok, an update, I commented out the caching operation, the "$image->thumbnail()" line, and it still fails in the same way, but makes it 10 rows further.
  • Another update - I changed memory limit to 500M and it fixed the problem.

    I had suspected memory was the problem, and that's why I had the unset() operations in there.  Since that clearly isn't working, how should I go about managing the memory better?
  • Memory limit set back to 64M, as I want this fixed since I'll soon have more images to thumbnail.


    Current code, still failing:



    public function action_test()
    {
    set_time_limit(0);

    $limit = 10;
    $offset=0;
    while( $products = \Product\Model::query()
    ->limit($limit)
    ->offset($offset)
    ->get()
    )
    {
    gc_collect_cycles();
    foreach( $products as $key=>$product )
    {
    echo $product->sku . '<br/>';
    \Log::debug( $product->sku );
    $images = $product->images();
    foreach( $images as $image )
    {
    //$image->thumbnail();
    }
    }
    $offset += $limit;
    }

    exit;
    }
  • HarroHarro
    Accepted Answer
    ORM caches all retrieved objects, so it's not really suitable for batch operations.

    Use a standard DB query instead.
  • I've added a \Log function to both the Product and Image models' __construct() and __destruct() functions, and __destruct seems to never get called on a product, while hundreds or thousands of images are constructed before any are ever destroyed.  These models extend the Orm model. I don't understand why the __destruct() functions are never being called on the products, and it's even more confusing that it gets called on 10 images at the very end.

    Weirdly, the memory error has started showing up sporadically, also:
    Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to
    allocate 40961 bytes) in
    /home/zero/ph/fuel/core/vendor/phpquickprofiler/phpquickprofiler.php on
    line 37


    *sigh*
    Some days this stuff is crazy-making.
  • Ah! Thank you again, Mr. Verton!  A note in the ORM/Introduction/Troubleshooting documentation about this might be appreciated by the next poor sap!
  • Is there any way to clear the ORM cache?
  • No, there isn't, but it's quite simple to add something to your model to clear the cache:

    public static function clear_cache()
    {
        $class = get_called_class();
        static::$_cached_objects[$class] = array();
    }

    will delete all cached objects for the current model.
  • I added that method to both product and image models, then added a call to it in of each of their respective loops in the code above.  With logs, I observed that the objects were having their __destroy function invoked at the end of each loop.

    This kept the script going longer, but it still ran out of memory in the end.  I'm lost as to why.
  • Only way to find is to start profiling in detail.
  • I don't even know what to profile. Since the Orm objects seem to be being destroyed, where should I look for the memory bloat?
  • I really don't know.

    You can run it through an xdebug session, and use Dericks tips (http://derickrethans.nl/xdebug-and-tracing-memory-usage.html) to get meaningful data from it?
  • Ok, I'll give it a try, thank you.
  • Those debugging tips are over my head.  I don't grok it.

    Other than the obvious Orm objects, there's never another object created inside of this loop, as I avoid instantiating any \Image or \File_Handler_File objects by commenting out the $image->thumbnail() line.  The only database calls are to these two tables.  I don't believe this is anything specific to my code, but just the result of working with a few thousand Orm queries.

    This is really putting a kink in my development :(.  If the Orm breaks un-fixably because I'm running batch operations, I have to repeat code in so many places, and may as well toss it altogether.
  • I mean - a few thousand isn't even that many.  It's running about 2,500 calls, to be more specific.  Shall I code everything wondering whether or not the Orm will memory-leak it to death, with nothing to do about it but make standard DB calls if it happens?

    I'm sorry for getting emotional over this, but I've spent hours on this, and I'm not really skilled enough to fix it, and am pretty-well stuck and apprehensive about whether this is the final sign that Fuel isn't going to cut it for me.  I need the Orm to be rock-solid, even if I'm running 100,000 calls through it.  The speed from caching is appreciated, but not if it breaks my program unfixably.
  • I don't know what to do about all of this.

    I've already invested so many weeks in learning the framework and developing my application, but this makes it all for naught, because my program relies on the Orm and will inevitably deal with thousands of entries.

    I'm further troubled by the comparatively tiny and very inactive community.  I know that Harro Verton is not the only developer, but he is the only developer who regularly takes the time to provide any support and is more active in the forums than the entire rest of the community combined, and I have to wonder how quickly this framework would fall without him.

    I don't want it to be so, as FuelPHP is very pleasing to my aesthetics and I am just now getting comfortable with it.  Is there any good reason to feel other than hopeless? 
  • An ORM (any ORM) is not suitable for batch operations, period. I already told you that before. There is a lot of magic happening inside an ORM that does not scale well, it's a common downside of ORM's. Doesn't matter which one you take.

    If you don't want code all over the place, nobody is stopping you from adding a method to your model to abstract your custom DB query, I do it all the time, also for situations where a hand-coded query is more efficient then running a generated query.

    I can't comment to why it runs out of memory without debugging the situation, and currently I don't have time for it. But I will take this with me for the redesign of the ORM for 2.0.

    As to your other comment: different people have different skills, and within a team you try to devide the time available based on those skills. We all have to find time for the project around all our other activities.

    In this case, it's my responsibility to deal with "customer service", I feel it's very important, no matter what product you put in the market. As for Frank and Steve, they feel more comfortable writing code, and that's something that has to be done too.

    FuelPHP is not a single person. We've lost our project lead before, and it didn't really affect us. On the contrary, I think we do better then before. There are several large development agency's that use Fuel for everything they do, including my company. So even when I'm hit by a bus I'm sure development will go on, too much depends on it.
  • Thank you for that, it helps.  Yes, "customer" service - without it, I'd be lost.  I'm not sure I understand your motivation for helping, but thank you.

    Is there some reason besides efficiency why the Orm is unacceptable for batch operations?  Where is the line between batch operations and normal ones?

    I hadn't before considered that I might run out of memory due to the Orm, and this will require some rethinking of my application.  It seems hard to predict whether some deeply-nested relations will end up outgrowing the Orm model.  I had liked the idea of recursively modeling whole directories and their contents, but now fear hitting a memory ceiling in doing so.  However, I suppose I could give it more than 32M to play with and extend the limits by a lot, or perhaps it will be best to use the DB class.

    Anyway, thank you again.
  • Well, I've quit using the Orm for this batch operation, and I'm still running out of memory.  Could it be because I'm making 3500+ calls to the DB through \DB?
    I don't think the memory leak is on my end, because:
    - I've opted to use the \File class instead of file handlers.
    -My Thumb class extends nothing, and my only calls inside of it are to \DB, \File, \Config, and \Date. (\Image objects are also used, but I'm still avoiding instantiating any for this operation by commenting out the $image->thumbnail() line.)
    -I'm not caching anything inside of the objects, and there shouldn't be anything sticking around inside of my \Thumb class.

    My app is in development environment, if that matters.  According tot he profiler, the script adds about 2 to 4mb of used memory for every 100 products I run through it, but while it claims to use 12M after 400 products, it hits the 32M limit before 500.




    public function action_test()
    {
    set_time_limit(0);

    $i=0;
    $limit = 900;
    $offset=0;
    $products = \DB::select('sku')
    ->from('dg_product')
    ->limit($limit)
    ->offset($offset)
    ->execute()
    ->as_array()
    ;
    foreach( $products as $key=>$product )
    {
    echo $product['sku'] . '<br/>';
    //\Log::debug( $product['sku'] );

    $images = \Thumb::load_dir("product/{$product['sku']}/");

    foreach( $images as $image )
    {
    //$image->thumbnail();
    $i++;
    if( $i % 10 == 0 )
    {
    \Log::debug($i);
    }
    }
    }
    exit;
    }
  • My motivation is pretty simple: it comes with the job.

    If you run a project and people start using it, you get to a point where the project becomes so large that those people start depending on it. They become "customers". If you're not prepared to support them, don't start a project like this. Same goes for roadmap and changes, you can't just introduce something radical, like the move from Laravel 3 to 4 (which essentially is a new framework, so none of your existing apps will work). This is why a lot of open source projects fail, no docs, no support, egocentric developers that don't care about their users.

    Only difference is that we don't charge our customers, donations are that their discretion. ;-)

    As to your issue, the two "issues" with ORM and batch operation are that instantiating ORM model objects for every record is an expensive operation, so the more records you have, the less efficient it's going to be, and the object cache, which makes you run out of memory with large resultsets.

    Since you have now cleaned the cache, and instantiation time is clearly not an issue for you (yet), i think it's safe to say that the ORM is not your (only) issue. Since you don't call the image class at the moment (which is very memory hungry for certain operations), we can rule that out as a cause.

    In development mode, some additional stuff happens that uses memory, such as error collection, debug and profiling information (have you enabled the profiler, and in particular db profiling)? But I can't think of anything that an eat up 4MB. I have large complex apps running in development, and the complete page request doesn't use 4MB...

    Can you add a
    Profiler::mark_memory(false, 'before Thumb::load_dir');
    $images = \Thumb::load_dir("product/{$product['sku']}/");
    Profiler::mark_memory(false, 'after Thumb::load_dir');
    so you can see if that is responsible for the increase in used memory? (if you don't use the profiler, you have to write the value of memory_get_usage() to the Log).

    The only other thing I can think if is install xdebug, run an xdebug trace, and use kcachegrind (linux) or webgrind to analyse the trace. It should tell you exactly which functions or methods are responsible for the memory increase.

    We have someone profiling the framework on a near daily basis, if it was something structural, I'm sure he would have reported this by now.

  • I tested it using a Task, and it worked!  Dropped the 32M limit to 4M
    and it still worked - processed all ~700 entries with ~3500 images
    without a hiccup.  Tasks and the Cli class are really nice to work with.

    Thank you for all of that, it leaves me feeling much better about FuelPHP.  I think there is a place for such a fast-moving framework as Laravel, and we can all benefit from their innovations, but I'm glad to be basing my project on something where the devs are more concerned about stability.  Not only do I want my app to continue working, but learning new stuff takes time and energy, and it's kind of disheartening when you're really just relearning how to do old stuff in a new framework.  However, I've heard hear-say that Laravel development was supposed to stabilize after 4.0.

    I checked the xdebug trace in kcachegrind and didn't find much.  I don't know what I'm looking for, but nothing jumped out at me.  It seemed to be showing processing time, not memory usage, and I didn't see a way to show memory usage, so wasn't able to get much relevant info.  I can send or upload my xdebug file, if you like.
  • Found this quote by MaWoe on StackOverflow:
    -------------------------------------------------------------------
    On http://www.xdebug.org/updates.php for Xdebug 2.0.4 they write in section removed functions": "...Removed support for Memory profiling as that didn't work properly...". Hence xdebug wont be an option
    -------------------------------------------------------------------
    source: http://stackoverflow.com/questions/255941/tools-to-visually-analyze-memory-usage-of-a-php-app
  • I'll discuss this with our profiler, and ask him how he traces memory usage. I haven't used xdebug for that purpose in a long time.
  • Ok. 
    In case it got lost in my mountain of text:
    this same function works as expected (no memory leak) when I use a Task instead of a controller.

    What should I look in to disabling?  Or is that what you need to discuss with your profiler?
  • I got that, but I don't understand it.

    The only difference between a task and a browser request is the request setup, which is done only once, and doesn't really store much data. Other then that, the code executed in both cases is the same.

    Can you run an xdebug again on the controller code, with xdebug.trace_format set to 1? And email me the resulting .xt file (wanwizard<at>fuelphp.com), so I can have a look at the memory usage?

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

In this Discussion