Re: Large Queryset Calculation In Background?
Thanks, Nik: that looks very handy. I didn't realize you could renice
a process from within! So now we just have to figure out whether it
blocks database queries at all, since if it does, then the longer it
runs the bigger a problem we have.
Andre: yes, that's why I'm doing the calculation overnight and caching
it in the DB. But the data it's based on changes almost daily and we
want the calculated version we're working with to be no more than 48
hours old -- hence the nightly recalculation. I'd love tips on how to
get that nightly recalculation not to bring the entire site grinding
to a halt. ;-)
On Nov 22, 7:51 pm, Andre Terra <andrete...@gmail.com> wrote:
> You will definitely need to look into caching those results, perhaps
> "permanently" in a database.
>
> My recommendation is redis[1] and possibly tools like sebleier's
> django-redis-cache[2]. Cache invalidation is a pain, I know, but it's
> pretty much the only way to go.
>
> Long term, you will need to profile the bottlenecks and dive into the
> django generated SQL to find if you can tune it by refactoring, and
> possibly switching to either .raw() or .sql() in some cases.
>
> There are plenty of presentations from python/django conferences out there
> that touch on the subject of ORM optimization, so don't be afraid to google.
>
> Good luck!
>
> Cheers,
> AT
>
> [1]http://redis.io
> [2]https://github.com/sebleier/django-redis-cache
>
> On Tue, Nov 22, 2011 at 9:04 PM, Nikolas Stevenson-Molnar <
>
>
>
>
>
>
>
> nik.mol...@consbio.org> wrote:
> > I wouldn't expect it to lock the database (though someone with more
> > database expertise should address that). I *would* expect it to consume
> > significant CPU. If you're on UNIX, you could address this issue by making
> > your process 'nice':http://docs.python.org/library/os.html#os.niceThe
> > nicer a process (higher the value), the less CPU it will hog. IIRC, nice
> > values default to 0 for processes and range from -20 (biggest CPU usage) to
> > +20 (smallest CPU usage).
>
> > _Nik
>
> > On 11/22/2011 2:37 PM, Nan wrote:
>
> > Hi folks --
>
> > I need to run a fairly CPU-intensive calculation nightly over a
> > dataset that's already large and growing quickly. I'm planning to run
> > this via a cron job, but would like to make sure that it neither eats
> > up the entire CPU nor locks the database, so that my site can continue
> > functioning in the meantime. The rough outline of what it needs to do
> > is as follows:
>
> > class OtherThing(models.Model):
> > anotherthing = models.ManyToManyField(Whatever)
> > ...
>
> > class Thing(models.Model):
> > other_things = models.ManyToManyField(OtherThing,
> > through='SomethingElse')
> > ...
>
> > for thing in Thing.objects.select_related('other_things',
> > 'other_things__anotherthing__etc'):
> > calculated = calculation_on_thing_and_its_otherthings(thing) #
> > this mainly involves serialization to a great depth
> > thing.calculated_data = calculated
> > thing.save()
>
> > Will the above approach lock the database for a while or eat tons of
> > CPU? Any suggestions? I'm using Django 1.2, btw.
>
> > Thanks,
> > -Nan
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Django users" group.
> > To post to this group, send email to django-users@googlegroups.com.
> > To unsubscribe from this group, send email to
> > django-users+unsubscribe@googlegroups.com.
> > For more options, visit this group at
> >http://groups.google.com/group/django-users?hl=en.
--
You received this message because you are subscribed to the Google Groups "Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to django-users+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home