Sunday, March 17, 2013

Re: Performance when retrieving data from related models

You're over thinking this by a large margin. getTendersForContractor should return a QuerySet. That way you can add things like .select_related as necessary in the view. 

Also, I'm a bit worried about the fact you have a function called getTendersForContractor. It really should be Tender.objects.forContractor. Or if the function is just a single filter call, it can be eliminated entirely.

On Mar 15, 2013, at 9:04 PM, Taras_96 wrote:

qs = Tender.objects.sql_related(/*names of keyed relationships here*/).prefetch_related('project', 'project__job_set') (the actual example has many more relations in the prefetch_related). This fixes the problem of hitting the database in loops such as the above, and the complexity can be hidden in a data retrieval layer that the client calls into.

returnedTenders = getTendersForContractor(contractor) # << the actual query with prefetches etc is done in this function

However, there are a few drawbacks to this:
[SNIP]
* Each prefetch_related hits the database once - a lot better than 3k hits, but not as minimal as a sql join which joins all of the tables
Actually, no. If you were to try and do the prefetch_related as a single SQL join, you would return an insane number of rows. A single prefetch_related wouldn't be a problem, but if you were to try and do a select with more than one many field, they would multiply. If you have an example of a join statement that doesn't do that, we'd love to see it.

* Prefetch_related selects related obects via the criteria: 'in (list of ids'), which arguably can be slower
What database are you using where this is true?

* Seems brittle - if you change the data model, you have to change all of the queries in the data retrieval layer. Denomralising the data would reduce this, but we'd still have the problem of retrieving more data than the client may possibly want
Yes, if you change code in one place, you might have to change it someplace else to match it. Most of us just use the ORM as the data retrieval layer. Again, I'm worried you're over thinking this.


Peter of the Norse



0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home


Real Estate