Tuesday, October 15, 2013

Re: django cron job - stops after reading some portion of huge file - why is this?

Awesome, then let me try these things you mentioned.. i let you know then.. thanks in tons for now 

On Tuesday, October 15, 2013 5:04:02 PM UTC+2, ke1g wrote:
Yes, you should split the db activity into sensible transactions, since information about how to roll back is being stored somewhere (though some DBs may not have a problem with this.

You've added a whole new dimension when you say that this data is not, in fact, a local file that you are reading, but a network request.  There are many more things between you and the data source that could have trouble with the large data size.  I suspect that the most likely is that the server limits the time allowed for the request to complete.  Hopefully a server with such a limit provides for restarting the transfer from other than the beginning.

I'm sorry, but I don't have the spare cycles to debug this for you.  Try instrumenting things to confirm whether it is a read on the source that is hanging or something else.  Since it's hard to get data from a hung process, this requires some imagination.  You could write to a file an indication of the point in the code when you are about to read the source, when the source completes, when you are about to talk to the database, when that completes, etc., but note that you must close the file after each write (and open it anew before the next) since otherwise the write may be buffered in the process when it hangs.  All those opens and closes will be slow, so if you feel adventurous, a write to a piece of shared memory, shared with a monitoring process, might be better.

If you find something other than the read on the source not returning, write again and I, or someone else, with think with you some more.



On Tue, Oct 15, 2013 at 9:59 AM, doniyor <doniy...@gmail.com> wrote:
yes, db code is doin all these calls in single transaction, i mean, i am not using transactions, may be this is the reason? 

this is my cron code: http://pastebin.com/Lrym1z8E i know, very ugly code, it is saving at least some objects into db


also i noticed now that in db, there are objects whose some fields are not fully filled out even if the xml file does have those information. it means, this is a transaction issue, right? 

could you please take a look at the code? would transaction solve this issue? 


On Tuesday, October 15, 2013 3:40:42 PM UTC+2, ke1g wrote:
One possibility is that your code keeps all that is read (or something derived from it) in memory, and you are running out.

E.g.; Is your database code trying to do all this in a single transaction?

Another possibility is that something in the file at that spot triggers a but in your code that contains an infinite loop.

There are other possibilities.  But there's no diagnosing it with the information you've given.

Can you, in python, read through the file, doing nothing with the data?  E.g.:

    f = open('your/file/path/here')
    n = 0
    s = True
    while s:
        s = f.read(1024*1024)
        n += len(s)
        print n
    print 'done'

That should work.  If not, does your O/S not correctly handle files that big?

Bill


On Tue, Oct 15, 2013 at 6:55 AM, doniyor <doniy...@gmail.com> wrote:
I am reading file from url and parsing it and saving some information out of this file into db - using cron job. 

i am testing now in my local dev. 

the problem is: job is reading file and saving into db without any problem but after some time, since file is very huge approx. >8GB, job doesnot do anything and freezes, without giving any error, 

i am using django 1.4, python 2.7 and postgresql. is there any limit for writing into db? why is it freezing? 


--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/b61fec61-7481-4113-ab8c-31b7143df3f5%40googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/86f462b3-1fe4-40a3-bdc0-5eb2a19e0138%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home


Real Estate