Sunday, April 14, 2013

Reading xpath value error - Lxml

Hi all,

I am trying to crawl the information from this link

http://muaban.net/mua-ban-nha-quan-thu-duc-l5924-c32/quan-thu-duc-ban-nha1lau-2mt-truoc-sau-dg-ng-cong-tru-p-hiep-phu-q9-dt-4x21-5m--id15946781

and this is the code I use

link = "http://muaban.net/mua-ban-nha-quan-thu-duc-l5924-c32/quan-thu-duc-ban-nha1lau-2mt-truoc-sau-dg-ng-cong-tru-p-hiep-phu-q9-dt-4x21-5m--id15946781"
xPath =  "id('pC_DV_tableHeader')/x:tbody/x:tr[4]/x:td[3]"
namespace = {'x': 'http://www.w3.org/1999/xhtml'}

tree = lxml.html.parse(link)
arrayContent = tree.xpath(xPath + "/text()", namespaces=namespace)

if len(arrayContent):
     content = cgi.escape(arrayContent[0].encode("utf-8"))

I use xPath checker add-on of firefox to read the xPath value and the namespace. However, when running the code, I always get the content empty. How can I solve this ? 

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home


Real Estate