Skip to main content

Posts

Showing posts from April, 2012

Weekend project

Quick weekend project - a website to search for 'halal' status of products in local market. The data was scraped from JAKIM website. The primary motivation for doing this is to search for halal status from my mobile phone - small feature phone with browser (not android). The JAKIM website can't even being displayed on my phone. It used Django at the backend and the well known Twitter Bootstrap for the frontend page. This is my first use of Bootstrap, it simply work out of the box for mobile browser (mine is the old opera mini, 3.0 something I guess). I try to avoid Django for weekend project but the offer django-haystack has for quick search tool is too tempting so I decided to still use Django for this one. The search backend is Whoosh with the integration mostly done by haystack. Scraping JAKIM website not easy, the data are all in heavily nested html tables with no id or class to identify. I use python lxml lib to parse the html. When user search for keyword, it wi...

Django lxml encode error

Python script using lxml library that work fine on console suddenly throwing out error when importing it from django views module. File "lxml.etree.pyx", line 123, in init lxml.etree (src/lxml/lxml.etree.c:160385) TypeError: encode() argument 1 must be string without null bytes, not unicode" It unlikely problem with the encoding of the content I want to parse because it just importing the module and I'm not calling any function that do the parsing yet. Almost giving up in my search until I found this answer[1] on Stackoverflow. It turn out on console I'm using Python 2.6 while mod_wsgi, which run the django app is compiled against python 2.7. [1]:http://stackoverflow.com/questions/9465248/runtimewarning-compiletime-version-2-6-of-module-lxml-etree-does-not-match-ru