Skip to main content

Weekend project

Quick weekend project - a website to search for 'halal' status of products in local market. The data was scraped from JAKIM website. The primary motivation for doing this is to search for halal status from my mobile phone - small feature phone with browser (not android). The JAKIM website can't even being displayed on my phone. It used Django at the backend and the well known Twitter Bootstrap for the frontend page. This is my first use of Bootstrap, it simply work out of the box for mobile browser (mine is the old opera mini, 3.0 something I guess). I try to avoid Django for weekend project but the offer django-haystack has for quick search tool is too tempting so I decided to still use Django for this one. The search backend is Whoosh with the integration mostly done by haystack. Scraping JAKIM website not easy, the data are all in heavily nested html tables with no id or class to identify. I use python lxml lib to parse the html. When user search for keyword, it will look first in the Whoosh index and if none found, try to query JAKIM site directly and then redirect user to the same page with their query parameter. To make the result immediately available, I used the real time index feature of haystack that will automatically update the index once new item inserted into db. This site has one advantage over the JAKIM site. The search on JAKIM site was naively implemented that you can't even search for multiple keywords. For example searching for "shokubutsu original" yield no result from JAKIM while my site return exactly what I want.

Comments

Popular posts from this blog

PHP with docker

A friend asking about a PHP library and I decided to test whether that library is working. But I don't have PHP environment setup (we're Python shop btw). But thanks to docker, that's easy these days. docker run -it --tty --rm --volume $PWD:/app --user $(id -u):$(id -g) composer require google/apiclient:^2.0 Then we just need to create the script to run, still in the same directory:- include_once __DIR__ . '/vendor/autoload.php'; $GCSE_API_KEY = "nqwkoigrhe893utnih_gibberish_q2ihrgu9qjnr"; $GCSE_SEARCH_ENGINE_ID = "937592689593725455:msi299dkne4de"; $client = new Google_Client(); $client->setApplicationName("My_App"); $client->setDeveloperKey($GCSE_API_KEY); $service = new Google_Service_Customsearch($client); $optParams = array("cx"=>self::GCSE_SEARCH_ENGINE_ID); $results = $service->cse->listCse("lol cats", $optParams); And we can run that script again using docker:- docker run -it --...

Ubuntu 22.04 Wayland share screen

 After switching to Dell XPS 13 which running Ubuntu 22.04, I noticed that trying to share screen through Google Meet, it shows this:-    This - Use operating system settings, I never saw it before. Usually here we will be presented the windows that we want to share.  It turned out that screen sharing in Ubuntu 22.04 indeed an issue, due to the use of Wayland instead of Xorg as its display server. Many suggested to disable wayland and back to use Xorg. I try to avoid that since Wayland seems to works fine so far. After some searching, the conclusion seems we can make this working by installing some packages. sudo apt install xdg-desktop-portal xdg-desktop-portal-gnome But it turned out that I have already installed the packages! So what were the problems?  Well, turn out it's more psychological than technical. Since the pop up is different than what I'm used to before, I never click the allow button. But clicking the allow button we will see this:-   Which...

The rise of localhost

I noticed a pattern in dex world, where you build client backend to participate in the network, and then build a web app that simply connect to  localhost:someport  for the UI. To check my scuttlebutt updates, I opened up http://localhost:8027/. For those using Ethereum Parity wallet, they can open it at http://localhost:8180/. ZeroNet users are browsing at http://localhost:43110/. But Parity for example, try to make it seamless, they still provide a dns - web3.site which then redirected to home.web3.site which simply resolved to 127.0.0.1. But this I think bring up some problem, especially non-tech user which think that Parity is a website hosted by Parity Technologies. I seen this in a some articles about the latest bug .