Friday, December 23, 2016

My Python Workflow

This is basically what I did whenever starting on new Python project.

mkdir project_name
cd project_name
vim buildout.cfg

Where my buildout.cfg look like:-

parts = main

recipe = zc.recipe.egg
eggs =

interpreter = python

Then I just execute the following to get my new python environment initialize:-


A python interpreter that already has access to my packages dependencies - in this case python-telegram-bot is ready in ./bin/python.

I'm a fan of buildout and shall write more about it in later post.

Friday, September 11, 2015

Trying out Ajenti

Ajenti is server control panel, in the same space as Webmin. It allow you to manage your linux server directy via web interface.

In term of installation, Ajenti really win. All available as OS packages, so if using Ubuntu or Debian it just a matter of apt-get. The initial installation is covered by a script that you download from Ajenti's website.

wget -O- | sudo sh
One thing that tripped me up is that I thought the web hosting plugins also in the same packages. Turn out it a separate package called ajenti-v (It clearly shown on the website, so my bad). To get the webhosting packages you need to install ajenti-v package.

sudo apt-get install ajenti-v ajenti-v-nginx ajenti-v-mysql ajenti-v-php-fpm php5-mysql ajenti-v-python-gunicorn

If you have already install apache2 before, you need to remove it as Ajenti-v use nginx.

The whole experiences of getting a Django website running is still not very smooth. Those who are not familiar with Python deployment might be in hard time trying to fit everything together. This is I some room that I want to explore me in Ajenti-V. In order for this to be really usable, we don't have to touch the terminal at all to get things running.

The first problem I got is 
supervisor FATAL Exited too quickly (process log may have details). This turn out that I don't have gunicorn installed in the virtualenv set for my site. There are lot of things that need to be set - virtualenv, install gunicorn, the path to wsgi script.

One that taking so much of my time is figuring out why nginx keep doing 301 redirect for static files. In the end, I have to choose the 'root' method instead of 'alias' in nginx location.

There's a subtle difference between root and alias.

Thursday, August 21, 2014

AWS: Allowing IAM user to manage their own MFA device

When enabling MFA (Multi-factor Authentication) on AWS Web Console, only users with admin privilege can configure the MFA device for each IAM user. This pose a problem if your users are not in the same physical location. To allow each IAM user to be able to configure the device on their own, you must add specific IAM policy:-

If you're using the default PowerUserAccess, that policy also basically remove access to the whole IAM so make sure to change that too. The default policy:-

  "Version": "2012-10-17",
  "Statement": [
      "Effect": "Allow",
      "NotAction": "iam:*",
      "Resource": "*"

Change that to:-
  "Version": "2012-10-17",
  "Statement": [
      "Effect": "Allow",
      "Action": "ec2:*",
      "Resource": "arn:aws:ec2:*"
Finally, user also has at least read only access to the IAM.

Thursday, June 13, 2013

Apache read raw request data

It's not possible to get the whole raw request data from within PHP. You can get the request body using php://input but not the headers. Searching around, I found that you can log the request data from apache  using mod_dumpio. It will dump the incoming request data to error.log. As mentioned in the docs, for apache version < 2.4, you have to set LogLevel to debug. One catch with this is to make sure none of your virtualhost config has LogLevel higher than debug otherwise you'll not get output from this module. Also make sure you didn't set the LogLevel to debug but going down the config, another LogLevel exists and set to something else. Happened to me.

Monday, May 27, 2013

Craft HTTP requests using nc

To do some low level check on websites, I'd usually use telnet to compose a http requests against the server. The main intention is to talk directly to the server port to make sure the problem we have not caused by some higher level application. For example, to connect to server and issue a GET request:-

telnet 80
Connected to
Escape character is '^]'.
Connection closed by foreign host.

There's always a problem with telnet. In the above example, I can only issue a GET request without having a chance to add other HTTP headers such as HOST before the server close the connection. Some websites also time out very quickly when they are not receiving any data after establishing connection. And since the above command is in interactive session, it's not repeatable or scripted. Using nc seem to be much better.

Specify the virtual host:-

echo -en "HEAD / HTTP/1.1\r\nHOST:\r\n\r\n" | nc 80

You'll get the output as:-

HTTP/1.1 200 OK
Content-Type: text/html
Last-Modified: Fri, 12 Apr 2013 23:26:51 GMT
Expires: Sun, 26 May 2013 19:40:06 GMT
Cache-Control: max-age=600
Content-Length: 9991
Accept-Ranges: bytes
Date: Sun, 26 May 2013 19:30:07 GMT
Via: 1.1 varnish
Age: 0
Connection: keep-alive
X-Served-By: cache-s34-SJC2
X-Cache: MISS
X-Cache-Hits: 0
X-Timer: S1369596606.987305880,VS0,VE145
Vary: Accept-Encoding

Without virtualhost:-

echo -en "HEAD / HTTP/1.1\r\n\r\n" | nc 80

And the output:-

HTTP/1.1 400 Bad Request
Content-Type: text/html
Content-Length: 166
Accept-Ranges: bytes
Date: Sun, 26 May 2013 19:32:16 GMT
Via: 1.1 varnish
Age: 0
Connection: keep-alive
X-Served-By: cache-s35-SJC2
X-Cache: MISS
X-Cache-Hits: 0
X-Timer: S1369596736.277374744,VS0,VE72
Vary: Accept-Encoding

It allow us to fully compose the request and then send it through the opened connection nc created.


Wednesday, April 18, 2012

Weekend project

Quick weekend project - a website to search for 'halal' status of products in local market. The data was scraped from JAKIM website. The primary motivation for doing this is to search for halal status from my mobile phone - small feature phone with browser (not android). The JAKIM website can't even being displayed on my phone. It used Django at the backend and the well known Twitter Bootstrap for the frontend page. This is my first use of Bootstrap, it simply work out of the box for mobile browser (mine is the old opera mini, 3.0 something I guess). I try to avoid Django for weekend project but the offer django-haystack has for quick search tool is too tempting so I decided to still use Django for this one. The search backend is Whoosh with the integration mostly done by haystack. Scraping JAKIM website not easy, the data are all in heavily nested html tables with no id or class to identify. I use python lxml lib to parse the html. When user search for keyword, it will look first in the Whoosh index and if none found, try to query JAKIM site directly and then redirect user to the same page with their query parameter. To make the result immediately available, I used the real time index feature of haystack that will automatically update the index once new item inserted into db. This site has one advantage over the JAKIM site. The search on JAKIM site was naively implemented that you can't even search for multiple keywords. For example searching for "shokubutsu original" yield no result from JAKIM while my site return exactly what I want.

Sunday, April 15, 2012

Django lxml encode error

Python script using lxml library that work fine on console suddenly throwing out error when importing it from django views module.
File "lxml.etree.pyx", line 123, in init lxml.etree (src/lxml/lxml.etree.c:160385) TypeError: encode() argument 1 must be string without null bytes, not unicode"
It unlikely problem with the encoding of the content I want to parse because it just importing the module and I'm not calling any function that do the parsing yet. Almost giving up in my search until I found this answer[1] on Stackoverflow. It turn out on console I'm using Python 2.6 while mod_wsgi, which run the django app is compiled against python 2.7.