Less is more: colordiff and more or less

In the Unix/Linux/Mac OS X world, less is more. Literally, in that ‘less‘ fully emulates ‘more‘, and figuratively, as it provides useful additional functionality like backwards scrolling. So, you really want to use ‘less’ instead of ‘more’ for paging another command’s output, e.g.

cat a_long_document.txt|less

When used to page the output of colordiff however, ‘less’ displays a mess instead of properly displaying colored output like ‘more’.
The trick is to use ‘less’ with either the -r or -R option (which both repaint the screen), i.e.

colordiff -u file_old.py file_new.py|less -r

or

colordiff -u file_old.py file_new.py|less -R

(try which one works better with your system and terminal)

tiny tiny rss: A great web-based feed reader!

I’ve just installed tiny tiny rss (tt-rss), an open source web-based news reader/aggregator for Atom, RDF and RSS feeds. Configuring it as the default news reader in Firefox is very easy (just click on the according link at the bottom of the preferences page) and a convenient solution.

The installation is pretty straightforward too, but here are a couple of hints for installing it on a Gentoo box:

1. Download the tt-rss-1.3.3.ebuild file and all other files and directories from http://overlays.gentoo.org/svn/proj/sunrise/reviewed/www-apps/tt-rss/ and place it in the www-apps/tt-rss directory (create it) in your local Portage overlay (usually /usr/local/portage).

2. Rename the file to tt-rss-1.3.4.ebuild (= the most recent version at the time of writing, released on Oct 21, 2009), execute ‘ebuild tt-rss-1.3.4.ebuild digest’, set the flags you need (e.g. for mysql and vhosts) and emerge the ebuild.

3. Follow the post-install instructions on the screen (bascially the official tt-rss installation notes)

If you intend to use the default, single-process update daemon, you can use the following init files I created (loosely based on Pierre-Yves Landure’s init script):

/etc/init.d/tt-rss:

#!/sbin/runscript
# Copyright 1999-2009 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2

depend() {
    use net
    use mysql
}

#
# Function that starts the daemon/service
#
start() {
    ebegin "Starting $NAME daemon"
    start-stop-daemon --start --quiet --make-pidfile --background --chdir $DAEMON_DIR --pidfile $PIDFILE --exec $DAEMON -- $DAEMON_ARGS
    eend $?
}

#
# Function that stops the daemon/service
#
stop() {
    ebegin "Stopping $NAME daemon"
    start-stop-daemon --stop --quiet --make-pidfile --retry=TERM/1/KILL/5 --pidfile $PIDFILE --name $NAME
    eend $?
}

(replace “mysql” by “postgresql” if you use postgresql)

/etc/conf.d/tt-rss:

# Defaults for the Tiny Tiny RSS update daemon init.d script

# Location of your Tiny Tiny RSS installation.
TTRSS_PATH="/var/www/localhost/htdocs/admin/tt-rss"

DAEMON_SCRIPT="update_daemon.php"

DAEMON=/usr/bin/php
DAEMON_ARGS="$TTRSS_PATH/$DAEMON_SCRIPT"
DAEMON_DIR="$TTRSS_PATH"
PIDFILE=/var/run/tt-rss.pid
NAME=tt-rss

(make sure TTRSS_PATH points to your tt-rss installation)

4. Note that for using the default update method, PHP needs to be compiled with pcntl support. If required, set the pcntl flag and remerge PHP.

5. Have fun!

Kimai – Open Source Time Tracking Tool

So far, I’ve always used “good old” spreadsheets for time tracking on projects. Custom ones I pimped up with some nifty formulae, but still just spreadsheets. Advantage: I can easily adjust them to any special needs anytime – be it the inclusion or exclusion of specific work or just a customization of the sheet’s design or layout. The price for this flexibility is the generally higher effort to track the time “manually” rather than using a specialized time tracking tool – which makes time tracking a tedious task.

Of course I’ve evaluated many proprietary and open source time tracking tools over the years, but so far, none of them managed to fully convince me.

Today, I’ve just stumbled over Kimai – an open source, web-based time tracking tool written in PHP. And so far, Kimai looks promising. Installation is dead easy – just make sure you’ve compiled PDO support into PHP (Gentooers: enable the PDO flag for dev-lang/php and remerge php), else the nice web-based installation wizard will abort without printing any error message.

Once you’ve logged in, you’ll be presented a very clean, intuitive GUI where you can setup customers, projects and tasks. On the top-right there’s a big push-button to start/stop/pause the time tracking.

During my quick evaluation, I haven’t found the functionality yet to export the timesheets, but as far as I know, such functionality will be provided by extensions that can be installed. Let’s see. [Addition 20091009: There’s a stats extension quick-hack for Kimai 0.8.x that can be used to list and print selected reports. To use it, simply download it, extract it in the extensions folder and navigate to {Kimai install folder}/extensions/stats/]

Here’s a screenshot of Kimai 0.8.1.890:

Kimai 0.8.1.890 Screenshot
Kimai 0.8.1.890 Screenshot

With the currently still very limited feature-set, Kimai doesn’t compete with full-grown project management solutions (I’ve recently seen a quick demo of a very sophisticated and cool, Django-based project management solution I’m not allowed to tell any details about yet). But it looks like a promising start. I hope the Kimai project will gain momentum, grow and mature as there’s definitely a need for open source time tracking tools – particularly web-based ones.

P.S. I haven’t had the time yet to audit Kimai’s source code, but if the orderly, clean GUI is any indication, it can’t be too bad.

Django custom model field for an unsigned BIGINT data type

Web 2.0 social media platforms tend to think “big”. They hence often use big integers (8 bytes / 64 bits long instead of just 4 bytes / 32 bits like a normal integer) for user IDs (or sometimes message IDs) to be prepared for even the most extreme potential future growth of their user base. Usually, these big integers are unsigned, allowing for up to 18’446’744’073’709’551’615 UIDs to be stored (which is probably enough to register the inhabitants of quite a few other blue planets too ;).

Facebook, with currently more than 300 million active users, also uses  a 64 bit unsigned integer for storing user IDs and expects Facebook applications to be able to handle this. Of course, 300 M user IDs would still easily fit into a 32 bit unsigned integer, but Facebook already goes beyond the 32 bit limit by issuing 15 digit UIDs like 100’000’xxx’xxx’xxx to registered test users (which allows Facebook to better distinguish between test accounts and real accounts).

Now if you happen to use Django to build your Facebook application, this fact needs special attention as Django doesn’t support 64 bit integer field types for its ORM models by default. As a Django developer, you could thus resort to using a CharField for storing Facebook UIDs (which would be odd) or, better, define a custom model field you can use in your models instead of IntegerField. Fortunately, Django offers an elegant way to define custom model fields. You can write your custom PositiveBigIntegerField by simply subclassing (extending, inheriting from) models.PositiveIntegerField:

So, in your models.py add the following code:

from django.db import models
from django.db.models.fields import PositiveIntegerField

class PositiveBigIntegerField(PositiveIntegerField):
    """Represents MySQL's unsigned BIGINT data type (works with MySQL only!)"""
    empty_strings_allowed = False

    def get_internal_type(self):
        return "PositiveBigIntegerField"

    def db_type(self):
        # This is how MySQL defines 64 bit unsigned integer data types
        return "bigint UNSIGNED"

class Mytest(models.Model):
    """Just a test model"""

    huge_id = PositiveBigIntegerField()

    def __unicode__(self):
        return u'id: %s, huge_id: %s' % (self.id, self.huge_id)

(NB: The “Mytest” class is just for testing the PositiveIntegerField definition, it’s not part of the PositiveIntegerField definition.)

Note that this solution only works for MySQL as a database backend (as MySQL supports the “bigint UNSIGNED” data type for columns which isn’t defined in the SQL standard).

For testing, define a “Mytest” model as shown above and execute “python manager.py syncdb” to create a new myapp_mytest table with an unsigned bigint(20) column named huge_id. Register this new model “Mytest” in admin.py, restart runserver and you’ll be able to enter 64 bit integer values through Django’s admin application.

The only minor “issue” is that Django admin’s CSS class (.vIntegerField) used for HTML form input fields representing integer values defines the width as “5em” which is a bit too narrow to display the entire 64 bit integer. This can be adjusted however (e.g. by writing your own ModelForm and telling ModelAdmin to use that, see the Django admin documentation and the Widget.attrs documentation).

P.S. Note that for Django to be able to access and use a “bigint UNSIGNED” data type, you don’t necessarily need to define a PositiveBigIntegerField and adjust your models. Instead, you could simply adjust the column type in MySQL accordingly as a quick-fix. If you use syncdb (like most Django devs) and want it to create your tables and columns correctly however, defining a custom model type as described is the way to go and strongly recommended for consistency and QA.

Gentoo: Mailman 2.1.11 incompatibility with Python 2.6

Mailman 2.1.11 (or earlier) isn’t compatible to Python 2.6. So if you upgrade your box to Python 2.6 without upgrading to Mailman 2.1.12 at the same time, you’ll run into troubles. And if you’re unlucky (e.g. when running some low-volume mailing lists), you might not even notice it.

One sign of such troubles is that messages sent to your Mailman mailing lists aren’t processed anymore. They simply seem to get “swallowed” by your Mailman server: Messages don’t reach the list, don’t get forwarded to subscribers and there’s no bounce or failure notice. For low-traffic lists, this might go unnoticed for several days or even weeks.

Another sign (that can easily slip through under certain, quite common circumstances) is that as long as you’re running Mailman 2.1.11 with Python 2.6, cron will, every 5 minutes,  send an error message to mailman@myhost.com with a subject similar to

Cron <mailman@myhost> /usr/bin/python -S /usr/lib64/mailman/cron/gate_news

and a content like

/usr/lib64/mailman/Mailman/Utils.py:32: DeprecationWarning: the sha module is deprecated; use the hashlib module instead
import sha
Traceback (most recent call last):
File “/usr/lib64/mailman/cron/gate_news”, line 44, in <module>
from Mailman import MailList
File “/usr/lib64/mailman/Mailman/MailList.py”, line 51, in <module>
from Mailman.Archiver import Archiver
File “/usr/lib64/mailman/Mailman/Archiver/__init__.py”, line 17, in <module>
from Archiver import *
File “/usr/lib64/mailman/Mailman/Archiver/Archiver.py”, line 32, in <module>
from Mailman import Mailbox
File “/usr/lib64/mailman/Mailman/Mailbox.py”, line 21, in <module>
import mailbox
File “/usr/lib64/python2.6/mailbox.py”, line 19, in <module>
import email.message
ImportError: No module named message

That by itself wouldn’t be a big problem and probably get noticed quickly if you forwarded mailman@yourhost.com to your admin’s e-mail address. The problem arises if you happen to host a mailing list named “mailman” (which is often set up as a mailing list for testing and debugging). In that (not uncommon) case, all these cron error messages are forwarded to the “mailman” mailing list.  This makes about 288 messages per day that are queued in /var/lib/mailman/qfiles/in and won’t be delivered (due to the compatibility problem with Python 2.6).

Once you upgrade to Mailman 2.1.12 (and restart /etc/init.d/mailman), all these messages in /var/lib/mailman/qfiles/in will be queued by your MTA (e.g. postfix). This may seem like a strange loop problem in Mailman (or your MTA configuration), which is quite irritating at first.

Here’s how to solve the problem (you may need to adjust these steps to your settings/system paths):

1) Temporarily stop the Mailman service

/etc/init.d/mailman stop

2) Delete all the queued messages to/for mailman-owner@myhost.mydomain.com in your MTA’s mail queue. For postfix as MTA, the following script may be helpful: mailq by Dan Mick.

3) Delete all the cron-generated error messages in /var/lib/mailman/qfiles/in.

In order to determine these error messages, use Mailman’s show_qfiles command to view the message content, e.g.

/usr/lib64/mailman/bin/show_qfiles /var/lib/mailman/qfiles/in/longnumber.longnumber.pck

The best way to identify these messages is by filtering them according to their file size. Usually they have sizes around 1600 bytes. E.g. for a file size of 1635 bytes, try sth similar to:

cd /var/lib/mailman/qfiles/in

find ./ -size 1635c -exec rm {} \;

4) Once you’ve deleted all these cron error messages in Mailman’s in-queue, you can restart Mailman (/etc/init.d/mailman start). Mailman will then start delivering the remaining valid files in its in-queue. Your MTA/postfix queue looks normal again (i.e. there’s no more overflowing)!

5) Finally, you need to manually “discard all messages marked Defer” for the ‘mailman’ mailing list using the web admin interface (usually on https://myhost.mydomain.com/mailman/admindb/mailman or similar). Before discarding these messages, make sure you don’t discard any valid messages.

That’s it!

Thanks to Mark Sapiro AKA msapiro from #mailman @ irc.freenode.net for very useful hints and help!

P.S. Some other helpful resources in case of Mailman problems:

How to find running Mailman instances (qrunners)

How to integrate postfix and Mailman

Gentoo, Django and PyFacebook: Updated Ebuild

If you happen to use PyFacebook on Gentoo you probably noticed that djangofb.py didn’t create/copy any skeleton files for the Facebook application to be created (for an example how it should work, take a look at the PyFacebook tutorial).

In order to fix this, just download my update of the pyfacebook-9999.ebuild. This updated ebuild reflects PyFacebook’s move to GIT for versioning (from SVN), hence ensuring that you always use the latest available (‘official’) sources with latest fixes (among others, some changes to /pyfacebook/setup.py to fix the djangofb problem). I also recommend to watch the forks of PyFacebook for recent changes.

Gentoo: How to fix a broken Python installation

If your Python environment seems to be buggy or broken after a recent Python upgrade, it probably is. To fix it, simply execute the following command on your Gentoo box:

# python-updater -v

(note that this process takes quite some time to complete)

Additionally, you might want to make Gentoo check the dependency tree and rebuild broken packages (related to Python or other packages):

# revdep-rebuild

P.S. This is more or less a repost of an earlier post about python-updater.