Rich Atkinson

Rich Atkinson's Personal Blog

Archive for September 2009

Installing Cassandra and Thrift on Snow Leopard – A Quick Start Guide

with 7 comments

Update March 25 2010: I will soon update this for Cassandra 0.6 (which is currently in beta). Until then, this process still works – just install Cassandra 0.5.1

I couldn’t find much in the way of an OS X install guide for Cassandra and (particularly) Thrift, so here’s a brief summary of the steps I took to get Cassandra up and running on Snow Leopard.

Requirements: xcode (provides java, ant and g++, autotools etc for compiling thrift) & svn.

I used macports to get “boost, “pkgconfig” and “libevent”, all used by for Thrift.

Part 1: JAVA_HOME

OSX does not set JAVA_HOME for you, instead it must be done manually.

A simple way is to add the following line to ~/.bashrc

export JAVA_HOME=$(/usr/libexec/java_home)

After setting JAVA_HOME, you will need to exit and reopen terminal before the change will take effect.

Part 2: Installing Cassandra

# I installed Cassandra into /opt and run it as me.
mkdir -p /opt/cassandra
chown -R {you} /opt/cassandra

# Need to create the log directory
mkdir -p /var/log/cassandra
chown -R {you} /var/log/cassandra

# Also Cassandra created a directory instead of a file for system.log, so...
touch /var/log/cassandra/system.log

# By default Cassandra 0.4.1 uses /var/lib for data, so...
mkdir -p /var/lib/cassandra
chown -R {you} /var/lib/cassandra

# now lets get the source
svn co https://svn.apache.org/repos/asf/incubator/cassandra/tags/cassandra-0.4.1 /opt/cassandra/cassandra-0.4.1

# If all is well you should be able to build.
cd /opt/cassandra/cassandra-0.4.1
ant

# If that works, you should be able to run cassandra
bin/cassandra -f

All going well, you should be able to, in a new terminal tab, run through the CLI tests here.

Part 3: Installing Thrift

First you’ll need boost, pkgconfig and libevent. So (using macports):

sudo port install boost
sudo port install libevent
sudo port install pkgconfig

Now, with a lack of tags and branches in SVN, I just grabbed trunk (please could someone let me know if this unwise).

svn co http://svn.apache.org/repos/asf/incubator/thrift/trunk/ /opt/cassandra/thrift

SVN revision at time of writing was 817923

Next, we build it…

(Note the /opt/local references, that’s where macports puts it’s stuff by default)

cd /opt/cassandra/thrift
./bootstrap.sh
./configure --with-boost=/opt/local --with-libevent=/opt/local --prefix=/opt/local

# If you get the error:
# ./configure: line 16440: syntax error near unexpected token `MONO,'
# ./configure: line 16440: ` PKG_CHECK_MODULES(MONO, mono >= 2.0.0, net_3_5=yes, net_3_5=no)'

# It's documented, to fix it: (assuming you installed pkgtools from macports)
ln -s /opt/local/share/aclocal/pkg.m4 /opt/cassandra/thrift/aclocal/pkg.m4

# once again:
./bootstrap.sh
./configure --with-boost=/opt/local --with-libevent=/opt/local --prefix=/opt/local
sudo make install

Language bindings

The make install should have installed libraries for ruby, perl, python etc.

EDIT: Python bindings were installed into /usr/lib/python2.6/site-packages. I had to move them to my (default apple provided python 2.6) site-packages.

If you want to install them into a virtualenv you will find the setup.py in /opt/cassandra/thrift/lib/py

Part 4: Generate “cassandra” from thrift.

Navigate to /opt/cassandra/cassandra-0.4.1/interface

and then…
thrift --gen py:new_style cassandra.thrift

This generates the cassandra python package which you can copy to your project.

If you have any suggestions for improvement here, please let me know.

Written by Rich Atkinson

September 23, 2009 at 12:36 pm

Posted in Web Tech

Sydney Dust Storm

without comments

Today Sydney is enveloped in a quite spectacular dust storm. Everything is an eerie orange, it really feels post-apocalyptic or like the scene from a martian wasteland.

Here’s a photo that Debi took through our apartment window at about 6am this morning:

Dust Storm

Sydney Dust Storm - September 23rd 2009

According to the weather bureau, it’s likely to stay around all day.

Written by Rich Atkinson

September 23, 2009 at 8:31 am

Posted in Personal

Cassandra DB

with one comment

Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store. (apache.org)

Originally developed by facebook, it is now an Apache project and at present has incubator status.

From my point of view, Cassandra stands out from the crowd of non-relational databases for three reasons:

1. Reference Sites: Apart from facebook, Cassandra recently replaced MySQL for parts of Dig. Also Rackspace are doing something secret with it; although I don’t know what it is, my guess would be something along the lines of Amazon’s SimpleDB.

2. True peer clustering: Cassandra does not require a central master. A key feature of Cassandra is you can write to any node in the cluster, at any time. Writes are never blocked. The trade off for this is you get consistency eventually. So transactions aren’t strictly ACIDic, but depending on what you are doing, that might not matter at all.

3. Column Querying: BigTable really popularised (and proved) the concept of the column database for large scale applications. Cassandra really is similar to BigTable from this perspective, but introduces the SuperColumn.

In addition to those things, a couple of other nice features of Cassandra are:

1. It’s JVM based, which makes it nicely portable. It really was a 1 minute job to get it up and running on Snow Leopard.

2. Cross platform API, via a remote Thrift interface.

My intention is to use Cassandra in our current project, where I need a horizontally scalable data store with geographically separate cluster nodes. Fortunately eventual consistency suits this project very well indeed.

Addendum: Here is the best getting started guide I have found so far.

Addendum 2: Eric Flo also sums up Cassandra nicely, although in a slightly different context.

Written by Rich Atkinson

September 21, 2009 at 10:47 pm

Posted in Web Tech

Snakes on the Web, Jacon Kaplan-Moss

without comments

I really enjoyed reading this transcript of Jacob’s 2009 PyCon talks. If you’re at all interested in Python there’s lots of neat insights here:

We need to be thinking about scale from day one. This means being incredibly skeptical of our own work, and continually asking ourselves where it’s going to fail. We need plan for the day that our framework will be phased out.

http://jacobian.org/writing/snakes-on-the-web/

Written by Rich Atkinson

September 6, 2009 at 12:41 am

Posted in Python

libjpeg and Python Imaging (PIL) on Snow Leopard

with 33 comments

EDIT: These packages work a treat:

http://old.nabble.com/Re:-building-PIL-in-Snow-Leopard-p28938239.html

Sometimes OSX could learn a trick from Linux; a great example is package management.

Mac ports isn’t bad but it’s not a patch on archlinux’s AUR for simplicity, and Ubuntu is onto a really good thing with APT.

Installing Python Imaging (PIL) with Jpeg support on Snow Leopard isn’t obvious. For anyone struggling with it, here’s a solution:

1. Download the source from http://libjpeg.sourceforge.net/

2. Extract, configure, make:

tar zxvf jpegsrc.v6b.tar.gz
cd jpeg-6b
cp /usr/share/libtool/config/config.sub .
cp /usr/share/libtool/config/config.guess .
./configure --enable-shared --enable-static
make

3. You may need to create the following directories:

sudo mkdir -p /usr/local/include
sudo mkdir -p /usr/local/lib
sudo mkdir -p /usr/local/man/man1

4. Now you can install it as usual.

sudo make install

5. If you want to freetype support, do that now.

6. Finally, you can install PIL. Be sure to activate any vitualenv now if you don’t want to install PIL into the system site-packages.

pip install http://effbot.org/downloads/Imaging-1.1.6.tar.gz

At least the native Python 2.6 on Snow Leopard works great, and this wasn’t nearly as painfull as installing PIL on Cygwin!

Written by Rich Atkinson

September 5, 2009 at 10:05 pm

Posted in Python