Thursday, August 11, 2011

Easy cassandra deployments with Ubuntu Server and Juju

** This is an updated post reflecting the new name of the project formerly known as Juju now known as Juju **

A very popular database used by many companies and projects these days seem to be Cassandra.
From their website: 
The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model.
I am by no means an expert on Cassandra but, I have done some medium size deployments on Amazon's cloud so, I wanted to translate my knowledge of Cassandra "rings" and develop an Juju charm that could use their peers interfaces to expand and contract the ring as needed.

For the impatient, the cassandra juju charm is here

The rest of us, let's move on to some details about the charm:

  • It should be simple ( juju deploy cassandra ... nothing more than that )
  • It should work stand-alone
  • It should be expandable via peers interfaces
    • grow the cluster/ring via juju add-unit cassandra
  • Make use of the Cassandra default configuration as much as possible.
  • Extract common variables from the configuration file(s) into the charm so they can be changed in the future.
The steps to install Cassandra can be distilled down to:

  • add repositories
  • install dependency packages
  • install cassandra
  • modify the configuration 
    • /etc/cassandra/cassandra-env.sh 
    • /etc/cassandra/cassandra.yaml
Now that we know the design goals and we have an idea on what's needed to get Cassandra up and running, let's delve into the charm.


metadata.yaml

name: cassandra
revision: 1
summary: distributed storage system for structured data
description: |
  Cassandra is a distributed (peer-to-peer) system for the management and
  storage of structured data.
provides:
  database:
    interface: cassandra
  jmx:
    interface: cassandra
peers:
  cluster:
    interface: cassandra-cluster

hooks/install
#!/bin/bash

set -ux

export LANG=en_US.UTF-8

# Install utility packages
DEBIAN_FRONTEND=noninteractive apt-get -y install python-software-properties

# Add facter and facter-plugins repository
echo deb http://ppa.launchpad.net/facter-plugins/ppa/ubuntu oneiric main  >> /etc/apt/sources.list.d/facter-plugins-ppa-oneiric.list
echo deb-src http://ppa.launchpad.net/facter-plugins/ppa/ubuntu oneiric main  >> /etc/apt/sources.list.d/facter-plugins-ppa-oneiric.list
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys B696B50DD8914A9290A4923D6383E098F7D4BE4B

#apt-add-repository ppa:facter-plugins/ppa

# Install the repositories
echo "deb http://www.apache.org/dist/cassandra/debian unstable main" > /etc/apt/sources.list.d/cassandra.list
echo "deb-src http://www.apache.org/dist/cassandra/debian unstable main" >> /etc/apt/sources.list.d/cassandra.list

# Add the key
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F758CE318D77295D

# Update the repositories
apt-get update

# Install the package
DEBIAN_FRONTEND=noninteractive apt-get install -y openjdk-6-jre-headless jsvc libcommons-daemon-java adduser libjna-java facter facter-customfacts-plugin
cd /tmp
curl -O http://people.canonical.com/~negronjl/cassandra_0.8.3_all.deb
DEBIAN_FRONTEND=noninteractive dpkg -i /tmp/cassandra_0.8.3_all.deb

HOSTNAME=`hostname -f`
IP=`facter ipaddress`
CWD=`dirname $0`
DEFAULT_JMX_PORT=7199
DEFAULT_CLUSTER_PORT=7000
DEFAULT_CLIENT_PORT=9160
DEFAULT_CLUSTER_NAME="Test Cluster"

# Open the necessary ports
if [ -x /usr/bin/open-port ]; then
   open-port ${DEFAULT_JMX_PORT}/TCP
   open-port ${DEFAULT_CLUSTER_PORT}/TCP
   open-port ${DEFAULT_CLIENT_PORT}/TCP
fi

# Persist the data for future use
fact-add cassandra_hostname ${HOSTNAME}
fact-add cassandra_ip ${IP}
fact-add cassandra_default_jmx_port ${DEFAULT_JMX_PORT}
fact-add cassandra_default_cluster_port ${DEFAULT_CLUSTER_PORT}
fact-add cassandra_default_client_port ${DEFAULT_CLIENT_PORT}
fact-add cassandra_default_cluster_name ${DEFAULT_CLUSTER_NAME}

# Update the cassandra environment with the appropriate JMX port
sed -i -e "s/^JMX_PORT=.*/JMX_PORT=\"${DEFAULT_JMX_PORT}\"/" /etc/cassandra/cassandra-env.sh

# Construct the cassandra.yaml file from the appropriate information above
sed -i -e "s/^cluster_name:.*/cluster_name: \'${DEFAULT_CLUSTER_NAME}\'/" \
       -e "s/\- seeds:.*/\- seeds: \"${IP}\"/" \
       -e "s/^storage_port:.*/storage_port: ${DEFAULT_CLUSTER_PORT}/" \
       -e "s/^listen_address:.*/listen_address: ${IP}/" \
       -e "s/^rpc_address:.*/rpc_address: ${IP}/" \
       -e "s/^rpc_port:.*/rpc_port: ${DEFAULT_CLIENT_PORT}/" \
        /etc/cassandra/cassandra.yaml

service cassandra status && service cassandra restart || service cassandra start

Now we should have enough of a charm to deploy a single Cassandra node.  The other hooks in the charm are:
  • jmx-relation-joined ( mainly to advertise our jmx interface )
  • database-relation-joined ( mainly to advertise our database interface )
  • cluster-relation-joined ( persists some values that need to be available to all nodes in the ring )
  • cluster-relation-changed ( we use the data persisted by cluster-relation-joined to reconfigure Cassandra so it shares data with the other nodes and form a ring )
The most interesting hook of the ones above is the cluster-relation-changed one so, I'll show that one here:  

hooks/cluster-relation-changed
#!/bin/bash

set -x

CWD=`dirname $0`

for node in `relation-list`
do
   HOSTNAME=`relation-get hostname ${node}`
   IP=`relation-get ip`
   DEFAULT_JMX_PORT=`relation-get jmx_port ${node}`
   DEFAULT_CLUSTER_PORT=`relation-get cluster_port ${node}`
   DEFAULT_CLIENT_PORT=`relation-get client_port ${node}`
   [ -z ${TMP_SEEDS} ] && TMP_SEEDS=${IP} || TMP_SEEDS="${TMP_SEEDS},${IP}"
done

sed -i -e "s/\- seeds:.*/\- seeds: \"${TMP_SEEDS}\"/" /etc/cassandra/cassandra.yaml

service cassandra status && service cassandra restart || service cassandra start

echo $JUJU_REMOTE_UNIT modified its settings
echo Relation settings:
relation-get
echo Relation members:
relation-list

Inspection of the other hooks is left as an exercise to the reader :)

Deploying Cassandra

I'll assume that you have followed Juju's Getting Started Documentation and have Juju properly configured and ready to go.

bzr branch the Cassandra charm ( bzr branch lp:~negronjl/+junk/cassandra )
juju bootstrap ( wait a few minutes while the environment is set up )
negronjl@negronjl-laptop:~/src/juju/charms$ juju bootstrap2011-08-11 20:45:26,976 INFO Bootstrapping environment 'sample' (type: ec2)...2011-08-11 20:45:37,804 INFO 'bootstrap' command finished successfully
juju status ( to ensure that the environment is up )
negronjl@negronjl-laptop:~/src/juju/charms$ juju status2011-08-11 20:47:57,196 INFO Connecting to environment.machines:  0: {dns-name: ec2-50-16-150-73.compute-1.amazonaws.com, instance-id: i-57642336}services: {}2011-08-11 20:48:02,029 INFO 'status' command finished successfully
juju deploy --repository . cassandra ( to deploy the Cassandra charm )

negronjl@negronjl-laptop:~/src/juju/charms$ juju deploy --repository . cassandra2011-08-11 20:48:41,251 INFO Connecting to environment.2011-08-11 20:48:48,659 INFO Charm deployed as service: 'cassandra'2011-08-11 20:48:48,662 INFO 'deploy' command finished successfully
juju status ( to ensure Cassandra deployed properly )

negronjl@negronjl-laptop:~/src/juju/charms$ juju status2011-08-11 20:49:25,623 INFO Connecting to environment.machines:  0: {dns-name: ec2-50-16-150-73.compute-1.amazonaws.com, instance-id: i-57642336}  1: {dns-name: ec2-50-19-73-31.compute-1.amazonaws.com, instance-id: i-5f62253e}services:  cassandra:    charm: local:cassandra-1    relations: {cluster: cassandra}    units:      cassandra/0:        machine: 1        relations: {}        state: null <---- NOT READY2011-08-11 20:49:36,141 INFO 'status' command finished successfully


negronjl@negronjl-laptop:~/src/juju/charms$ juju status
2011-08-11 21:02:36,264 INFO Connecting to environment.
machines:
  0: {dns-name: ec2-50-16-150-73.compute-1.amazonaws.com, instance-id: i-57642336}
  1: {dns-name: ec2-50-19-73-31.compute-1.amazonaws.com, instance-id: i-5f62253e}
services:
  cassandra:
    charm: local:cassandra-1
    relations: {cluster: cassandra}
    units:
      cassandra/0:
        machine: 1
        relations:
          cluster: {state: up}
        state: started  <---- NOW IT IS READY
2011-08-11 21:02:42,506 INFO 'status' command finished successfully
juju ssh 1 ( this will ssh into the Cassandra machine )

Once in the Cassandra machine, verify the status of it by typing:
  • nodetool -h `hostname -f` ring

ubuntu@ip-10-245-211-95:~$ nodetool -h `hostname -f` ring
Address         DC          Rack        Status State   Load            Owns    Token                                      
10.245.211.95   datacenter1 rack1       Up     Normal  6.55 KB         100.00% 124681228764612737621872162332718392045  
Back on your machine ( not the Cassandr one ), type the following to add more Cassandra nodes:
  • juju add-unit cassandra ( repeat as many times as you want )
negronjl@negronjl-laptop:~/src/juju/charms$ juju status
2011-08-11 21:11:40,367 INFO Connecting to environment.
machines:
  0: {dns-name: ec2-50-16-150-73.compute-1.amazonaws.com, instance-id: i-57642336}
  1: {dns-name: ec2-50-19-73-31.compute-1.amazonaws.com, instance-id: i-5f62253e}
  2: {dns-name: ec2-50-17-90-104.compute-1.amazonaws.com, instance-id: i-e1703780}
  3: {dns-name: ec2-174-129-128-232.compute-1.amazonaws.com, instance-id: i-f1703790}
services:
  cassandra:
    charm: local:cassandra-1
    relations: {cluster: cassandra}
    units:
      cassandra/0:
        machine: 1
        relations:
          cluster: {state: up}
        state: started
      cassandra/1:
        machine: 2
        relations:
          cluster: {state: up}
        state: started
      cassandra/2:
        machine: 3
        relations:
          cluster: {state: up}
        state: started
2011-08-11 21:11:54,132 INFO 'status' command finished successfully
After the new nodes have been properly deployed ( you can see the status of the deployment by running juju status ), log back on the Cassandra node ( juju ssh 1 ) and type:
  • nodetool -h `hostname -f` ring ( to see that the new nodes are being added to the ring )
ubuntu@ip-10-245-211-95:~$ nodetool -h `hostname -f` ringAddress         DC          Rack        Status State   Load            Owns    Token                                                                                                                      124681228764612737621872162332718392045     10.38.33.97     datacenter1 rack1       Up     Normal  11.06 KB        69.21%  72298506053176682474361069083301352072      10.99.45.243    datacenter1 rack1       Up     Normal  15.34 KB        9.26%   88046943828017032654712668424156081726      10.245.211.95   datacenter1 rack1       Up     Normal  11.06 KB        21.53%  124681228764612737621872162332718392045     ubuntu@ip-10-245-211-95:~$ 
As you can see, once you create a charm on Juju, it's pretty easy to share and use.

If you have feedback about this ( or any other charm ), I would love to hear from you.
Drop me a line.

-Juan

No comments:

Post a Comment