Thursday, November 29, 2012

Let's shard something

It's been a while so, let's get right to it...

Today I'll be talking about sharded clusters in MongoDB using Ubuntu Precise and Juju.

According the the mongodb documentation found on their website, one way of deploying a Shard Cluster is as follows:
  • deploy config servers
  • deploy a mongo shell (mongos)
  • deploy shards
  • connect the config servers to the mongo shell
  • add the shards to the mongo shell
This is a pretty involved process that normally takes several hours to accomplish.  In this post, I will show you a way of deploying a sharded cluster in minutes using UbuntuJuju and the MongoDB charm.

For the impatient, here is a video of the entire deployment:



For the not so impatient, here is a more detailed explanation of the deployment.

A few things to keep in mind when configuring this sharded cluster configuration:
  • Each shard will be a replica set with three nodes each.  As such, each shard must have a different replica-set name so it can be successfully registered as part of the cluster.
  • We will deploy three configuration servers
Bootstrap the environment
juju bootstrap



Mongo Shell
juju deploy mongodb mongos



As described in the MongoDB charm, there are many options that can be configured to suit your needs.  For this deployment, most of the configuration options will work with the exception of the replicaset which will need to be different for each shard.

Fortunately, a simple yaml file with our configuration overrides is all that's needed.

Prepare a configuration file similar to the following:
shard1:
  replicaset: shard1
shard2:
  replicaset: shard2
shard3:
  replicaset: shard3
configsvr:
  replicaset: configsvr

We'll save this one as ~/mongodb-shard.yaml



Config Servers ( we'll deploy 3 of them )
juju deploy mongodb configsvr --config ~/mongodb-shard.yaml -n3




Shards ( We'll deploy three replica-sets )
juju deploy mongodb shard1 --config ~/mongodb-shard.yaml -n3
juju deploy mongodb shard2 --config ~/mongodb-shard.yaml -n3
juju deploy mongodb shard3 --config ~/mongodb-shard.yaml -n3




Connect the Config Servers to the Mongo shell (mongos)
juju add-relation mongos:mongos-cfg configsvr:configsvr




Connect each Shard to the Mongo shell (mongos)
juju add-relation mongos:mongos shard1:database
juju add-relation mongos:mongos shard2:database
juju add-relation mongos:mongos shard3:database




With the above commands, we should now have a three replica-set sharded cluster running.
Using the default configuration, here are some details of our sharded cluster:
  • mongos is running on port 27021
  • configsvr is running on port 27019
  • the shards are running on the default mongodb port of 27017
  • The web admin is turned on by default and accessible with your browser on port 28017 on each of the shards after exposing the service.



After a few minutes, your sharded cluster should be ready.  Let's verify that everything went as planned:
  • Verify your config servers
    • juju expose configsvr
    • juju status configsvr
    • Open your browser to http://<public-address-of-configsvr:28017
  • Verify that each shard
    • juju expose <shard1|shard2|shard3>
    • juju status <shard1|shard2|shard3>
    • Open your browser to http://<public-address-of-shard>:28017
  • Verify that each shard has been successfully register with the cluster:
    • juju expose mongos
    • juju status mongos
    • mongo --host <public-address-of-mongos>:27021
    • Once connected:
      • sh.status()
      • Results should be similar to the following:
MongoDB shell version: 2.0.6
connecting to: ec2-184-169-254-0.us-west-1.compute.amazonaws.com:27021/test
mongos> sh.status()
--- Sharding Status --- 
  sharding version: { "_id" : 1, "version" : 3 }
  shards:
        {  "_id" : "shard1",  "host" : "shard1/ec2-184-169-236-25.us-west-1.compute.amazonaws.com:27017,ec2-54-241-89-206.us-west-1.compute.amazonaws.com:27017,ip-10-170-173-51:27017" }
        {  "_id" : "shard2",  "host" : "shard2/ec2-204-236-159-194.us-west-1.compute.amazonaws.com:27017,ip-10-170-22-104:27017" }
        {  "_id" : "shard3",  "host" : "shard3/ec2-184-169-219-11.us-west-1.compute.amazonaws.com:27017,ec2-184-169-239-214.us-west-1.compute.amazonaws.com:27017,ip-10-170-215-20:27017" }
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }

mongos> 
      • In the above status, you should see the hosts associated with each of your shards.

You now have a MongoDB sharded cluster accessible via the public-address of the mongos instance... enjoy ... and don't forget to destroy the environment when you're done with it:

juju destroy-environment




-Juan

5 comments:

  1. Juan

    I executed every step as specified in this blogpost and landed up with an relation error as specified below

    machines:
    1:
    agent-state: running
    dns-name: ec2-50-112-196-51.us-west-2.compute.amazonaws.com
    instance-id: i-4eb1b87c
    instance-state: running
    services:
    mongos:
    charm: cs:precise/mongodb-12
    exposed: true
    relations:
    mongos:
    - shard1
    - shard2
    mongos-cfg:
    - configsvr
    replica-set:
    - mongos
    units:
    mongos/0:
    agent-state: started
    machine: 1
    open-ports:
    - 27019/tcp
    - 27017/tcp
    - 28017/tcp
    public-address: ec2-50-112-196-51.us-west-2.compute.amazonaws.com
    relation-errors:
    mongos:
    - shard1
    - shard2

    I cannot connect to mongos host .. can you please throw some light on this

    ReplyDelete
  2. Hi Chetan:

    Let me review the post and provide updates. The charms, mongodb, Ubuntu and pretty much the entire stack is an ever evolving ecosystem so, it could very well be that the steps detailed in my blog post require some updating.

    Give me some time to review the steps in the post and I will provide you with an update and/or feedback.

    Thanks,

    Juan

    ReplyDelete
  3. I just tried it and it worked just fine ... I'll try a few more times to see if this is is an intermittent issue.

    In the meantime, you can try juju resolved --retry ... This command will retry the failed hooks. Sometimes it helps.

    Thanks,

    Juan

    ReplyDelete
  4. Hi thank you for the tuto, really helping. Careful, there is a small typo mistake in :
    juju add-realtion mongos:mongos shard1:database
    juju add-realtion mongos:mongos shard2:database
    juju add-realtion mongos:mongos shard3:database
    -->
    juju add-relation mongos:mongos shard1:database
    juju add-relation mongos:mongos shard2:database
    juju add-relation mongos:mongos shard3:database

    To be sure to understand, what would be the process if I want to launch sharding instance without using replica-set?
    Just do one: "juju deploy mongodb shardUnique --config ~/mongodb-shard.yaml -n3" ? And if I want to shard on Y instances: "juju deploy mongodb shardUnique --config ~/mongodb-shard.yaml -nY" ?

    Thank you

    ReplyDelete
  5. Hi Francois:

    Thanks! I just fixed it.

    ReplyDelete