Thursday, August 18, 2011

HPCC with Ubuntu Server and Ensemble

** This is an updated post reflecting the new name of the project formerly known as Ensemble now known as Juju **

Let's start this post with a bit of background on the technologies that I'll be using:
  • What is Ubuntu?
    • Ubuntu is a fast, secure and easy-to-use operating system used by millions of people around the world.  
    • Secure, fast and powerful, Ubuntu Server is transforming IT environments worldwide. Realise the full potential of your infrastructure with a reliable, easy-to-integrate technology platform.
  • What is Juju?
    • Juju is a next generation service orchestration framework. It has been likened to APT for the cloud. With juju, different authors are able to create service charms independently, and make those services coordinate their communication through a simple protocol. Users can then take the product of different authors and very comfortably deploy those services in an environment. The result is multiple machines and components transparently collaborating towards providing the requested service.
  • What is HPCC?
    • HPCC (High Performance Computing Cluster) is a massive parallel-processing computing platform that solves Big Data problems. The platform is now Open Source!
 Now that we are all caught up, let's delve right into it.  I will be discussing the details of my newly created hpcc juju charm.

The hpcc charm has been one of the trickiest one to date to get working properly so, I want to take some time to explain some of the challenges that I encountered.

hpcc seems to use ssh keys for authentication and a single xml file to hold it's configuration.  All nodes that are part of the cluster should have identical keys and xml configuration file.

  • The ssh keys are pretty easy to do ( there is even a script that will do it all for you located at /opt/HPCCSystems/sbin/keygen.sh ).  You can just run: ssh-keygen -f path_where_to_save_keys/id_rsa -N "" -q
  • The configuration file environment.xml is a lot trickier to configure so, I will use cheetah templates to help make a template out of this enormous file.  
    • According to their website:
      • Cheetah is an open source template engine and code generation tool, written in Python. It can be used standalone or combined with other tools and frameworks. Web development is its principle use, but Cheetah is very flexible and is also being used to generate C++ game code, Java, sql, form emails and even Python code.
With cheetah, I can create self contained templates that can be generated into their intended file by just calling cheetah.  This is because we can embed python code inside the template itself, making the template ( environment.tmpl in our case ) more or less a python program that generates a fully functional environment.xml file ready for hpcc to use.

Another, very important, reason to use a template engine is the ability to create identical configuration files from each node without having to pass them around.  In other words, each node can create it's own configuration file and, since all nodes are using the same methods and data to create the file, they will all be exactly the same.

The hpcc configuration file is huge so, I'll just talk about some of the interesting bits of it here:
#import random
#import subprocess
#set $rel_structure = { $subprocess.check_output(['facter', 'install_time']).strip() : { 'name' : $subprocess.check_output(['hostname', '-f']).strip(), 'netAddress' : $subprocess.check_output(['facter','ipaddress']).strip(), 'uuid' : $subprocess.check_output(['facter','uuid']).strip()  } }
#for $member in $subprocess.check_output(['relation-list']).strip().split():
   #set $rel_structure[$subprocess.check_output(['relation-get','install_time', $member]).strip()] = { 'name' : $subprocess.check_output(['relation-get','name', $member]).strip(), 'netAddress' : $subprocess.check_output(['relation-get','netAddress', $member]).strip(), 'uuid' : $subprocess.check_output(['relation-get', 'uuid', $member]).strip() }
#end for
#set $nodes = []
#for $index in $sorted($rel_structure.keys()):
   $nodes.append(($rel_structure[$index]['netAddress'], $rel_structure[$index]['name']))
#end for
The above piece of code is what I am currently using to populate a list with the FQDN and address of each cluster member sorted by install time.  This puts the "master" of the cluster at the top of the list which will become useful when populating certain parts of the configuration file.


As we can see by the code above, the main piece of information that we use in this template is the node list.  Here is a sample of how we use it in the environment.tmpl template file:
 #for $netAddress, $name in $nodes:
  <Computer computerType="linuxmachine"
            domain="localdomain"
            name="$name"
            netAddress="$netAddress"/>
#end for
I encourage you to download the charm here and examine the environment.tmpl file in the templates directory.

Here is the complete environment.tmpl file... I know it's pretty small and, you can just download the charm and read the file at your leasure but, I wanted to give you an idea of the size and complexity of hpcc's configuration file.

====

#import random
#import subprocess
#set $rel_structure = { $subprocess.check_output(['facter', 'install_time']).strip() : { 'name' : $subprocess.check_output(['hostname', '-f']).strip(), 'netAddress' : $subprocess.check_output(['facter','ipaddress']).strip(), 'uuid' : $subprocess.check_output(['facter','uuid']).strip()  } }
#for $member in $subprocess.check_output(['relation-list']).strip().split():
   #set $rel_structure[$subprocess.check_output(['relation-get','install_time', $member]).strip()] = { 'name' : $subprocess.check_output(['relation-get','name', $member]).strip(), 'netAddress' : $subprocess.check_output(['relation-get','netAddress', $member]).strip(), 'uuid' : $subprocess.check_output(['relation-get', 'uuid', $member]).strip() }
#end for
#set $nodes = []
#for $index in $sorted($rel_structure.keys()):
   $nodes.append(($rel_structure[$index]['netAddress'], $rel_structure[$index]['name']))
#end for
<?xml version="1.0" encoding="UTF-8"?>
<!-- Edited with ConfigMgr on ip 71.204.190.179 on 2011-08-16T00:39:16 -->
<Environment>
 <EnvSettings>
  <blockname>HPCCSystems</blockname>
  <configs>/etc/HPCCSystems</configs>
  <environment>environment.xml</environment>
  <group>hpcc</group>
  <home>/home</home>
  <interface>eth0</interface>
  <lock>/var/lock/HPCCSystems</lock>
  <log>/var/log/HPCCSystems</log>
  <path>/opt/HPCCSystems</path>
  <pid>/var/run/HPCCSystems</pid>
  <runtime>/var/lib/HPCCSystems</runtime>
  <sourcedir>/etc/HPCCSystems/source</sourcedir>
  <user>hpcc</user>
 </EnvSettings>
 <Hardware>
 #for $netAddress, $name in $nodes:
  <Computer computerType="linuxmachine"
            domain="localdomain"
            name="$name"
            netAddress="$netAddress"/>
#end for
  <ComputerType computerType="linuxmachine"
                manufacturer="unknown"
                name="linuxmachine"
                opSys="linux"/>
  <Domain name="localdomain" password="" username=""/>
  <Switch name="Switch"/>
 </Hardware>
 <Programs>
  <Build name="community_3.0.4" url="/opt/HPCCSystems">
   <BuildSet installSet="deploy_map.xml"
             name="dafilesrv"
             path="componentfiles/dafilesrv"
             processName="DafilesrvProcess"
             schema="dafilesrv.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="dali"
             path="componentfiles/dali"
             processName="DaliServerProcess"
             schema="dali.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="dfuplus"
             path="componentfiles/dfuplus"
             processName="DfuplusProcess"
             schema="dfuplus.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="dfuserver"
             path="componentfiles/dfuserver"
             processName="DfuServerProcess"
             schema="dfuserver.xsd"/>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="DropZone"
             path="componentfiles/DropZone"
             processName="DropZone"
             schema="dropzone.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="eclagent"
             path="componentfiles/eclagent"
             processName="EclAgentProcess"
             schema="eclagent_config.xsd"/>
   <BuildSet installSet="deploy_map.xml" name="eclminus" path="componentfiles/eclminus"/>
   <BuildSet installSet="deploy_map.xml"
             name="eclplus"
             path="componentfiles/eclplus"
             processName="EclPlusProcess"
             schema="eclplus.xsd"/>
   <BuildSet installSet="eclccserver_deploy_map.xml"
             name="eclccserver"
             path="componentfiles/configxml"
             processName="EclCCServerProcess"
             schema="eclccserver.xsd"/>
   <BuildSet installSet="eclscheduler_deploy_map.xml"
             name="eclscheduler"
             path="componentfiles/configxml"
             processName="EclSchedulerProcess"
             schema="eclscheduler.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="esp"
             path="componentfiles/esp"
             processName="EspProcess"
             schema="esp.xsd"/>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="espsmc"
             path="componentfiles/espsmc"
             processName="EspService"
             schema="espsmcservice.xsd">
    <Properties defaultPort="8010"
                defaultResourcesBasedn="ou=SMC,ou=EspServices,ou=ecl"
                defaultSecurePort="18010"
                type="WsSMC">
     <Authenticate access="Read"
                   description="Root access to SMC service"
                   path="/"
                   required="Read"
                   resource="SmcAccess"/>
     <AuthenticateFeature description="Access to SMC service"
                          path="SmcAccess"
                          resource="SmcAccess"
                          service="ws_smc"/>
     <AuthenticateFeature description="Access to thor queues"
                          path="ThorQueueAccess"
                          resource="ThorQueueAccess"
                          service="ws_smc"/>
     <AuthenticateFeature description="Access to super computer environment"
                          path="ConfigAccess"
                          resource="ConfigAccess"
                          service="ws_config"/>
     <AuthenticateFeature description="Access to DFU"
                          path="DfuAccess"
                          resource="DfuAccess"
                          service="ws_dfu"/>
     <AuthenticateFeature description="Access to DFU XRef"
                          path="DfuXrefAccess"
                          resource="DfuXrefAccess"
                          service="ws_dfuxref"/>
     <AuthenticateFeature description="Access to machine information"
                          path="MachineInfoAccess"
                          resource="MachineInfoAccess"
                          service="ws_machine"/>
     <AuthenticateFeature description="Access to SNMP metrics information"
                          path="MetricsAccess"
                          resource="MetricsAccess"
                          service="ws_machine"/>
     <AuthenticateFeature description="Access to remote execution"
                          path="ExecuteAccess"
                          resource="ExecuteAccess"
                          service="ws_machine"/>
     <AuthenticateFeature description="Access to DFU workunits"
                          path="DfuWorkunitsAccess"
                          resource="DfuWorkunitsAccess"
                          service="ws_fs"/>
     <AuthenticateFeature description="Access to DFU exceptions"
                          path="DfuExceptionsAccess"
                          resource="DfuExceptions"
                          service="ws_fs"/>
     <AuthenticateFeature description="Access to spraying files"
                          path="FileSprayAccess"
                          resource="FileSprayAccess"
                          service="ws_fs"/>
     <AuthenticateFeature description="Access to despraying of files"
                          path="FileDesprayAccess"
                          resource="FileDesprayAccess"
                          service="ws_fs"/>
     <AuthenticateFeature description="Access to dkcing of key files"
                          path="FileDkcAccess"
                          resource="FileDkcAccess"
                          service="ws_fs"/>
     <AuthenticateFeature description="Access to files in dropzone"
                          path="FileIOAccess"
                          resource="FileIOAccess"
                          service="ws_fileio"/>
     <AuthenticateFeature description="Access to WS ECL service"
                          path="WsEclAccess"
                          resource="WsEclAccess"
                          service="ws_ecl"/>
     <AuthenticateFeature description="Access to Roxie queries and files"
                          path="RoxieQueryAccess"
                          resource="RoxieQueryAccess"
                          service="ws_roxiequery"/>
     <AuthenticateFeature description="Access to cluster topology"
                          path="ClusterTopologyAccess"
                          resource="ClusterTopologyAccess"
                          service="ws_topology"/>
     <AuthenticateFeature description="Access to own workunits"
                          path="OwnWorkunitsAccess"
                          resource="OwnWorkunitsAccess"
                          service="ws_workunits"/>
     <AuthenticateFeature description="Access to others&apos; workunits"
                          path="OthersWorkunitsAccess"
                          resource="OthersWorkunitsAccess"
                          service="ws_workunits"/>
     <AuthenticateFeature description="Access to ECL direct service"
                          path="EclDirectAccess"
                          resource="EclDirectAccess"
                          service="ecldirect"/>
     <ProcessFilters>
      <Platform name="Windows">
       <ProcessFilter name="any">
        <Process name="dafilesrv"/>
       </ProcessFilter>
       <ProcessFilter name="AttrServerProcess">
        <Process name="attrserver"/>
       </ProcessFilter>
       <ProcessFilter name="DaliProcess">
        <Process name="daserver"/>
       </ProcessFilter>
       <ProcessFilter multipleInstances="true" name="DfuServerProcess">
        <Process name="dfuserver"/>
       </ProcessFilter>
       <ProcessFilter multipleInstances="true" name="EclCCServerProcess">
        <Process name="eclccserver"/>
       </ProcessFilter>
       <ProcessFilter multipleInstances="true" name="EspProcess">
        <Process name="esp"/>
        <Process name="dafilesrv" remove="true"/>
       </ProcessFilter>
       <ProcessFilter name="FTSlaveProcess">
        <Process name="ftslave"/>
       </ProcessFilter>
       <ProcessFilter name="RoxieServerProcess">
        <Process name="ccd"/>
       </ProcessFilter>
       <ProcessFilter name="RoxieSlaveProcess">
        <Process name="ccd"/>
       </ProcessFilter>
       <ProcessFilter name="SchedulerProcess">
        <Process name="scheduler"/>
       </ProcessFilter>
       <ProcessFilter name="ThorMasterProcess">
        <Process name="thormaster"/>
       </ProcessFilter>
       <ProcessFilter name="ThorSlaveProcess">
        <Process name="thorslave"/>
       </ProcessFilter>
       <ProcessFilter name="SashaServerProcess">
        <Process name="saserver"/>
       </ProcessFilter>
      </Platform>
      <Platform name="Linux">
       <ProcessFilter name="any">
        <Process name="dafilesrv"/>
       </ProcessFilter>
       <ProcessFilter name="AttrServerProcess">
        <Process name="attrserver"/>
       </ProcessFilter>
       <ProcessFilter name="DaliProcess">
        <Process name="daserver"/>
       </ProcessFilter>
       <ProcessFilter multipleInstances="true" name="DfuServerProcess">
        <Process name="."/>
       </ProcessFilter>
       <ProcessFilter multipleInstances="true" name="EclCCServerProcess">
        <Process name="."/>
       </ProcessFilter>
       <ProcessFilter multipleInstances="true" name="EspProcess">
        <Process name="."/>
        <Process name="dafilesrv" remove="true"/>
       </ProcessFilter>
       <ProcessFilter name="FTSlaveProcess">
        <Process name="ftslave"/>
       </ProcessFilter>
       <ProcessFilter name="GenesisServerProcess">
        <Process name="httpd"/>
        <Process name="atftpd"/>
        <Process name="dhcpd"/>
       </ProcessFilter>
       <ProcessFilter name="RoxieServerProcess">
        <Process name="ccd"/>
       </ProcessFilter>
       <ProcessFilter name="RoxieSlaveProcess">
        <Process name="ccd"/>
       </ProcessFilter>
       <ProcessFilter name="SchedulerProcess">
        <Process name="scheduler"/>
       </ProcessFilter>
       <ProcessFilter name="ThorMasterProcess">
        <Process name="thormaster"/>
       </ProcessFilter>
       <ProcessFilter name="ThorSlaveProcess">
        <Process name="thorslave"/>
       </ProcessFilter>
       <ProcessFilter name="SashaServerProcess">
        <Process name="saserver"/>
       </ProcessFilter>
      </Platform>
     </ProcessFilters>
    </Properties>
   </BuildSet>
   <BuildSet installSet="deploy_map.xml"
             name="ftslave"
             path="componentfiles/ftslave"
             processName="FTSlaveProcess"
             schema="ftslave_linux.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="hqltest"
             path="componentfiles/hqltest"
             processName="HqlTestProcess"/>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="ldapServer"
             path="componentfiles/ldapServer"
             processName="LDAPServerProcess"
             schema="ldapserver.xsd"/>
   <BuildSet deployable="no"
             installSet="auditlib_deploy_map.xml"
             name="plugins_auditlib"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="debugservices_deploy_map.xml"
             name="plugins_debugservices"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="fileservices_deploy_map.xml"
             name="plugins_fileservices"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="logging_deploy_map.xml"
             name="plugins_logging"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="parselib_deploy_map.xml"
             name="plugins_parselib"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="stringlib_deploy_map.xml"
             name="plugins_stringlib"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="unicodelib_deploy_map.xml"
             name="plugins_unicodelib"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="workunitservices_deploy_map.xml"
             name="plugins_workunitservices"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet installSet="roxie_deploy_map.xml"
             name="roxie"
             path="componentfiles/configxml"
             processName="RoxieCluster"
             schema="roxie.xsd"/>
   <BuildSet installSet="deploy_map.xml" name="roxieconfig" path="componentfiles/roxieconfig"/>
   <BuildSet installSet="deploy_map.xml"
             name="sasha"
             path="componentfiles/sasha"
             processName="SashaServerProcess"
             schema="sasha.xsd"/>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="SiteCertificate"
             path="componentfiles/SiteCertificate"
             processName="SiteCertificate"
             schema="SiteCertificate.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="soapplus"
             path="componentfiles/soapplus"
             processName="SoapPlusProcess"
             schema="soapplus.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="thor"
             path="componentfiles/thor"
             processName="ThorCluster"
             schema="thor.xsd"/>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="topology"
             path="componentfiles/topology"
             processName="Topology"
             schema="topology.xsd"/>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="ws_ecl"
             path="componentfiles/ws_ecl"
             processName="EspService"
             schema="esp_service_wsecl2.xsd">
    <Properties bindingType="ws_eclSoapBinding"
                defaultPort="8002"
                defaultResourcesBasedn="ou=WsEcl,ou=EspServices,ou=ecl"
                defaultSecurePort="18002"
                plugin="ws_ecl"
                type="ws_ecl">
     <Authenticate access="Read"
                   description="Root access to WS ECL service"
                   path="/"
                   required="Read"
                   resource="WsEclAccess"/>
     <AuthenticateFeature description="Access to WS ECL service"
                          path="WsEclAccess"
                          resource="WsEclAccess"
                          service="ws_ecl"/>
    </Properties>
   </BuildSet>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="ecldirect"
             path="componentfiles/ecldirect"
             processName="EspService"
             schema="esp_service_ecldirect.xsd">
    <Properties bindingType="EclDirectSoapBinding"
                defaultPort="8008"
                defaultResourcesBasedn="ou=EclDirectAccess,ou=EspServices,ou=ecl"
                defaultSecurePort="18008"
                plugin="ecldirect"
                type="ecldirect">
     <Authenticate access="Read"
                   description="Root access to ECL Direct service"
                   path="/"
                   required="Read"
                   resource="EclDirectAccess"/>
     <AuthenticateFeature description="Access to ECL Direct service"
                          path="EclDirectAccess"
                          resource="EclDirectAccess"
                          service="ecldirect"/>
    </Properties>
   </BuildSet>
  </Build>
 </Programs>
 <Software>
  <DafilesrvProcess build="community_3.0.4"
                    buildSet="dafilesrv"
                    description="DaFileSrv process"
                    name="mydafilesrv"
                    version="1">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/mydafilesrv"
             name="s_$name"
             netAddress="$netAddress"/>
#end for             
  </DafilesrvProcess>
  <DaliServerProcess build="community_3.0.4"
                     buildSet="dali"
                     environment="/etc/HPCCSystems/environment.xml"
                     name="mydali"
                     recoverFromIncErrors="true">                     
   <Instance computer="$nodes[0][1]"
             directory="/var/lib/HPCCSystems/mydali"
             name="s_$nodes[0][1]"
             netAddress="$nodes[0][0]"
             port="7070"/>
  </DaliServerProcess>
  <DfuServerProcess build="community_3.0.4"
                    buildSet="dfuserver"
                    daliServers="mydali"
                    description="DFU Server"
                    monitorinterval="900"
                    monitorqueue="dfuserver_monitor_queue"
                    name="mydfuserver"
                    queue="dfuserver_queue"
                    transferBufferSize="65536">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/mydfuserver"
             name="s_$name"
             netAddress="$netAddress"/>
#end for    
#raw        
   <SSH SSHidentityfile="$HOME/.ssh/id_rsa"
#end raw
        SSHpassword=""
        SSHretries="3"
        SSHtimeout="0"
        SSHusername="hpcc"/>
  </DfuServerProcess>
  <Directories name="HPCCSystems">
   <Category dir="/var/log/[NAME]/[INST]" name="log"/>
   <Category dir="/var/lib/[NAME]/[INST]" name="run"/>
   <Category dir="/etc/[NAME]/[INST]" name="conf"/>
   <Category dir="/var/lib/[NAME]/[INST]/temp" name="temp"/>
   <Category dir="/var/lib/[NAME]/hpcc-data/[COMPONENT]" name="data"/>
   <Category dir="/var/lib/[NAME]/hpcc-data2/[COMPONENT]" name="data2"/>
   <Category dir="/var/lib/[NAME]/hpcc-data3/[COMPONENT]" name="data3"/>
   <Category dir="/var/lib/[NAME]/hpcc-mirror/[COMPONENT]" name="mirror"/>
   <Category dir="/var/lib/[NAME]/queries/[INST]" name="query"/>
   <Category dir="/var/lock/[NAME]/[INST]" name="lock"/>
  </Directories>
  <DropZone build="community_3.0.4"
            buildSet="DropZone"
            computer="$nodes[0][0]"
            description="DropZone process"
            directory="/var/lib/HPCCSystems/dropzone"
            name="mydropzone"/>
  <EclAgentProcess allowedPipePrograms="*"
                   build="community_3.0.4"
                   buildSet="eclagent"
                   daliServers="mydali"
                   description="EclAgent process"
                   name="myeclagent"
                   pluginDirectory="/opt/HPCCSystems/plugins/"
                   thorConnectTimeout="600"
                   traceLevel="0"
                   wuQueueName="myeclagent_queue">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/myeclagent"
             name="s_$name"
             netAddress="$netAddress"/>
#end for  
  </EclAgentProcess>
  <EclCCServerProcess build="community_3.0.4"
                      buildSet="eclccserver"
                      daliServers="mydali"
                      description="EclCCServer process"
                      enableSysLog="true"
                      maxCompileThreads="4"
                      name="myeclccserver"
                      traceLevel="1">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/myeclccserver"
             name="s_$name"
             netAddress="$netAddress"/>
#end for
  </EclCCServerProcess>
  <EclSchedulerProcess build="community_3.0.4"
                       buildSet="eclscheduler"
                       daliServers="mydali"
                       description="EclScheduler process"
                       name="myeclscheduler">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/myeclscheduler"
             name="s_$name"
             netAddress="$netAddress"/>
#end for
  </EclSchedulerProcess>
  <EspProcess build="community_3.0.4"
              buildSet="esp"
              componentfilesDir="/opt/HPCCSystems/componentfiles"
              daliServers="mydali"
              description="ESP server"
              enableSEHMapping="true"
              formOptionsAccess="false"
              httpConfigAccess="true"
              logLevel="1"
              logRequests="false"
              logResponses="false"
              maxBacklogQueueSize="200"
              maxConcurrentThreads="0"
              maxRequestEntityLength="8000000"
              name="myesp"
              perfReportDelay="60"
              portalurl="http://hpccsystems.com/download">
   <Authentication ldapAuthMethod="kerberos"
                   ldapConnections="10"
                   ldapServer=""
                   method="none"/>
   <EspBinding defaultForPort="true"
               defaultServiceVersion=""
               name="myespsmc"
               port="8010"
               protocol="http"
               resourcesBasedn="ou=SMC,ou=EspServices,ou=ecl"
               service="EclWatch"
               workunitsBasedn="ou=workunits,ou=ecl"
               wsdlServiceAddress="">
    <Authenticate access="Read"
                  description="Root access to SMC service"
                  path="/"
                  required="Read"
                  resource="SmcAccess"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to SMC service"
                         path="SmcAccess"
                         resource="SmcAccess"
                         service="ws_smc"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to thor queues"
                         path="ThorQueueAccess"
                         resource="ThorQueueAccess"
                         service="ws_smc"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to super computer environment"
                         path="ConfigAccess"
                         resource="ConfigAccess"
                         service="ws_config"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to DFU"
                         path="DfuAccess"
                         resource="DfuAccess"
                         service="ws_dfu"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to DFU XRef"
                         path="DfuXrefAccess"
                         resource="DfuXrefAccess"
                         service="ws_dfuxref"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to machine information"
                         path="MachineInfoAccess"
                         resource="MachineInfoAccess"
                         service="ws_machine"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to SNMP metrics information"
                         path="MetricsAccess"
                         resource="MetricsAccess"
                         service="ws_machine"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to remote execution"
                         path="ExecuteAccess"
                         resource="ExecuteAccess"
                         service="ws_machine"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to DFU workunits"
                         path="DfuWorkunitsAccess"
                         resource="DfuWorkunitsAccess"
                         service="ws_fs"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to DFU exceptions"
                         path="DfuExceptionsAccess"
                         resource="DfuExceptions"
                         service="ws_fs"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to spraying files"
                         path="FileSprayAccess"
                         resource="FileSprayAccess"
                         service="ws_fs"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to despraying of files"
                         path="FileDesprayAccess"
                         resource="FileDesprayAccess"
                         service="ws_fs"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to dkcing of key files"
                         path="FileDkcAccess"
                         resource="FileDkcAccess"
                         service="ws_fs"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to files in dropzone"
                         path="FileIOAccess"
                         resource="FileIOAccess"
                         service="ws_fileio"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to WS ECL service"
                         path="WsEclAccess"
                         resource="WsEclAccess"
                         service="ws_ecl"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to Roxie queries and files"
                         path="RoxieQueryAccess"
                         resource="RoxieQueryAccess"
                         service="ws_roxiequery"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to cluster topology"
                         path="ClusterTopologyAccess"
                         resource="ClusterTopologyAccess"
                         service="ws_topology"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to own workunits"
                         path="OwnWorkunitsAccess"
                         resource="OwnWorkunitsAccess"
                         service="ws_workunits"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to others&apos; workunits"
                         path="OthersWorkunitsAccess"
                         resource="OthersWorkunitsAccess"
                         service="ws_workunits"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to ECL direct service"
                         path="EclDirectAccess"
                         resource="EclDirectAccess"
                         service="ecldirect"/>
   </EspBinding>
   <EspBinding defaultForPort="true"
               defaultServiceVersion=""
               name="myws_ecl"
               port="8002"
               protocol="http"
               resourcesBasedn="ou=WsEcl,ou=EspServices,ou=ecl"
               service="myws_ecl"
               workunitsBasedn="ou=workunits,ou=ecl"
               wsdlServiceAddress="">
    <Authenticate access="Read"
                  description="Root access to WS ECL service"
                  path="/"
                  required="Read"
                  resource="WsEclAccess"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to WS ECL service"
                         path="WsEclAccess"
                         resource="WsEclAccess"
                         service="ws_ecl"/>
   </EspBinding>
   <EspBinding defaultForPort="true"
               defaultServiceVersion=""
               name="myecldirect"
               port="8008"
               protocol="http"
               resourcesBasedn="ou=EclDirectAccess,ou=EspServices,ou=ecl"
               service="myecldirect"
               workunitsBasedn="ou=workunits,ou=ecl"
               wsdlServiceAddress="">
    <Authenticate access="Read"
                  description="Root access to ECL Direct service"
                  path="/"
                  required="Read"
                  resource="EclDirectAccess"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to ECL Direct service"
                         path="EclDirectAccess"
                         resource="EclDirectAccess"
                         service="ecldirect"/>
   </EspBinding>
   <HTTPS acceptSelfSigned="true"
          CA_Certificates_Path="ca.pem"
          certificateFileName="certificate.cer"
          city=""
          country="US"
          daysValid="365"
          enableVerification="false"
          organization="Customer of HPCCSystems"
          organizationalUnit=""
          passphrase=""
          privateKeyFileName="privatekey.cer"
          regenerateCredentials="false"
          requireAddressMatch="false"
          state=""
          trustedPeers="anyone"/>
   #for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/myesp"
             FQDN=""
             name="s_$name"
             netAddress="$netAddress"/>
   #end for
   <ProtocolX authTimeout="3"
              defaultTimeout="21"
              idleTimeout="3600"
              maxTimeout="36"
              minTimeout="30"
              threadCount="2"/>
  </EspProcess>
  <EspService allowNewRoxieOnDemandQuery="false"
              AWUsCacheTimeout="15"
              build="community_3.0.4"
              buildSet="espsmc"
              description="ESP services for SMC"
              disableUppercaseTranslation="false"
              eclServer="myeclccserver"
              enableSystemUseRewrite="false"
              excludePartitions="/,/dev*,/sys,/usr,/proc/*"
              monitorDaliFileServer="false"
              name="EclWatch"
              pluginsPath="/opt/HPCCSystems/plugins"
              syntaxCheckQueue=""
              viewTimeout="1000"
              warnIfCpuLoadOver="95"
              warnIfFreeMemoryUnder="5"
              warnIfFreeStorageUnder="5">
   <Properties defaultPort="8010"
               defaultResourcesBasedn="ou=SMC,ou=EspServices,ou=ecl"
               defaultSecurePort="18010"
               type="WsSMC">
    <Authenticate access="Read"
                  description="Root access to SMC service"
                  path="/"
                  required="Read"
                  resource="SmcAccess"/>
    <AuthenticateFeature description="Access to SMC service"
                         path="SmcAccess"
                         resource="SmcAccess"
                         service="ws_smc"/>
    <AuthenticateFeature description="Access to thor queues"
                         path="ThorQueueAccess"
                         resource="ThorQueueAccess"
                         service="ws_smc"/>
    <AuthenticateFeature description="Access to super computer environment"
                         path="ConfigAccess"
                         resource="ConfigAccess"
                         service="ws_config"/>
    <AuthenticateFeature description="Access to DFU"
                         path="DfuAccess"
                         resource="DfuAccess"
                         service="ws_dfu"/>
    <AuthenticateFeature description="Access to DFU XRef"
                         path="DfuXrefAccess"
                         resource="DfuXrefAccess"
                         service="ws_dfuxref"/>
    <AuthenticateFeature description="Access to machine information"
                         path="MachineInfoAccess"
                         resource="MachineInfoAccess"
                         service="ws_machine"/>
    <AuthenticateFeature description="Access to SNMP metrics information"
                         path="MetricsAccess"
                         resource="MetricsAccess"
                         service="ws_machine"/>
    <AuthenticateFeature description="Access to remote execution"
                         path="ExecuteAccess"
                         resource="ExecuteAccess"
                         service="ws_machine"/>
    <AuthenticateFeature description="Access to DFU workunits"
                         path="DfuWorkunitsAccess"
                         resource="DfuWorkunitsAccess"
                         service="ws_fs"/>
    <AuthenticateFeature description="Access to DFU exceptions"
                         path="DfuExceptionsAccess"
                         resource="DfuExceptions"
                         service="ws_fs"/>
    <AuthenticateFeature description="Access to spraying files"
                         path="FileSprayAccess"
                         resource="FileSprayAccess"
                         service="ws_fs"/>
    <AuthenticateFeature description="Access to despraying of files"
                         path="FileDesprayAccess"
                         resource="FileDesprayAccess"
                         service="ws_fs"/>
    <AuthenticateFeature description="Access to dkcing of key files"
                         path="FileDkcAccess"
                         resource="FileDkcAccess"
                         service="ws_fs"/>
    <AuthenticateFeature description="Access to files in dropzone"
                         path="FileIOAccess"
                         resource="FileIOAccess"
                         service="ws_fileio"/>
    <AuthenticateFeature description="Access to WS ECL service"
                         path="WsEclAccess"
                         resource="WsEclAccess"
                         service="ws_ecl"/>
    <AuthenticateFeature description="Access to Roxie queries and files"
                         path="RoxieQueryAccess"
                         resource="RoxieQueryAccess"
                         service="ws_roxiequery"/>
    <AuthenticateFeature description="Access to cluster topology"
                         path="ClusterTopologyAccess"
                         resource="ClusterTopologyAccess"
                         service="ws_topology"/>
    <AuthenticateFeature description="Access to own workunits"
                         path="OwnWorkunitsAccess"
                         resource="OwnWorkunitsAccess"
                         service="ws_workunits"/>
    <AuthenticateFeature description="Access to others&apos; workunits"
                         path="OthersWorkunitsAccess"
                         resource="OthersWorkunitsAccess"
                         service="ws_workunits"/>
    <AuthenticateFeature description="Access to ECL direct service"
                         path="EclDirectAccess"
                         resource="EclDirectAccess"
                         service="ecldirect"/>
    <ProcessFilters>
     <Platform name="Windows">
      <ProcessFilter name="any">
       <Process name="dafilesrv"/>
      </ProcessFilter>
      <ProcessFilter name="AttrServerProcess">
       <Process name="attrserver"/>
      </ProcessFilter>
      <ProcessFilter name="DaliProcess">
       <Process name="daserver"/>
      </ProcessFilter>
      <ProcessFilter multipleInstances="true" name="DfuServerProcess">
       <Process name="dfuserver"/>
      </ProcessFilter>
      <ProcessFilter multipleInstances="true" name="EclCCServerProcess">
       <Process name="eclccserver"/>
      </ProcessFilter>
      <ProcessFilter multipleInstances="true" name="EspProcess">
       <Process name="esp"/>
       <Process name="dafilesrv" remove="true"/>
      </ProcessFilter>
      <ProcessFilter name="FTSlaveProcess">
       <Process name="ftslave"/>
      </ProcessFilter>
      <ProcessFilter name="RoxieServerProcess">
       <Process name="ccd"/>
      </ProcessFilter>
      <ProcessFilter name="RoxieSlaveProcess">
       <Process name="ccd"/>
      </ProcessFilter>
      <ProcessFilter name="SchedulerProcess">
       <Process name="scheduler"/>
      </ProcessFilter>
      <ProcessFilter name="ThorMasterProcess">
       <Process name="thormaster"/>
      </ProcessFilter>
      <ProcessFilter name="ThorSlaveProcess">
       <Process name="thorslave"/>
      </ProcessFilter>
      <ProcessFilter name="SashaServerProcess">
       <Process name="saserver"/>
      </ProcessFilter>
     </Platform>
     <Platform name="Linux">
      <ProcessFilter name="any">
       <Process name="dafilesrv"/>
      </ProcessFilter>
      <ProcessFilter name="AttrServerProcess">
       <Process name="attrserver"/>
      </ProcessFilter>
      <ProcessFilter name="DaliProcess">
       <Process name="daserver"/>
      </ProcessFilter>
      <ProcessFilter multipleInstances="true" name="DfuServerProcess">
       <Process name="."/>
      </ProcessFilter>
      <ProcessFilter multipleInstances="true" name="EclCCServerProcess">
       <Process name="."/>
      </ProcessFilter>
      <ProcessFilter multipleInstances="true" name="EspProcess">
       <Process name="."/>
       <Process name="dafilesrv" remove="true"/>
      </ProcessFilter>
      <ProcessFilter name="FTSlaveProcess">
       <Process name="ftslave"/>
      </ProcessFilter>
      <ProcessFilter name="GenesisServerProcess">
       <Process name="httpd"/>
       <Process name="atftpd"/>
       <Process name="dhcpd"/>
      </ProcessFilter>
      <ProcessFilter name="RoxieServerProcess">
       <Process name="ccd"/>
      </ProcessFilter>
      <ProcessFilter name="RoxieSlaveProcess">
       <Process name="ccd"/>
      </ProcessFilter>
      <ProcessFilter name="SchedulerProcess">
       <Process name="scheduler"/>
      </ProcessFilter>
      <ProcessFilter name="ThorMasterProcess">
       <Process name="thormaster"/>
      </ProcessFilter>
      <ProcessFilter name="ThorSlaveProcess">
       <Process name="thorslave"/>
      </ProcessFilter>
      <ProcessFilter name="SashaServerProcess">
       <Process name="saserver"/>
      </ProcessFilter>
     </Platform>
    </ProcessFilters>
   </Properties>
  </EspService>
  <EspService build="community_3.0.4"
              buildSet="ws_ecl"
              description="WS ECL Service"
              name="myws_ecl">
   <Properties bindingType="ws_eclSoapBinding"
               defaultPort="8002"
               defaultResourcesBasedn="ou=WsEcl,ou=EspServices,ou=ecl"
               defaultSecurePort="18002"
               plugin="ws_ecl"
               type="ws_ecl">
    <Authenticate access="Read"
                  description="Root access to WS ECL service"
                  path="/"
                  required="Read"
                  resource="WsEclAccess"/>
    <AuthenticateFeature description="Access to WS ECL service"
                         path="WsEclAccess"
                         resource="WsEclAccess"
                         service="ws_ecl"/>
   </Properties>
  </EspService>
  <EspService build="community_3.0.4"
              buildSet="ecldirect"
              clusterName="hthor"
              description="ESP service for running raw ECL queries"
              name="myecldirect">
   <Properties bindingType="EclDirectSoapBinding"
               defaultPort="8008"
               defaultResourcesBasedn="ou=EclDirectAccess,ou=EspServices,ou=ecl"
               defaultSecurePort="18008"
               plugin="ecldirect"
               type="ecldirect">
    <Authenticate access="Read"
                  description="Root access to ECL Direct service"
                  path="/"
                  required="Read"
                  resource="EclDirectAccess"/>
    <AuthenticateFeature description="Access to ECL Direct service"
                         path="EclDirectAccess"
                         resource="EclDirectAccess"
                         service="ecldirect"/>
   </Properties>
  </EspService>
  <FTSlaveProcess build="community_3.0.4"
                  buildSet="ftslave"
                  description="FTSlave process"
                  name="myftslave"
                  version="1">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/myftslave"
             name="s_$name"
             netAddress="$netAddress"
             program="/opt/HPCCSystems/bin/ftslave"/>
#end for
  </FTSlaveProcess>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_auditlib"
                 description="plugin process"
                 name="myplugins_auditlib"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_debugservices"
                 description="plugin process"
                 name="myplugins_debugservices"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_fileservices"
                 description="plugin process"
                 name="myplugins_fileservices"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_logging"
                 description="plugin process"
                 name="myplugins_logging"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_parselib"
                 description="plugin process"
                 name="myplugins_parselib"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_stringlib"
                 description="plugin process"
                 name="myplugins_stringlib"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_unicodelib"
                 description="plugin process"
                 name="myplugins_unicodelib"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_workunitservices"
                 description="plugin process"
                 name="myplugins_workunitservices"/>
  <RoxieCluster allowRoxieOnDemand="false"
                baseDataDir="/var/lib/HPCCSystems/hpcc-data/roxie"
                blindLogging="false"
                blobCacheMem="0"
                build="community_3.0.4"
                buildSet="roxie"
                callbackRetries="3"
                callbackTimeout="500"
                channel0onPrimary="true"
                checkCompleted="true"
                checkFileDate="true"
                checkingHeap="0"
                checkPrimaries="true"
                checkState="false"
                checkVersion="true"
                clusterWidth="$len($nodes)"
                copyResources="true"
                crcResources="false"
                cyclicOffset="1"
                dafilesrvLookupTimeout="10000"
                daliServers="mydali"
                debugPermitted="true"
                defaultConcatPreload="0"
                defaultFetchPreload="0"
                defaultFullKeyedJoinPreload="0"
                defaultHighPriorityTimeLimit="0"
                defaultHighPriorityTimeWarning="5000"
                defaultKeyedJoinPreload="0"
                defaultLowPriorityTimeLimit="0"
                defaultLowPriorityTimeWarning="0"
                defaultMemoryLimit="0"
                defaultParallelJoinPreload="0"
                defaultPrefetchProjectPreload="10"
                defaultSLAPriorityTimeLimit="0"
                defaultSLAPriorityTimeWarning="5000"
                defaultStripLeadingWhitespace="1"
                deleteUnneededFiles="false"
                description="Roxie cluster"
                directory="/var/lib/HPCCSystems/myroxie"
                diskReadBufferSize="65536"
                diskReadStable="true"
                doIbytiDelay="true"
                enableForceKeyDiffCopy="false"
                enableHeartBeat="true"
                enableKeyDiff="true"
                enableSNMP="true"
                enableSysLog="true"
                fastLaneQueue="true"
                fieldTranslationEnabled="false"
                flushJHtreeCacheOnOOM="true"
                forceStdLog="false"
                highTimeout="2000"
                ignoreMissingFiles="false"
                indexReadChunkSize="60000"
                indexReadStable="true"
                initIbytiDelay="100"
                jumboFrames="false"
                keyedJoinFlowLimit="1000"
                keyedJoinStable="true"
                lazyOpen="false"
                leafCacheMem="50"
                linuxYield="true"
                localFilesExpire="-1"
                localSlave="false"
                logFullQueries="false"
                logQueueDrop="32"
                logQueueLen="512"
                lowTimeout="10000"
                maxBlockSize="10000000"
                maxLocalFilesOpen="4000"
                maxLockAttempts="5"
                maxRemoteFilesOpen="1000"
                memoryStatsInterval="60"
                memTraceLevel="1"
                memTraceSizeLimit="0"
                minFreeDiskSpace="1073741824"
                minIbytiDelay="0"
                minLocalFilesOpen="2000"
                minRemoteFilesOpen="500"
                miscDebugTraceLevel="0"
                monitorDaliFileServer="false"
                multicastBase="239.1.1.1"
                multicastLast="239.1.254.254"
                name="myroxie"
                nodeCacheMem="100"
                nodeCachePreload="false"
                numChannels="$len($nodes)"
                numDataCopies="2"
                parallelAggregate="0"
                perChannelFlowLimit="10"
                pingInterval="60"
                pluginsPath="/opt/HPCCSystems/plugins"
                preabortIndexReadsThreshold="100"
                preabortKeyedJoinsThreshold="100"
                preferredSubnet=""
                preferredSubnetMask=""
                remoteFilesExpire="3600000"
                resolveFilesInPackage="false"
                roxieMulticastEnabled="true"
                serverSideCacheSize="0"
                serverThreads="30"
                simpleLocalKeyedJoins="true"
                siteCertificate=""
                slaTimeout="2000"
                slaveConfig="cyclic redundancy"
                slaveThreads="30"
                smartSteppingChunkRows="100"
                soapTraceLevel="1"
                socketCheckInterval="5000"
#raw                
                SSHidentityfile="$HOME/.ssh/id_rsa"
#end raw                
                SSHpassword=""
                SSHretries="3"
                SSHtimeout="0"
                SSHusername="hpcc"
                statsExpiryTime="3600"
                syncCluster="false"
                systemMonitorInterval="60000"
                totalMemoryLimit="1073741824"
                traceLevel="1"
                trapTooManyActiveQueries="true"
                udpFlowSocketsSize="131071"
                udpInlineCollation="false"
                udpInlineCollationPacketLimit="50"
                udpLocalWriteSocketSize="131071"
                udpMaxRetryTimedoutReqs="0"
                udpMaxSlotsPerClient="2147483647"
                udpMulticastBufferSize="131071"
                udpOutQsPriority="0"
                udpQueueSize="100"
                udpRequestToSendTimeout="5"
                udpResendEnabled="true"
                udpRetryBusySenders="0"
                udpSendCompletedInData="false"
                udpSendQueueSize="50"
                udpSnifferEnabled="true"
                udpTraceLevel="1"
                useHardLink="false"
                useLogQueue="true"
                useMemoryMappedIndexes="false"
                useRemoteResources="true"
                useTreeCopy="false">
   <RoxieFarmProcess dataDirectory="/var/lib/HPCCSystems/hpcc-data/roxie"
                     listenQueue="200"
                     name="farm1"
                     numThreads="30"
                     port="9876"
                     requestArrayThreads="5">
#for $netAddress, $name in $nodes:                    
    <RoxieServerProcess computer="$name" name="farm1_$name"/>
#end for
   </RoxieFarmProcess>
#for $netAddress, $name in $nodes:                    
   <RoxieServerProcess computer="$name"
                       dataDirectory="/var/lib/HPCCSystems/hpcc-data/roxie"
                       listenQueue="200"
                       name="farm1_s_$name"
                       netAddress="$netAddress"
                       numThreads="30"
                       port="9876"
                       requestArrayThreads="5"/>
#end for
#for $netAddress, $name in $nodes:                    
   <RoxieSlave computer="$name" name="s_$name">
    <RoxieChannel dataDirectory="/var/lib/HPCCSystems/hpcc-data/roxie" number="$random.randint(1,$len($nodes))"/>
    <RoxieChannel dataDirectory="/var/lib/HPCCSystems/hpcc-data2/roxie" number="$random.randint(1,$len($nodes))"/>
   </RoxieSlave>
#end for
#set $channel = 1
#for $netAddress, $name in $nodes:                    
   <RoxieSlaveProcess channel="$channel"
                      computer="$name"
                      dataDirectory="/var/lib/HPCCSystems/hpcc-data/roxie"
                      name="s_$name"
                      netAddress="$netAddress"/>
#set $channel = $channel + 1                      
#end for
  </RoxieCluster>
  <SashaServerProcess autoRestartInterval="0"
                      build="community_3.0.4"
                      buildSet="sasha"
                      cachedWUat="* * * * *"
                      cachedWUinterval="24"
                      cachedWUlimit="100"
                      coalesceAt="* * * * *"
                      coalesceInterval="1"
                      dafsmonAt="* * * * *"
                      dafsmonInterval="0"
                      dafsmonList="*"
                      daliServers="mydali"
                      description="Sasha Server process"
                      DFUrecoveryAt="* * * * *"
                      DFUrecoveryCutoff="4"
                      DFUrecoveryInterval="12"
                      DFUrecoveryLimit="20"
                      DFUWUat="* * * * *"
                      DFUWUcutoff="14"
                      DFUWUduration="0"
                      DFUWUinterval="24"
                      DFUWUlimit="1000"
                      DFUWUthrottle="0"
                      ExpiryAt="* 3 * * *"
                      ExpiryInterval="24"
                      keepResultFiles="false"
                      LDSroot="LDS"
                      logDir="."
                      minDeltaSize="50000"
                      name="mysasha"
                      recoverDeltaErrors="false"
                      thorQMonInterval="1"
                      thorQMonQueues="*"
                      thorQMonSwitchMinTime="0"
                      WUat="* * * * *"
                      WUbackup="0"
                      WUcutoff="8"
                      WUduration="0"
                      WUinterval="6"
                      WUlimit="1000"
                      WUretryinterval="7"
                      WUthrottle="0"
                      xrefAt="* 2 * * *"
                      xrefCutoff="1"
                      xrefEclWatchProvider="true"
                      xrefInterval="0"
                      xrefList="*">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/mysasha"
             name="s_$name"
             netAddress="$netAddress"
             port="8877"/>
#end for             
  </SashaServerProcess>
  <ThorCluster autoCopyBackup="false"
               build="community_3.0.4"
               buildSet="thor"
               computer="$nodes[0][1]"
               daliServers="mydali"
               description="Thor process"
               localThor="false"
               monitorDaliFileServer="true"
               multiSlaves="false"
               name="mythor"
               pluginsPath="/opt/HPCCSystems/plugins/"
               replicateAsync="true"
               replicateOutputs="true"
               slaves="8"
               watchdogEnabled="true"
               watchdogProgressEnabled="true">
   <Debug/>
#raw   
   <SSH SSHidentityfile="$HOME/.ssh/id_rsa"
#end raw   
        SSHpassword=""
        SSHretries="3"
        SSHtimeout="0"
        SSHusername="hpcc"/>
   <Storage/>
   <SwapNode/>
   <ThorMasterProcess computer="$nodes[0][1]" name="m_$nodes[0][1]"/>
#if $len($nodes) > 1:
#for $netAddress, $name in $nodes[1:]:                    
   <ThorSlaveProcess computer="$name" name="s_$name"/>
#end for   
#end if
   <Topology>
    <Node process="m_$nodes[0][1]">
#if $len($nodes) > 1:
#for $netAddress, $name in $nodes[1:]:                    
     <Node process="s_$name"/>
#end for   
#end if
    </Node>
   </Topology>
  </ThorCluster>
  <Topology build="community_3.0.4" buildSet="topology" name="topology">
   <Cluster name="hthor" prefix="hthor">
    <EclAgentProcess process="myeclagent"/>
    <EclCCServerProcess process="myeclccserver"/>
    <EclSchedulerProcess process="myeclscheduler"/>
   </Cluster>
   <Cluster name="thor" prefix="thor">
    <EclAgentProcess process="myeclagent"/>
    <EclCCServerProcess process="myeclccserver"/>
    <EclSchedulerProcess process="myeclscheduler"/>
    <ThorCluster process="mythor"/>
   </Cluster>
   <Cluster name="roxie" prefix="roxie">
    <EclAgentProcess process="myeclagent"/>
    <EclCCServerProcess process="myeclccserver"/>
    <EclSchedulerProcess process="myeclscheduler"/>
    <RoxieCluster process="myroxie"/>
   </Cluster>
  </Topology>
 </Software>
</Environment>
====


Even just scrolling past the file takes a while!!  This behemoth of a file was tamed thanks to cheetah, I highly encourage you to read up on it.

This charm may require some changes to your environment.yaml file in ~/.juju as hpcc will only run on 64-bit instances.  Make sure that your juju environment has been properly shutdown before you edit this file ( juju destroy-environment ).  Here is my environment.yaml file where I show you the important part to check:
juju: environments

environments:
  sample:
    type: ec2
    access-key: ( removed ... get your own :) )
    secret-key: ( removed ... get your own :) )
    control-bucket: juju-fbb790f292e14a0394353bb4b63a3403
    admin-secret: 604d18a77fd24e3f91e1df398fcbe9f2
The emphasized parts are the important ones.  You can just copy them from here and paste them into your ~/.juju/environment.yaml file.
Now, let's take a look at the charm starting with the metadata.yaml file:
name: hpcc
revision: 1
summary: HPCC (High Performance Computing Cluster)
description: |
  HPCC (High Performance Computing Cluster) is a massive 
  parallel-processing computing platform that solves Big Data problems.
provides:
  hpcc:
    interface: hpcc
requires:
  hpcc-thor:
    interface: hpcc-thor
  hpcc-roxie:
    interface: hpcc-roxie
peers:
  hpcc-cluster:
    interface: hpcc-cluster

 There are various provides and requires interfaces in this metadata.yaml file but, for now, only the peers interface is being used.  I'll work on the other ones as the charm matures.


Let's look at the hpcc-cluster interface.  More specifically the hpcc-cluster-relation-changed hook where the new configuration is created:
#!/bin/bash
CWD=$(dirname $0)
cheetah fill --oext=xml --odir=/etc/HPCCSystems/ ${CWD}/../templates/environment.tmpl
service hpcc-init restart
It's pretty simple isn't it?   Since the "heavy lifting" is being done with the self contained cheetah template , we don't have much to do here but, to generate the configuration file and restart hpcc.

The other files in this charm are pretty self explanatory and simple so, I am leaving the details of them as an exercise to the reader.

All of the complexities in hpcc has been distilled to the following commands:

  • juju bootstrap
  • bzr branch lp:~negronjl/+junk/hpcc
  • juju deploy --repository . hpcc
    • wait a minute of two
  • juju status
    • you should see something similar to this:

negronjl@negronjl-laptop:~/src/juju/charms$ juju status
2011-08-18 16:00:54,413 INFO Connecting to environment.


machines: 
  0: {dns-name: ec2-184-73-109-244.compute-1.amazonaws.com, instance-id: i-6d61460c} 
  1: {dns-name: ec2-50-16-60-94.compute-1.amazonaws.com, instance-id: i-d5694eb4}


services:
  hpcc:
    charm: local:hpcc-1
    relations: {hpcc-cluster: hpcc}
    units:
      hpcc/0:
        machine: 1
        relations: {}
        state: null


2011-08-18 16:00:58,374 INFO 'status' command finished successfully
negronjl@negronjl-laptop:~/src/juju/charms$
The above commands, will give you a single node.

You can access the web interface of your node by pointing your browser to http://<FQDN>:8010 Where FQDN is the Fully Qualified Domain Name or Public IP Address of your hpcc instance.  On the left side, there should be a menu, explore the items on the Topology section.  The Target Clusters section should look something similar to this:


To experience the true power of hpcc, you should probably throw in some more nodes at it.  Let's do just that with:


  • juju add-unit hpcc 
    • do this as many times as you feel comfortable   
    • Each command will give you a new node in the cluster
    • wait a minute or two and you should see something similar to this:

negronjl@negronjl-laptop:~$ juju status
2011-08-18 16:25:55,739 INFO Connecting to environment.
machines:
  0: {dns-name: ec2-184-73-109-244.compute-1.amazonaws.com, instance-id: i-6d61460c}
  1: {dns-name: ec2-50-16-60-94.compute-1.amazonaws.com, instance-id: i-d5694eb4}
  2: {dns-name: ec2-50-19-181-98.compute-1.amazonaws.com, instance-id: i-a5795ec4}
  3: {dns-name: ec2-184-72-147-67.compute-1.amazonaws.com, instance-id: i-25446344}
services:
  hpcc:
    charm: local:hpcc-1
    relations: {hpcc-cluster: hpcc}
    units:
      hpcc/0:
        machine: 1
        relations:
          hpcc-cluster: {state: up}
        state: started
      hpcc/1:
        machine: 2
        relations:
          hpcc-cluster: {state: up}
        state: started
      hpcc/2:
        machine: 3
        relations:
          hpcc-cluster: {state: up}
        state: started
2011-08-18 16:26:01,837 INFO 'status' command finished successfully
Notice how we now have more hpcc nodes :)  Here is what the web interface could look like:


Again....we have more nodes :)

Now that we have a working cluster, let's try it.  We'll first do the mandatory Hello World in ECL.  It looks something like this (hello.ecl):
Output('Hello world');
 We have to compile our hello.ecl so we can use it.  We do that by logging into one of the nodes ( I used juju ssh 1 to log on to the first/master node ) and typing the following:
eclcc hello.ecl -o
We run the file just like we would any other binary:
./hello
... and the output is:
ubuntu@ip-10-111-19-210:~$ ./hello
Hello world
ubuntu@ip-10-111-19-210:~$ 
There are far more interesting examples in the Learning ECL Documentation here.  I highly encourage you to go and read about it.

That's it for now.  Feedback is always welcome of course so, let me know how I'm doing.

-Juan

3 comments:

  1. Man this is awesome!
    One little thing, the output from first ensemble status seems squished

    ReplyDelete
  2. Thanks Ahmed for the feedback! I fixed the formatting issue.

    -Juan

    ReplyDelete
  3. Thanks for sharing! HPCC Systems is hosting a meetup on 9/8 in the SF area if you can attend. http://bit.ly/p2aN3f

    ReplyDelete