Installation/Deployment On Clusters¶

StreamMine3G uses zookeeper to read and write its configuration to, hence, only a single configuration file called node.properties is required to run StreamMine3G which solely contains the information about the location of the zookeeper nodes.

In addition to the node.properties file, StreamMine3G's logging system log4cxx is configured via the log4cxx.properties file. Please read the log4j documentation to adjust your logging level etc. of StreamMine3G to fit your needs.

StreamMine3G consists just of a single binary which is linked statically to all libraries it depends on, hence, no extra libraries must be installed on the target system.

To run StreamMine3G, only two command line parameters are required: A nodeName, and the path to a node.properties file (which contains the list of zookeeper nodes). The second parameter is optional, hence, StreamMine3G will look in the current working directory for a node.properties file which looks like this:

node.properties:

zookeeperHostsList=localhost:2181

The nodeName is used to uniquely identify each StreamMine3G node within a StreamMine3G cluster. Ideally, one can use the hostname of the VM as nodeName.

In a StreamMine3G cluster, you should run only a single StreamMine3G process/instance (which we call a node) per node/virtual machine.

Zookeeper Entries¶

For each node in a StreamMine3G cluster, there must exist an equivalent entry in the /streammine3g/nodes path of zookeeper with the following configuration:

host=192.168.1.100
port=10000

threads=8

delay=1000000
managerLibrary=libMyManager.so

The first two entries tell the StreamMine3G node on what IP address/network interface and port number the TCP server of this node should be bound to so other StreamMine3G nodes can "talk" to this one.

Each StreamMine3G node has a thread pool to perform asynchronous reads from the network as well as process events. For an optimal performance of the system, the number of threads used for the thread pool should match the number of available processing cores for that machine.

The two last entries of the configuration are used for the manager component, hence, the library to use for the manager code and the interval of the onTimer method of the manager. Those parameters will only be used, if the node runs the manager component as well.

Import Information on Zookeeper

To import the configuration into zookeeper, simply create a file and use the zookeeperClientConfig tool provided with StreamMine3G. This tool will simply copy all the file contents into the specified path in zookeeper.

Boot Up Process¶

What happens during the bootup of StreamMine3G?

It will first look up the node.properties file and contact the zookeeper cluster.
The nodeName (e.g. nodeXZY.mycloud.com) will be used to lookup the configuration for this node stored at /streammine3g/nodes/nodeXZY.mycloud.com.
StreamMine3G will bind its TCP Server to the specified network interface so that other StreamMine3G nodes can contact that node.
It will lookup the /streammine3g/manager entry in zookeeper to find out what node (identified by nodeName) is currently assigned to run the manager component.
It will either run the manager code itself (if its own name matches the entry) or contact the node running the manager to register itself to the StreamMine3G cluster and to receive "work", i.e., deploy slices and do event processing later on.