CLUSTERER Module

The clusterer module is used to organize multiple OpenSIPS instances into groups(clusters) in which the nodes can communicate with each other in order to replicate, share information or perform distributed tasks. The distributed logic is performed by different modules that use the clusterer interface (i.e. the dialog module can replicate dialogs/profiles, the ratelimit module can share pipes across multiple instances etc.). The clusterer module itself only provides an interface to send/receive BIN packets and get notifications about node availability. It does this by internally learning the cluster topology and state of the nodes. Provisioning the nodes within a cluster is done over the database. The node-related information can be checked and triggered to be reloaded by sending commands over the MI interface.

The topology established by the clusterer module is an overlay of nodes where the "links" represent communication availability at BIN interface level. For this purpose, a probing mechanism is used, consisting of regular pings to all nodes which must receive a reply within a given interval. All nodes in the cluster exchange information about the state of their links with other nodes and compute a "routing table" which gives a next hop for each destination. The metric for the shortest path is the number of hops. When there is no direct link to a destination, the BIN packet sent by a module is transparently routed through the cluster.

Note that an OpenSIPS instance can belong to multiple clusters, communicating and establishing the topology separately for each one. In order to provision this in the database, each node has an unique ID at global level, which can be referenced by each cluster.

While existing nodes can learn about newly added nodes without additional provisioning, the new nodes must be fully aware of the existing components of the cluster they are joining, in order to properly advertise themselves.

1.2. Dependencies

1.2.1. OpenSIPS Modules

The following modules must be loaded before this module:

a database module.
proto_bin module.

1.2.2. External Libraries or Applications

The following libraries or applications must be installed before running OpenSIPS with this module loaded:

None.

1.3. Exported Parameters

1.3.1. `db_url`

The database url.

Default value is “NULL”.

Example 1.1. Set db_url parameter

...
modparam("clusterer", "db_url",
	"mysql://opensips:opensipsrw@localhost/opensips")
...

1.3.2. `db_table`

The name of the table storing the clustering information.

Default value is “clusterer”.

Example 1.2. Set db_table parameter

...
modparam("clusterer", "db_table", "clusterer")
...

1.3.3. `id_col`

The name of the column storing an id for the table rows.

Default value is “id”.

Example 1.3. Set id_col parameter

...
modparam("clusterer", "id_col", "id")
...

1.3.4. `cluster_id_col`

The name of the column to store the id of a cluster.

Default value is “cluster_id”.

Example 1.4. Set cluster_id_col parameter

...
modparam("clusterer", "cluster_id_col", "cluster_id")
...

1.3.5. `node_id_col`

The name of the column to store the id of an instance. The values must be greater than 0.

Default value is “node_id”.

Example 1.5. Set node_id_col parameter

...
modparam("clusterer", "node_id_col", "node_id")
...

1.3.6. `url_col`

The name of the column containing the instance url. The values must be greater than 0.

Default value is “url”.

Example 1.6. Set url_col parameter

...
modparam("clusterer", "url_col", "url")
...

1.3.7. `state_col`

The name of the column storing the state of the node(enabled/disabled).

Default value is “state”.

Example 1.7. Set state_col parameter

...
modparam("clusterer", "state_col", "state")
...

1.3.8. `ls_seq_no_col`

The name of the column storing the sequence number of the last link state update message sent by the node.

Default value is “ls_seq_no”.

Example 1.8. Set ls_seq_no_col parameter

...
modparam("clusterer", "ls_seq_no_col", "ls_seq_no")
...

1.3.9. `top_seq_no_col`

The name of the column storing the sequence number of the last topology update message sent by the node.

Default value is “top_seq_no”.

Example 1.9. Set top_seq_no_col parameter

...
modparam("clusterer", "top_seq_no_col", "top_seq_no")
...

1.3.10. `no_ping_retries_col`

The name of the column containing the maximum number of ping retries before the link with the neighbour node is considered down.

Default value is “no_ping_retries”.

Example 1.10. Set no_ping_retries_col parameter

...
modparam("clusterer", "no_ping_retries_col", "no_ping_retries")
...

1.3.11. `priority_col`

The name of the column storing the node priority to be chosen as next hop in case of same length(number of hops) paths when rerouting messages.

Default value is “priority”.

Example 1.11. Set priority_col parameter

...
modparam("clusterer", "priority_col", "priority")
...

1.3.12. `sip_addr_col`

The name of the column containing a SIP address for the node.

Default value is “sip_addr”.

Example 1.12. Set sip_addr_col parameter

...
modparam("clusterer", "sip_addr_col", "sip_addr")
...

1.3.13. `description_col`

The name of the column containing a node description.

Default value is “description”.

Example 1.13. Set description_col parameter

...
modparam("clusterer", "description_col", "description")
...

1.3.14. `current_id`

The id of the current instance. This parameter must be equal with one of the node_id fields in the database.

No default value. This parameter must be explicitly set to a value greater than zero.

Example 1.14. Set current_id parameter

...
modparam("clusterer", "current_id", 1)
...

1.3.15. `ping_interval`

The interval in seconds between regular pings sent to a neighbour node.

Default value is “4”

Example 1.15. Set ping_interval parameter

...
modparam("clusterer", "ping_interval", 1)
...

1.3.16. `ping_timeout`

The time in milliseconds to wait for a reply to a previously sent ping before retrying or considering the link with the neighbour node down. This is also the interval between successive retries if the send fails.

Default value is “1000”

Example 1.16. Set ping_timeout parameter

...
modparam("clusterer", "ping_timeout", 500)
...

1.3.17. `node_timeout`

The time in seconds to wait before pinging is restarted for a failed node.

Default value is “60”

Example 1.17. Set node_timeout parameter

...
modparam("clusterer", "node_timeout", 10)
...

1.4. Exported Functions

none

1.5. Exported MI Functions

1.5.1. `clusterer_reload`

Reloads data from the clusterer database. The currently established topology will be lost and the node will rediscover the new topology.

Name: clusterer_reload

Parameters:none

MI FIFO Command Format:

		:clusterer_reload
		_empty_line_

1.5.2. `clusterer_list`

Lists information(node id, URL, link state with that node etc.) about the other nodes in each cluster.

Name: clusterer_list

Parameters:none

MI FIFO Command Format:

		:clusterer_list
		_empty_line_

Example 1.18. clusterer_list usage

$ ./opensipsctl fifo clusterer_list
Cluster:: 1
	Node:: 4 DB_ID=4 URL=bin:127.0.0.4:7774 Enabled=1 Link_state=Up      Next_hop=4 Description=none
	Node:: 3 DB_ID=3 URL=bin:127.0.0.3:7773 Enabled=1 Link_state=Down    Next_hop=4 Description=none
	Node:: 2 DB_ID=2 URL=bin:127.0.0.2:7772 Enabled=1 Link_state=Probe   Next_hop=4 Description=none
Cluster:: 2
	Node:: 5 DB_ID=5 URL=bin:127.0.0.4:7775 Enabled=1 Link_state=Up      Next_hop=5 Description=none

1.5.3. `clusterer_list_topology`

Lists each cluster's topology from the current node's perspective as an adjacency list. A node appears as a neighbour if the link with that node is up.

Note that if a node id appears in multiple clusters, it refers to the same instance that belongs to different clusters, for which it has a different topology.

Name: clusterer_list_topology

Parameters:none

MI FIFO Command Format:

		:clusterer_list_topology
		_empty_line_

Example 1.19. clusterer_list_topology usage

$ ./opensipsctl fifo clusterer_list_topology
Cluster:: 1
	Node:: 1 Neighbours=4
	Node:: 4 Neighbours=1 2 3
	Node:: 3 Neighbours=2 4
	Node:: 2 Neighbours=3 4
Cluster:: 2
	Node:: 1 Neighbours=5
	Node:: 5 Neighbours=1

1.5.4. `clusterer_set_status`

Sets the status(Enabled/Disabled) of the current node in a specified cluster. A disabled node does not send any messages and ignores received ones thus appearing as a failed node in the topology.

Name: clusterer_set_status

Parameters:

cluster_id - indicates the id of the cluster.
status - indicates the new status(0 - Disabled, 1 - Enabled).

MI FIFO Command Format:

		:clusterer_set_status:
		1
		0
		_empty_line_

1.6. Usage Example

This section provides an usage example for replicating ratelimit pipes between two OpenSIPS instances. It uses the clusterer module to manage the replicating nodes, and the proto_bin modules to send the replicated information.

The setup topology is simple: we have two OpenSIPS nodes running on two separate machines (although they could run on the same machine as well): Node A has IP 192.168.0.5 and Node B has IP 192.168.0.6. Both have, besides the traffic listeners (UDP, TCP, etc.), bin listeners bound on port 5566. These listeners will be used by the ratelimit module to replicate the pipes. Therefore, we have to provision them in the clusterer table.

Example 1.20. Example database content - clusterer table

+----+------------+---------+----------------------+-------+-----------+------------+-----------------+----------+------------------------+
| id | cluster_id | node_id | url                  | state | ls_seq_no | top_seq_no | no_ping_retries | priority | sip_addr | description |
+----+------------+---------+----------------------+-------+-----------+------------+-----------------+----------+------------------------+
|  1 |          1 |       1 | bin:192.168.0.5:5566 |     1 |         0 |          0 |               3 |       50 | NULL     | Node A      |
|  2 |          1 |       2 | bin:192.168.0.6:5566 |     1 |         0 |          0 |               3 |       50 | NULL     | Node B      |
+----+------------+---------+----------------------+-------+-----------+------------+-----------------+----------+------------------------+

“cluster_id” - this column represents the identifier of the cluster. All nodes within a group/cluster should have the same id (in our example, both nodes have ID 1). The values must be greater than 0.
“node_id” - this represents the identifier of the machine/node, and each instance within a cluster should have a different ID. The values must be greater than 0. In our example, Node A will have ID 1, and Node B ID 2.
“url” - this indicates the URL where the instance will receive the replication information. In our example, each node will receive the date over the bin protocol
“state” - this is the state of the machine: 1 means Enabled, 0 means Disabled; if we had a third machine that we didn't want to use for the moment, we would have set the state to 0
“ls_seq_no” and “top_seq_no” are fields used for the probing and topology discovery mechanisms, and should be set to 0 by default; they are automatically updated by the clusterer module and you shouldn't change them even if a node fails or you disable it
“no_ping_retries” - is used to specify the maximum number of ping retries before the link with a node is considered down
“priority” - is used to specify the node priority to be chosen as next hop in case of same length(number of hops) paths when rerouting messages; it is not relevant for this two-node topology example
“sip_addr” - is a SIP address for the node with currently no application in replication scenarios; reserved for further development of other modules which might use the clusterer module for communication
“description” - is an opaque value used to identify the node

After provisioning the two nodes in the database, we have to configure the two instances of OpenSIPS. First, we configure Node A:

Example 1.21. Node A configuration

...
listen = bin:192.168.0.5:5566 # bin listener for Node A

loadmodule "proto_bin.so"

loadmodule "clusterer.so"
modparam("clusterer", "db_url", "mysql://opensips@192.168.0.7/opensips")
modparam("clusterer", "current_id", 1) # node_id for Node A

loadmodule "ratelimit.so"
# replicate pipes to cluster id 1
modparam("ratelimit", "replicate_pipes_to", 1)
# accept replicated data from nodes within cluster 1
modparam("ratelimit", "accept_pipes_from", 1)
...

Similarly, the configuration for Node B is as follows:

Example 1.22. Node B configuration

...
listen = bin:192.168.0.6:5566 # bin listener for Node B

loadmodule "proto_bin.so"

loadmodule "clusterer.so"
# ideally, use the same database for both nodes
modparam("clusterer", "db_url", "mysql://opensips@192.168.0.7/opensips")
modparam("clusterer", "current_id", 2) # node_id for Node B

loadmodule "ratelimit.so"
# replicate pipes to cluster id 1
modparam("ratelimit", "replicate_pipes_to", 1)
# accept replicated data from nodes within cluster 1
modparam("ratelimit", "accept_pipes_from", 1)
...

Note that the node_id parameter for Node B is 2. Starting the two OpenSIPS instances with the above configurations provides your platform the ability to used shared ratelimit pipes in a very efficient and scalable way.

Chapter 2. Developer Guide

2.1. Available Functions

2.1.1. `get_nodes(cluster_id)`

This function will return a list of all the reachable nodes(if the direct link is down/probing, a path through intermediary nodes is considered) in the specified cluster.

The returned nodes structure:

...
typedef struct clusterer_node {
    int node_id;
    union sockaddr_union addr;
    str sip_addr;
    str description;
    struct clusterer_node *next;
} clusterer_node_t;
...