StrongKey Tellaro Appliances are always installed in a clustered configuration for HA in a production environment. All the servers function in a master-master where any server can receive transactions and will asynchronously replicate data records between all other nodes in the cluster.
However, there can be issues with the replication and here are the ways to debug the replication issues.
ISSUE:
While running the repl command in the command line, if the output consistently increases while running the repl command, there may be an issue with replication.
DEBUGGING METHOD 1:
Check /etc/hosts or the DNS entry and confirm that the FQDN and IP address entries are correct for all the nodes in the cluster.
Login as the root user and type the following command in the terminal:
shell > cat /etc/hosts
Output should look like this. The left part will have the IP addresses and right with the domain names. Check the IP addresses and the domain names to verify the accuracy.
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
DEBUGGING METHOD 2:
Check the iptables file and confirm the rules for ports 7001, 7002, and 7003 are open for the correct IP addresses for all the nodes in the cluster. If not, update the IP address and reload the iptables rules.
Open a terminal and login as the root user. Type the following command:
shell > cat /etc/sysconfig/iptables
To change the file entries, use a text editor such as nano or vi.
If the ports are not open for the correct IP address, run the following command to reload the iptables; also be sure to update the IP address in the iptables file.
shell > service iptables reload
DEBUGGING METHOD 3:
If the issue still persists, verify whether or not the firewall rules between data centers are blocking the access for the replication zmq ports 7001, 7002 and 7003.
Check the etc/hosts file to confirm the fully qualified domain name (FQDN) and IP addresses are correct, execute the following command on the Tellaro:
shell > nc -zv <PORT#>
This command will confirm the access of replication ports to other nodes in the cluster. If the output has a long response time or even no output at all from the nc command, it could mean that the communication for that port is not possible and it might be blocked by a firewall between the two data centers.
ADDITIONAL INFORMATION
ZeroMQ Ports use in SAKA<p> </p>
ZeroMQ (also known as ØMQ, 0MQ, or zmq) are ports that work in alliance with Transmission Control Protocol (TCP) ports. ØMQ ports are like mailboxes with routing.
ØMQ helps speed up the process as opposed to traditional messaging systems. Other tools can be used for group messaging and multi casting but ØMQ is a tool for doing this process more efficiently since these ports can bind to multiple ports and protocols concurrently.