I recently setup a two node replica Gluster cluster on CentOS 6.6 with Gluster version 3.6.2-1. I setup a private network specifically for Gluster replication and communication and binded the Gluster daemons to this network.
I binded the Gluster daemons by adding the option transport.socket.bind-address line to the main Gluster configuration file at /etc/glusterfs/glusterd.vol on each Gluster node (on the second Gluster node 192.168.3.1 would be changed to 192.168.3.2):
volume management
type mgmt/glusterd
option working-directory /var/lib/glusterd
option transport-type socket,rdma
option transport.socket.keepalive-time 10
option transport.socket.keepalive-interval 2
option transport.socket.read-fail-log off
option transport.socket.bind-address 192.168.3.1
# option base-port 49152
end-volume
Everything was working fine until I began playing with Gluster heal.
When I ran command gluster volume heal gv0
it outputted the following and everything looked normal:
Launching heal operation to perform index self heal on volume gv0 has been successful
Use heal info commands to check status
However, when I ran command gluster volume heal gv0 info
, the following was displayed:
gv0: Not able to fetch volfile from glusterd
Volume heal failed
Something was wrong. The Gluster heal log file for the particular Gluster Volume, located at /var/log/glusterfs/glfsheal-gv0.log, outputted the following:
glfsheal-gv0.log:[2015-02-24 16:31:53.788775] E [socket.c:2267:socket_connect_finish] 0-gfapi: connection to 127.0.0.1:24007 failed (Connection refused)
Gluster heal was trying to connect to port 24007 on localhost, but, because I binded the Gluster daemon to the 192.168.3.0 network, that connection was refused. This occurs because the Gluster heal daemon has localhost hardcoded. A very old email thread explains why this is the case.
That same email thread suggests to try running gluster --remote-host=<IP of Gluster Node> volume heal gv0 info
, but this yielded the same error as above.
The only fix I found was to unbind the Gluster daemon by removing the line from the main Gluster configuration file mentioned above.
After that, running command gluster volume heal gv0 info
displayed the proper output:
Brick gluster1.example.com:/export/xvdb1/brick/
Number of entries: 0
Brick gluster2.example.com:/export/xvdb1/brick/
Number of entries: 0