Gluster Heal Not able to fetch volfile from glusterd

Wednesday, February 25, 2015

I recently setup a two node replica Gluster cluster on CentOS 6.6 with Gluster version 3.6.2-1. I setup a private network specifically for Gluster replication and communication and binded the Gluster daemons to this network.

I binded the Gluster daemons by adding the option transport.socket.bind-address line to the main Gluster configuration file at /etc/glusterfs/glusterd.vol on each Gluster node (on the second Gluster node would be changed to

volume management
    type mgmt/glusterd
    option working-directory /var/lib/glusterd
    option transport-type socket,rdma
    option transport.socket.keepalive-time 10
    option transport.socket.keepalive-interval 2
    option off
    option transport.socket.bind-address
#   option base-port 49152

Everything was working fine until I began playing with Gluster heal.

When I ran command gluster volume heal gv0 it outputted the following and everything looked normal:

Launching heal operation to perform index self heal on volume gv0 has been successful 
Use heal info commands to check status

However, when I ran command gluster volume heal gv0 info, the following was displayed:

gv0: Not able to fetch volfile from glusterd
Volume heal failed

Something was wrong. The Gluster heal log file for the particular Gluster Volume, located at /var/log/glusterfs/glfsheal-gv0.log, outputted the following:

glfsheal-gv0.log:[2015-02-24 16:31:53.788775] E [socket.c:2267:socket_connect_finish] 0-gfapi: connection to failed (Connection refused)

Gluster heal was trying to connect to port 24007 on localhost, but, because I binded the Gluster daemon to the network, that connection was refused. This occurs because the Gluster heal daemon has localhost hardcoded. A very old email thread explains why this is the case.

That same email thread suggests to try running gluster --remote-host=<IP of Gluster Node> volume heal gv0 info, but this yielded the same error as above.

The only fix I found was to unbind the Gluster daemon by removing the line from the main Gluster configuration file mentioned above.

After that, running command gluster volume heal gv0 info displayed the proper output:

Number of entries: 0

Number of entries: 0
comments powered by Disqus