[DRBD-user] Split brain problem.

Digimer linux at alteeve.com
Fri Dec 2 05:05:57 CET 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 12/01/2011 07:30 PM, Ivan Pavlenko wrote:
> Hi ALL,
> 
> Could you help me to fix a problem with split brain, please?
> 
> I have Red Hat cluster based on RHEL 5.7 and provide nfs-over-gfs2
> service. I use DRBD as a storage.
> 
> # cat /etc/drbd.conf
> #
> # please have a a look at the example configuration file in
> # /usr/share/doc/drbd83/drbd.conf
> #
> include "/etc/drbd.d/global_common.conf";

This is a good file to see. Can you share it, please?

> include "/etc/drbd.d/r0.res";
> 
> # cat /etc/drbd.d/r0.res
> resource r0 {
>   on infplsm017 {
>     device    /dev/drbd1;
>     disk      /dev/sdb1;
>     address   10.10.24.10:7789;
>     meta-disk internal;
>   }
>   on infplsm018 {
>     device    /dev/drbd1;
>     disk      /dev/sdb1;
>     address   10.10.24.11:7789;
>     meta-disk internal;
>   }
> }
> 
> As you can see, there is nothing sophisticated here.
> 
> I have:
> 
> # cat /proc/drbd
> version: 8.3.8 (api:88/proto:86-94)
> GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
> mockbuild at builder10.centos.org, 2010-06-04 08:04:09
> 
>  1: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r----
>     ns:0 nr:0 dw:0 dr:332 al:0 bm:4 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
> oos:524288
> 
> # ping 10.10.24.11
> PING 10.10.24.11 (10.10.24.11) 56(84) bytes of data.
> 64 bytes from 10.10.24.11: icmp_seq=1 ttl=64 time=2.99 ms
> 64 bytes from 10.10.24.11: icmp_seq=2 ttl=64 time=13.9 ms
> 
> But when I try to use telnet for port 7789 I get:
> 
> # telnet 10.10.24.11 7789
> Trying 10.10.24.11...
> telnet: connect to address 10.10.24.11: Connection refused
> telnet: Unable to connect to remote host: Connection refused  only
> 
> But at the same time:
> 
> # service iptables status
> Table: filter
> Chain INPUT (policy ACCEPT)
> num  target     prot opt source               destination
> 
> Chain FORWARD (policy ACCEPT)
> num  target     prot opt source               destination
> 
> Chain OUTPUT (policy ACCEPT)
> num  target     prot opt source               destination
> 
> 
> I did it from my first server (INFPLSM017). And I have absolutely same
> result from the second one (INFPLSM018). Could you tell me, please, wht
> the possible reason of this problem and how I can fix this.
> 
> Thank you in advance,
> Ivan

Is this a network or split-brain problem?

What happens when you try to connect?

What state is the other node in?

Anything interesting in /var/log/messages?

How does DRBD tie into the cluster? What is the cluster's configuration?
Are you using fencing?

More details are needed to provide assistance.

-- 
Digimer
E-Mail:              digimer at alteeve.com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"omg my singularity battery is dead again.
stupid hawking radiation." - epitron



More information about the drbd-user mailing list