Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2004-08-17 16:34:12 +0200 \ Alex Ongena: > Somehow I have a Split-Brain between my two > nodes, but I have no way to correct it. > > They both think they are 'Consistent', while there > is apparantly a difference and they immediatly drop > connection. > > Trying to 'invalidate' one does not work: > > # drbdsetup /dev/drbd0 invalidate > ioctl(,INVALIDATE_REM,) failed: Operation now in progress > Only in 'Connected' cstate possible. > > Also there is some internal inconsistency, > # umount /dev/drbd0 > # drbdsetup /dev/drbd0 secondary > ioctl(,SET_STATE,) failed: Device or resource busy > Someone has opened the device for RW access! > > while this is not correct: > # fuser -mav /dev/drbd0 > USER PID ACCESS COMMAND > /dev/drbd0 kernel version? drbd svn revision? ouput of lsmod | grep drbd; cat /proc/drbd; drbdsetup /dev/drbd0 show on both nodes? part of the log when they have been up and running for the last time? part of the log during the event that lead to connection loss at that time? > # this is part of the log > 16:11:33 AXSDBG debug boot.pl[331] Service::system(170) run drbdsetup /dev/drbd0 syncer -r 10000 > 16:11:33 AXSDBG debug boot.pl[331] Ha::Service::isConfigured(55) master = 0 > 16:11:33 AXSDBG debug boot.pl[331] Service::system(170) run drbdsetup /dev/drbd0 net 192.168.5.229 192.168.5.228 C > 16:11:33 SYSLOG info kernel drbd0: drbdsetup [745]: cstate StandAlone --> Unconnected > 16:11:33 SYSLOG info kernel drbd0: drbd0_receiver [746]: cstate Unconnected --> WFConnection > 16:11:33 SYSLOG info kernel drbd0: drbd0_receiver [746]: cstate WFConnection --> WFReportParams > 16:11:33 SYSLOG info kernel drbd0: Handshake successful: DRBD Network Protocol version 74 > 16:11:33 SYSLOG info kernel drbd0: Connection established. > 16:11:33 SYSLOG alert kernel drbd0: Split-Brain detected, dropping connection! > 16:11:33 SYSLOG info kernel drbd0: drbd0_receiver [746]: cstate WFReportParams --> StandAlone > 16:11:33 SYSLOG err kernel drbd0: error receiving ReportParams, l: 72! > 16:11:33 SYSLOG info kernel drbd0: asender terminated > 16:11:33 SYSLOG info kernel drbd0: worker terminated > 16:11:33 SYSLOG info kernel drbd0: drbd0_receiver [746]: cstate StandAlone --> StandAlone > 16:11:33 SYSLOG info kernel drbd0: Connection lost. > 16:11:33 SYSLOG info kernel drbd0: receiver terminated > > Any idea how I can get them in sync again ? > > Thanks > alex Lars Ellenberg -- please use the "List-Reply" function of your email client.