Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Late last night I started getting paged for a DRBD issue. It appears that the two servers have lost connection for an unknown reason. Here is an excerpt from the logs, it should be a complete startup: Aug 13 08:53:36 plccnfs02 kernel: drbd0: drbdsetup [7274]: cstate WFConnection --> Unconnected Aug 13 08:53:36 plccnfs02 kernel: drbd0: worker terminated Aug 13 08:53:36 plccnfs02 kernel: drbd0: drbd0_receiver [6956]: cstate Unconnected --> StandAlone Aug 13 08:53:36 plccnfs02 kernel: drbd0: Connection lost. Aug 13 08:53:36 plccnfs02 kernel: drbd0: Discarding network configuration. Aug 13 08:53:36 plccnfs02 kernel: drbd0: drbd0_receiver [6956]: cstate StandAlone --> StandAlone Aug 13 08:53:36 plccnfs02 kernel: drbd0: receiver terminated Aug 13 08:53:36 plccnfs02 kernel: drbd0: drbdsetup [7274]: cstate StandAlone --> StandAlone Aug 13 08:53:36 plccnfs02 kernel: drbd0: drbdsetup [7274]: cstate StandAlone --> Unconfigured Aug 13 08:53:36 plccnfs02 kernel: drbd0: worker terminated Aug 13 08:53:42 plccnfs02 kernel: drbd0: resync bitmap: bits=10453652 words=326678 Aug 13 08:53:42 plccnfs02 kernel: drbd0: size = 39 GB (41814608 KB) Aug 13 08:53:43 plccnfs02 kernel: drbd0: 1116 KB marked out-of-sync by on disk bit-map. Aug 13 08:53:43 plccnfs02 kernel: drbd0: Found 4 transactions (192 active extents) in activity log. Aug 13 08:53:43 plccnfs02 kernel: drbd0: drbdsetup [7284]: cstate Unconfigured --> StandAlone Aug 13 08:53:43 plccnfs02 kernel: drbd0: drbdsetup [7287]: cstate StandAlone --> Unconnected Aug 13 08:53:43 plccnfs02 kernel: drbd0: drbd0_receiver [7288]: cstate Unconnected --> WFConnection Aug 13 08:54:01 plccnfs02 kernel: drbd0: drbdsetup [7295]: cstate WFConnection --> Unconnected Aug 13 08:54:01 plccnfs02 kernel: drbd0: worker terminated Aug 13 08:54:01 plccnfs02 kernel: drbd0: drbd0_receiver [7288]: cstate Unconnected --> StandAlone Aug 13 08:54:01 plccnfs02 kernel: drbd0: Connection lost. Aug 13 08:54:01 plccnfs02 kernel: drbd0: Discarding network configuration. Aug 13 08:54:01 plccnfs02 kernel: drbd0: drbd0_receiver [7288]: cstate StandAlone --> StandAlone Aug 13 08:54:01 plccnfs02 kernel: drbd0: receiver terminated Aug 13 08:54:01 plccnfs02 kernel: drbd0: drbdsetup [7295]: cstate StandAlone --> StandAlone Aug 13 08:54:01 plccnfs02 kernel: drbd0: drbdsetup [7295]: cstate StandAlone --> Unconfigured Aug 13 08:54:01 plccnfs02 kernel: drbd0: worker terminated Aug 13 08:54:05 plccnfs02 kernel: drbd0: resync bitmap: bits=10453652 words=326678 Aug 13 08:54:05 plccnfs02 kernel: drbd0: size = 39 GB (41814608 KB) Aug 13 08:54:05 plccnfs02 kernel: drbd0: 1116 KB marked out-of-sync by on disk bit-map. Aug 13 08:54:05 plccnfs02 kernel: drbd0: Found 4 transactions (192 active extents) in activity log. Aug 13 08:54:05 plccnfs02 kernel: drbd0: drbdsetup [7304]: cstate Unconfigured --> StandAlone Aug 13 08:54:05 plccnfs02 kernel: drbd0: drbdsetup [7307]: cstate StandAlone --> Unconnected Aug 13 08:54:05 plccnfs02 kernel: drbd0: drbd0_receiver [7308]: cstate Unconnected --> WFConnection Here is /etc/drbd.conf (the same on both machines): resource drbd0 { protocol C; incon-degr-cmd "halt -f"; # killall heartbeat would be a good alternative :-> startup { degr-wfc-timeout 120; # 2 minutes } disk { on-io-error detach; } syncer { rate 10M; # Note: 'M' is MegaBytes, not MegaBits } on plccnfs01 { device /dev/drbd0; disk /dev/cciss/c0d1p1; address 10.1.100.173:7789; meta-disk internal; } on plccnfs02 { device /dev/drbd0; disk /dev/sdb1; address 10.1.100.172:7789; meta-disk internal; } } On plccnfs01 there are no drbd issues in /var/log/messages at all. The previous log was from the secondary. At this time I cannot get the secondary device to come up as part of the cluster. I have tried restarting DRBD, rebooting the machine, using drbdadm, and pretty much everything I could think of. Any help at all would be greatly appreciated. Best Regards, Mark L. Potter Systems Engineer Academy Sports & Outdoors