[DRBD-user] dopd Crash

Art Age Software artagesw at gmail.com
Wed Jun 11 01:05:11 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


I was recently performing some testing of a 2-node drbd-heartbeat
setup. Everything is operating fine. However, when I rebooted the
server on which drbd was  secondary, the primary node's system log
output the following worrisome messages:

------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Jun 10 22:09:34 node1 /usr/lib64/heartbeat/dopd: [5283]: info: sending
start_outdate message to the other node node1 -> node2
Jun 10 22:09:39 node1 /usr/lib64/heartbeat/dopd: [5283]: ERROR:
ipc_bufpool_update: magic number in head does not match.Something very
bad happened, abort now, farside pid =6678
Jun 10 22:09:39 node1 /usr/lib64/heartbeat/dopd: [5283]: ERROR:
magic=63203a72, expected value=abcd
Jun 10 22:09:39 node1 /usr/lib64/heartbeat/dopd: [5283]: info: pool:
refcount=1, startpos=0x38b7838,
currpos=0x38b78e5,consumepos=0x38b78a3, endpos=0x38b8808, size=4096
Jun 10 22:09:39 node1 /usr/lib64/heartbeat/dopd: [5283]: info: nmsgs=0
Jun 10 22:09:39 node1 heartbeat: [4999]: WARN: Managed
/usr/lib64/heartbeat/dopd process 5283 killed by signal 6 [SIGABRT -
Abort].
Jun 10 22:09:39 node1 heartbeat: [4999]: ERROR: Managed
/usr/lib64/heartbeat/dopd process 5283 dumped core
Jun 10 22:09:39 node1 heartbeat: [4999]: ERROR: Respawning client
"/usr/lib64/heartbeat/dopd":
Jun 10 22:09:39 node1 heartbeat: [4999]: info: Starting child client
"/usr/lib64/heartbeat/dopd" (90,90)
Jun 10 22:09:39 node1 heartbeat: [6679]: info: Starting
"/usr/lib64/heartbeat/dopd" as uid 90  gid 90 (pid 6679)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------

The node seemed to recover from this condition with no apparent
problems as the primary node came back online. Still, I am concerned
that dopd crashed like that. Has anyone else seen this behavior? Is it
a known issue? If there is any more info I can provide that would help
in explaining or possibly finding/fixing the cause, please let me
know.

Thanks.



More information about the drbd-user mailing list