Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Bump. Nobody else is seeing dopd crashes in their logs? On Tue, Jun 10, 2008 at 4:05 PM, Art Age Software <artagesw at gmail.com> wrote: > I was recently performing some testing of a 2-node drbd-heartbeat > setup. Everything is operating fine. However, when I rebooted the > server on which drbd was secondary, the primary node's system log > output the following worrisome messages: > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > Jun 10 22:09:34 node1 /usr/lib64/heartbeat/dopd: [5283]: info: sending > start_outdate message to the other node node1 -> node2 > Jun 10 22:09:39 node1 /usr/lib64/heartbeat/dopd: [5283]: ERROR: > ipc_bufpool_update: magic number in head does not match.Something very > bad happened, abort now, farside pid =6678 > Jun 10 22:09:39 node1 /usr/lib64/heartbeat/dopd: [5283]: ERROR: > magic=63203a72, expected value=abcd > Jun 10 22:09:39 node1 /usr/lib64/heartbeat/dopd: [5283]: info: pool: > refcount=1, startpos=0x38b7838, > currpos=0x38b78e5,consumepos=0x38b78a3, endpos=0x38b8808, size=4096 > Jun 10 22:09:39 node1 /usr/lib64/heartbeat/dopd: [5283]: info: nmsgs=0 > Jun 10 22:09:39 node1 heartbeat: [4999]: WARN: Managed > /usr/lib64/heartbeat/dopd process 5283 killed by signal 6 [SIGABRT - > Abort]. > Jun 10 22:09:39 node1 heartbeat: [4999]: ERROR: Managed > /usr/lib64/heartbeat/dopd process 5283 dumped core > Jun 10 22:09:39 node1 heartbeat: [4999]: ERROR: Respawning client > "/usr/lib64/heartbeat/dopd": > Jun 10 22:09:39 node1 heartbeat: [4999]: info: Starting child client > "/usr/lib64/heartbeat/dopd" (90,90) > Jun 10 22:09:39 node1 heartbeat: [6679]: info: Starting > "/usr/lib64/heartbeat/dopd" as uid 90 gid 90 (pid 6679) > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > The node seemed to recover from this condition with no apparent > problems as the primary node came back online. Still, I am concerned > that dopd crashed like that. Has anyone else seen this behavior? Is it > a known issue? If there is any more info I can provide that would help > in explaining or possibly finding/fixing the cause, please let me > know. > > Thanks. >