Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Monday 01 August 2011 13:58:55 Trevor Hemsley wrote: > Today they did it again. And then several more times - about every 20 > minutes in fact. The servers are in a remote data centre and I have no > console access and the iLO's on these two servers are not set up and I'm > unable to use them so I can see no output on the console. There's no > information in /var/log about what the problem is, all I see is that one > of the servers reboots itself and then 5 to 10 seconds later, the 2nd > one follows it. I've seen from the logs that it's not always the same > one that reboots first, sometimes it's one and sometimes the other. The > only way I've managed to get the servers out of their 20 minute reboot > loop is to stop drbd on one of the pair and migrate all my VMs to run on > the other with all the DRBD devices in standalone mode. This seems to me > to indicate that DRBD is most probably involved in the reboot. Just a shot in the dark (because I was hit by the same last friday): Is there a watchdog active and set to a timeout of 20 minutes? Could be the corresponding userspace tool was removed or rendered unusable during the update... Good luck, Arnold -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110801/862c861e/attachment.pgp>