Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Wed, Feb 9, 2011 at 6:30 AM, Dario Fiumicello - Antek < fiumicello at antek.it> wrote: > Hi all, I have two Virtualbox VM running on two different physical hosts. > The vm are interconnected with two gigabit ethernet for drbd sync and > heartbeat. > > Suddenly I get this on master machine: > > Feb 9 10:53:24 mail1 kernel: [136200.650336] INFO: task jbd2/drbd0-8:13739 > blocked for more than 120 seconds. > Feb 9 10:53:24 mail1 kernel: [136200.650967] "echo 0 > > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > This is a warning, not an error. It simply states that a some tasks has been working for more than 2 minutes. Some tasks legitimately take more than 120 seconds to complete, the above is simply informative. > > And from this moment many other errors of blocked tasks appears (postfix, > pickup and so on). The machine load was more than 25! > It sounds like the DRBD block device is hung due to slow I/O response from one of the backing-devices on your VMs. > > Obviously I cannot use the machine anymore and I needed to kill it in order > to force the takeover on the slave. Halt didn't work either. > That's not obvious at all. Your system shouldn't be entirely on DRBD. Even if your DRBD block device is unresponsive you should still be able to login and look around. What was your CPU load? > > My question is: why did I get this error? What can I do to avoid it? > You got this error because one of your VMs likely couldn't keep up, likely caused by load on one of the host servers. You can avoid it by going bare-metal. The VMs are on different host servers right? -JR -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110209/45e1baa3/attachment.htm>