Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi all, I used drbd 8.3.7 on HA. When Master host is dead and HA swatches from Master to Slave, the drbd can't switch because it spends 10 minutes to mount its partition. But the time is timeout to HA.(in HA, default overtime is 2 miniutes). Why does drbd spent that long time? The log is: Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739458] block drbd1: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739468] block drbd1: asender terminated Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739470] block drbd1: Terminating asender thread Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739526] block drbd1: short read expecting header on sock: r=-512 Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739666] block drbd1: Connection closed Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739672] block drbd1: conn( NetworkFailure -> Unconnected ) Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739678] block drbd1: receiver terminated Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739680] block drbd1: Restarting receiver thread Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739683] block drbd1: receiver (re)started Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739687] block drbd1: conn( Unconnected -> WFConnection ) Jul 22 21:06:39 QD-CS-MDC-B pengine: [17776]: info: crm_log_init: Changed active directory to /usr/var/lib/heartbeat/cores/root Jul 22 21:06:47 QD-CS-MDC-B kernel: [325573.727331] NET: Registered protocol family 17 Jul 22 21:06:47 QD-CS-MDC-B kernel: [325573.768912] block drbd0: role( Secondary -> Primary ) Jul 22 21:06:47 QD-CS-MDC-B kernel: [325573.772742] block drbd1: role( Secondary -> Primary ) Jul 22 21:06:47 QD-CS-MDC-B kernel: [325573.772997] block drbd1: Creating new current UUID Jul 22 21:08:47 QD-CS-MDC-B su: (to hitv) root on none Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032485] block drbd0: PingAck did not arrive in time. Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032493] block drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032503] block drbd0: asender terminated Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032506] block drbd0: Terminating asender thread Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032514] block drbd0: Creating new current UUID Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032567] block drbd0: short read expecting header on sock: r=-512 Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032868] block drbd0: Connection closed Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032875] block drbd0: conn( NetworkFailure -> Unconnected ) Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032879] block drbd0: receiver terminated Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032881] block drbd0: Restarting receiver thread Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032884] block drbd0: receiver (re)started Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032888] block drbd0: conn( Unconnected -> WFConnection ) Jul 22 21:16:48 QD-CS-MDC-B kernel: [326174.600888] kjournald starting. Commit interval 15 seconds Jul 22 21:16:48 QD-CS-MDC-B kernel: [326174.600956] EXT3-fs warning: maximal mount count reached, running e2fsck is recommended Jul 22 21:16:48 QD-CS-MDC-B kernel: [326174.601330] EXT3 FS on drbd0, internal journal Jul 22 21:16:48 QD-CS-MDC-B kernel: [326174.601334] EXT3-fs: recovery complete. Jul 22 21:16:48 QD-CS-MDC-B kernel: [326174.601392] EXT3-fs: mounted filesystem with ordered data mode. According to the log, the timeout is PingAsk operation. Thanks your help. simon -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120816/35fc0e22/attachment.htm>