Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Guys, I had a fault with 8.2.7 last night. During the time where DRBD was handling the fault I had a kernel panic/hang. I believe the panic was probably caused by DRBD. Because of the panic/hang there is very little in the log file. What I have is listed below. Can any person suggest whether this may be a DRBD problem? Only I want to put this server live this evening, and I'm now very worried about it! Any help very welcome! Regards, Ben -------------------------- Linux hp-tm-12 2.6.25.18-0.2-default version: 8.2.7 (api:88/proto:86-88) GIT-hash: 61b7f4c2fc34fe3d2acf7be6bcc1fc2684708a7d build by root at hp-tm-12, 2008-11-25 17:19:15 -------------------------- /var/log/messages on dead server: 00:06:25 drbd0: sock was shut down by peer 00:06:25 drbd0: peer( Primary -> Unknown ) conn( Connected -> BrokenPipe ) pdsk( UpToDate -> DUnknown ) 00:06:25 drbd0: short read expecting header on sock: r=0 ** Kernel Hang/Panic until reboot ** 08:22:48 [<ffffffff80217368>] mtrr_add_page+0x270/0x34d 08:22:48 [<ffffffff80217745>] mtrr_file_add+0x91/0xaa 08:22:48 [<ffffffff80217b12>] mtrr_ioctl+0x3b4/0x542 08:22:48 [<ffffffff802df107>] proc_reg_unlocked_ioctl+0x7c/0xd7 08:22:48 [<ffffffff802acada>] vfs_ioctl+0x2a/0x78 08:22:48 [<ffffffff802acd6f>] do_vfs_ioctl+0x247/0x261 08:22:48 [<ffffffff802acdde>] sys_ioctl+0x55/0x77 08:22:48 [<ffffffff8020bffa>] system_call_after_swapgs+0x8a/0x8f 08:22:48 [<00007fc201c72b67>] 08:22:48 kernel: 08:22:48 ---[ end trace 0a6413c31e348d2f ]--- 08:22:48 ------------[ cut here ]------------ -------------------------- /var/log/messages on server which did not hang: 00:06:25 drbd0: PingAck did not arrive in time. 00:06:25 drbd0: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) 00:06:25 drbd0: asender terminated 00:06:25 drbd0: Terminating asender thread 00:06:25 drbd0: Creating new current UUID 00:06:25 drbd0: short read expecting header on sock: r=-512 00:06:25 drbd0: Connection closed 00:06:25 drbd0: conn( NetworkFailure -> Unconnected ) 00:06:25 drbd0: receiver terminated 00:06:25 drbd0: Restarting receiver thread 00:06:25 drbd0: receiver (re)started 00:06:25 drbd0: conn( Unconnected -> WFConnection ) 00:06:41 drbd0: Handshake successful: Agreed network protocol version 88 00:06:41 drbd0: conn( WFConnection -> WFReportParams ) 00:06:41 drbd0: Starting asender thread (from drbd0_receiver [3491]) 00:06:41 drbd0: data-integrity-alg: <not-used> 00:06:41 drbd0: drbd_sync_handshake: 00:06:41 drbd0: self 6C43B920C2584C8B:65FCA875E302675F:B4D1BAA9F48439CF:90CE33222F63999E 00:06:41 drbd0: peer 65FCA875E302675F:0000000000000000:B4D1BAA9F48439CE:90CE33222F63999E 00:06:41 drbd0: uuid_compare()=1 by rule 7 00:06:41 drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate ) 00:06:51 drbd0: PingAck did not arrive in time. 00:06:51 drbd0: peer( Secondary -> Unknown ) conn( WFBitMapS -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) 00:06:51 drbd0: asender terminated 00:06:51 drbd0: Terminating asender thread 00:06:51 drbd0: error receiving ReportBitMap, l: 4088! 00:06:51 drbd0: Connection closed 00:06:51 drbd0: conn( NetworkFailure -> Unconnected ) 00:06:51 drbd0: receiver terminated 00:06:51 drbd0: Restarting receiver thread 00:06:51 drbd0: receiver (re)started 00:06:51 drbd0: conn( Unconnected -> WFConnection ) 00:07:16 drbd0: Handshake successful: Agreed network protocol version 88 00:07:16 drbd0: conn( WFConnection -> WFReportParams ) 00:07:16 drbd0: Starting asender thread (from drbd0_receiver [3491]) 00:07:16 drbd0: data-integrity-alg: <not-used> 00:07:16 drbd0: drbd_sync_handshake: 00:07:16 drbd0: self 6C43B920C2584C8B:65FCA875E302675F:B4D1BAA9F48439CF:90CE33222F63999E 00:07:16 drbd0: peer 65FCA875E302675F:0000000000000000:B4D1BAA9F48439CE:90CE33222F63999E 00:07:16 drbd0: uuid_compare()=1 by rule 7 00:07:16 drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate ) 00:07:16 drbd0: conn( WFBitMapS -> SyncSource ) pdsk( UpToDate -> Inconsistent ) 00:07:16 drbd0: Began resync as SyncSource (will sync 9768 KB [2442 bits set]). 00:10:23 drbd1: PingAck did not arrive in time. 00:10:23 drbd1: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) 00:10:23 drbd1: asender terminated 00:10:23 drbd1: Terminating asender thread 00:10:23 drbd1: short read expecting header on sock: r=-512 00:10:23 drbd1: Connection closed 00:10:23 drbd1: conn( NetworkFailure -> Unconnected ) 00:10:23 drbd1: receiver terminated 00:10:23 drbd1: Restarting receiver thread 00:10:23 drbd1: receiver (re)started 00:10:23 drbd1: conn( Unconnected -> WFConnection ) 00:10:26 drbd0: PingAck did not arrive in time. 00:10:26 drbd0: peer( Secondary -> Unknown ) conn( SyncSource -> NetworkFailure ) 00:10:26 drbd0: asender terminated 00:10:26 drbd0: Terminating asender thread 00:10:26 drbd0: short read expecting header on sock: r=-512 00:10:26 drbd0: Connection closed 00:10:26 drbd0: conn( NetworkFailure -> Unconnected ) 00:10:26 drbd0: receiver terminated 00:10:26 drbd0: Restarting receiver thread 00:10:26 drbd0: receiver (re)started 00:10:26 drbd0: conn( Unconnected -> WFConnection ) 00:12:10 drbd1: Handshake successful: Agreed network protocol version 88 00:12:10 drbd1: conn( WFConnection -> WFReportParams ) 00:12:10 drbd1: Starting asender thread (from drbd1_receiver [3494]) 00:12:10 drbd1: data-integrity-alg: <not-used> 00:12:10 drbd1: drbd_sync_handshake: 00:12:10 drbd1: self D5FEF42F4E1EBD62:0000000000000000:FBB62BAB73B59A46:0437B1B84EC0633D 00:12:10 drbd1: peer 1419EE6C8C122AAF:D5FEF42F4E1EBD63:FBB62BAB73B59A46:0437B1B84EC0633D 00:12:10 drbd1: uuid_compare()=-1 by rule 5 00:12:10 drbd1: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate ) 00:12:21 drbd1: PingAck did not arrive in time. 00:12:21 drbd1: peer( Primary -> Unknown ) conn( WFBitMapT -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) 00:12:21 drbd1: asender terminated 00:12:21 drbd1: Terminating asender thread 00:12:21 drbd1: error receiving ReportBitMap, l: 4088! 00:12:21 drbd1: Connection closed 00:12:21 drbd1: conn( NetworkFailure -> Unconnected ) 00:12:21 drbd1: receiver terminated 00:12:21 drbd1: Restarting receiver thread 00:12:21 drbd1: receiver (re)started 00:12:21 drbd1: conn( Unconnected -> WFConnection ) Last entry until peer was rebooted. ************************************************************************* This e-mail is confidential and may be legally privileged. It is intended solely for the use of the individual(s) to whom it is addressed. Any content in this message is not necessarily a view or statement from Road Tech Computer Systems Limited but is that of the individual sender. If you are not the intended recipient, be advised that you have received this e-mail in error and that any use, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. We use reasonable endeavours to virus scan all e-mails leaving the company but no warranty is given that this e-mail and any attachments are virus free. You should undertake your own virus checking. The right to monitor e-mail communications through our networks is reserved by us Road Tech Computer Systems Ltd. Shenley Hall, Rectory Lane, Shenley, Radlett, Hertfordshire, WD7 9AN. - VAT Registration No GB 449 3582 17 Registered in England No: 02017435, Registered Address: Charter Court, Midland Road, Hemel Hempstead, Hertfordshire, HP2 5GE. *************************************************************************