Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Lars, Thanks for the opinion it was nothing to do with DRBD. All I wanted to know :) I'll see what I can do with a serial cable... Ben Lars Ellenberg wrote: > On Tue, Dec 16, 2008 at 08:53:58AM +0000, Ben Clewett wrote: >> >> Hi Guys, >> >> I had a fault with 8.2.7 last night. During the time where DRBD was >> handling the fault I had a kernel panic/hang. I believe the panic was >> probably caused by DRBD. Because of the panic/hang there is very little >> in the log file. What I have is listed below. >> >> Can any person suggest whether this may be a DRBD problem? Only I want >> to put this server live this evening, and I'm now very worried about it! >> >> Any help very welcome! >> >> Regards, Ben > > I don't see anything in the messages below that suggests drbd is the > problem here. for the information given so far, it can be anything. > > hook up a serial console and log it to capture > any future oops/panic message. > >> -------------------------- >> >> Linux hp-tm-12 2.6.25.18-0.2-default >> >> version: 8.2.7 (api:88/proto:86-88) >> GIT-hash: 61b7f4c2fc34fe3d2acf7be6bcc1fc2684708a7d build by >> root at hp-tm-12, 2008-11-25 17:19:15 >> >> -------------------------- >> >> /var/log/messages on dead server: >> >> 00:06:25 drbd0: sock was shut down by peer >> 00:06:25 drbd0: peer( Primary -> Unknown ) conn( Connected -> BrokenPipe >> ) pdsk( UpToDate -> DUnknown ) >> 00:06:25 drbd0: short read expecting header on sock: r=0 >> >> ** Kernel Hang/Panic until reboot ** >> >> 08:22:48 [<ffffffff80217368>] mtrr_add_page+0x270/0x34d >> 08:22:48 [<ffffffff80217745>] mtrr_file_add+0x91/0xaa >> 08:22:48 [<ffffffff80217b12>] mtrr_ioctl+0x3b4/0x542 >> 08:22:48 [<ffffffff802df107>] proc_reg_unlocked_ioctl+0x7c/0xd7 >> 08:22:48 [<ffffffff802acada>] vfs_ioctl+0x2a/0x78 >> 08:22:48 [<ffffffff802acd6f>] do_vfs_ioctl+0x247/0x261 >> 08:22:48 [<ffffffff802acdde>] sys_ioctl+0x55/0x77 >> 08:22:48 [<ffffffff8020bffa>] system_call_after_swapgs+0x8a/0x8f >> 08:22:48 [<00007fc201c72b67>] >> 08:22:48 kernel: >> 08:22:48 ---[ end trace 0a6413c31e348d2f ]--- >> 08:22:48 ------------[ cut here ]------------ >> >> -------------------------- >> >> /var/log/messages on server which did not hang: >> >> >> 00:06:25 drbd0: PingAck did not arrive in time. >> 00:06:25 drbd0: peer( Secondary -> Unknown ) conn( Connected -> >> NetworkFailure ) pdsk( UpToDate -> DUnknown ) >> 00:06:25 drbd0: asender terminated >> 00:06:25 drbd0: Terminating asender thread >> 00:06:25 drbd0: Creating new current UUID >> 00:06:25 drbd0: short read expecting header on sock: r=-512 >> 00:06:25 drbd0: Connection closed >> 00:06:25 drbd0: conn( NetworkFailure -> Unconnected ) >> 00:06:25 drbd0: receiver terminated >> 00:06:25 drbd0: Restarting receiver thread >> 00:06:25 drbd0: receiver (re)started >> 00:06:25 drbd0: conn( Unconnected -> WFConnection ) >> 00:06:41 drbd0: Handshake successful: Agreed network protocol version 88 >> 00:06:41 drbd0: conn( WFConnection -> WFReportParams ) >> 00:06:41 drbd0: Starting asender thread (from drbd0_receiver [3491]) >> 00:06:41 drbd0: data-integrity-alg: <not-used> >> 00:06:41 drbd0: drbd_sync_handshake: >> 00:06:41 drbd0: self >> 6C43B920C2584C8B:65FCA875E302675F:B4D1BAA9F48439CF:90CE33222F63999E >> 00:06:41 drbd0: peer >> 65FCA875E302675F:0000000000000000:B4D1BAA9F48439CE:90CE33222F63999E >> 00:06:41 drbd0: uuid_compare()=1 by rule 7 >> 00:06:41 drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> >> WFBitMapS ) pdsk( DUnknown -> UpToDate ) >> 00:06:51 drbd0: PingAck did not arrive in time. >> 00:06:51 drbd0: peer( Secondary -> Unknown ) conn( WFBitMapS -> >> NetworkFailure ) pdsk( UpToDate -> DUnknown ) >> 00:06:51 drbd0: asender terminated >> 00:06:51 drbd0: Terminating asender thread >> 00:06:51 drbd0: error receiving ReportBitMap, l: 4088! >> 00:06:51 drbd0: Connection closed >> 00:06:51 drbd0: conn( NetworkFailure -> Unconnected ) >> 00:06:51 drbd0: receiver terminated >> 00:06:51 drbd0: Restarting receiver thread >> 00:06:51 drbd0: receiver (re)started >> 00:06:51 drbd0: conn( Unconnected -> WFConnection ) >> 00:07:16 drbd0: Handshake successful: Agreed network protocol version 88 >> 00:07:16 drbd0: conn( WFConnection -> WFReportParams ) >> 00:07:16 drbd0: Starting asender thread (from drbd0_receiver [3491]) >> 00:07:16 drbd0: data-integrity-alg: <not-used> >> 00:07:16 drbd0: drbd_sync_handshake: >> 00:07:16 drbd0: self >> 6C43B920C2584C8B:65FCA875E302675F:B4D1BAA9F48439CF:90CE33222F63999E >> 00:07:16 drbd0: peer >> 65FCA875E302675F:0000000000000000:B4D1BAA9F48439CE:90CE33222F63999E >> 00:07:16 drbd0: uuid_compare()=1 by rule 7 >> 00:07:16 drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> >> WFBitMapS ) pdsk( DUnknown -> UpToDate ) >> 00:07:16 drbd0: conn( WFBitMapS -> SyncSource ) pdsk( UpToDate -> >> Inconsistent ) >> 00:07:16 drbd0: Began resync as SyncSource (will sync 9768 KB [2442 bits >> set]). >> 00:10:23 drbd1: PingAck did not arrive in time. >> 00:10:23 drbd1: peer( Primary -> Unknown ) conn( Connected -> >> NetworkFailure ) pdsk( UpToDate -> DUnknown ) >> 00:10:23 drbd1: asender terminated >> 00:10:23 drbd1: Terminating asender thread >> 00:10:23 drbd1: short read expecting header on sock: r=-512 >> 00:10:23 drbd1: Connection closed >> 00:10:23 drbd1: conn( NetworkFailure -> Unconnected ) >> 00:10:23 drbd1: receiver terminated >> 00:10:23 drbd1: Restarting receiver thread >> 00:10:23 drbd1: receiver (re)started >> 00:10:23 drbd1: conn( Unconnected -> WFConnection ) >> 00:10:26 drbd0: PingAck did not arrive in time. >> 00:10:26 drbd0: peer( Secondary -> Unknown ) conn( SyncSource -> >> NetworkFailure ) >> 00:10:26 drbd0: asender terminated >> 00:10:26 drbd0: Terminating asender thread >> 00:10:26 drbd0: short read expecting header on sock: r=-512 >> 00:10:26 drbd0: Connection closed >> 00:10:26 drbd0: conn( NetworkFailure -> Unconnected ) >> 00:10:26 drbd0: receiver terminated >> 00:10:26 drbd0: Restarting receiver thread >> 00:10:26 drbd0: receiver (re)started >> 00:10:26 drbd0: conn( Unconnected -> WFConnection ) >> 00:12:10 drbd1: Handshake successful: Agreed network protocol version 88 >> 00:12:10 drbd1: conn( WFConnection -> WFReportParams ) >> 00:12:10 drbd1: Starting asender thread (from drbd1_receiver [3494]) >> 00:12:10 drbd1: data-integrity-alg: <not-used> >> 00:12:10 drbd1: drbd_sync_handshake: >> 00:12:10 drbd1: self >> D5FEF42F4E1EBD62:0000000000000000:FBB62BAB73B59A46:0437B1B84EC0633D >> 00:12:10 drbd1: peer >> 1419EE6C8C122AAF:D5FEF42F4E1EBD63:FBB62BAB73B59A46:0437B1B84EC0633D >> 00:12:10 drbd1: uuid_compare()=-1 by rule 5 >> 00:12:10 drbd1: peer( Unknown -> Primary ) conn( WFReportParams -> >> WFBitMapT ) pdsk( DUnknown -> UpToDate ) >> 00:12:21 drbd1: PingAck did not arrive in time. >> 00:12:21 drbd1: peer( Primary -> Unknown ) conn( WFBitMapT -> >> NetworkFailure ) pdsk( UpToDate -> DUnknown ) >> 00:12:21 drbd1: asender terminated >> 00:12:21 drbd1: Terminating asender thread >> 00:12:21 drbd1: error receiving ReportBitMap, l: 4088! >> 00:12:21 drbd1: Connection closed >> 00:12:21 drbd1: conn( NetworkFailure -> Unconnected ) >> 00:12:21 drbd1: receiver terminated >> 00:12:21 drbd1: Restarting receiver thread >> 00:12:21 drbd1: receiver (re)started >> 00:12:21 drbd1: conn( Unconnected -> WFConnection ) >> >> >> Last entry until peer was rebooted. >> > ************************************************************************* This e-mail is confidential and may be legally privileged. It is intended solely for the use of the individual(s) to whom it is addressed. Any content in this message is not necessarily a view or statement from Road Tech Computer Systems Limited but is that of the individual sender. If you are not the intended recipient, be advised that you have received this e-mail in error and that any use, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. We use reasonable endeavours to virus scan all e-mails leaving the company but no warranty is given that this e-mail and any attachments are virus free. You should undertake your own virus checking. The right to monitor e-mail communications through our networks is reserved by us Road Tech Computer Systems Ltd. Shenley Hall, Rectory Lane, Shenley, Radlett, Hertfordshire, WD7 9AN. - VAT Registration No GB 449 3582 17 Registered in England No: 02017435, Registered Address: Charter Court, Midland Road, Hemel Hempstead, Hertfordshire, HP2 5GE. *************************************************************************