Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Tonight my secondary crashed while importing a large set of (openstreetmap) data with postgresql on the primary: I had to extract this from the remote syslogging facility, I do not know whether it is fully complete: Jul 8 00:03:28 nadir kernel: [125018.133603] block drbd0: Digest integrity check FAILED. Jul 8 00:03:28 nadir kernel: [125018.138980] block drbd0: error receiving Data, l: 131112! Jul 8 00:03:28 nadir kernel: [125018.144492] block drbd0: peer( Primary -> Unknown ) conn( Connected -> ProtocolError ) pdsk( UpToDate -> DUnknown ) Jul 8 00:03:28 nadir kernel: [125018.144953] block drbd0: asender terminated Jul 8 00:03:28 nadir kernel: [125018.144956] block drbd0: Terminating drbd0_asender Jul 8 00:03:29 nadir kernel: [125018.440849] block drbd0: Connection closed Jul 8 00:03:29 nadir kernel: [125018.440853] block drbd0: conn( ProtocolError -> Unconnected ) Jul 8 00:03:29 nadir kernel: [125018.440856] block drbd0: receiver terminated Jul 8 00:03:29 nadir kernel: [125018.440858] block drbd0: Restarting drbd0_receiverJul 8 00:03:29 nadir kernel: [125018.440860] block drbd0: receiver (re)started Jul 8 00:03:29 nadir kernel: [125018.440863] block drbd0: conn( Unconnected -> WFConnection ) Jul 8 00:03:29 nadir kernel: [125018.536341] block drbd0: Handshake successful: Agreed network protocol version 95 Jul 8 00:03:29 nadir kernel: [125018.536483] block drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC Jul 8 00:03:29 nadir kernel: [125018.536487] block drbd0: conn( WFConnection -> WFReportParams ) Jul 8 00:03:29 nadir kernel: [125018.536506] block drbd0: Starting asender thread (from drbd0_receiver [2131]) Jul 8 00:03:29 nadir kernel: [125018.536552] block drbd0: data-integrity-alg: md5 Jul 8 00:03:29 nadir kernel: [125018.536573] block drbd0: max_segment_size ( = BIO size ) = 65536 Jul 8 00:03:29 nadir kernel: [125018.536588] block drbd0: drbd_sync_handshake: Jul 8 00:03:29 nadir kernel: [125018.536590] block drbd0: self E34E53A9081B1F64:0000000000000000:0F9D607E2DF004B4:4DEE98D54259C08D bits:0 flags:0 Jul 8 00:03:29 nadir kernel: [125018.536596] block drbd0: peer 55D7620CF1CEA711:E34E53A9081B1F65:0F9D607E2DF004B5:4DEE98D54259C08D bits:69526 flags:0 Jul 8 00:03:29 nadir kernel: [125018.536599] block drbd0: uuid_compare()=-1 by rule 50 Jul 8 00:03:29 nadir kernel: [125018.536603] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate ) Jul 8 00:03:30 nadir kernel: [125019.926570] block drbd0: conn( WFBitMapT -> WFSyncUUID ) Jul 8 00:03:30 nadir kernel: [125019.933012] block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 Jul 8 00:03:30 nadir kernel: [125019.934579] block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0) Jul 8 00:03:30 nadir kernel: [125019.934583] block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( UpToDate -> Inconsistent ) Jul 8 00:03:30 nadir kernel: [125019.934588] block drbd0: Began resync as SyncTarget (will sync 313168 KB [78292 bits set]). Jul 8 00:03:31 nadir kernel: [125020.350423] general protection fault: 0000 [#1] SMP Jul 8 00:03:31 nadir kernel: [125020.355525] last sysfs file: /sys/module/drbd/parameters/cn_idx Jul 8 00:03:31 nadir kernel: [125020.361534] CPU 0 Jul 8 00:03:31 nadir kernel: [125020.363457] Modules linked in: sha1_generic drbd lru_cache coretemp dcdbas pl2303 usbserial ghes ipmi_si ipmi_msghandler bonding power_meter hed lp parport xfs exportfs raid10 raid456 usb_storage async_pq async_xor uas xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 ahci raid0 bnx2 libahci igb multipath dca linear Jul 8 00:03:31 nadir kernel: [125020.393345] Jul 8 00:03:31 nadir kernel: [125020.394940] Pid: 30452, comm: drbd0_asender Tainted: G W 2.6.38-8-server #42-Ubuntu Dell Inc. PowerEdge R210/05KX61 Jul 8 00:03:31 nadir kernel: [125020.406362] RIP: 0010:[<ffffffff811175c9>] [<ffffffff811175c9>] put_page+0x9/0x40 Jul 8 00:03:31 nadir kernel: [125020.414055] RSP: 0018:ffff8803eed93ae0 EFLAGS: 00010246 Jul 8 00:03:31 nadir kernel: [125020.419455] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000004 Jul 8 00:03:31 nadir kernel: [125020.426689] RDX: ffff8804066a1500 RSI: ffff8804066a14b4 RDI: 656e6f6870656c65 Jul 8 00:03:31 nadir kernel: [125020.433921] RBP: ffff8803eed93ae0 R08: 00000000975c6e7f R09: 0000000000004100 Jul 8 00:03:31 nadir kernel: [125020.441155] R10: ffff8804060ab400 R11: 0000000000000018 R12: ffff88041e4c2000 Jul 8 00:03:31 nadir kernel: [125020.448390] R13: ffff8804060ab470 R14: ffff8804060ab874 R15: 0000000000000000 Jul 8 00:03:31 nadir kernel: [125020.455623] FS: 0000000000000000(0000) GS:ffff8800bf200000(0000) knlGS:0000000000000000 Jul 8 00:03:31 nadir kernel: [125020.463809] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jul 8 00:03:31 nadir kernel: [125020.469643] CR2: 0000000001f17130 CR3: 0000000001a03000 CR4: 00000000000006f0 Jul 8 00:03:31 nadir kernel: [125020.476875] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 8 00:03:31 nadir kernel: [125020.484110] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jul 8 00:03:31 nadir kernel: [125020.491344] Process drbd0_asender (pid: 30452, threadinfo ffff8803eed92000, task ffff8804133ddb80) Jul 8 00:03:31 nadir kernel: [125020.500396] Stack: Jul 8 00:03:31 nadir kernel: [125020.502513] ffff8803eed93b00 ffffffff814d70c4 ffff88041e4c2000 0000000000000020 Jul 8 00:03:31 nadir kernel: [125020.510098] ffff8803eed93b20 ffffffff814d710e 0000000000000020 ffff88041e4c2000 Jul 8 00:03:31 nadir kernel: [125020.517677] ffff8803eed93bd0 ffffffff81527953 ffff8803eed93b50 0000000000000246 Jul 8 00:03:31 nadir kernel: [125020.525257] Call Trace: Jul 8 00:03:31 nadir kernel: [125020.527803] [<ffffffff814d70c4>] skb_release_data+0xb4/0xe0 Jul 8 00:03:31 nadir kernel: [125020.533555] [<ffffffff814d710e>] __kfree_skb+0x1e/0xa0 Jul 8 00:03:31 nadir kernel: [125020.538871] [<ffffffff81527953>] tcp_recvmsg+0xb03/0xbb0 Jul 8 00:03:31 nadir kernel: [125020.544361] [<ffffffff8154a88b>] inet_recvmsg+0x6b/0x80 Jul 8 00:03:31 nadir kernel: [125020.549769] [<ffffffff8104da6c>] ? resched_task+0x2c/0x80 Jul 8 00:03:31 nadir kernel: [125020.555344] [<ffffffff814ce3cd>] sock_recvmsg+0xfd/0x130 Jul 8 00:03:31 nadir kernel: [125020.560834] [<ffffffffa02495aa>] ? __bm_change_bits_to.clone.8+0xaa/0x140 [drbd] Jul 8 00:03:31 nadir kernel: [125020.568418] [<ffffffffa022264c>] ? lc_get+0x3c/0x130 [lru_cache] Jul 8 00:03:31 nadir kernel: [125020.574602] [<ffffffffa0253680>] drbd_recv_short.clone.22+0x70/0x80 [drbd] Jul 8 00:03:31 nadir kernel: [125020.581656] [<ffffffffa025d09f>] drbd_asender+0x15f/0x590 [drbd] Jul 8 00:03:31 nadir kernel: [125020.587843] [<ffffffffa02651a0>] ? drbd_thread_setup+0x0/0xf0 [drbd] Jul 8 00:03:31 nadir kernel: [125020.594378] [<ffffffffa0265204>] drbd_thread_setup+0x64/0xf0 [drbd] Jul 8 00:03:31 nadir kernel: [125020.600823] [<ffffffffa02651a0>] ? drbd_thread_setup+0x0/0xf0 [drbd] Jul 8 00:03:31 nadir kernel: [125020.607356] [<ffffffff810871f6>] kthread+0x96/0xa0 Jul 8 00:03:31 nadir kernel: [125020.612331] [<ffffffff8100cde4>] kernel_thread_helper+0x4/0x10 Jul 8 00:03:31 nadir kernel: [125020.618339] [<ffffffff81087160>] ? kthread+0x0/0xa0 Jul 8 00:03:31 nadir kernel: [125020.623393] [<ffffffff8100cde0>] ? kernel_thread_helper+0x0/0x10 Jul 8 00:03:31 nadir kernel: [125020.629574] Code: de fe ff ff eb c9 48 8b 03 eb e6 89 c2 0f 1f 44 00 00 e9 5d ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 <48> f7 07 00 c0 00 00 75 1f 8b 47 08 f0 ff 4f 08 0f 94 c0 84 c0 Jul 8 00:03:31 nadir kernel: [125020.649899] RIP [<ffffffff811175c9>] put_page+0x9/0x40 Jul 8 00:03:31 nadir kernel: [125020.655237] RSP <ffff8803eed93ae0> Jul 8 00:03:31 nadir kernel: [125020.659209] ---[ end trace 2aa310a34052a046 ]--- Is there anything I can do to help debug this? If I must, I can repeat the import of the data (220G, it's not done yet :). Is the config needed? Should I try upgrading the module? thanks for any reply, Maarten.