[DRBD-user] Antwort: Re: proto c - corrupt files - directories missing

Bauer, Stefan (IZLBW Extern) Stefan.Bauer at iz.bwl.de
Thu Jan 23 09:47:10 CET 2014

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi List,

I just set up the verify and ran it manually and guess what. Both nodes started to respond very slowly - at the end I had to reboot both nodes...

It's drbd 8.3.11 with kernel 3.2.0.0 from bpo - debian 6.0 - 64bit.
It's a hp ml320 server, hw raid and 8gb ram.
It's hard to believe, that according to some webpages it could be a limitation on the hardware.

Thank you.

Stefan 


It started with kernel oops mixed with the following messages:


kernel: drbd0: [kjournald/10662] sock_sendmsg time expired, ko = 4294967295


Jan 23 08:02:37 node2 kernel: [1295929.993590] block drbd0: conn( Connected -> VerifyS )
Jan 23 08:02:37 node2 kernel: [1295929.993599] block drbd0: Starting Online Verify from sector 0
Jan 23 08:08:01 node2 kernel: [1296253.890624] jbd2/drbd0-8    D ffff8801a4451020     0 49080      2 0x00000000
Jan 23 08:08:01 node2 kernel: [1296253.890634]  ffff8801a4451020 0000000000000046 0000000000000000 ffff880202230f60
Jan 23 08:08:01 node2 kernel: [1296253.890644]  0000000000013740 ffff8801f34d9fd8 ffff8801f34d9fd8 0000000000013740
Jan 23 08:08:01 node2 kernel: [1296253.890652]  ffff8801a4451020 0000000000013740 0000000000013740 ffff8801f34d8010
Jan 23 08:08:01 node2 kernel: [1296253.890660] Call Trace:
Jan 23 08:08:01 node2 kernel: [1296253.890674]  [<ffffffff810bd839>] ? __lock_page+0x63/0x63
Jan 23 08:08:01 node2 kernel: [1296253.890707]  [<ffffffff81368b1e>] ? io_schedule+0x84/0xc3
Jan 23 08:08:01 node2 kernel: [1296253.890717]  [<ffffffff811c0d07>] ? radix_tree_gang_lookup_tag_slot+0x7a/0x9f
Jan 23 08:08:01 node2 kernel: [1296253.890725]  [<ffffffff810bd842>] ? sleep_on_page+0x9/0xd
Jan 23 08:08:01 node2 kernel: [1296253.890732]  [<ffffffff81368f19>] ? __wait_on_bit+0x3e/0x6f
Jan 23 08:08:01 node2 kernel: [1296253.890739]  [<ffffffff810bd9f0>] ? wait_on_page_bit+0x6a/0x70
Jan 23 08:08:01 node2 kernel: [1296253.890749]  [<ffffffff81063cb7>] ? autoremove_wake_function+0x2a/0x2a
Jan 23 08:08:01 node2 kernel: [1296253.890757]  [<ffffffff810c6b28>] ? pagevec_lookup_tag+0x18/0x1f
Jan 23 08:08:01 node2 kernel: [1296253.890775]  [<ffffffff810bdcfd>] ? filemap_fdatawait_range+0x96/0x150
Jan 23 08:08:01 node2 kernel: [1296253.890798]  [<ffffffffa011075c>] ? jbd2_journal_commit_transaction+0x8e0/0x1160 [jbd2]
Jan 23 08:08:01 node2 kernel: [1296253.890812]  [<ffffffffa0114e7b>] ? kjournald2+0xc0/0x20e [jbd2]
Jan 23 08:08:01 node2 kernel: [1296253.890819]  [<ffffffff81063c8d>] ? wake_up_bit+0x20/0x20
Jan 23 08:08:01 node2 kernel: [1296253.890829]  [<ffffffffa0114dbb>] ? commit_timeout+0xa/0xa [jbd2]
Jan 23 08:08:01 node2 kernel: [1296253.890839]  [<ffffffffa0114dbb>] ? commit_timeout+0xa/0xa [jbd2]
Jan 23 08:08:01 node2 kernel: [1296253.890846]  [<ffffffff81063841>] ? kthread+0x7a/0x82
Jan 23 08:08:01 node2 kernel: [1296253.890855]  [<ffffffff81371434>] ? kernel_thread_helper+0x4/0x10
Jan 23 08:08:01 node2 kernel: [1296253.890862]  [<ffffffff810637c7>] ? kthread_worker_fn+0x147/0x147
Jan 23 08:08:01 node2 kernel: [1296253.890869]  [<ffffffff81371430>] ? gs_change+0x13/0x13
Jan 23 08:08:01 node2 kernel: [1296253.890990] mysqld          D ffff8800154f49f0     0 49165      1 0x00000000
Jan 23 08:08:01 node2 kernel: [1296253.890998]  ffff8800154f49f0 0000000000000086 0000000000000000 ffff8802024be0c0
Jan 23 08:08:01 node2 kernel: [1296253.891006]  0000000000013740 ffff880124dbbfd8 ffff880124dbbfd8 0000000000013740
Jan 23 08:08:01 node2 kernel: [1296253.891013]  ffff8800154f49f0 0000000000013740 0000000000013740 ffff880124dba010
Jan 23 08:08:01 node2 kernel: [1296253.891021] Call Trace:
Jan 23 08:08:01 node2 kernel: [1296253.891033]  [<ffffffffa0114b16>] ? jbd2_log_wait_commit+0xc0/0x111 [jbd2]
Jan 23 08:08:01 node2 kernel: [1296253.891040]  [<ffffffff81063c8d>] ? wake_up_bit+0x20/0x20
Jan 23 08:08:01 node2 kernel: [1296253.891050]  [<ffffffffa0114cbb>] ? jbd2_log_start_commit+0x21/0x2f [jbd2]
Jan 23 08:08:01 node2 kernel: [1296253.891066]  [<ffffffffa012a0dc>] ? ext4_sync_file+0x314/0x380 [ext4]
Jan 23 08:08:01 node2 kernel: [1296253.891076]  [<ffffffff81129791>] ? do_fsync+0x27/0x3b
Jan 23 08:08:01 node2 kernel: [1296253.891083]  [<ffffffff811297c2>] ? sys_fsync+0xb/0xf
Jan 23 08:08:01 node2 kernel: [1296253.891090]  [<ffffffff8136f2d2>] ? system_call_fastpath+0x16/0x1b
Jan 23 08:08:01 node2 kernel: [1296253.891205] mysqld          D ffff88016fed41c0     0 45369      1 0x00000000
Jan 23 08:08:01 node2 kernel: [1296253.891212]  ffff88016fed41c0 0000000000000086 0000000000000000 ffff8802021ff610
Jan 23 08:08:01 node2 kernel: [1296253.891219]  0000000000013740 ffff8801f19e3fd8 ffff8801f19e3fd8 0000000000013740
Jan 23 08:08:01 node2 kernel: [1296253.891226]  ffff88016fed41c0 0000000000013740 0000000000013740 ffff8801f19e2010
Jan 23 08:08:01 node2 kernel: [1296253.891234] Call Trace:
Jan 23 08:08:01 node2 kernel: [1296253.891241]  [<ffffffff810139f1>] ? read_tsc+0x5/0x16
Jan 23 08:08:01 node2 kernel: [1296253.891249]  [<ffffffff810bd839>] ? __lock_page+0x63/0x63
Jan 23 08:08:01 node2 kernel: [1296253.891255]  [<ffffffff81368b1e>] ? io_schedule+0x84/0xc3
Jan 23 08:08:01 node2 kernel: [1296253.891262]  [<ffffffff811c0d07>] ? radix_tree_gang_lookup_tag_slot+0x7a/0x9f
Jan 23 08:08:01 node2 kernel: [1296253.891270]  [<ffffffff810bd842>] ? sleep_on_page+0x9/0xd
Jan 23 08:08:01 node2 kernel: [1296253.891276]  [<ffffffff81368f19>] ? __wait_on_bit+0x3e/0x6f
Jan 23 08:08:01 node2 kernel: [1296253.891283]  [<ffffffff810bd9f0>] ? wait_on_page_bit+0x6a/0x70
Jan 23 08:08:01 node2 kernel: [1296253.891291]  [<ffffffff81063cb7>] ? autoremove_wake_function+0x2a/0x2a
Jan 23 08:08:01 node2 kernel: [1296253.891298]  [<ffffffff810c6b28>] ? pagevec_lookup_tag+0x18/0x1f
Jan 23 08:08:01 node2 kernel: [1296253.891305]  [<ffffffff810bdcfd>] ? filemap_fdatawait_range+0x96/0x150
Jan 23 08:08:01 node2 kernel: [1296253.891314]  [<ffffffff810bde68>] ? filemap_write_and_wait_range+0x3d/0x4f
Jan 23 08:08:01 node2 kernel: [1296253.891329]  [<ffffffffa0129e5a>] ? ext4_sync_file+0x92/0x380 [ext4]
Jan 23 08:08:01 node2 kernel: [1296253.891336]  [<ffffffff81129791>] ? do_fsync+0x27/0x3b
Jan 23 08:08:01 node2 kernel: [1296253.891343]  [<ffffffff811297c2>] ? sys_fsync+0xb/0xf
Jan 23 08:08:01 node2 kernel: [1296253.891349]  [<ffffffff8136f2d2>] ? system_call_fastpath+0x16/0x1b
Jan 23 08:10:01 node2 kernel: [1296373.858860] mysqld          D ffff8800154f49f0     0 49165      1 0x00000000
Jan 23 08:10:01 node2 kernel: [1296373.858870]  ffff8800154f49f0 0000000000000086 0000000000000000 ffff8802024be0c0
Jan 23 08:10:01 node2 kernel: [1296373.858879]  0000000000013740 ffff880124dbbfd8 ffff880124dbbfd8 0000000000013740
Jan 23 08:10:01 node2 kernel: [1296373.858887]  ffff8800154f49f0 0000000000013740 0000000000013740 ffff880124dba010
Jan 23 08:10:01 node2 kernel: [1296373.858895] Call Trace:
Jan 23 08:10:01 node2 kernel: [1296373.858922]  [<ffffffffa0114b16>] ? jbd2_log_wait_commit+0xc0/0x111 [jbd2]
Jan 23 08:10:01 node2 kernel: [1296373.858934]  [<ffffffff81063c8d>] ? wake_up_bit+0x20/0x20
Jan 23 08:10:01 node2 kernel: [1296373.858945]  [<ffffffffa0114cbb>] ? jbd2_log_start_commit+0x21/0x2f [jbd2]
Jan 23 08:10:01 node2 kernel: [1296373.858962]  [<ffffffffa012a0dc>] ? ext4_sync_file+0x314/0x380 [ext4]
Jan 23 08:10:01 node2 kernel: [1296373.858972]  [<ffffffff81129791>] ? do_fsync+0x27/0x3b
Jan 23 08:10:01 node2 kernel: [1296373.858979]  [<ffffffff811297c2>] ? sys_fsync+0xb/0xf
Jan 23 08:10:01 node2 kernel: [1296373.858987]  [<ffffffff8136f2d2>] ? system_call_fastpath+0x16/0x1b
Jan 23 08:10:01 node2 kernel: [1296373.859102] mysqld          D ffff88016fed41c0     0 45369      1 0x00000000
Jan 23 08:10:01 node2 kernel: [1296373.859109]  ffff88016fed41c0 0000000000000086 0000000000000000 ffff8802021ff610
Jan 23 08:10:01 node2 kernel: [1296373.859117]  0000000000013740 ffff8801f19e3fd8 ffff8801f19e3fd8 0000000000013740
Jan 23 08:10:01 node2 kernel: [1296373.859124]  ffff88016fed41c0 0000000000013740 0000000000013740 ffff8801f19e2010
Jan 23 08:10:01 node2 kernel: [1296373.859132] Call Trace:
Jan 23 08:10:01 node2 kernel: [1296373.859139]  [<ffffffff810139f1>] ? read_tsc+0x5/0x16
Jan 23 08:10:01 node2 kernel: [1296373.859147]  [<ffffffff810bd839>] ? __lock_page+0x63/0x63
Jan 23 08:10:01 node2 kernel: [1296373.859155]  [<ffffffff81368b1e>] ? io_schedule+0x84/0xc3
Jan 23 08:10:01 node2 kernel: [1296373.859165]  [<ffffffff811c0d07>] ? radix_tree_gang_lookup_tag_slot+0x7a/0x9f
Jan 23 08:10:01 node2 kernel: [1296373.859173]  [<ffffffff810bd842>] ? sleep_on_page+0x9/0xd
Jan 23 08:10:01 node2 kernel: [1296373.859179]  [<ffffffff81368f19>] ? __wait_on_bit+0x3e/0x6f
Jan 23 08:10:01 node2 kernel: [1296373.859186]  [<ffffffff810bd9f0>] ? wait_on_page_bit+0x6a/0x70
Jan 23 08:10:01 node2 kernel: [1296373.859193]  [<ffffffff81063cb7>] ? autoremove_wake_function+0x2a/0x2a
Jan 23 08:10:01 node2 kernel: [1296373.859202]  [<ffffffff810c6b28>] ? pagevec_lookup_tag+0x18/0x1f
Jan 23 08:10:01 node2 kernel: [1296373.859209]  [<ffffffff810bdcfd>] ? filemap_fdatawait_range+0x96/0x150
Jan 23 08:10:01 node2 kernel: [1296373.859218]  [<ffffffff810bde68>] ? filemap_write_and_wait_range+0x3d/0x4f
Jan 23 08:10:01 node2 kernel: [1296373.859233]  [<ffffffffa0129e5a>] ? ext4_sync_file+0x92/0x380 [ext4]
Jan 23 08:10:01 node2 kernel: [1296373.859241]  [<ffffffff81129791>] ? do_fsync+0x27/0x3b
Jan 23 08:10:01 node2 kernel: [1296373.859248]  [<ffffffff811297c2>] ? sys_fsync+0xb/0xf
Jan 23 08:10:01 node2 kernel: [1296373.859254]  [<ffffffff8136f2d2>] ? system_call_fastpath+0x16/0x1b
Jan 23 08:11:20 node2 rsyslogd-2177: imuxsock begins to drop messages from pid 1219 due to rate-limiting
Jan 23 08:11:40 node2 rsyslogd-2177: imuxsock lost 14 messages from pid 1219 due to rate-limiting
Jan 23 08:12:00 node2 rsyslogd-2177: imuxsock begins to drop messages from pid 1219 due to rate-limiting
Jan 23 08:12:01 node2 kernel: [1296493.827126] jbd2/drbd0-8    D ffff8801a4451020     0 49080      2 0x00000000
Jan 23 08:12:01 node2 kernel: [1296493.827136]  ffff8801a4451020 0000000000000046 0000000000000000 ffff880202230f60
Jan 23 08:12:01 node2 kernel: [1296493.827146]  0000000000013740 ffff8801f34d9fd8 ffff8801f34d9fd8 0000000000013740
Jan 23 08:12:01 node2 kernel: [1296493.827154]  ffff8801a4451020 0000000000013740 0000000000013740 ffff8801f34d8010
Jan 23 08:12:01 node2 kernel: [1296493.827162] Call Trace:
Jan 23 08:12:01 node2 kernel: [1296493.827176]  [<ffffffff810bd839>] ? __lock_page+0x63/0x63
Jan 23 08:12:01 node2 kernel: [1296493.827185]  [<ffffffff81368b1e>] ? io_schedule+0x84/0xc3
Jan 23 08:12:01 node2 kernel: [1296493.827196]  [<ffffffff811c0d07>] ? radix_tree_gang_lookup_tag_slot+0x7a/0x9f
Jan 23 08:12:01 node2 kernel: [1296493.827203]  [<ffffffff810bd842>] ? sleep_on_page+0x9/0xd
Jan 23 08:12:01 node2 kernel: [1296493.827210]  [<ffffffff81368f19>] ? __wait_on_bit+0x3e/0x6f
Jan 23 08:12:01 node2 kernel: [1296493.827218]  [<ffffffff810bd9f0>] ? wait_on_page_bit+0x6a/0x70
Jan 23 08:12:01 node2 kernel: [1296493.827227]  [<ffffffff81063cb7>] ? autoremove_wake_function+0x2a/0x2a
Jan 23 08:12:01 node2 kernel: [1296493.827235]  [<ffffffff810c6b28>] ? pagevec_lookup_tag+0x18/0x1f
Jan 23 08:12:01 node2 kernel: [1296493.827243]  [<ffffffff810bdcfd>] ? filemap_fdatawait_range+0x96/0x150
Jan 23 08:12:01 node2 kernel: [1296493.827266]  [<ffffffffa011075c>] ? jbd2_journal_commit_transaction+0x8e0/0x1160 [jbd2]
Jan 23 08:12:01 node2 kernel: [1296493.827280]  [<ffffffffa0114e7b>] ? kjournald2+0xc0/0x20e [jbd2]
Jan 23 08:12:01 node2 kernel: [1296493.827287]  [<ffffffff81063c8d>] ? wake_up_bit+0x20/0x20
Jan 23 08:12:01 node2 kernel: [1296493.827298]  [<ffffffffa0114dbb>] ? commit_timeout+0xa/0xa [jbd2]
Jan 23 08:12:01 node2 kernel: [1296493.827308]  [<ffffffffa0114dbb>] ? commit_timeout+0xa/0xa [jbd2]
Jan 23 08:12:01 node2 kernel: [1296493.827315]  [<ffffffff81063841>] ? kthread+0x7a/0x82
Jan 23 08:12:01 node2 kernel: [1296493.827324]  [<ffffffff81371434>] ? kernel_thread_helper+0x4/0x10
Jan 23 08:12:01 node2 kernel: [1296493.827331]  [<ffffffff810637c7>] ? kthread_worker_fn+0x147/0x147
Jan 23 08:12:01 node2 kernel: [1296493.827338]  [<ffffffff81371430>] ? gs_change+0x13/0x13
Jan 23 08:12:01 node2 kernel: [1296493.827449] mysqld          D ffff8800154f49f0     0 49165      1 0x00000000
Jan 23 08:12:01 node2 kernel: [1296493.827456]  ffff8800154f49f0 0000000000000086 0000000000000000 ffff8802024be0c0
Jan 23 08:12:01 node2 kernel: [1296493.827464]  0000000000013740 ffff880124dbbfd8 ffff880124dbbfd8 0000000000013740
Jan 23 08:12:01 node2 kernel: [1296493.827472]  ffff8800154f49f0 0000000000013740 0000000000013740 ffff880124dba010
Jan 23 08:12:01 node2 kernel: [1296493.827479] Call Trace:
Jan 23 08:12:01 node2 kernel: [1296493.827491]  [<ffffffffa0114b16>] ? jbd2_log_wait_commit+0xc0/0x111 [jbd2]
Jan 23 08:12:01 node2 kernel: [1296493.827498]  [<ffffffff81063c8d>] ? wake_up_bit+0x20/0x20
Jan 23 08:12:01 node2 kernel: [1296493.827508]  [<ffffffffa0114cbb>] ? jbd2_log_start_commit+0x21/0x2f [jbd2]
Jan 23 08:12:01 node2 kernel: [1296493.827525]  [<ffffffffa012a0dc>] ? ext4_sync_file+0x314/0x380 [ext4]
Jan 23 08:12:01 node2 kernel: [1296493.827535]  [<ffffffff81129791>] ? do_fsync+0x27/0x3b
Jan 23 08:12:01 node2 kernel: [1296493.827542]  [<ffffffff811297c2>] ? sys_fsync+0xb/0xf
Jan 23 08:12:01 node2 kernel: [1296493.827549]  [<ffffffff8136f2d2>] ? system_call_fastpath+0x16/0x1b
Jan 23 08:12:01 node2 kernel: [1296493.827664] mysqld          D ffff88016fed41c0     0 45369      1 0x00000000
Jan 23 08:12:01 node2 kernel: [1296493.827670]  ffff88016fed41c0 0000000000000086 0000000000000000 ffff8802021ff610
Jan 23 08:12:01 node2 kernel: [1296493.827678]  0000000000013740 ffff8801f19e3fd8 ffff8801f19e3fd8 0000000000013740
Jan 23 08:12:01 node2 kernel: [1296493.827685]  ffff88016fed41c0 0000000000013740 0000000000013740 ffff8801f19e2010
Jan 23 08:12:01 node2 kernel: [1296493.827693] Call Trace:
Jan 23 08:12:01 node2 kernel: [1296493.827700]  [<ffffffff810139f1>] ? read_tsc+0x5/0x16
Jan 23 08:12:01 node2 kernel: [1296493.827707]  [<ffffffff810bd839>] ? __lock_page+0x63/0x63
Jan 23 08:12:01 node2 kernel: [1296493.827713]  [<ffffffff81368b1e>] ? io_schedule+0x84/0xc3
Jan 23 08:12:01 node2 kernel: [1296493.827721]  [<ffffffff811c0d07>] ? radix_tree_gang_lookup_tag_slot+0x7a/0x9f
Jan 23 08:12:01 node2 kernel: [1296493.827728]  [<ffffffff810bd842>] ? sleep_on_page+0x9/0xd
Jan 23 08:12:01 node2 kernel: [1296493.827734]  [<ffffffff81368f19>] ? __wait_on_bit+0x3e/0x6f
Jan 23 08:12:01 node2 kernel: [1296493.827742]  [<ffffffff810bd9f0>] ? wait_on_page_bit+0x6a/0x70
Jan 23 08:12:01 node2 kernel: [1296493.827749]  [<ffffffff81063cb7>] ? autoremove_wake_function+0x2a/0x2a
Jan 23 08:12:01 node2 kernel: [1296493.827756]  [<ffffffff810c6b28>] ? pagevec_lookup_tag+0x18/0x1f
Jan 23 08:12:01 node2 kernel: [1296493.827763]  [<ffffffff810bdcfd>] ? filemap_fdatawait_range+0x96/0x150
Jan 23 08:12:01 node2 kernel: [1296493.827772]  [<ffffffff810bde68>] ? filemap_write_and_wait_range+0x3d/0x4f
Jan 23 08:12:01 node2 kernel: [1296493.827786]  [<ffffffffa0129e5a>] ? ext4_sync_file+0x92/0x380 [ext4]
Jan 23 08:12:01 node2 kernel: [1296493.827794]  [<ffffffff81129791>] ? do_fsync+0x27/0x3b
Jan 23 08:12:01 node2 kernel: [1296493.827801]  [<ffffffff811297c2>] ? sys_fsync+0xb/0xf
Jan 23 08:12:01 node2 kernel: [1296493.827807]  [<ffffffff8136f2d2>] ? system_call_fastpath+0x16/0x1b
Jan 23 08:12:11 node2 rsyslogd-2177: imuxsock lost 1690 messages from pid 1219 due to rate-limiting




More information about the drbd-user mailing list