Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi guys, I am having big troubles with drbd. The resources detch themselves. I get a lot of this logs from the kernel on both nodes: [11304.180242] PAX: refcount overflow detected in: drbd2_worker:2212, uid/euid: 0/0 [11304.180243] CPU 1 [11304.180243] Modules linked in: ipt_LOG drbd lru_cache ip6table_filter ip6table_mangle ip6_tables ipt_REJECT xt_recent xt_state xt_tcpudp iptable_filter iptable_mangle kvm_intel kvm authenc esp4 ah4 xfrm4_mode_transport deflate zlib_deflate ctr twofish_generic twofish_x86_64_3way twofish_x86_64 twofish_common camellia serpent blowfish_generic blowfish_x86_64 blowfish_common cast5 des_generic xcbc rmd160 sha512_generic crypto_null af_key xt_addrtype i7core_edac ipt_MASQUERADE nfsd shpchp iptable_nat nf_nat nfs nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables lockd psmouse dcdbas mac_hid serio_raw acpi_power_meter wmi edac_core bridge fscache tpm_tis auth_rpcgss stp nfs_acl sunrpc lp parport usbhid hid ses enclosure megaraid_sas bnx2 [11304.180261] [11304.180262] Pid: 2212, comm: drbd2_worker Not tainted 3.2.29-grsec [11304.180264] RIP: 0010:[<ffffffffa03fd9d3>] [<ffffffffa03fd9d3>] bm_page_io_async+0x1a3/0x250 [drbd] [11304.180267] RSP: 0018:ffff880612fb5c60 EFLAGS: 00000a12 [11304.180268] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000000 [11304.180269] RDX: 000000000000a23e RSI: ffffffff8132d670 RDI: ffff8805fae32144 [11304.180271] RBP: ffff880612fb5cf0 R08: 0000000000000000 R09: ffff88060907c0c0 [11304.180272] R10: 00000001b77407b0 R11: 0000000000000000 R12: ffff88060907c0c0 [11304.180273] R13: ffff8806125ba000 R14: ffff880612fb5d20 R15: ffff8805fc629200 [11304.180274] FS: 0000000000000000(0000) GS:ffff88063f620000(0000) knlGS:0000000000000000 [11304.180275] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [11304.180276] CR2: 000001866c3c1158 CR3: 00000000016bf000 CR4: 00000000000006f0 [11304.180277] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [11304.180279] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [11304.180280] Process drbd2_worker (pid: 2212, threadinfo ffff8805fa419630, task ffff8805fa419100) [11304.180281] Stack: [11304.180281] ffff8805fa419630 00000000dbb9ffff fffe484800000001 00000000dbb9fff8 [11304.180283] ffff880612fb5c90 ffffea0018288670 ffff880612fb5cb0 ffff8805fa419630 [11304.180285] 0000000000000a9b 0000000000000000 0000000000002a9a ffff8806125ba100 [11304.180288] Call Trace: [11304.180291] [<ffffffffa03fdf28>] bm_rw+0x178/0x440 [drbd] [11304.180294] [<ffffffffa03ff7a5>] drbd_bm_write+0x15/0x20 [drbd] [11304.180298] [<ffffffffa041c672>] w_bitmap_io+0xe2/0x2a0 [drbd] [11304.180302] [<ffffffffa04059be>] drbd_worker+0x21e/0x4c0 [drbd] [11304.180306] [<ffffffffa04193d0>] ? drbd_open+0xb0/0xb0 [drbd] [11304.180310] [<ffffffffa0419434>] drbd_thread_setup+0x64/0xf0 [drbd] [11304.180314] [<ffffffffa04193d0>] ? drbd_open+0xb0/0xb0 [drbd] [11304.180316] [<ffffffff8108dafc>] kthread+0x8c/0xa0 [11304.180318] [<ffffffff8169c744>] kernel_thread_helper+0x4/0x10 [11304.180320] [<ffffffff8108da70>] ? flush_kthread_worker+0xa0/0xa0 [11304.180322] [<ffffffff8169c740>] ? gs_change+0x13/0x13 [11304.180322] Code: 80 4d 89 74 24 58 4c 89 e6 49 c7 44 24 50 80 da 3f a0 e8 a1 2d f0 e0 f0 41 01 9d d8 0a 00 00 71 0a f0 41 29 9d d8 0a 00 00 cd 04 <48> 83 c4 68 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 0f 1f 44 00 00 [11304.180335] Call Trace: [11304.180338] [<ffffffffa03fdf28>] bm_rw+0x178/0x440 [drbd] [11304.180341] [<ffffffffa03ff7a5>] drbd_bm_write+0x15/0x20 [drbd] [11304.180345] [<ffffffffa041c672>] w_bitmap_io+0xe2/0x2a0 [drbd] [11304.180349] [<ffffffffa04059be>] drbd_worker+0x21e/0x4c0 [drbd] [11304.180353] [<ffffffffa04193d0>] ? drbd_open+0xb0/0xb0 [drbd] [11304.180357] [<ffffffffa0419434>] drbd_thread_setup+0x64/0xf0 [drbd] [11304.180362] [<ffffffffa04193d0>] ? drbd_open+0xb0/0xb0 [drbd] [11304.180363] [<ffffffff8108dafc>] kthread+0x8c/0xa0 [11304.180365] [<ffffffff8169c744>] kernel_thread_helper+0x4/0x10 [11304.180367] [<ffffffff8108da70>] ? flush_kthread_worker+0xa0/0xa0 [11304.180369] [<ffffffff8169c740>] ? gs_change+0x13/0x13 [11304.180371] block drbd2: bitmap WRITE of 5871 pages took 265 jiffies [11304.180382] block drbd2: 734 GB (192322088 bits) marked out-of-sync by on disk bit-map. [11304.180386] block drbd2: ASSERT FAILED: drbd_worker: (get_t_state(thi) == Running) in drivers/block/drbd/drbd_worker.c:1645 [11304.180465] block drbd2: Connection closed [11304.180469] block drbd2: conn( BrokenPipe -> Unconnected ) [11304.180471] block drbd2: receiver terminated [11304.180472] block drbd2: Restarting drbd2_receiver [11304.180473] block drbd2: receiver (re)started [11304.180476] block drbd2: conn( Unconnected -> WFConnection ) Does this speak to anyone? What is the risk for data? What should be the recommended action? Thank you I am having big troubles with drbd. The resources detch themselves. I get a lot of this logs from the kernel on both nodes: <pre> [11304.180242] PAX: refcount overflow detected in: drbd2_worker:2212, uid/euid: 0/0 [11304.180243] CPU 1 [11304.180243] Modules linked in: ipt_LOG drbd lru_cache ip6table_filter ip6table_mangle ip6_tables ipt_REJECT xt_recent xt_state xt_tcpudp iptable_filter iptable_mangle kvm_intel kvm authenc esp4 ah4 xfrm4_mode_transport deflate zlib_deflate ctr twofish_generic twofish_x86_64_3way twofish_x86_64 twofish_common camellia serpent blowfish_generic blowfish_x86_64 blowfish_common cast5 des_generic xcbc rmd160 sha512_generic crypto_null af_key xt_addrtype i7core_edac ipt_MASQUERADE nfsd shpchp iptable_nat nf_nat nfs nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables lockd psmouse dcdbas mac_hid serio_raw acpi_power_meter wmi edac_core bridge fscache tpm_tis auth_rpcgss stp nfs_acl sunrpc lp parport usbhid hid ses enclosure megaraid_sas bnx2 [11304.180261] [11304.180262] Pid: 2212, comm: drbd2_worker Not tainted 3.2.29-grsec [11304.180264] RIP: 0010:[<ffffffffa03fd9d3>] [<ffffffffa03fd9d3>] bm_page_io_async+0x1a3/0x250 [drbd] [11304.180267] RSP: 0018:ffff880612fb5c60 EFLAGS: 00000a12 [11304.180268] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000000 [11304.180269] RDX: 000000000000a23e RSI: ffffffff8132d670 RDI: ffff8805fae32144 [11304.180271] RBP: ffff880612fb5cf0 R08: 0000000000000000 R09: ffff88060907c0c0 [11304.180272] R10: 00000001b77407b0 R11: 0000000000000000 R12: ffff88060907c0c0 [11304.180273] R13: ffff8806125ba000 R14: ffff880612fb5d20 R15: ffff8805fc629200 [11304.180274] FS: 0000000000000000(0000) GS:ffff88063f620000(0000) knlGS:0000000000000000 [11304.180275] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [11304.180276] CR2: 000001866c3c1158 CR3: 00000000016bf000 CR4: 00000000000006f0 [11304.180277] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [11304.180279] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [11304.180280] Process drbd2_worker (pid: 2212, threadinfo ffff8805fa419630, task ffff8805fa419100) [11304.180281] Stack: [11304.180281] ffff8805fa419630 00000000dbb9ffff fffe484800000001 00000000dbb9fff8 [11304.180283] ffff880612fb5c90 ffffea0018288670 ffff880612fb5cb0 ffff8805fa419630 [11304.180285] 0000000000000a9b 0000000000000000 0000000000002a9a ffff8806125ba100 [11304.180288] Call Trace: [11304.180291] [<ffffffffa03fdf28>] bm_rw+0x178/0x440 [drbd] [11304.180294] [<ffffffffa03ff7a5>] drbd_bm_write+0x15/0x20 [drbd] [11304.180298] [<ffffffffa041c672>] w_bitmap_io+0xe2/0x2a0 [drbd] [11304.180302] [<ffffffffa04059be>] drbd_worker+0x21e/0x4c0 [drbd] [11304.180306] [<ffffffffa04193d0>] ? drbd_open+0xb0/0xb0 [drbd] [11304.180310] [<ffffffffa0419434>] drbd_thread_setup+0x64/0xf0 [drbd] [11304.180314] [<ffffffffa04193d0>] ? drbd_open+0xb0/0xb0 [drbd] [11304.180316] [<ffffffff8108dafc>] kthread+0x8c/0xa0 [11304.180318] [<ffffffff8169c744>] kernel_thread_helper+0x4/0x10 [11304.180320] [<ffffffff8108da70>] ? flush_kthread_worker+0xa0/0xa0 [11304.180322] [<ffffffff8169c740>] ? gs_change+0x13/0x13 [11304.180322] Code: 80 4d 89 74 24 58 4c 89 e6 49 c7 44 24 50 80 da 3f a0 e8 a1 2d f0 e0 f0 41 01 9d d8 0a 00 00 71 0a f0 41 29 9d d8 0a 00 00 cd 04 <48> 83 c4 68 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 0f 1f 44 00 00 [11304.180335] Call Trace: [11304.180338] [<ffffffffa03fdf28>] bm_rw+0x178/0x440 [drbd] [11304.180341] [<ffffffffa03ff7a5>] drbd_bm_write+0x15/0x20 [drbd] [11304.180345] [<ffffffffa041c672>] w_bitmap_io+0xe2/0x2a0 [drbd] [11304.180349] [<ffffffffa04059be>] drbd_worker+0x21e/0x4c0 [drbd] [11304.180353] [<ffffffffa04193d0>] ? drbd_open+0xb0/0xb0 [drbd] [11304.180357] [<ffffffffa0419434>] drbd_thread_setup+0x64/0xf0 [drbd] [11304.180362] [<ffffffffa04193d0>] ? drbd_open+0xb0/0xb0 [drbd] [11304.180363] [<ffffffff8108dafc>] kthread+0x8c/0xa0 [11304.180365] [<ffffffff8169c744>] kernel_thread_helper+0x4/0x10 [11304.180367] [<ffffffff8108da70>] ? flush_kthread_worker+0xa0/0xa0 [11304.180369] [<ffffffff8169c740>] ? gs_change+0x13/0x13 [11304.180371] block drbd2: bitmap WRITE of 5871 pages took 265 jiffies [11304.180382] block drbd2: 734 GB (192322088 bits) marked out-of-sync by on disk bit-map. [11304.180386] block drbd2: ASSERT FAILED: drbd_worker: (get_t_state(thi) == Running) in drivers/block/drbd/drbd_worker.c:1645 [11304.180465] block drbd2: Connection closed [11304.180469] block drbd2: conn( BrokenPipe -> Unconnected ) [11304.180471] block drbd2: receiver terminated [11304.180472] block drbd2: Restarting drbd2_receiver [11304.180473] block drbd2: receiver (re)started [11304.180476] block drbd2: conn( Unconnected -> WFConnection ) </pre> Does this speak to anyone? What is the risk for data? Thank you Hi guys, I am having big troubles with drbd. The resources detch themselves. I get a lot of this logs from the kernel on both nodes: <pre> [11304.180242] PAX: refcount overflow detected in: drbd2_worker:2212, uid/euid: 0/0 [11304.180243] CPU 1 [11304.180243] Modules linked in: ipt_LOG drbd lru_cache ip6table_filter ip6table_mangle ip6_tables ipt_REJECT xt_recent xt_state xt_tcpudp iptable_filter iptable_mangle kvm_intel kvm authenc esp4 ah4 xfrm4_mode_transport deflate zlib_deflate ctr twofish_generic twofish_x86_64_3way twofish_x86_64 twofish_common camellia serpent blowfish_generic blowfish_x86_64 blowfish_common cast5 des_generic xcbc rmd160 sha512_generic crypto_null af_key xt_addrtype i7core_edac ipt_MASQUERADE nfsd shpchp iptable_nat nf_nat nfs nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables lockd psmouse dcdbas mac_hid serio_raw acpi_power_meter wmi edac_core bridge fscache tpm_tis auth_rpcgss stp nfs_acl sunrpc lp parport usbhid hid ses enclosure megaraid_sas bnx2 [11304.180261] [11304.180262] Pid: 2212, comm: drbd2_worker Not tainted 3.2.29-grsec [11304.180264] RIP: 0010:[<ffffffffa03fd9d3>] [<ffffffffa03fd9d3>] bm_page_io_async+0x1a3/0x250 [drbd] [11304.180267] RSP: 0018:ffff880612fb5c60 EFLAGS: 00000a12 [11304.180268] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000000 [11304.180269] RDX: 000000000000a23e RSI: ffffffff8132d670 RDI: ffff8805fae32144 [11304.180271] RBP: ffff880612fb5cf0 R08: 0000000000000000 R09: ffff88060907c0c0 [11304.180272] R10: 00000001b77407b0 R11: 0000000000000000 R12: ffff88060907c0c0 [11304.180273] R13: ffff8806125ba000 R14: ffff880612fb5d20 R15: ffff8805fc629200 [11304.180274] FS: 0000000000000000(0000) GS:ffff88063f620000(0000) knlGS:0000000000000000 [11304.180275] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [11304.180276] CR2: 000001866c3c1158 CR3: 00000000016bf000 CR4: 00000000000006f0 [11304.180277] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [11304.180279] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [11304.180280] Process drbd2_worker (pid: 2212, threadinfo ffff8805fa419630, task ffff8805fa419100) [11304.180281] Stack: [11304.180281] ffff8805fa419630 00000000dbb9ffff fffe484800000001 00000000dbb9fff8 [11304.180283] ffff880612fb5c90 ffffea0018288670 ffff880612fb5cb0 ffff8805fa419630 [11304.180285] 0000000000000a9b 0000000000000000 0000000000002a9a ffff8806125ba100 [11304.180288] Call Trace: [11304.180291] [<ffffffffa03fdf28>] bm_rw+0x178/0x440 [drbd] [11304.180294] [<ffffffffa03ff7a5>] drbd_bm_write+0x15/0x20 [drbd] [11304.180298] [<ffffffffa041c672>] w_bitmap_io+0xe2/0x2a0 [drbd] [11304.180302] [<ffffffffa04059be>] drbd_worker+0x21e/0x4c0 [drbd] [11304.180306] [<ffffffffa04193d0>] ? drbd_open+0xb0/0xb0 [drbd] [11304.180310] [<ffffffffa0419434>] drbd_thread_setup+0x64/0xf0 [drbd] [11304.180314] [<ffffffffa04193d0>] ? drbd_open+0xb0/0xb0 [drbd] [11304.180316] [<ffffffff8108dafc>] kthread+0x8c/0xa0 [11304.180318] [<ffffffff8169c744>] kernel_thread_helper+0x4/0x10 [11304.180320] [<ffffffff8108da70>] ? flush_kthread_worker+0xa0/0xa0 [11304.180322] [<ffffffff8169c740>] ? gs_change+0x13/0x13 [11304.180322] Code: 80 4d 89 74 24 58 4c 89 e6 49 c7 44 24 50 80 da 3f a0 e8 a1 2d f0 e0 f0 41 01 9d d8 0a 00 00 71 0a f0 41 29 9d d8 0a 00 00 cd 04 <48> 83 c4 68 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 0f 1f 44 00 00 [11304.180335] Call Trace: [11304.180338] [<ffffffffa03fdf28>] bm_rw+0x178/0x440 [drbd] [11304.180341] [<ffffffffa03ff7a5>] drbd_bm_write+0x15/0x20 [drbd] [11304.180345] [<ffffffffa041c672>] w_bitmap_io+0xe2/0x2a0 [drbd] [11304.180349] [<ffffffffa04059be>] drbd_worker+0x21e/0x4c0 [drbd] [11304.180353] [<ffffffffa04193d0>] ? drbd_open+0xb0/0xb0 [drbd] [11304.180357] [<ffffffffa0419434>] drbd_thread_setup+0x64/0xf0 [drbd] [11304.180362] [<ffffffffa04193d0>] ? drbd_open+0xb0/0xb0 [drbd] [11304.180363] [<ffffffff8108dafc>] kthread+0x8c/0xa0 [11304.180365] [<ffffffff8169c744>] kernel_thread_helper+0x4/0x10 [11304.180367] [<ffffffff8108da70>] ? flush_kthread_worker+0xa0/0xa0 [11304.180369] [<ffffffff8169c740>] ? gs_change+0x13/0x13 [11304.180371] block drbd2: bitmap WRITE of 5871 pages took 265 jiffies [11304.180382] block drbd2: 734 GB (192322088 bits) marked out-of-sync by on disk bit-map. [11304.180386] block drbd2: ASSERT FAILED: drbd_worker: (get_t_state(thi) == Running) in drivers/block/drbd/drbd_worker.c:1645 [11304.180465] block drbd2: Connection closed [11304.180469] block drbd2: conn( BrokenPipe -> Unconnected ) [11304.180471] block drbd2: receiver terminated [11304.180472] block drbd2: Restarting drbd2_receiver [11304.180473] block drbd2: receiver (re)started [11304.180476] block drbd2: conn( Unconnected -> WFConnection ) </pre> Does this speak to anyone? What is the risk for data? Thank you -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20161013/921f2dd9/attachment.htm>