Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Lars, We continue to experience intermittent kernel warnings (oops) in kernel thread drbd_w_db. This occurs on servers having a DRBD resource in the primary role. While we are currently unable to reproduce the issue in a controlled manner, we have mapped call traces to line numbers in the 4.1 kernel.org sources. This mapping is for kernels currently running on our production servers. Our kernel is compiled from the 4.1 kernel.org source code with the following lines defining default barrier options for ext3 and ext4 file systems commented out: fs/ext3/super.c: 1788 // set_opt(sbi->s_mount_opt, BARRIER); fs/ext4/super.c: 3578,3579 // if ((def_mount_opts & EXT4_DEFM_NOBARRIER) == 0) // set_opt(sb, BARRIER); One observation after examining the source code is that the iov_iter interface code where the oops is triggered changed significantly between kernels 3.18 and 3.19 (a topic discussed here: https://lwn.net/Articles/625077/ ). Since DRBD is dependent upon networking code that calls this updated code, perhaps the changes are related to the issue we are experiencing with recent kernels. Below I've included the following: 1) A drbd configuration file from one of our servers (i.e. /etc/drbd.conf) 2) For each 4.1 kernel oops, the oops report followed by matching source code line numbers. 3) A description of the process we used to map call stack addresses to kernel source line numbers. Hopefully the line numbers will prove helpful when debugging this issue. Regards, Adrian ========================================== 1) DRBD CONFIGURATION --------/etc/drbd.conf-------- # You can find an example in /usr/share/doc/drbd.../drbd.conf.example # include "drbd.d/global_common.conf"; # include "drbd.d/*.res"; global { usage-count no; } common { net { protocol A; } syncer { rate 100M; } } resource db { disk { on-io-error detach; } on uk3 { device /dev/drbd0; disk /dev/xvdd; address 192.168.133.115:7791; meta-disk internal; } on uk4 { device /dev/drbd0; disk /dev/xvdd; address 192.168.133.117:7791; meta-disk internal; } } ======================================================================== 2) KERNEL OOPS INFORMATION - For Each: Error log followed by line numbers KERNEL OOPS 1 - ERROR LOG - Version 4.1.0 with barrier patch - 2015-07-08 18:11:09 uk3 kern.alert kernel BUG: unable to handle kernel paging request at 0000000000001000 uk3 kern.alert kernel IP: [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk3 kern.warning kernel PGD 77b46067 PUD 6f0df067 PMD 0 uk3 kern.warning kernel Oops: 0000 [#1] SMP uk3 kern.warning kernel Modules linked in: uk3 kern.warning kernel CPU: 0 PID: 4516 Comm: drbd_w_db Not tainted 4.1.0-x86_64-linode59 #1 uk3 kern.warning kernel Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 uk3 kern.warning kernel task: ffff88006f024a40 ti: ffff8800793b0000 task.ti: ffff8800793b0000 uk3 kern.warning kernel RIP: 0010:[<ffffffff8158ba60>] [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk3 kern.warning kernel RSP: 0018:ffff8800793b3b60 EFLAGS: 00010292 uk3 kern.warning kernel RAX: 0000000000000028 RBX: 0000000000000028 RCX: 0000000000000028 uk3 kern.warning kernel RDX: ffff8800793b3c90 RSI: 0000000000001000 RDI: ffff8800384eae98 uk3 kern.warning kernel RBP: ffff8800793b3c80 R08: 0000000000000000 R09: ffff8800384eaec0 uk3 kern.warning kernel R10: ffff8800384eae98 R11: 000000000000bab2 R12: 00000000000005f0 uk3 kern.warning kernel R13: ffff880068334800 R14: 00000000000005f0 R15: ffff8800716e3640 uk3 kern.warning kernel FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 uk3 kern.warning kernel CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 uk3 kern.warning kernel CR2: 0000000000001000 CR3: 00000000760ef000 CR4: 00000000001407f0 uk3 kern.warning kernel Stack: uk3 kern.warning kernel 00000000000005f0 ffff880073f43200 00000000000005f0 ffff8800793b3c70 uk3 kern.warning kernel 0000000000000000 ffff880068334800 00000000000005f0 ffff8800716e3640 uk3 kern.warning kernel ffffffff81838006 000040000000f428 ffff8800793b3c80 0000000000000000 uk3 kern.warning kernel Call Trace: uk3 kern.warning kernel [<ffffffff81838006>] ? tcp_sendmsg+0x3ee/0x9b4 uk3 kern.warning kernel [<ffffffff8175c6de>] ? sock_sendmsg+0x2e/0x3b uk3 kern.warning kernel [<ffffffff81666c4a>] ? drbd_send+0xa5/0x171 uk3 kern.warning kernel [<ffffffff81666d24>] ? drbd_send_all+0xe/0x23 uk3 kern.warning kernel [<ffffffff816682a5>] ? _drbd_no_send_page+0x47/0x5d uk3 kern.warning kernel [<ffffffff8166887d>] ? drbd_send_dblock+0x2e1/0x4ab uk3 kern.warning kernel [<ffffffff811075ae>] ? __wake_up_common+0x47/0x7d uk3 kern.warning kernel [<ffffffff81107a84>] ? __wake_up+0x3a/0x4b uk3 kern.warning kernel [<ffffffff81651643>] ? w_send_dblock+0xd3/0x139 uk3 kern.warning kernel [<ffffffff8165279e>] ? drbd_worker+0x124/0x302 uk3 kern.warning kernel [<ffffffff81665f5b>] ? w_complete+0x13/0x13 uk3 kern.warning kernel [<ffffffff81665fa2>] ? drbd_thread_setup+0x47/0x10c uk3 kern.warning kernel [<ffffffff81665f5b>] ? w_complete+0x13/0x13 uk3 kern.warning kernel [<ffffffff810f088c>] ? kthread+0xca/0xd2 uk3 kern.warning kernel [<ffffffff810f07c2>] ? kthread_freezable_should_stop+0x40/0x40 uk3 kern.warning kernel [<ffffffff81962362>] ? ret_from_fork+0x42/0x70 uk3 kern.warning kernel [<ffffffff810f07c2>] ? kthread_freezable_should_stop+0x40/0x40 uk3 kern.warning kernel Code: eb 2b 4c 39 42 08 4c 89 c0 48 0f 46 42 08 48 85 c0 74 1a 49 01 c1 48 8b 32 48 89 c1 4d 89 ca 49 29 c0 48 89 c3 49 29 c2 4c 89 d7 <f3> a4 48 83 c2 10 4d 85 c0 48 8d 42 f0 75 c8 48 3b 58 08 75 05 uk3 kern.alert kernel RIP [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk3 kern.warning kernel RSP <ffff8800793b3b60> uk3 kern.warning kernel CR2: 0000000000001000 uk3 kern.warning kernel ---[ end trace e217b95518b37926 ]--- KERNEL OOPS 1 - LINE NUMBERS - Version 4.1.0 with barrier patch - 2015-07-08 18:11:09 ffffffff8158ba60:copy_from_iter lib/iov_iter.c:416 **********inline:skb_do_copy_data_nocache include/net/sock.h:1791 **********inline:skb_add_data_no_cache include/net/sock.h:1802 ffffffff81838006:tcp_sendmsg net/ipv4/tcp.c:1177 **********inline:sock_sendmsg_nosec net/socket.c:613 (***addr2line reports 614) ffffffff8175c6de:sock_sendmsg net/socket.c:623 ffffffff81666c4a:drbd_send drivers/block/drbd/drbd_main.c:1805 (***addr2line reports 1806) ffffffff81666d24:drbd_send_all drivers/block/drbd/drbd_main.c:1849 (***addr2line reports 1850) ffffffff816682a5:_drbd_no_send_page drivers/block/drbd/drbd_main.c:1492 (***addr2line reports 1494) ffffffff8166887d:drbd_send_dblock /drivers/block/drbd/drbd_main.c:1678 ffffffff811075ae:__wake_up_common kernel/sched/wait.c:73 ffffffff81107a84:__wake_up kernel/sched/wait.c:95 (***addr2line reports 97) ffffffff81651643:w_send_dblock drivers/block/drbd/drbd_worker.c:1404 (***addr2line reports 1405) *** Developer analysis of call stack stopped here *** ffffffff8165279e:drbd_worker drivers/block/drbd/drbd_worker.c:2122 ffffffff81665f5b:drbd_thread_setup drivers/block/drbd/drbd_main.c:324 ffffffff81665fa2:drbd_thread_setup drivers/block/drbd/drbd_main.c:337 ffffffff81665f5b:drbd_thread_setup drivers/block/drbd/drbd_main.c:324 ffffffff810f088c:kthread kernel/kthread.c:210 ffffffff810f07c2:kthread kernel/kthread.c:176 ffffffff81962362:ret_from_fork arch/x86/kernel/entry_64.S:640 ffffffff810f07c2:kthread kernel/kthread.c:176 ----------------------------------- KERNEL OOPS 2 - ERROR LOG - Version 4.1.0 with barrier patch - 2015-07-27 07:44:17 uk3 kern.alert kernel BUG: unable to handle kernel NULL pointer dereference at 0000000000000003 uk3 kern.alert kernel IP: [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk3 kern.warning kernel PGD 759b0067 PUD 759b1067 PMD 0 uk3 kern.warning kernel Oops: 0000 [#1] SMP uk3 kern.warning kernel Modules linked in: uk3 kern.warning kernel CPU: 0 PID: 4448 Comm: drbd_w_db Not tainted 4.1.0-x86_64-linode59 #1 uk3 kern.warning kernel Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 uk3 kern.warning kernel task: ffff880075a87380 ti: ffff880075a68000 task.ti: ffff880075a68000 uk3 kern.warning kernel RIP: 0010:[<ffffffff8158ba60>] [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk3 kern.warning kernel RSP: 0018:ffff880075a6bb60 EFLAGS: 00010282 uk3 kern.warning kernel RAX: 00000000000005f0 RBX: 00000000000005f0 RCX: 00000000000005f0 uk3 kern.warning kernel RDX: ffff880075a6bc80 RSI: 0000000000000003 RDI: ffff88006c9e8aa8 uk3 kern.warning kernel RBP: ffff880075a6bc80 R08: 0000000000000190 R09: ffff88006c9e9098 uk3 kern.warning kernel R10: ffff88006c9e8aa8 R11: 00000000000005a8 R12: 0000000000000a10 uk3 kern.warning kernel R13: ffff8800788fd000 R14: ffff880075a87b40 R15: ffff88007bae3e00 uk3 kern.warning kernel FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 uk3 kern.warning kernel CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 uk3 kern.warning kernel CR2: 0000000000000003 CR3: 000000007597a000 CR4: 00000000001407f0 uk3 kern.warning kernel Stack: uk3 kern.warning kernel 00000000000005f0 ffff8800788fca00 0000000000000a10 ffff880075a6bc70 uk3 kern.warning kernel 00000000000005f0 ffff8800788fd000 ffff880075a87b40 ffff88007bae3e00 uk3 kern.warning kernel ffffffff81838225 000040000000f6e0 ffff880075a6bc80 0000000000000a10 uk3 kern.warning kernel Call Trace: uk3 kern.warning kernel [<ffffffff81838225>] ? tcp_sendmsg+0x60d/0x9b4 uk3 kern.warning kernel [<ffffffff8175c6de>] ? sock_sendmsg+0x2e/0x3b uk3 kern.warning kernel [<ffffffff81666c4a>] ? drbd_send+0xa5/0x171 uk3 kern.warning kernel [<ffffffff81666d24>] ? drbd_send_all+0xe/0x23 uk3 kern.warning kernel [<ffffffff816682a5>] ? _drbd_no_send_page+0x47/0x5d uk3 kern.warning kernel [<ffffffff8166887d>] ? drbd_send_dblock+0x2e1/0x4ab uk3 kern.warning kernel [<ffffffff81107a84>] ? __wake_up+0x3a/0x4b uk3 kern.warning kernel [<ffffffff81651643>] ? w_send_dblock+0xd3/0x139 uk3 kern.warning kernel [<ffffffff8165279e>] ? drbd_worker+0x124/0x302 uk3 kern.warning kernel [<ffffffff81665f5b>] ? w_complete+0x13/0x13 uk3 kern.warning kernel [<ffffffff81665fa2>] ? drbd_thread_setup+0x47/0x10c uk3 kern.warning kernel [<ffffffff81665f5b>] ? w_complete+0x13/0x13 uk3 kern.warning kernel [<ffffffff810f088c>] ? kthread+0xca/0xd2 uk3 kern.warning kernel [<ffffffff810f07c2>] ? kthread_freezable_should_stop+0x40/0x40 uk3 kern.warning kernel [<ffffffff81962362>] ? ret_from_fork+0x42/0x70 uk3 kern.warning kernel [<ffffffff810f07c2>] ? kthread_freezable_should_stop+0x40/0x40 uk3 kern.warning kernel Code: eb 2b 4c 39 42 08 4c 89 c0 48 0f 46 42 08 48 85 c0 74 1a 49 01 c1 48 8b 32 48 89 c1 4d 89 ca 49 29 c0 48 89 c3 49 29 c2 4c 89 d7 <f3> a4 48 83 c2 10 4d 85 c0 48 8d 42 f0 75 c8 48 3b 58 08 75 05 uk3 kern.alert kernel RIP [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk3 kern.warning kernel RSP <ffff880075a6bb60> uk3 kern.warning kernel CR2: 0000000000000003 uk3 kern.warning kernel ---[ end trace 72c4b2b13ae06bb9 ]--- KERNEL OOPS 2 - LINE NUMBERS - Version 4.1.0 with barrier patch - 2015-07-27 07:44:17 ffffffff8158ba60:copy_from_iter lib/iov_iter.c:416 **********inline:skb_do_copy_data_nocache include/net/sock.h:1791 **********inline:skb_copy_to_page_nocache include/net/sock.h:1817 ffffffff81838225:tcp_sendmsg net/ipv4/tcp.c:1202 **********inline:sock_sendmsg_nosec net/socket.c:613 (***addr2line reports 614) ffffffff8175c6de:sock_sendmsg net/socket.c:623 ffffffff81666c4a:drbd_send drivers/block/drbd/drbd_main.c:1805: (***addr2line reports 1806) ffffffff81666d24:drbd_send_all drivers/block/drbd/drbd_main.c:1849 (***addr2line reports 1850) ffffffff816682a5:_drbd_no_send_page drivers/block/drbd/drbd_main.c:1492 (***addr2line reports 1494) **********inline:_drbd_send_bio drivers/block/drbd/drbd_main.c:1557 (***addr2line reports 1561) ffffffff8166887d:drbd_send_dblock drivers/block/drbd/drbd_main.c:1678 **** Developer analysis of call stack stopped here **** ffffffff81107a84:__wake_up kernel/sched/wait.c:97 **********inline:req_mod drivers/block/drbd/drbd_req.h:320 ffffffff81651643:w_send_dblock drivers/block/drbd/drbd_worker.c:1405 ffffffff8165279e:drbd_worker drivers/block/drbd/drbd_worker.c:2122 ffffffff81665f5b:drbd_thread_setup drivers/block/drbd/drbd_main.c:324 ffffffff81665fa2:drbd_thread_setup drivers/block/drbd/drbd_main.c:337 ffffffff81665f5b:drbd_thread_setup drivers/block/drbd/drbd_main.c:324 ffffffff810f088c:kthread kernel/kthread.c:210 ffffffff810f07c2:kthread kernel/kthread.c:176 ffffffff81962362:ret_from_fork arch/x86/kernel/entry_64.S:640 ffffffff810f07c2:kthread kernel/kthread.c:176 -------------------------------- KERNEL OOPS 3 - ERROR LOG - Version 4.1.0 with barrier patch - 2015-07-27 18:04:49 uk4 kern.alert kernel BUG: unable to handle kernel NULL pointer dereference at 0000000000000003 uk4 kern.alert kernel IP: [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk4 kern.warning kernel PGD 77e8d067 PUD 78b0c067 PMD 0 uk4 kern.warning kernel Oops: 0000 [#1] SMP uk4 kern.warning kernel Modules linked in: uk4 kern.warning kernel CPU: 0 PID: 4166 Comm: drbd_w_db Not tainted 4.1.0-x86_64-linode59 #1 uk4 kern.warning kernel Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 uk4 kern.warning kernel task: ffff8800741498c0 ti: ffff88007b7f0000 task.ti: ffff88007b7f0000 uk4 kern.warning kernel RIP: 0010:[<ffffffff8158ba60>] [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk4 kern.warning kernel RSP: 0018:ffff88007b7f3b60 EFLAGS: 00010282 uk4 kern.warning kernel RAX: 00000000000005f0 RBX: 00000000000005f0 RCX: 00000000000005f0 uk4 kern.warning kernel RDX: ffff88007b7f3c80 RSI: 0000000000000003 RDI: ffff880002af9be0 uk4 kern.warning kernel RBP: ffff88007b7f3c80 R08: 00000000000003f0 R09: ffff880002afa1d0 uk4 kern.warning kernel R10: ffff880002af9be0 R11: 00000000000005a8 R12: 0000000000000a10 uk4 kern.warning kernel R13: ffff88006a7d8e00 R14: ffff88007414a080 R15: ffff880077421f00 uk4 kern.warning kernel FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 uk4 kern.warning kernel CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 uk4 kern.warning kernel CR2: 0000000000000003 CR3: 0000000077e8c000 CR4: 00000000001407f0 uk4 kern.warning kernel Stack: uk4 kern.warning kernel 00000000000005f0 ffff88006a7d9e00 0000000000000a10 ffff88007b7f3c70 uk4 kern.warning kernel 00000000000005f0 ffff88006a7d8e00 ffff88007414a080 ffff880077421f00 uk4 kern.warning kernel ffffffff81838225 000040000000f480 ffff88007b7f3c80 0000000000000a10 uk4 kern.warning kernel Call Trace: uk4 kern.warning kernel [<ffffffff81838225>] ? tcp_sendmsg+0x60d/0x9b4 uk4 kern.warning kernel [<ffffffff8175c6de>] ? sock_sendmsg+0x2e/0x3b uk4 kern.warning kernel [<ffffffff81666c4a>] ? drbd_send+0xa5/0x171 uk4 kern.warning kernel [<ffffffff81666d24>] ? drbd_send_all+0xe/0x23 uk4 kern.warning kernel [<ffffffff816682a5>] ? _drbd_no_send_page+0x47/0x5d uk4 kern.warning kernel [<ffffffff8166887d>] ? drbd_send_dblock+0x2e1/0x4ab uk4 kern.warning kernel [<ffffffff81107a84>] ? __wake_up+0x3a/0x4b uk4 kern.warning kernel [<ffffffff81651643>] ? w_send_dblock+0xd3/0x139 uk4 kern.warning kernel [<ffffffff8165279e>] ? drbd_worker+0x124/0x302 uk4 kern.warning kernel [<ffffffff81665f5b>] ? w_complete+0x13/0x13 uk4 kern.warning kernel [<ffffffff81665fa2>] ? drbd_thread_setup+0x47/0x10c uk4 kern.warning kernel [<ffffffff81665f5b>] ? w_complete+0x13/0x13 uk4 kern.warning kernel [<ffffffff810f088c>] ? kthread+0xca/0xd2 uk4 kern.warning kernel [<ffffffff810f07c2>] ? kthread_freezable_should_stop+0x40/0x40 uk4 kern.warning kernel [<ffffffff81962362>] ? ret_from_fork+0x42/0x70 uk4 kern.warning kernel [<ffffffff810f07c2>] ? kthread_freezable_should_stop+0x40/0x40 uk4 kern.warning kernel Code: eb 2b 4c 39 42 08 4c 89 c0 48 0f 46 42 08 48 85 c0 74 1a 49 01 c1 48 8b 32 48 89 c1 4d 89 ca 49 29 c0 48 89 c3 49 29 c2 4c 89 d7 <f3> a4 48 83 c2 10 4d 85 c0 48 8d 42 f0 75 c8 48 3b 58 08 75 05 uk4 kern.alert kernel RIP [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk4 kern.warning kernel RSP <ffff88007b7f3b60> uk4 kern.warning kernel CR2: 0000000000000003 uk4 kern.warning kernel ---[ end trace 1277a4d533113ace ]--- KERNEL OOPS 3 - LINE NUMBERS - Version 4.1.0 with barrier patch - 2015-07-27 18:04:49 ffffffff8158ba60:copy_from_iter lib/iov_iter.c:416 **********inline:skb_do_copy_data_nocache include/net/sock.h:1791 **********inline:skb_copy_to_page_nocache include/net/sock.h:1817 ffffffff81838225:tcp_sendmsg net/ipv4/tcp.c:1202 **********inline:sock_sendmsg_nosec net/socket.c:613 (***addr2line reports 614) ffffffff8175c6de:sock_sendmsg net/socket.c:623 ffffffff81666c4a:drbd_send drivers/block/drbd/drbd_main.c:1805 (***addr2line reports 1806) ffffffff81666d24:drbd_send_all drivers/block/drbd/drbd_main.c:1849 (***addr2line reports 1850) ffffffff816682a5:_drbd_no_send_page drivers/block/drbd/drbd_main.c:1492 (***addr2line reports 1494) **********inline:_drbd_send_bio drivers/block/drbd/drbd_main.c:1557 (***addr2line reports 1561) ffffffff8166887d:drbd_send_dblock drivers/block/drbd/drbd_main.c:1678 **** Developer analysis of call stack stopped here **** ffffffff81107a84:__wake_up kernel/sched/wait.c:97 **********inline:req_mod drivers/block/drbd/drbd_req.h:320 ffffffff81651643:w_send_dblock drivers/block/drbd/drbd_worker.c:1405 ffffffff8165279e:drbd_worker drivers/block/drbd/drbd_worker.c:2122 ffffffff81665f5b:drbd_thread_setup drivers/block/drbd/drbd_main.c:324 ffffffff81665fa2:drbd_thread_setup drivers/block/drbd/drbd_main.c:337 ffffffff81665f5b:drbd_thread_setup drivers/block/drbd/drbd_main.c:324 ffffffff810f088c:kthread kernel/kthread.c:210 ffffffff810f07c2:kthread kernel/kthread.c:176 ffffffff81962362:ret_from_fork arch/x86/kernel/entry_64.S:640 ffffffff810f07c2:kthread kernel/kthread.c:176 -------------------------------- KERNEL OOPS 4 - ERROR LOG - Version 4.1.0 with barrier patch - 2015-08-13 16:39:28 uk3 kern.alert kernel BUG: unable to handle kernel paging request at 0000000000001000 uk3 kern.alert kernel IP: [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk3 kern.warning kernel PGD 0 uk3 kern.warning kernel Oops: 0000 [#1] SMP uk3 kern.warning kernel Modules linked in: uk3 kern.warning kernel CPU: 0 PID: 6121 Comm: drbd_w_db Not tainted 4.1.0-x86_64-linode59 #1 uk3 kern.warning kernel task: ffff88007c0d3180 ti: ffff8800776b8000 task.ti: ffff8800776b8000 uk3 kern.warning kernel RIP: e030:[<ffffffff8158ba60>] [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk3 kern.warning kernel RSP: e02b:ffff8800776bbb60 EFLAGS: 00010282 uk3 kern.warning kernel RAX: 00000000000004e0 RBX: 00000000000004e0 RCX: 00000000000004e0 uk3 kern.warning kernel RDX: ffff8800776bbc90 RSI: 0000000000001000 RDI: ffff880077acd9e0 uk3 kern.warning kernel RBP: ffff8800776bbc80 R08: 0000000000000000 R09: ffff880077acdec0 uk3 kern.warning kernel R10: ffff880077acd9e0 R11: 00000000000005a8 R12: 00000000000005f0 uk3 kern.warning kernel R13: ffff88007c74de00 R14: 00000000000005f0 R15: ffff880000ed5540 uk3 kern.warning kernel FS: 0000000000000000(0000) GS:ffff88007d200000(0000) knlGS:ffff88007d200000 uk3 kern.warning kernel CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 uk3 kern.warning kernel CR2: 0000000000001000 CR3: 00000000771b3000 CR4: 0000000000042660 uk3 kern.warning kernel Stack: uk3 kern.warning kernel 00000000000005f0 ffff880077211000 00000000000005f0 ffff8800776bbc70 uk3 kern.warning kernel 0000000000000000 ffff88007c74de00 00000000000005f0 ffff880000ed5540 uk3 kern.warning kernel ffffffff81838006 000040000000ef70 ffff8800776bbc80 0000000000000000 uk3 kern.warning kernel Call Trace: uk3 kern.warning kernel [<ffffffff81838006>] ? tcp_sendmsg+0x3ee/0x9b4 uk3 kern.warning kernel [<ffffffff8175c6de>] ? sock_sendmsg+0x2e/0x3b uk3 kern.warning kernel [<ffffffff81666c4a>] ? drbd_send+0xa5/0x171 uk3 kern.warning kernel [<ffffffff81666d24>] ? drbd_send_all+0xe/0x23 uk3 kern.warning kernel [<ffffffff816682a5>] ? _drbd_no_send_page+0x47/0x5d uk3 kern.warning kernel [<ffffffff8166887d>] ? drbd_send_dblock+0x2e1/0x4ab uk3 kern.warning kernel [<ffffffff811075ae>] ? __wake_up_common+0x47/0x7d uk3 kern.warning kernel [<ffffffff81107a84>] ? __wake_up+0x3a/0x4b uk3 kern.warning kernel [<ffffffff81651643>] ? w_send_dblock+0xd3/0x139 uk3 kern.warning kernel [<ffffffff8165279e>] ? drbd_worker+0x124/0x302 uk3 kern.warning kernel [<ffffffff81665f5b>] ? w_complete+0x13/0x13 uk3 kern.warning kernel [<ffffffff81665fa2>] ? drbd_thread_setup+0x47/0x10c uk3 kern.warning kernel [<ffffffff81665f5b>] ? w_complete+0x13/0x13 uk3 kern.warning kernel [<ffffffff810f088c>] ? kthread+0xca/0xd2 uk3 kern.warning kernel [<ffffffff810f07c2>] ? kthread_freezable_should_stop+0x40/0x40 uk3 kern.warning kernel [<ffffffff81962362>] ? ret_from_fork+0x42/0x70 uk3 kern.warning kernel [<ffffffff810f07c2>] ? kthread_freezable_should_stop+0x40/0x40 uk3 kern.warning kernel Code: eb 2b 4c 39 42 08 4c 89 c0 48 0f 46 42 08 48 85 c0 74 1a 49 01 c1 48 8b 32 48 89 c1 4d 89 ca 49 29 c0 48 89 c3 49 29 c2 4c 89 d7 <f3> a4 48 83 c2 10 4d 85 c0 48 8d 42 f0 75 c8 48 3b 58 08 75 05 uk3 kern.alert kernel RIP [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk3 kern.warning kernel RSP <ffff8800776bbb60> uk3 kern.warning kernel CR2: 0000000000001000 uk3 kern.warning kernel ---[ end trace edb6ef1729c2cd08 ]--- KERNEL OOPS 4 - LINE NUMBERS - Version 4.1.0 with barrier patch - 2015-08-13 16:39:28 ffffffff8158ba60:copy_from_iter lib/iov_iter.c:416 **********inline:skb_do_copy_data_nocache include/net/sock.h:1791 **********inline:skb_add_data_nocache include/net/sock.h:1802 ffffffff81838006:tcp_sendmsg net/ipv4/tcp.c:1177 **********inline:sock_sendmsg_nosec net/socket.c:613 (***addr2line reports 614) ffffffff8175c6de:sock_sendmsg net/socket.c:623 ffffffff81666c4a:drbd_send drivers/block/drbd/drbd_main.c:1806 (***addr2line reports 1806) ffffffff81666d24:drbd_send_all drivers/block/drbd/drbd_main.c:1849 (***addr2line reports 1850) ffffffff816682a5:_drbd_no_send_page drivers/block/drbd/drbd_main.c:1492 (***addr2line reports 1494) **********inline:_drbd_send_bio drivers/block/drbd/drbd_main.c:1557 (***addr2line reports 1561) ffffffff8166887d:drbd_send_dblock drivers/block/drbd/drbd_main.c:1678 *** Developer analysis of call stack stopped here *** ffffffff811075ae:__wake_up_common kernel/sched/wait.c:73 ffffffff81107a84:__wake_up kernel/sched/wait.c:97 **********inline:req_mod drivers/block/drbd/drbd_req.h:320 ffffffff81651643:w_send_dblock:drivers/block/drbd/drbd_worker.c:1405 ffffffff8165279e:drbd_worker drivers/block/drbd/drbd_worker.c:2122 ffffffff81665f5b:drbd_thread_setup drivers/block/drbd/drbd_main.c:324 ffffffff81665fa2:drbd_thread_setup drivers/block/drbd/drbd_main.c:337 ffffffff81665f5b:drbd_thread_setup drivers/block/drbd/drbd_main.c:324 ffffffff810f088c:kthread kernel/kthread.c:210 ffffffff810f07c2:kthread kernel/kthread.c:176 ffffffff81962362:ret_from_fork arch/x86/kernel/entry_64.S:640 ffffffff810f07c2:kthread kernel/kthread.c:176 ---------------------------------------- KERNEL OOPS 5 - LINE NUMBERS - Version 4.1.0 with barrier patch - 2015-08-28 15:58:21 uk4 kern.alert kernel IP: [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk4 kern.warning kernel PGD 76f43067 PUD 76d64067 PMD 0 uk4 kern.warning kernel Oops: 0000 [#1] SMP uk4 kern.warning kernel Modules linked in: uk4 kern.warning kernel CPU: 0 PID: 3967 Comm: drbd_w_db Not tainted 4.1.0-x86_64-linode59 #1 uk4 kern.warning kernel task: ffff88007c0b39c0 ti: ffff8800776a4000 task.ti: ffff8800776a4000 uk4 kern.warning kernel RIP: e030:[<ffffffff8158ba60>] [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk4 kern.warning kernel RSP: e02b:ffff8800776a7b60 EFLAGS: 00010286 uk4 kern.warning kernel RAX: 00000000000005f0 RBX: 00000000000005f0 RCX: 00000000000005f0 uk4 kern.warning kernel RDX: ffff8800776a7c80 RSI: 0000000000000003 RDI: ffff880013a30c30 uk4 kern.warning kernel RBP: ffff8800776a7c80 R08: 00000000000000d8 R09: ffff880013a31220 uk4 kern.warning kernel R10: ffff880013a30c30 R11: 000000000000bc4a R12: 0000000000000a10 uk4 kern.warning kernel R13: ffff8800023fc000 R14: ffff88007c0b4180 R15: ffff8800358df440 uk4 kern.warning kernel FS: 0000000000000000(0000) GS:ffff88007d200000(0000) knlGS:ffff88007d200000 uk4 kern.warning kernel CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 uk4 kern.warning kernel CR2: 0000000000000003 CR3: 000000007802f000 CR4: 0000000000042660 uk4 kern.warning kernel Stack: uk4 kern.warning kernel 00000000000005f0 ffff880074ad4c00 0000000000000a10 ffff8800776a7c70 uk4 kern.warning kernel 00000000000005f0 ffff8800023fc000 ffff88007c0b4180 ffff8800358df440 uk4 kern.warning kernel ffffffff81838225 0000000000000000 ffff8800776a7c80 0000000000000a10 uk4 kern.warning kernel Call Trace: uk4 kern.warning kernel [<ffffffff81838225>] ? tcp_sendmsg+0x60d/0x9b4 uk4 kern.warning kernel [<ffffffff8175c6de>] ? sock_sendmsg+0x2e/0x3b uk4 kern.warning kernel [<ffffffff81666c4a>] ? drbd_send+0xa5/0x171 uk4 kern.warning kernel [<ffffffff81666d24>] ? drbd_send_all+0xe/0x23 uk4 kern.warning kernel [<ffffffff816682a5>] ? _drbd_no_send_page+0x47/0x5d uk4 kern.warning kernel [<ffffffff8166887d>] ? drbd_send_dblock+0x2e1/0x4ab uk4 kern.warning kernel [<ffffffff8165f050>] ? mod_rq_state+0x463/0x4e1 uk4 kern.warning kernel [<ffffffff81961b39>] ? _raw_spin_lock_irq+0xc/0x65 uk4 kern.warning kernel [<ffffffff81651643>] ? w_send_dblock+0xd3/0x139 uk4 kern.warning kernel [<ffffffff8165279e>] ? drbd_worker+0x124/0x302 uk4 kern.warning kernel [<ffffffff81665f5b>] ? w_complete+0x13/0x13 uk4 kern.warning kernel [<ffffffff81665fa2>] ? drbd_thread_setup+0x47/0x10c uk4 kern.warning kernel [<ffffffff81665f5b>] ? w_complete+0x13/0x13 uk4 kern.warning kernel [<ffffffff810f088c>] ? kthread+0xca/0xd2 uk4 kern.warning kernel [<ffffffff810f07c2>] ? kthread_freezable_should_stop+0x40/0x40 uk4 kern.warning kernel [<ffffffff81962362>] ? ret_from_fork+0x42/0x70 uk4 kern.warning kernel [<ffffffff810f07c2>] ? kthread_freezable_should_stop+0x40/0x40 uk4 kern.warning kernel Code: eb 2b 4c 39 42 08 4c 89 c0 48 0f 46 42 08 48 85 c0 74 1a 49 01 c1 48 8b 32 48 89 c1 4d 89 ca 49 29 c0 48 89 c3 49 29 c2 4c 89 d7 <f3> a4 48 83 c2 10 4d 85 c0 48 8d 42 f0 75 c8 48 3b 58 08 75 05 uk4 kern.alert kernel RIP [<ffffffff8158ba60>] copy_from_iter+0x140/0x24e uk4 kern.warning kernel RSP <ffff8800776a7b60> uk4 kern.warning kernel CR2: 0000000000000003 uk4 kern.warning kernel ---[ end trace 44570100730370c8 ]--- KERNEL OOPS 5 - LINE NUMBERS - Version 4.1.0 with barrier patch - 2015-08-28 15:58:21 ffffffff8158ba60:copy_from_iter lib/iov_iter.c:416 **********inline:skb_do_copy_data_nocache include/net/sock.h:1791 **********inline:skb_copy_to_page_nocache include/net/sock.h:1817 ffffffff81838225:tcp_sendmsg net/ipv4/tcp.c:1202 **********inline:sock_sendmsg_nosec net/socket.c:613 (***addr2line reports 614) ffffffff8175c6de:sock_sendmsg net/socket.c:623 ffffffff81666c4a:drbd_send drivers/block/drbd/drbd_main.c:1805 (***addr2line reports 1806) ffffffff81666d24:drbd_send_all drivers/block/drbd/drbd_main.c:1849 (***addr2line reports 1850) ffffffff816682a5:_drbd_no_send_page drivers/block/drbd/drbd_main.c:1492 (***addr2line reports 1494) **********inline:_drbd_send_bio drivers/block/drbd/drbd_main.c:1557 (***addr2line reports 1561) ffffffff8166887d:drbd_send_dblock drivers/block/drbd/drbd_main.c:1678 *** Developer analysis of call stack stopped here *** ffffffff8165f050:mod_rq_state drivers/block/drbd/drbd_req.c:533 **********inline:do_raw_spin_lock ./arch/x86/include/asm/spinlock.h:106 **********inline:__raw_spin_lock_irq include/linux/spinlock_api_smp.h:131 ffffffff81961b39:_raw_spin_lock_irq kernel/locking/spinlock.c:167 **********inline:req_mod drivers/block/drbd/drbd_req.h:320 ffffffff81651643:w_send_dblock drivers/block/drbd/drbd_worker.c:1405 ffffffff8165279e:drbd_worker drivers/block/drbd/drbd_worker.c:2122 ffffffff81665f5b:drbd_thread_setup drivers/block/drbd/drbd_main.c:324 ffffffff81665fa2:drbd_thread_setup drivers/block/drbd/drbd_main.c:337 ffffffff81665f5b:drbd_thread_setup drivers/block/drbd/drbd_main.c:324 ffffffff810f088c:kthread kernel/kthread.c:210 ffffffff810f07c2:kthread kernel/kthread.c:176 ffffffff81962362:ret_from_fork arch/x86/kernel/entry_64.S:640 ffffffff810f07c2:kthread kernel/kthread.c:176 ============================================================================= 3) MAPPING PROCESS - Call Stack Addresses To Kernel Source Line Numbers: - Download 4.1 kernel sources from kernel.org. - Modify source code as described above (i.e. commenting out barrier related lines) - Obtain a .config file from a live server zcat /proc/config.gz. - Obtain compiler version cat /proc/version which identifies gcc version 4.7.2 (Debian 4.7.2-5) - Build kernel with matching compiler on dual core Debian 7 (Wheezy) system: make -j3 deb-pkg - Compare Symbol.map file with one obtained from live server via cat /proc/kallsyms to assure function addresses match. - Extract kernel sources again and run menuconfig to update .config as follows: < # CONFIG_DEBUG_INFO is not set --- > CONFIG_DEBUG_INFO=y > CONFIG_DEBUG_INFO_REDUCED=y > CONFIG_DEBUG_INFO_SPLIT=y > CONFIG_DEBUG_INFO_DWARF4=y > CONFIG_GDB_SCRIPTS=y - Build a matching debug version of the kernel: make -j3 deb-pkg - Compare Symbol.map file with one obtained from live server via cat /proc/kallsyms to assure function addresses match. At this point, the command addr2line -fie ./vmlinux <address> is used to map each address reported in the Oops call trace to one or more lines in the kernel source. Minor adjustments are made to the lines reported by addr2line as they often fall just after a function call in the code. Finally inline function calls reported by addr2line are manually inserted into the call trace to clarify the lines of code called that lead to the Oops. On 08/18/2015 06:40 PM, Lars Ellenberg wrote: > On Fri, Aug 14, 2015 at 12:32:48PM +0200, Ben Siemerink wrote: >> Hello, >> >> >> Lately we have experienced five kernel oopses in our DRBD setup. The >> stack trace is very similar every time. >> >> If you need more information, please let me know. Thank you in advance. >> >> >> Kind regards, >> Ben. >> >> ---------- >> >> 2015-05-05T09:31:02.759829+00:00 uk3 kern.info kernel Linux version 4.0.1-x86_64-linode55 (maker at build) (gcc version 4.7.2 (Debian 4.7.2-5) ) #3 SMP Wed Apr 29 11:10:11 EDT 2015 >> 2015-05-05T09:31:02.760314+00:00 uk3 kern.info kernel drbd: initialized. Version: 8.4.5 (api:1/proto:86-101) > reproduce with kernel.org kernel, and resolve the symbols to line numbers. > preferably even reproduce with drbd 8.4 from git, > and still resolve line numbers. > >> 2015-05-13T14:14:58.163337+00:00 uk3 kern.alert kernel BUG: unable to handle kernel NULL pointer dereference at 0000000000000003 >> 2015-05-13T14:14:58.163365+00:00 uk3 kern.alert kernel IP: [<ffffffff8157804f>] copy_from_iter+0x140/0x24e >> 2015-05-13T14:14:58.163369+00:00 uk3 kern.warning kernel PGD 759f7067 PUD 77777067 PMD 0 >> 2015-05-13T14:14:58.163372+00:00 uk3 kern.warning kernel Oops: 0000 [#1] SMP >> 2015-05-13T14:14:58.163374+00:00 uk3 kern.warning kernel Modules linked in: >> 2015-05-13T14:14:58.163390+00:00 uk3 kern.warning kernel CPU: 0 PID: 3606 Comm: drbd_w_db Not tainted 4.0.1-x86_64-linode55 #3 >> 2015-05-13T14:14:58.163392+00:00 uk3 kern.warning kernel task: ffff88007c5071c0 ti: ffff880075bd0000 task.ti: ffff880075bd0000 >> 2015-05-13T14:14:58.163395+00:00 uk3 kern.warning kernel RIP: e030:[<ffffffff8157804f>] [<ffffffff8157804f>] copy_from_iter+0x140/0x24e >> 2015-05-13T14:14:58.163397+00:00 uk3 kern.warning kernel RSP: e02b:ffff880075bd3ae0 EFLAGS: 00010286 >> 2015-05-13T14:14:58.163399+00:00 uk3 kern.warning kernel RAX: 0000000000000440 RBX: 0000000000000440 RCX: 0000000000000440 >> 2015-05-13T14:14:58.163401+00:00 uk3 kern.warning kernel RDX: ffff880075bd3c88 RSI: 0000000000000003 RDI: ffff880005f25088 >> 2015-05-13T14:14:58.163403+00:00 uk3 kern.warning kernel RBP: ffff880075bd3c88 R08: 0000000000000000 R09: ffff880005f254c8 >> 2015-05-13T14:14:58.163409+00:00 uk3 kern.warning kernel R10: ffff880005f25088 R11: ffff88007d2149c0 R12: 0000000000000a10 >> 2015-05-13T14:14:58.163411+00:00 uk3 kern.warning kernel R13: ffff8800720f3000 R14: ffff88007c507978 R15: ffff880075ab57c0 >> 2015-05-13T14:14:58.163413+00:00 uk3 kern.warning kernel FS: 0000000000000000(0000) GS:ffff88007d200000(0000) knlGS:ffff880179b40000 >> 2015-05-13T14:14:58.163415+00:00 uk3 kern.warning kernel CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 >> 2015-05-13T14:14:58.163417+00:00 uk3 kern.warning kernel CR2: 0000000000000003 CR3: 00000000771e1000 CR4: 0000000000042660 >> 2015-05-13T14:14:58.163419+00:00 uk3 kern.warning kernel Stack: >> 2015-05-13T14:14:58.163421+00:00 uk3 kern.warning kernel 00000000000005f0 ffff88002d75f200 0000000000000a10 ffff880075bd3c78 >> 2015-05-13T14:14:58.163422+00:00 uk3 kern.warning kernel 00000000000005f0 ffff8800720f3000 ffff88007c507978 ffff880075ab57c0 >> 2015-05-13T14:14:58.163429+00:00 uk3 kern.warning kernel ffffffff817d0f76 000040000000fa20 ffff880075bd3c88 0000000000000a10 >> 2015-05-13T14:14:58.163430+00:00 uk3 kern.warning kernel Call Trace: >> 2015-05-13T14:14:58.163432+00:00 uk3 kern.warning kernel [<ffffffff817d0f76>] ? tcp_sendmsg+0x610/0x9b7 >> 2015-05-13T14:14:58.163434+00:00 uk3 kern.warning kernel [<ffffffff816f8787>] ? sock_sendmsg+0x59/0x73 >> 2015-05-13T14:14:58.163436+00:00 uk3 kern.warning kernel [<ffffffff816f87ee>] ? kernel_sendmsg+0x4d/0x5d >> 2015-05-13T14:14:58.163441+00:00 uk3 kern.warning kernel [<ffffffff81645a3f>] ? drbd_send+0xa2/0x16d >> 2015-05-13T14:14:58.163473+00:00 uk3 kern.warning kernel [<ffffffff81645b18>] ? drbd_send_all+0xe/0x23 >> 2015-05-13T14:14:58.163481+00:00 uk3 kern.warning kernel [<ffffffff81647099>] ? _drbd_no_send_page+0x47/0x5d >> 2015-05-13T14:14:58.163483+00:00 uk3 kern.warning kernel [<ffffffff81647671>] ? drbd_send_dblock+0x2e1/0x4ab >> 2015-05-13T14:14:58.163485+00:00 uk3 kern.warning kernel [<ffffffff8163de54>] ? mod_rq_state+0x463/0x4e1 >> 2015-05-13T14:14:58.163487+00:00 uk3 kern.warning kernel [<ffffffff818f8229>] ? _raw_spin_lock_irq+0xc/0x65 >> 2015-05-13T14:14:58.163503+00:00 uk3 kern.warning kernel [<ffffffff8162c72b>] ? dequeue_work_batch+0x63/0x7d >> 2015-05-13T14:14:58.163507+00:00 uk3 kern.warning kernel [<ffffffff8162cab6>] ? wait_for_work+0x59/0x2e0 >> 2015-05-13T14:14:58.163509+00:00 uk3 kern.warning kernel [<ffffffff8163044b>] ? w_send_dblock+0xd3/0x139 >> 2015-05-13T14:14:58.163511+00:00 uk3 kern.warning kernel [<ffffffff816315a6>] ? drbd_worker+0x124/0x302 >> 2015-05-13T14:14:58.163524+00:00 uk3 kern.warning kernel [<ffffffff81644d53>] ? w_complete+0x13/0x13 >> 2015-05-13T14:14:58.163526+00:00 uk3 kern.warning kernel [<ffffffff81644d9a>] ? drbd_thread_setup+0x47/0x10c >> 2015-05-13T14:14:58.163528+00:00 uk3 kern.warning kernel [<ffffffff81644d53>] ? w_complete+0x13/0x13 >> 2015-05-13T14:14:58.163530+00:00 uk3 kern.warning kernel [<ffffffff810ecfb0>] ? kthread+0xca/0xd2 >> 2015-05-13T14:14:58.163532+00:00 uk3 kern.warning kernel [<ffffffff810ecee6>] ? kthread_freezable_should_stop+0x40/0x40 >> 2015-05-13T14:14:58.163534+00:00 uk3 kern.warning kernel [<ffffffff818f86d8>] ? ret_from_fork+0x58/0x90 >> 2015-05-13T14:14:58.163536+00:00 uk3 kern.warning kernel [<ffffffff810ecee6>] ? kthread_freezable_should_stop+0x40/0x40 >> 2015-05-13T14:14:58.163547+00:00 uk3 kern.warning kernel Code: eb 2b 4c 39 42 08 4c 89 c0 48 0f 46 42 08 48 85 c0 74 1a 49 01 c1 48 8b 32 48 89 c1 4d 89 ca 49 29 c0 48 89 c3 49 29 c2 4c 89 d7 <f3> a4 48 83 c2 10 4d 85 c0 48 8d 42 f0 75 c8 48 3b 58 08 75 05 >> 2015-05-13T14:14:58.163551+00:00 uk3 kern.alert kernel RIP [<ffffffff8157804f>] copy_from_iter+0x140/0x24e >> 2015-05-13T14:14:58.163553+00:00 uk3 kern.warning kernel RSP <ffff880075bd3ae0> >> 2015-05-13T14:14:58.163555+00:00 uk3 kern.warning kernel CR2: 0000000000000003 >> 2015-05-13T14:14:58.163557+00:00 uk3 kern.warning kernel ---[ end trace 74bd7ba7a08a9e0b ]--- > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user >