[DRBD-user] NULL deref at drbd_submit_peer_request

Tadashi Abe tabe at mvista.com
Thu Feb 9 11:46:23 CET 2017

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

Thanks for answer.

I tried drbd-8.4 latest and still see the same NULL deref.

Looks to me an invalid (negative) device->local_cnt causes it in the end.
Actually if I reset the variable to 0 by force (I know it's not clean 
fix, though) by the following change
when it goes negative, the NULL deref problem is gone.

--- linux.orig/drivers/block/drbd/drbd_int.h
+++ linux/drivers/block/drbd/drbd_int.h
@@ -2278,6 +2278,8 @@ static inline void put_ldev(struct drbd_
                                 drbd_device_post_work(device, GO_DISKLESS);
                 wake_up(&device->misc_wait);
         }
+       if (i < 0)
+               atomic_set(&device->local_cnt, 0);
  }

  #ifndef __CHECKER__

That's why I suspect that put_ldev isn't called along with get_ldev 
one-by-one and its execution is duplicated somewhere and 
device->local_cnt goes negative.
Does anyone know clue about the sequence? Thanks.

Regards,
Tadashi


 > Hi!
 >
 > Look to this commit e0645836e870346cafe688cbdd8ec29092f6cdb5 (Tue Nov 8
 > 11:43:09 2016) and this d9aea72bb66bb27f815de082d5b347fcddfc9c1b (Thu 
Nov 10
 > 14:48:33 2016) in http://git.linbit.com/drbd-8.4.git/
 >
 > I am not sure if this solves your particular problem, but you could 
use the
 > newest version 8.4.9-2 and test if it is gone, when you have a test 
scenario
 > where it happens easily.
 >
 > BR,
 >     Jasmin
 >
 > 
*******************************************************************************
 >
 > On 02/07/2017 06:28 PM, Tadashi Abe wrote:
 > > Hi,
 > >
 > > I'm using DRBD 8.4.8-1 with linux-2.6.32 kernel, on 2 nodes.
 > >
 > > # cat /proc/drbd
 > > version: 8.4.8-1 (api:1/proto:86-101)
 > > GIT-hash: 22b4c802192646e433d3f7399d578ec7fecc6272
 > >
 > > When running a kind of system test (detach/attach loop in high 
system load),
 > > NULL pointer deref occurs on a node at drbd_submit_peer_request.
 > > One thing I notice is the following 2 assertion failure about the 
same drbd
 > > device (drbd6) is seen many times
 > > before NULL deref occurs.
 > >
 > > Jan 18 12:19:29 HOSTA_101 kernel: : [161157.608191] block drbd6: 
ASSERT( i >= 0
 > > ) in drivers/block/drbd/drbd_int.h:2270
 > > Jan 18 12:19:29 HOSTA_101 kernel: : [161158.332840] block drbd6: 
ASSERT(
 > > atomic_read(&device->local_cnt) ) in 
drivers/block/drbd/drbd_actlog.c:691
 > >
 > > These makes me the suspicion that device->local_cnt is invalid count.
 > > (the former assertion failure occurs in put_ldev() in drbd code I'm 
using).
 > >
 > > Here's syslog snippet of BUG.
 > >
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.600846] BUG: unable to 
handle
 > > kernel NULL pointer dereference at (null)
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.601807] IP: 
[<ffffffffa010f07d>]
 > > drbd_submit_peer_request+0x8d/0x4c0 [drbd]
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.610078] PGD 765c6067 
PUD 7645d067
 > > PMD 0
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.610078] Oops: 0000 [#1] 
PREEMPT SMP
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.610078] last sysfs file:
 > > /sys/devices/virtual/block/drbd7/removable
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.644817] CPU 4
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.660713] Modules linked 
in: e1000
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.644817] CPU 4
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.660713] Modules linked 
in: e1000
 > > igb mlx4_core mlx4_en virtio_net virtio_balloon ipmi_msghandler 
ipmi_watchdog
 > > kplugdr libcrc32c crc32c drbd scsi_transport_iscsi libiscsi 
libiscsi_tcp
 > > iscsi_tcp [last unloaded: ipmi_msghandler]
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.692781] Pid: 25503, 
comm: drbd_r_r4
 > > Tainted: G        W  2.6.32.59.cge 60-A64-N1.07 KVM
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.710234] RIP:
 > > 0010:[<ffffffffa010f07d>] [<ffffffffa010f07d>]
 > > drbd_submit_peer_request+0x8d/0x4c0 [drbd]
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.714138] RSP: 
0018:ffff88004fc09d90
 > > EFLAGS: 00010286
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.725343] RAX: 
0000000000000000 RBX:
 > > ffff88004fc09e60 RCX: ffff8800376bf8c0
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.725343] RDX: 
ffff88007b863800 RSI:
 > > 0000000000011200 RDI: 0000000000000246
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.738464] RBP: 
ffff88004fc09df0 R08:
 > > 0000000000000000 R09: 0000000000000000
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.738464] R10: 
0000000000000000 R11:
 > > 0000000000000000 R12: ffffea0001a1b550
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.757103] R13: 
0000000000064000 R14:
 > > 0000000000043400 R15: 0000000000000064
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.777606] FS: 
0000000000000000(0000)
 > > GS:ffff880001d00000(0000) knlGS:0000000000000000
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.783633] CS: 0010 DS: 
0018 ES: 0018
 > > CR0: 000000008005003b
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.796394] CR2: 
0000000000000000 CR3:
 > > 000000004f345000 CR4: 00000000000406e0
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.804852] DR0: 
0000000000000000 DR1:
 > > 0000000000000000 DR2: 0000000000000000
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.809335] DR3: 
0000000000000000 DR6:
 > > 00000000ffff0ff0 DR7: 0000000000000400
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.809335] Process 
drbd_r_r4 (pid:
 > > 25503, threadinfo ffff88004fc08000, task ffff88007b6bb2a0)
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.838090] Stack:
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.838090] 0000000000000000
 > > 0000000000000001 0000000000000002 ffff88007b863800
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.838090] <0> 
ffff88004ed31810
 > > ffff8800376bf8c0 ffff88004fc09df0 ffff88004fc09e60
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.838090] <0> 
ffff88007b863800
 > > ffff88004ed31810 ffff88007b863800 0000000000043400
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.852971] Call Trace:
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.856040] 
[<ffffffffa01126ab>]
 > > receive_RSDataReply+0x13b/0x490 [drbd]
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.856040] 
[<ffffffffa0110880>]
 > > drbd_receiver+0x100/0x2e0 [drbd]
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.856040] 
[<ffffffffa0124740>] ?
 > > drbd_thread_setup+0x0/0x110 [drbd]
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.856040] 
[<ffffffffa012476d>]
 > > drbd_thread_setup+0x2d/0x110 [drbd]
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.856040] 
[<ffffffffa0124740>] ?
 > > drbd_thread_setup+0x0/0x110 [drbd]
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.871833] 
[<ffffffff811251b6>]
 > > kthread+0x96/0xa0
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.871833] 
[<ffffffff81125120>] ?
 > > kthread+0x0/0xa0
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.871833] 
[<ffffffff810b445a>]
 > > child_rip+0xa/0x20
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.880486] 
[<ffffffff81125120>] ?
 > > kthread+0x0/0xa0
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.880486] 
[<ffffffff81125120>] ?
 > > kthread+0x0/0xa0
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.885895] 
[<ffffffff810b4450>] ?
 > > child_rip+0x0/0x20
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.887926] Code: 00 00 00 
e8 76 45 0f
 > > e1 48 85 c0 48 89 45 c8 4c 8b 4d a0 0f 84 f6 03 00 00 48 8b 4d c8 
4c 89 31 48
 > > 8b 55 b8 48 8b 82 80 00 00 00 <48> 8b 00 48 89 41 10 48 8b 45 a8 48 
89 41 20 48
 > > 8b 55 c0 48 c7
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.894609] RIP 
[<ffffffffa010f07d>]
 > > drbd_submit_peer_request+0x8d/0x4c0 [drbd]
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.894609]  RSP 
<ffff88004fc09d90>
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.894609] CR2: 
0000000000000000
 > > Jan 18 12:20:38 HOSTA_101 kernel: : [161228.924389] ---[ end trace
 > > d6d8759a31519f4d ]---
 > >
 > > Any help really appreciated. Thanks a lot.
 > >
 > > Regards,
 > > Tadashi
 > > _______________________________________________
 > > drbd-user mailing list
 > > drbd-user at lists.linbit.com
 > > http://lists.linbit.com/mailman/listinfo/drbd-user
 > >



More information about the drbd-user mailing list