[DRBD-user] Segfaulting DRBD

Mark Steele msteele at beringmedia.com
Fri Nov 20 16:25:36 CET 2009


Hi everyone,

I've been trying to get DRBD up for a few hours now with no success.

My config:

global {
  usage-count yes;
}
common {
  protocol C;
}

resource r0 {
    device    /dev/drbd1;
    disk      /dev/sdb1;
    meta-disk internal;
  on test1 {
    address   10.2.0.1:7789;
  }
  on test2 {
    address   10.2.0.2:7789;
  }
}


Running on gentoo. I've tried the following combinations:

Kernel 2.6.30-gentoo-r4, drbd 8.3.2
Kernel 2.6.31-gentoo-r6, drbd 8.3.6

Also tried installing from official source drbd tarball, and vanilla
2.6.31.6 kernel, with the same results.

gcc version 4.3.4
glibc 2.11

I'm not using any strange compilation flags and don't get any warnings or
errors when compiling the module or software.

Here's what I'm seeing

# modprobe drbd
[  265.322164] drbd: initialized. Version: 8.3.6
(api:88/proto:86-91)
[  265.322166] drbd: GIT-hash: f3606c47cc6fcf6b3f086e425cb34af8b7a81bbf
build by root at test1, 2009-11-20 03:46:09
[  265.322168] drbd: registered as block device major
147
[  265.322170] drbd: minor_table @ 0xffff88083dbf8e00

After which I attempt to bring up the device

# drbdadm up r0

[  279.930487] BUG: unable to handle kernel paging request at
ffffffff9ffcc900
[  279.939386] IP: [<ffffffffa002ed02>] drbd_connector_callback+0xe2/0x280
[drbd]
[  279.948949] PGD 1003067 PUD 1007063 PMD
0
[  279.958662] Oops: 0000 [#1]
SMP

[  279.968259] last sysfs file:
/sys/module/drbd/parameters/cn_idx

[  279.978043] CPU
3

[  279.987524] Modules linked in:
drbd

[  279.997023] Pid: 612, comm: cqueue Not tainted 2.6.31.6 #2 PowerEdge
R710
[  280.006375] RIP: 0010:[<ffffffffa002ed02>]  [<ffffffffa002ed02>]
drbd_connector_callback+0xe2/0x280 [drbd]
[  280.016297] RSP: 0018:ffff88083d969df0  EFLAGS:
00010293
[  280.026260] RAX: 00000000ffff8808 RBX: ffff88043d4ed010 RCX:
ffff88043d5ea7e0
[  280.036517] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffffffff81832b80
[  280.046943] RBP: ffff88083d969e30 R08: 0000000000000001 R09:
ffff88083d969ba8
[  280.057539] R10: 0000000000000008 R11: 0000000000000000 R12:
fffffffffff88080
[  280.068594] R13: ffff88083d933170 R14: ffff88043d4ed024 R15:
ffff88043d546000
[  280.079121] FS:  0000000000000000(0000) GS:ffffc90000600000(0000)
knlGS:0000000000000000
[  280.089629] CS:  0010 DS: 0018 ES: 0018 CR0:
000000008005003b
[  280.100209] CR2: ffffffff9ffcc900 CR3: 000000083e561000 CR4:
00000000000006e0
[  280.110820] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  280.121171] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  280.131523] Process cqueue (pid: 612, threadinfo ffff88083d968000, task
ffff88083d933170)
[  280.142246]
Stack:

[  280.152869]  ffff88083d969e30 ffffffff814ece24 000000005544e207
ffff88083c5d3680
[  280.153330] <0> ffff88083c5d3690 ffff88083d933170 ffffc90000018e40
ffffffff813788d0
[  280.164644] <0> ffff88083d969e60 ffffffff81378902 000000005544e207
ffffc90000018e40
[  280.187044] Call
Trace:

[  280.198552]  [<ffffffff814ece24>] ?
kfree_skb+0x74/0xb0

[  280.210210]  [<ffffffff813788d0>] ?
cn_queue_wrapper+0x0/0x70

[  280.221911]  [<ffffffff81378902>]
cn_queue_wrapper+0x32/0x70

[  280.233665]  [<ffffffff81069cb2>]
worker_thread+0x182/0x270

[  280.245632]  [<ffffffff810702f0>] ?
autoremove_wake_function+0x0/0x60

[  280.257626]  [<ffffffff81069b30>] ?
worker_thread+0x0/0x270

[  280.269738]  [<ffffffff8106fdc6>]
kthread+0xb6/0xc0

[  280.281777]  [<ffffffff8100c97a>]
child_rip+0xa/0x20

[  280.293677]  [<ffffffff8106fd10>] ?
kthread+0x0/0xc0

[  280.305606]  [<ffffffff8100c970>] ?
child_rip+0x0/0x20

[  280.317462] Code: 00 00 49 89 c7 48 85 c0 74 a5 8b 35 21 5e 01 00 85 f6
0f 85 46 01 00 00 8b 43 14 83 f8 1a 0f 8f fd 00 00 00 4c 63 e0 49 c1 e4 04
<49> 83 bc 24 80 48 04 a0 00 0f 84 e7 00 00 00 41 8b bc 24 88
48

[  280.343452] RIP  [<ffffffffa002ed02>] drbd_connector_callback+0xe2/0x280
[drbd]

[  280.356357]  RSP
<ffff88083d969df0>

[  280.369148] CR2:
ffffffff9ffcc900

[  280.381970] ---[ end trace 6465e891e665cb33 ]---

# cat /proc/drbd
version: 8.3.6 (api:88/proto:86-91)
GIT-hash: f3606c47cc6fcf6b3f086e425cb34af8b7a81bbf build by root at test1,
2009-11-20 03:46:09

 1: cs:Unconfigured

I've got a couple nics (broadcom netxtreme II with CNIC support enabled)
back-to-back on a bonded interface (if that matters), although I get same
results if I just use one interface instead of two. The server is an 8 proc
box (x86-64) with 32 gigs of ram (dell poweredge r710).

Anyone have some tips on how to go about troubleshooting this? Kernel
options to avoid/use? Known working kernel/drbd version combos?

Thanks,

Mark Steele
Director of development
Bering Media Inc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20091120/cc5c0222/attachment.htm>


More information about the drbd-user mailing list