[DRBD-user] kernel panics with infiniband and 10 resources (Xen dom0)

Michael Snow michael.snow at 4bright.com
Wed Sep 30 19:17:10 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Has anyone else run across issues where you get kernel panics when a  
certain number of resources are added to a DRBD setup?

We have a DRBD setup that we are trying to use 10 resources (moving to  
18) under XEN DOM0 and using infiniband as the interconnect.

As soon as both DRBD systems go active and start the sync process one  
or both servers will kernel panic.  If we put a skip {} block around  
any one (or more) of the resource entries DRBD will sync the drives  
and every thing works just fine.

We use the fencing dont-care; option in drbd because of the primary/ 
primary issue with the version of drbd we are using.  Version: 8.3.0  

This also happens with the vanilla xen kernel, the one below just adds  
in MD raid patch.

The closest issue that is similar to our situation seems to be  
ethernet driver related (https://bugzilla.redhat.com/show_bug.cgi?id=476897 
) since they throw a similar error and it is using a large MTU.

kernel panic stack backtrace (I can not get kdump to work under XEN  
DOM0 so this is from a serial console capture)

Unable to handle kernel NULL pointer dereference at 0000000000000000  
  [<ffffffff8027cc53>] xen_destroy_contiguous_region+0x83/0x3d6
PGD 5bb3f8067 PUD 5bad6b067 PMD 0
Oops: 0002 [1] SMP
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
Modules linked in: drbd(U) vsd(U) xt_physdev netloop netbk blktap  
blkbk ipt_MASQUERADE iptable_nat ip_nat xt_state ip_conntrack  
nfnetlink ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables  
bridge autofs4 nfs lockd fscache nfs_acl sunrpc cpufreq_ondemand  
acpi_cpufreq freq_table rdma_ucm(U) qlgc_vnic(U) ib_sdp(U) rdma_cm(U)  
iw_cm(U) ib_addr(U) ib_ipoib(U) ipoib_helper(U) ib_cm(U) ib_sa(U) ipv6  
xfrm_nalgo crypto_api ib_uverbs(U) ib_umad(U) iw_cxgb3(U) cxgb3(U)  
ib_ipath(U) mlx4_ib(U) mlx4_core(U) dm_multipath scsi_dh video hwmon  
backlight sbs i2c_ec button battery asus_acpi ac parport_pc lp parport  
joydev sr_mod sg i5000_edac i2c_i801 e1000e edac_mc i2c_core ib_mthca 
(U) ide_cd ib_mad(U) ib_core(U) serial_core pcspkr serio_raw cdrom  
dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero  
dm_mirror dm_log dm_mod usb_storage qla2xxx scsi_transport_fc ata_piix  
libata shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod  
scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 4724, comm: ib_cm/1 Tainted: G      2.6.18-128.el5.bsi.01xen #1
RIP: e030:[<ffffffff8027cc53>]  [<ffffffff8027cc53>]  
RSP: e02b:ffff8805c69ef770  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000001000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffff8068fa40 R09: 0000000000000000
R10: ffff8805c69ef770 R11: 0000000000000048 R12: 0000000000000001
R13: 0000000000000005 R14: 0000000000008000 R15: ffff8802fa52d400
FS:  00002b5f13ed52b0(0000) GS:ffffffff805ba080(0000) knlGS: 
CS:  e033 DS: 0000 ES: 0000
Process ib_cm/1 (pid: 4724, threadinfo ffff8805c69ee000, task  
Stack:  ffff8805c69ef7b8  0000000000000001  0000000000000000   
  ffffffff8068ea40  0000000000000001  0000000000000000  0000000000007ff0
  0000000000000000  ffffffff804eac80
Call Trace:
  [<ffffffff80271292>] dma_free_coherent+0x69/0x77
  [<ffffffff883b596e>] :ib_mthca:mthca_buf_free+0x73/0x9c
  [<ffffffff883b5ef8>] :ib_mthca:mthca_buf_alloc+0x273/0x297
  [<ffffffff802629d6>] mutex_lock+0xd/0x1d
  [<ffffffff883baa6d>] :ib_mthca:mthca_alloc_qp_common+0x23d/0x517
  [<ffffffff8025d6e8>] del_timer_sync+0xc/0x16
  [<ffffffff883bb08c>] :ib_mthca:mthca_alloc_qp+0xab/0x106
  [<ffffffff883bf9f7>] :ib_mthca:mthca_create_qp+0x12d/0x28e
  [<ffffffff883b2453>] :ib_mthca:mthca_cmd_wait+0x183/0x1d7
  [<ffffffff8836ef48>] :ib_core:ib_create_qp+0x17/0xb4
  [<ffffffff887009e4>] :rdma_cm:rdma_create_qp+0x2d/0x153
  [<ffffffff803a345c>] dma_pool_free+0x83/0x144
  [<ffffffff8020b7bf>] kfree+0x15/0xc5
  [<ffffffff883b879f>] :ib_mthca:mthca_init_cq+0x2f5/0x39f
  [<ffffffff883bfc50>] :ib_mthca:mthca_create_cq+0xf8/0x1c8
  [<ffffffff88716354>] :ib_sdp:sdp_completion_handler+0x0/0xc
  [<ffffffff88714904>] :ib_sdp:sdp_cq_event_handler+0x0/0x1
  [<ffffffff8836f00c>] :ib_core:ib_create_cq+0x27/0x55
  [<ffffffff88714c27>] :ib_sdp:sdp_init_qp+0x321/0x43a
  [<ffffffff88714905>] :ib_sdp:sdp_qp_event_handler+0x0/0x1
  [<ffffffff8871551d>] :ib_sdp:sdp_cma_handler+0x4d2/0x1309
  [<ffffffff886fd797>] :rdma_cm:cma_acquire_dev+0xec/0x113
  [<ffffffff8871504b>] :ib_sdp:sdp_cma_handler+0x0/0x1309
  [<ffffffff8870015d>] :rdma_cm:cma_req_handler+0x30a/0x3c3
  [<ffffffff886abc7d>] :ib_cm:cm_process_work+0x48/0x97
  [<ffffffff886ad076>] :ib_cm:cm_req_handler+0x832/0x89f
  [<ffffffff886ad0e3>] :ib_cm:cm_work_handler+0x0/0xa9f
  [<ffffffff886ad113>] :ib_cm:cm_work_handler+0x30/0xa9f
  [<ffffffff886ad0e3>] :ib_cm:cm_work_handler+0x0/0xa9f
  [<ffffffff8024ee11>] run_workqueue+0x94/0xe4
  [<ffffffff8024b71a>] worker_thread+0x0/0x122
  [<ffffffff80299db3>] keventd_create_kthread+0x0/0xc4
  [<ffffffff8024b80a>] worker_thread+0xf0/0x122
  [<ffffffff80286daf>] default_wake_function+0x0/0xe
  [<ffffffff80299db3>] keventd_create_kthread+0x0/0xc4
  [<ffffffff80299db3>] keventd_create_kthread+0x0/0xc4
  [<ffffffff80233476>] kthread+0xfe/0x132
  [<ffffffff8025fb2c>] child_rip+0xa/0x12
  [<ffffffff80299db3>] keventd_create_kthread+0x0/0xc4
  [<ffffffff80233378>] kthread+0x0/0x132
  [<ffffffff8025fb22>] child_rip+0x0/0x12

Code: f3 aa 48 c7 c7 80 31 53 80 e8 8f 6d fe ff 49 89 c3 48 b8 ff
RIP  [<ffffffff8027cc53>] xen_destroy_contiguous_region+0x83/0x3d6
  RSP <ffff8805c69ef770>
CR2: 0000000000000000
  <0>Kernel panic - not syncing: Fatal exception
  (XEN) Domain 0 crashed: rebooting machine in 5 seconds.

Our drbd.cfg

global {
         usage-count no;
common {
         protocol C;
         net {
                 timeout                 60;
                 max-epoch-size          2048;
                 max-buffers             2048;
                 unplug-watermark        128;
                 connect-int             10;
                 ping-int                10;
                 sndbuf-size             32764;
                 ko-count                2;
                 ping-timeout            10;
                 after-sb-0pri           discard-zero-changes;
                 after-sb-1pri           discard-secondary;
                 after-sb-2pri           disconnect;
         startup {
                 wfc-timeout             60;
                 degr-wfc-timeout        15;
                 become-primary-on       both;
         handlers {
                 local-io-error "/usr/lib/drbd/brtHandler.pl local-io- 
                 pri-on-incon-degr  "/usr/lib/drbd/brtHandler.pl pri- 
                 pri-lost-after-sb  "/usr/lib/drbd/brtHandler.pl pri- 
                 pri-lost "/usr/lib/drbd/brtHandler.pl pri-lost";
                 split-brain "/usr/lib/drbd/brtHandler.pl split-brain";
                 before-resync-target "/usr/lib/drbd/brtHandler.pl  
                 after-resync-target "/usr/lib/drbd/brtHandler.pl  
                 out-of-sync "/usr/lib/drbd/brtHandler.pl out-of-sync";
                 fence-peer "/usr/lib/drbd/brtHandler.pl fence-peer";
                 outdate-peer "/usr/lib/drbd/brtHandler.pl outdate- 
         disk {
                 fencing dont-care;
                 max-bio-bvecs 1;
                 on-io-error call-local-io-error;

resource ol01 {
         device          /dev/drbd1;
         disk            /dev/vsdb;
         meta-disk       internal;
         on g2-0937-xxxx-1host-1
                 address         sci;
         on g2-0937-xxxx-1host-2 {
                 address         sci;

skipping resource ol02 - ol08, and nl01

resource nl02 {
         device          /dev/drbd10;
         disk            /dev/vsdk;
         meta-disk       internal;
         on g2-0937-xxxx-1host-1
                 address         sci;
         on g2-0937-xxxx-1host-2 {
                 address         sci;



ib0       Link encap:InfiniBand  HWaddr 80:00:04:04:FE: 
           inet addr:  Bcast:  Mask:
           inet6 addr: fe80::202:c902:22:b9a9/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
           RX packets:153 errors:0 dropped:0 overruns:0 frame:0
           TX packets:153 errors:0 dropped:5 overruns:0 carrier:0
           collisions:0 txqueuelen:256
           RX bytes:8568 (8.3 KiB)  TX bytes:9188 (8.9 KiB)

lspci -v

06:00.0 InfiniBand: Mellanox Technologies MT25204 [InfiniHost III Lx  
HCA] (rev a0)
	Subsystem: Mellanox Technologies MT25204 [InfiniHost III Lx HCA]
	Flags: bus master, fast devsel, latency 0, IRQ 20
	Memory at b9100000 (64-bit, non-prefetchable) [size=1M]
	Memory at b8000000 (64-bit, prefetchable) [size=8M]
	Capabilities: [40] Power Management version 2
	Capabilities: [48] Vital Product Data
	Capabilities: [90] Message Signalled Interrupts: 64bit+ Queue=0/5  
	Capabilities: [84] MSI-X: Enable+ Mask- TabSize=32
	Capabilities: [60] Express Endpoint IRQ 0

More information about the drbd-user mailing list