[DRBD-user] kernel panic with rhel 5.4, drbd 8.3.2

atp Andrew.Phillips at lmax.com
Fri Feb 26 12:21:13 CET 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


hi,

 I get a kernel panic when attempting to attach a resource to a 500GB
iscsi volume. I don't get it when I try a local 500GB lvm volume. 
 
 I can dd or write data through a file system without an issue to the 
iscsi volume. Any ideas as to what to do next? Is this a drbd bug?

 I'm attempting to set up a secondary. The primary is working fine. 
We have several other pairs of systems configured pretty much
identically, but this is the first that is running on iscsi. 

 Googling for this does not yield much. Also, if someone could point 
me at the bug tracker, that would be great. 

  Thanks for any assistance. 

[root at drvms03 ~]# drbdadm create-md vm1
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.

[root at drvms03 ~]# drbdadm attach vm1
Unable to handle kernel NULL pointer dereference at 0000000000000000
RIP: 
 [<ffffffff8002de3c>] blk_recount_segments+0x74/0x36f
PGD 223e04067 PUD 222e38067 PMD 0 
Oops: 0000 [1] SMP 
last sysfs file: /module/drbd/parameters/cn_idx
CPU 2 
Modules linked in: mptctl mptbase hidp l2cap bluetooth sg lockd sunrpc
bridge bonding ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr
iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 8021q
libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi
cpufreq_ondemand powernow_k8 freq_table drbd(U) dm_multipath scsi_dh
video hwmon backlight sbs i2c_ec button battery asus_acpi
acpi_memhotplug ac parport_pc lp parport ksm(U) kvm_amd(U) kvm(U) shpchp
i2c_piix4 i2c_core tg3 e1000e pcspkr serio_raw dm_raid45 dm_message
dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod
sata_svw libata cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd
ehci_hcd
Pid: 347, comm: cqueue/2 Tainted: G      2.6.18-164.11.1.el5 #1
RIP: 0010:[<ffffffff8002de3c>]  [<ffffffff8002de3c>]
blk_recount_segments+0x74/0x36f
RSP: 0000:ffff810427acda90  EFLAGS: 00010297
RAX: 0000000000000000 RBX: ffff8102273d9340 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8102273d9340 RDI: ffff8102273f26f0
RBP: ffff810427a44e60 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000050 R12: 000000005732bdb0
R13: 0000000000000000 R14: 000000005732bdb0 R15: 0000000000000000
FS:  00002b23e6f4a230(0000) GS:ffff8101079591c0(0000)
knlGS:00000000f7f6f6c0
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 000000021daaa000 CR4: 00000000000006e0
Process cqueue/2 (pid: 347, threadinfo ffff810427acc000, task
ffff810227968860)
Stack:  ffff8102273f26f0 000000000000bdb0 0000000000000000
ffff810200000000
 0000000000000001 ffffffff800289b5 694420286b736964 ffff8102273d9340
 ffff8102273d9340 000000005732bdb0 0000000000000008 000000005732bdb0
Call Trace:
 [<ffffffff800289b5>] get_request_wait+0x21/0x11f
 [<ffffffff8004254a>] bio_phys_segments+0xf/0x15
 [<ffffffff800258d4>] init_request_from_bio+0xc8/0x198
 [<ffffffff8000c011>] __make_request+0x34f/0x401
 [<ffffffff8001c028>] generic_make_request+0x211/0x228
 [<ffffffff80033437>] submit_bio+0xe4/0xeb
 [<ffffffff8835dc02>] :drbd:_drbd_md_sync_page_io+0x156/0x171
 [<ffffffff8835e1e7>] :drbd:drbd_md_sync_page_io+0x3c6/0x4e6
 [<ffffffff883675c2>] :drbd:drbd_md_read+0xb2/0x256
 [<ffffffff8836d189>] :drbd:drbd_nl_disk_conf+0x6a6/0xc69
 [<ffffffff800da72c>] __cache_alloc_node+0x9d/0xd2
 [<ffffffff8836be5e>] :drbd:drbd_connector_callback+0xe9/0x1a8
 [<ffffffff801b9722>] cn_queue_wrapper+0x0/0x23
 [<ffffffff801b972d>] cn_queue_wrapper+0xb/0x23
 [<ffffffff8004d8ed>] run_workqueue+0x94/0xe4
 [<ffffffff8004a12f>] worker_thread+0x0/0x122
 [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4
 [<ffffffff8004a21f>] worker_thread+0xf0/0x122
 [<ffffffff8008c86c>] default_wake_function+0x0/0xe
 [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032950>] kthread+0xfe/0x132
 [<ffffffff8005dfb1>] child_rip+0xa/0x11
 [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032852>] kthread+0x0/0x132
 [<ffffffff8005dfa7>] child_rip+0x0/0x11


Code: 4d 8b 1a 49 c1 eb 33 4c 89 d8 48 c1 e8 09 48 8b 3c c5 80 8b 
RIP  [<ffffffff8002de3c>] blk_recount_segments+0x74/0x36f
 RSP <ffff810427acda90>
CR2: 0000000000000000
 <0>Kernel panic - not syncing: Fatal exception

 Details of the disk/system.

uname -a
Linux drvms03.dr.tradefair 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04
EST 2010 x86_64 x86_64 x86_64 GNU/Linux

[root at drvms03 ~]# rpm -qa | grep drbd
drbd83-8.3.2-6
kmod-drbd83-8.3.2-6.el5_3

[root at drvms03 ~]# fdisk /dev/sda   
Note: sector size is 4096 (not 512)

The number of cylinders for this disk is set to 11383.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): p

Disk /dev/sda: 749.0 GB, 749043974144 bytes
255 heads, 63 sectors/track, 11383 cylinders
Units = cylinders of 16065 * 4096 = 65802240 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1        7600   488375748   83  Linux

Command (m for help): q

[root at drvms03 ~]# drbdadm create-md vm1
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.

root at drvms03 ~]# cat /etc/drbd.conf
#
# please have a a look at the example configuration file in
# /usr/share/doc/drbd83/drbd.conf
#


global {
    # Participate in DRBD's online usage counter at
http://usage.drbd.org
    # possilbe options: ask, yes, no. Default is ask. In case you do not
    # know, set it to ask, and follow the on screen instructions later.
    usage-count no;
}

common {

# this is the maximum rate which to send data updates to partners - 100M
is about right for a 9000MTU Gig link
  syncer {      rate 100M;
                verify-alg md5; }

# disable syncing after each update - fast, but requires battery backed
caches
        disk {
                no-disk-flushes;
                no-md-flushes;
        }

        handlers {

local-io-error /etc/ha.d/resource.d/drbd_error_handler.sh ;
                pri-on-incon-degr "echo 'DRBD: primary requested but
inconsistent!' | logger; /etc/ha.d/resource.d/drbd_error_handler.sh";
                pri-lost-after-sb "echo 'DRBD: primary requested but
lost!' | logger; /etc/ha.d/resource.d/drbd_error_handler.sh ";


        }
}

resource vm1 {
  protocol        B;
  startup { wfc-timeout 20; degr-wfc-timeout      10; }
#  disk { on-io-error call-local-io-error; }
  disk { on-io-error detach; }
  on drvms03.dr.tradefair {
        device    /dev/drbd0;
        disk            /dev/sda1;
        address  192.168.199.103:8000;
        meta-disk   internal;
  }
  on drvms04.dr.tradefair {
        device    /dev/drbd0;
        disk            /dev/vg.data/lv.vmstorage;
        address  192.168.199.104:8000;
        meta-disk   internal;
  }
}


  
Andrew Phillips
Head of Systems

www.lmax.com 

Office: +44 203 1922509
Mobile: +44 (0)7595 242 900

LMAX | Level 2, Yellow Building | 1 Nicholas Road | London | W11 4AN




The information in this e-mail and any attachment is confidential and is intended only for the named recipient(s). The e-mail may not be disclosed or used by any person other than the addressee, nor may it be copied in any way. If you are not a named recipient please notify the sender immediately and delete any copies of this message. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Any view or opinions presented are solely those of the author and do not necessarily represent those of the company.



More information about the drbd-user mailing list