[DRBD-user] Linux kernel panic when detach the DRBD resource on Primary node

Jason sz_byb at huawei.com
Wed Apr 1 14:14:21 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

I am having a problem with two DRBD machines.These machines are exactly the same in hardware and software, and are both running SUSE Linux(the kernel is 2.6.16.60-0.21-bigsmp) and DRBD 0.7.24.

The primary node is "ATCAX86_F0S5" and the secondary node is "ATCAX86_F0S9".When I detach the DRBD resource on Primary node and then copy a file to the DRBD device,the linux is panic and then restart.
First, I do "drbdadm primary all" on the Primary node,and the state of drbd is:
ATCAX86_F0S5:/ # cat /proc/drbd
version: 0.7.24 (api:79/proto:74)
SVN Revision: 2875 build by root at ccf01, 2008-10-24 06:23:46
 0: cs:Unconfigured
 1: cs:DiskLessClient st:Primary/Secondary ld:Consistent
    ns:68 nr:0 dw:68 dr:172 al:0 bm:0 lo:0 pe:0 ua:0 ap:0

And then,I copy a file to the DRBD device of primary node.Then the OS of primary node is panic,and then restart.The system message is:
Mar 31 10:07:34 ATCAX86_F0S4 kernel: ReiserFS: drbd1: Using r5 hash to sort names
Mar 31 10:12:24 ATCAX86_F0S4 kernel: Unable to handle kernel NULL pointer dereference at virtual address 0000001e
Mar 31 10:12:24 ATCAX86_F0S4 kernel:  printing eip:
Mar 31 10:12:24 ATCAX86_F0S4 kernel: f92a3d25
Mar 31 10:12:24 ATCAX86_F0S4 kernel: *pde = 35c54001
Mar 31 10:12:24 ATCAX86_F0S4 kernel: Oops: 0000 [#1]
Mar 31 10:12:25 ATCAX86_F0S4 syslog-ng[3184]: Changing permissions on special file /dev/console
Mar 31 10:12:24 ATCAX86_F0S4 kernel: SMP 
Mar 31 10:12:24 ATCAX86_F0S4 kernel: last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
Mar 31 10:12:24 ATCAX86_F0S4 kernel: Modules linked in: drbd kbox_V100R001C01B003_20090211114813_24945 pmcint dpukernel_V100R003C01B612_20090312062515_21878 mcenonfatal_V100R003C02B015_20090113102406_26186 mchesb_V100R003C02B015_20090113102413_4859 lpcbios_V100R003C02B015_20090113102412_13838 nfsd exportfs lockd nfs_acl sunrpc ipv6 gab af_packet llt intermodule ipmi_watchdog ipmi_si ipmi_devintf ipmi_poweroff ipmi_msghandler tg3 e1000 dock button battery ac loop dm_mod usbhid i2c_i801 i2c_core mptctl qla2xxx uhci_hcd ehci_hcd firmware_class usbcore scsi_transport_fc reiserfs ext3 jbd mppVhba edd fan thermal processor mptsas mptscsih mptbase scsi_transport_sas ata_piix libata mppUpper sg sd_mod scsi_mod
Mar 31 10:12:24 ATCAX86_F0S4 kernel: CPU:    0
Mar 31 10:12:24 ATCAX86_F0S4 kernel: EIP:    0060:[<f92a3d25>]    Tainted: PF    U VLI
Mar 31 10:12:24 ATCAX86_F0S4 kernel: EFLAGS: 00010202   (2.6.16.60-0.21-bigsmp #1) 
Mar 31 10:12:24 ATCAX86_F0S4 kernel: EIP is at drbd_send_dblock+0x1c9/0x36a [drbd]
Mar 31 10:12:24 ATCAX86_F0S4 kernel: eax: 00000001   ebx: f4ad1904   ecx: 00008000   edx: 00000000
Mar 31 10:12:24 ATCAX86_F0S4 kernel: esi: c593e000   edi: f4ad1528   ebp: f61935a4   esp: c593fbcc
Mar 31 10:12:24 ATCAX86_F0S4 kernel: ds: 007b   es: 007b   ss: 0068
Mar 31 10:12:24 ATCAX86_F0S4 kernel: Process pdflush (pid: 212, threadinfo=c593e000 task=c593d6b0)
Mar 31 10:12:25 ATCAX86_F0S4 kernel: Stack: <0>00000000 00000000 00000001 00000082 ffffffff ffffffff 67027483 10100000 
Mar 31 10:12:25 ATCAX86_F0S4 kernel:        00000000 00400400 f61935a4 00000000 f7445bec f110b0c0 00000001 00000000 
Mar 31 10:12:25 ATCAX86_F0S4 kernel:        f929de40 00001000 00000001 f4ad1528 01e39800 f61935a4 0000000a 00000000 
Mar 31 10:12:25 ATCAX86_F0S4 kernel: Call Trace:
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<f929de40>] drbd_make_request_common+0x727/0x961 [drbd]
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<f929e2bd>] drbd_make_request_26+0x1cc/0x1d5 [drbd]
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c01bdf34>] generic_make_request+0x29c/0x2ac
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c01c02d2>] blk_do_ordered+0x18c/0x2b6
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c011b486>] find_busiest_group+0x13c/0x2fa
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c014b20d>] mempool_alloc+0x28/0xc5
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c01bfa5a>] submit_bio+0xa6/0xad
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c016b144>] bio_alloc_bioset+0xb2/0x117
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c01680e0>] submit_bh+0xe3/0x101
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<f8a21277>] write_ordered_chunk+0x47/0x6d [reiserfs]
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<f8a2164b>] write_ordered_buffers+0x1a7/0x28c [reiserfs]
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c0148610>] find_get_page+0x18/0x47
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c01683e3>] __find_get_block_slow+0x10b/0x115
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c0168730>] __find_get_block+0x185/0x18f
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<f8a219a2>] flush_commit_list+0x190/0x5a6 [reiserfs]
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<f8a247b3>] do_journal_end+0xbe5/0xc31 [reiserfs]
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c011b486>] find_busiest_group+0x13c/0x2fa
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<f8a24865>] journal_end_sync+0x66/0x6b [reiserfs]
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<f8a15e97>] reiserfs_sync_fs+0x32/0x54 [reiserfs]
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c016cd5a>] sync_supers+0x72/0xd2
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c014dc92>] wb_kupdate+0x2a/0xf4
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c014e58e>] pdflush+0x116/0x1ad
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c014dc68>] wb_kupdate+0x0/0xf4
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c013436f>] kthread+0xca/0xf7
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c014e478>] pdflush+0x0/0x1ad
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c01342a5>] kthread+0x0/0xf7
Mar 31 10:12:25 ATCAX86_F0S4 kernel:  [<c0102005>] kernel_thread_helper+0x5/0xb
Mar 31 10:12:25 ATCAX86_F0S4 kernel: Code: ff 89 44 24 10 89 f0 e8 e1 6c e7 c6 5e 58 83 7c 24 08 00 74 34 b9 01 00 00 00 ba 10 00 00 00 89 e8 e8 23 a8 ff ff eb 21 8b 55 28 <0f> b7 42 1e 6b c0 0c 03 42 34 8b 48 08 8b 10 ff 70 04 89 f8 e8 
Mar 31 10:15:19 ATCAX86_F0S4 syslog-ng[3209]: syslog-ng version 1.6.8 starting

Could you tell me why this problem happen? How to solve it?

Thanks,
Jason

### /etc/drbd.conf ###
#
# please have a a look at the example configuration file in
# /usr/share/doc/packages/drbd.conf
#
resource r0 {
  protocol C;

  startup {
    wfc-timeout  20;
    degr-wfc-timeout 60;
  }

  syncer {
    rate 50M;
  }

  on ATCAX86_F0S5 {
    device /dev/drbd1;
    disk /dev/sda5;
    address 172.30.128.125:7789;
    meta-disk internal;
  }
  on ATCAX86_F0S9 {
    device /dev/drbd1;
    disk /dev/sda5;
    address 172.30.128.131:7789;
    meta-disk internal;
  }
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090401/59a7e7a8/attachment.htm>


More information about the drbd-user mailing list