Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello Michael,
I can confirm this issue.
Our secondary crashed last night after started an online verify via cron.
I had to push The Button...
Found these last messages in syslog:
Sep 2 00:18:01 bach-s52 kernel: block drbd0: Online Verify start sector: 0
Sep 2 00:18:01 bach-s52 kernel: block drbd1: conn( Connected -> VerifyT )
Sep 2 00:18:01 bach-s52 kernel: block drbd1: Online Verify start sector: 0
Sep 2 00:18:04 bach-s52 kernel: Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
Sep 2 00:18:04 bach-s52 kernel: [<ffffffff88440fbf>] :drbd:w_e_end_ov_req+0x29/0x136
Sep 2 00:18:04 bach-s52 kernel: PGD 0
Sep 2 00:18:04 bach-s52 kernel: Oops: 0000 [1] SMP
Sep 2 00:18:04 bach-s52 kernel: last sysfs file: /devices/pci0000:00/0000:00:1c.2/0000:03:00.1/irq
DRBD Version: 8.3.8.1
HW: HP DL380G6 (1 x Xeon X5570)
OS: RHEL 5.5 x86_64
Kernel: 2.6.18-194.11.3.el5 #1 SMP Mon Aug 23 15:51:38 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
Does not reproduce until now.
Kind Regards,
Roland
DRBD resource configs here:
# drbdsetup 0 show | grep -v _is_default
disk {
on-io-error detach;
no-disk-barrier ;
no-disk-flushes ;
no-md-flushes ;
}
net {
max-epoch-size 20000;
max-buffers 32000;
unplug-watermark 16;
ko-count 6;
allow-two-primaries;
cram-hmac-alg "sha1";
shared-secret "39urXnII331";
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
data-integrity-alg "md5";
}
syncer {
rate 33792k; # bytes/second
al-extents 3389;
al-extents 3389;
csums-alg "md5";
verify-alg "md5";
cpu-mask "255";
}
protocol C;
_this_host {
device minor 0;
disk "/dev/vg_drbdsec_raid10/lv_drbd_disk_01";
meta-disk internal;
address ipv4 10.0.0.2:7789;
}
_remote_host {
address ipv4 10.0.0.1:7789;
}
# drbdsetup 1 show | grep -v _is_default
disk {
on-io-error detach;
no-disk-barrier ;
no-disk-flushes ;
no-md-flushes ;
}
net {
max-epoch-size 20000;
max-buffers 32000;
unplug-watermark 16;
ko-count 6;
allow-two-primaries;
cram-hmac-alg "sha1";
shared-secret "39urXnII332";
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
data-integrity-alg "md5";
}
syncer {
rate 33792k; # bytes/second
after 0;
al-extents 3389;
csums-alg "md5";
verify-alg "md5";
cpu-mask "255";
}
protocol C;
_this_host {
device minor 1;
disk "/dev/vg_drbdsec_raid6/lv_drbd_disk_02";
meta-disk internal;
address ipv4 10.0.0.2:7790;
}
_remote_host {
address ipv4 10.0.0.1:7790;
}
Am Sonntag 04 Juli 2010 schrieben Sie:
> Hi All,
>
> I've installed ( as always ) from source 8.3.8
> All seems to work fine but when I've tried to start online-verify -
> I"ve got oops on secondary node:
> Is anyone having that issues?
>
> Jul 5 07:38:45 vhost2 kernel: block drbd3: conn( Connected ->
> VerifyT ) Jul 5 07:38:45 vhost2 kernel: block drbd3: Online Verify
> start sector: 0 Jul 5 07:38:46 vhost2 kernel: BUG: unable to handle
> kernel NULL pointer dereference at 0000000000000030
> Jul 5 07:38:46 vhost2 kernel: IP: [<ffffffffa029ce5f>]
> w_e_end_ov_req+0x36/0x154 [drbd]
> Jul 5 07:38:46 vhost2 kernel: PGD 41aa0d067 PUD 4178f8067 PMD 0
> Jul 5 07:38:46 vhost2 kernel: Oops: 0000 [#1] SMP
> Jul 5 07:38:46 vhost2 kernel: last sysfs file:
> /sys/module/drbd/parameters/cn_idx
> Jul 5 07:38:46 vhost2 kernel: CPU 3
> Jul 5 07:38:46 vhost2 kernel: Modules linked in: ppdev vmnet(P)
> parport_pc parport vmmon(P) drbd ext4 jbd2 crc16 crc32c ac battery cn
> coretemp w83627hf w83793 hwmon_vid bonding loop rtc_cmos rtc_core
> i2c_i801 iTCO_wdt i5000_edac serio_raw ioatdma container rtc_lib
> i2c_core e1000e tpm_tis tpm tpm_bios dca edac_core shpchp pci_hotplug
> psmouse pcspkr i5k_amb button processor evdev ext3 jbd mbcache
> dm_mirror dm_region_hash dm_log dm_snapshot dm_mod raid456 async_pq
> async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid10
> raid1 md_mod ide_cd_mod cdrom sd_mod pata_acpi ata_generic ata_piix
> ide_pci_generic ahci sata_mv piix ehci_hcd ide_core libata scsi_mod
> uhci_hcd floppy thermal fan [last unloaded: vmnet]
> Jul 5 07:38:46 vhost2 kernel:
> Jul 5 07:38:46 vhost2 kernel: Pid: 7314, comm: drbd3_worker Tainted:
> P W 2.6.34-vs2.3.0.36.30.4.pre8 #2 X7DB8/X7DB8
> Jul 5 07:38:46 vhost2 kernel: RIP: 0010:[<ffffffffa029ce5f>]
> [<ffffffffa029ce5f>] w_e_end_ov_req+0x36/0x154 [drbd]
> Jul 5 07:38:46 vhost2 kernel: RSP: 0000:ffff8804131c5e20 EFLAGS:
> 00010202 Jul 5 07:38:46 vhost2 kernel: RAX: 0000000000000008 RBX:
> ffff88042b09c9c0 RCX: ffff88036b08c430
> Jul 5 07:38:46 vhost2 kernel: RDX: 0000000000000000 RSI:
> 0000000000000010 RDI: ffff880416900000
> Jul 5 07:38:46 vhost2 kernel: RBP: ffff880416900000 R08:
> 0000000000000004 R09: ffff8800016cd1c0
> Jul 5 07:38:46 vhost2 kernel: R10: 0000000000000001 R11:
> ffffffff8126109b R12: ffff880416900138
> Jul 5 07:38:46 vhost2 kernel: R13: ffff8804169006b8 R14:
> ffff88036b08c430 R15: ffff880416900118
> Jul 5 07:38:46 vhost2 kernel: FS: 0000000000000000(0000)
> GS:ffff8800016c0000(0000) knlGS:0000000000000000
> Jul 5 07:38:46 vhost2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
> 000000008005003b
> Jul 5 07:38:46 vhost2 kernel: CR2: 0000000000000030 CR3:
> 00000004179ed000 CR4: 00000000000006e0
> Jul 5 07:38:46 vhost2 kernel: DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> Jul 5 07:38:46 vhost2 kernel: DR3: 0000000000000000 DR6:
> 00000000ffff0ff0 DR7: 0000000000000400
> Jul 5 07:38:46 vhost2 kernel: Process drbd3_worker (pid: 7314,
> threadinfo ffff8804131c4000, task ffff88042b09c9c0)
> Jul 5 07:38:46 vhost2 kernel: Stack:
> Jul 5 07:38:46 vhost2 kernel: ffff880416900118 ffff88042b09c9c0
> ffff880416900000 ffff880416900138
> Jul 5 07:38:46 vhost2 kernel: <0> ffff8804169006b8 0000000000000000
> ffff880416900118 ffffffffa029c0f8
> Jul 5 07:38:46 vhost2 kernel: <0> 0000000101fbdd14 ffffffff81040c79
> ffff88042e4dc000 ffff88042b09c9c0
> Jul 5 07:38:46 vhost2 kernel: Call Trace:
> Jul 5 07:38:46 vhost2 kernel: [<ffffffffa029c0f8>] ?
> drbd_worker+0x284/0x4b4 [drbd]
> Jul 5 07:38:46 vhost2 kernel: [<ffffffff81040c79>] ?
> del_timer_sync+0xc/0x16
> Jul 5 07:38:46 vhost2 kernel: [<ffffffff81040c83>] ?
> process_timeout+0x0/0x5
> Jul 5 07:38:46 vhost2 kernel: [<ffffffffa02b63b7>] ?
> drbd_thread_setup+0x166/0x227 [drbd]
> Jul 5 07:38:46 vhost2 kernel: [<ffffffff810036d4>] ?
> kernel_thread_helper+0x4/0x10
> Jul 5 07:38:46 vhost2 kernel: [<ffffffffa02b6251>] ?
> drbd_thread_setup+0x0/0x227 [drbd]
> Jul 5 07:38:46 vhost2 kernel: [<ffffffff810036d0>] ?
> kernel_thread_helper+0x0/0x10
> Jul 5 07:38:46 vhost2 kernel: Code: 55 48 89 fd 53 48 83 ec 08 85 d2
> 0f 85 c9 00 00 00 f6 46 48 10 0f 85 bf 00 00 00 48 8b 87 60 06 00 00
> be 10 00 00 00 48 83 c0 08 <8b> 58 28 48 63 fb e8 29 51 e3 e0 48 85
> c0 49 89 c5 0f 84 98 00
> Jul 5 07:38:46 vhost2 kernel: RIP [<ffffffffa029ce5f>]
> w_e_end_ov_req+0x36/0x154 [drbd]
> Jul 5 07:38:46 vhost2 kernel: RSP <ffff8804131c5e20>
> Jul 5 07:38:46 vhost2 kernel: CR2: 0000000000000030
> Jul 5 07:38:46 vhost2 kernel: ---[ end trace 11e0b3ada4e13922 ]---
--
Roland.Friedwagner at wu.ac.at Phone: +43 1 31336 5377
IT Services - WU (Vienna University of Economics and Business)