Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I'm experiencing several random crashes a day with my latest DRBD
configuration.
It's exactly the same as the others (crash free) that I got except
that the kernel is compiled with SMP support (Hyper Threading).
I'm wondering if anyone is using heavy loaded production NFS server
with DRBD + XFS + SMP.
Here's the complete configuration I'm using:
- DRBD 0.7.5
- Linux 2.6.9-1-686-smp (Debian)
- XFS over software RAID
This morning I reproduced the crash several times with this simple
test: make -C /usr/src/linux -j modules
The machine stop responding to ICMP requests after ~20 seconds; the
same runs just fine if I use an UP kernel.
I've just booted the UP kernel so I'm not sure if the random daily
crashes has gone away too, I'll let you know.
Most of the time the machine crashes without anything in the logs but
I had this message last time, I don't know if it's DRBD related. Maybe
it's another bug:
------------[ cut here ]------------
kernel BUG at mm/rmap.c:474!
invalid operand: 0000 [#1]
PREEMPT SMP
Modules linked in: drbd ext2 mbcache nfsd exportfs lockd sunrpc sd_mod sg sr_mod scsi_mod cdrom ipt_LOG iptable_nat ip_conntrack iptable_filter ip_tables ipv6 dm_mod raid0 md capability commoncap r8169 tg3 firmware_class 3c59x 8139too mii crc32 forcedeth rtc xfs ide_generic piix ide_disk ide_core unix fbcon font vesafb cfbcopyarea cfbimgblt cfbfillrect
CPU: 1
EIP: 0060:[<c015199c>] Tainted: GF VLI
EFLAGS: 00010286 (2.6.9-1-686-smp)
EIP is at page_remove_rmap+0x3c/0x50
eax: ffffffff ebx: 00002000 ecx: da6add4c edx: c139d9a0
esi: da56df00 edi: c139d9a0 ebp: 00000000 esp: da6adc78
ds: 007b es: 007b ss: 0068
Process munin-node (pid: 14532, threadinfo=da6ac000 task=f7321150)
Stack: c014af5e c139d9a0 00000000 c028fe10 da6adcd8 b83be000 da5bbb80 b7fe0000
00000000 c014b143 c18143a0 da5bbb7c b7fbe000 00022000 00000000 c18143a0
b7fbe000 da5bbb80 b7fe0000 00000000 c014b1b3 c18143a0 da5bbb7c b7fbe000
Call Trace:
[<c014af5e>] zap_pte_range+0x14e/0x2d0
[<c028fe10>] schedule+0x520/0xbe0
[<c014b143>] zap_pmd_range+0x63/0x80
[<c014b1b3>] unmap_page_range+0x53/0x80
[<c014b2e6>] unmap_vmas+0x106/0x220
[<c014fa8f>] exit_mmap+0x9f/0x190
[<c011de6b>] mmput+0x6b/0xa0
[<c0166f9d>] exec_mmap+0xfd/0x1e0
[<c016729a>] flush_old_exec+0x15a/0x870
[<c015b9c7>] vfs_read+0x107/0x160
[<c0166e8e>] kernel_read+0x4e/0x60
[<c01861bb>] load_elf_binary+0x2db/0xbd0
[<c011db60>] autoremove_wake_function+0x0/0x60
[<c0166e8e>] kernel_read+0x4e/0x60
[<c0185ee0>] load_elf_binary+0x0/0xbd0
[<c0167d5e>] search_binary_handler+0x18e/0x2d0
[<c0185565>] load_script+0x215/0x250
[<c0140f55>] __alloc_pages+0x1d5/0x390
[<c01b0aa2>] copy_from_user+0x42/0x70
[<c01669d1>] copy_strings+0x1d1/0x220
[<c0185350>] load_script+0x0/0x250
[<c0167d5e>] search_binary_handler+0x18e/0x2d0
[<c0168059>] do_execve+0x1b9/0x270
[<c0104c62>] sys_execve+0x42/0x80
[<c0106199>] sysenter_past_esp+0x52/0x71
Code: 98 c0 84 c0 74 24 8b 42 08 40 78 1f 9c 59 fa b8 00 e0 ff ff ba 00 4b 37 c0 21 e0 8b 40 10 03 14 85 20 80 37 c0 ff 4a 10 51 9d c3 <0f> 0b da 01 65 3a 2a c0 eb d7 0f 0b d7 01 65 3a 2a c0 eb bb 83
<6>note: munin-node[14532] exited with preempt_count 1
find_exported_dentry: npd != pd
find_exported_dentry: npd != pd
--
Cyril Bouthors
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 188 bytes
Desc: not available
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20041111/9e0ec652/attachment.pgp>