[DRBD-user] drbd-0.8pre6 (svn rev 2590) - kernel fails

Lars Ellenberg Lars.Ellenberg at linbit.com
Tue Nov 7 15:12:40 CET 2006

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


/ 2006-11-07 14:23:26 +0300
\ Vitaly Kuznetsov:
> I tried to use it with Xen-3.0
> It works well on 2 servers, and on third after some time of usage (about
> 2 hours) it fails.  I tried Primary/Primary and Primary/Secondary. Only
> Primary host fails.
> Here is logmessages:
> 
> First fail:
> Nov  7 11:05:39 cluster02 kernel: Unable to handle kernel paging request
> at virtual address ea66e000
> Nov  7 11:05:39 cluster02 kernel:  printing eip:
> Nov  7 11:05:39 cluster02 kernel: c0171175
> Nov  7 11:05:39 cluster02 kernel: 00b60000 -> *pde = 00000000:3eb18001
> Nov  7 11:05:39 cluster02 kernel: 00b18000 -> *pme = 00000000:3ec3c067
> Nov  7 11:05:39 cluster02 kernel: 00c3c000 -> *pte = 00000000:00000000
> Nov  7 11:05:39 cluster02 kernel: Oops: 0000 [#1]
> Nov  7 11:05:39 cluster02 kernel: SMP
> Nov  7 11:05:39 cluster02 kernel: last sysfs file:
> /devices/pci0000:00/0000:00:06.0/0000:06:00.0/subsystem_device
> Nov  7 11:05:39 cluster02 kernel: Modules linked in: xt_physdev
> iptable_filter ip_tables x_tables af_packet bridge ipv6 drbd blkbk netbk
> netloop raw button battery ac apparmor aamatch_pcre loop hw_random
> i8xx_tco tg3 shpchp pci_hotplug i2c_i801 i2c_core ehci_hcd uhci_hcd
> usbcore reiserfs dm_snapshot dm_mod fan thermal processor mptspi
> mptscsih mptbase scsi_transport_spi sg sr_mod cdrom ata_piix libata
> sd_mod scsi_mod
> Nov  7 11:05:39 cluster02 kernel: CPU:    0
> Nov  7 11:05:39 cluster02 kernel: EIP:    0061:[<c0171175>]    Tainted:
> G     U VLI
> Nov  7 11:05:39 cluster02 kernel: EFLAGS: 00010206
> (2.6.16.21-0.25-xenpae #1)
> Nov  7 11:05:39 cluster02 kernel: EIP is at __bio_clone+0x35/0xc0
> Nov  7 11:05:39 cluster02 kernel: eax: 000000c0   ebx: ea41e380   ecx: 00000002   edx: ea66dec0
> Nov  7 11:05:39 cluster02 kernel: esi: ea66e000   edi: eac87a38   ebp: ea41e380   esp: c7443b88
> Nov  7 11:05:39 cluster02 kernel: ds: 007b   es: 007b   ss: 0069
> Nov  7 11:05:39 cluster02 kernel: Process xvd 2 93:02 (pid: 5602, threadinfo=c7442000 task=c0b05630)
> Nov  7 11:05:39 cluster02 kernel: Stack: <0>ead32b60 ea41e380 ea66dec0 ea41b288 ea41b000 c0171230 00000800 00111bba
> Nov  7 11:05:39 cluster02 kernel:        ee4cbe81 c154cda0 c154cda0 00000001 ea66d000 c0166c96 00000c00 00000001
> Nov  7 11:05:39 cluster02 kernel:        c0bed264 ffffffff c81cfcc8 c0bef680 0000002d 00000000 00000000 00000000
> Nov  7 11:05:39 cluster02 kernel: Call Trace:
> Nov  7 11:05:39 cluster02 kernel:  [<c0171230>] bio_clone+0x30/0x40

> Nov  7 11:05:39 cluster02 kernel: Code: 5c 24 04 89 74 24 08 89 7c 24 0c
> 8b 42 0c 8b 40 58 8b 40 38 89 04 24 8b 42 2c 8b 7d 30 8b 72 30 8d 04 40
> c1 e0 02 89 c1 c1 e9 02 <f3> a5 89 c1 83 e1 03 74 02 f3 a4 8b 42 0c 8b
> 0a 8b 5a 04 83 4d

there are not too many ways __bio_clone
could do something stupid like that.

from the disassembly of the "Code:", this is the

	memcpy(bio->bi_io_vec, bio_src->bi_io_vec,
		bio_src->bi_max_vecs * sizeof(struct bio_vec));

so bio->bi_io_vec (esi) is 0xea66e000,
and for some reason your kernel
thinks it cannot access that address.

no more I can figure from what we have so far, sorry.


-- 
: Lars Ellenberg                                  Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH            Fax +43-1-8178292-82 :
: Schoenbrunner Str. 244, A-1120 Vienna/Europe   http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list