Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
FWIW, this very repeatable problem on 2.6.15-1.2054_FC5xen0 no longer seems to exist on 2.6.16-1.2080_FC5xen0. On Mar 27, 2006, at 10:40 AM, Ben wrote: > Hey guys, I'm trying to run some Xen virtual machines on top of > DRBD to get failover protection, and it's almost working great. > Unfortunately, I occasionally get an oops on one of my domUs, and > it looks like this: > > Mar 27 01:39:36 johnny kernel: Unable to handle kernel paging > request at ffff8800e53ba000 RIP: > Mar 27 01:39:36 johnny kernel: <ffffffff80179b5b>{__bio_clone+46} > Mar 27 01:39:36 johnny kernel: PGD 10d9067 PUD 16dd067 PMD 1807067 > PTE 0 > Mar 27 01:39:36 johnny kernel: Oops: 0000 [1] SMP > Mar 27 01:39:36 johnny kernel: CPU 0 > Mar 27 01:39:36 johnny kernel: Modules linked in: xt_physdev drbd > (U) ipv6 bridge w83627hf hwmon_vid hwmon eeprom i2c_isa > ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink > ipt_LOG xt_tcp udp iptable_filter ip_tables x_tables video button > battery ac lp parport_pc parport nvram ohci1394 ieee1394 sg e100 > mii i2c_nforce2 i2c_core forcedeth dm_snapshot dm_zero dm_mirror > dm_mod ext3 jbd sata_nv libata aacraid sd_mod scsi_mod > Mar 27 01:39:36 johnny kernel: Pid: 5229, comm: xvd 8 93:02 Not > tainted 2.6.15-1.2054_FC5xen0 #1 > Mar 27 01:39:36 johnny kernel: RIP: e030:[<ffffffff80179b5b>] > <ffffffff80179b5b>{__bio_clone+46} > Mar 27 01:39:36 johnny kernel: RSP: e02b:ffff8800ac24d948 EFLAGS: > 00010216 > Mar 27 01:39:36 johnny kernel: RAX: ffff8800e53b9f50 RBX: > ffff8800e49b9d40 RCX: 0000000000000050 > Mar 27 01:39:36 johnny kernel: RDX: ffff8800e53b9e80 RSI: > ffff8800e53ba000 RDI: ffff8800a9785d30 > Mar 27 01:39:36 johnny kernel: RBP: ffff8800e702b338 R08: > 0000000006ffb100 R09: ffff88000189c000 > Mar 27 01:39:36 johnny kernel: R10: 0000000000001000 R11: > 0000000000000001 R12: 0000000000000023 > Mar 27 01:39:36 johnny kernel: R13: ffff8800e53b9e80 R14: > ffff8800e4766150 R15: 0000000000000008 > Mar 27 01:39:36 johnny kernel: FS: 00002abf280251c0(0000) > GS:ffffffff80499000(0000) knlGS:0000000000000000 > Mar 27 01:39:36 johnny kernel: CS: e033 DS: 0000 ES: 0000 > Mar 27 01:39:36 johnny kernel: Process xvd 8 93:02 (pid: 5229, > threadinfo ffff8800ac24c000, task ffff8800a6ff6040) > Mar 27 01:39:36 johnny kernel: Stack: ffff8800e53b9e80 > ffff8800e49b9d40 ffff8800e53b9e80 ffffffff80179bed > Mar 27 01:39:36 johnny kernel: ffff8800e70250d0 > 0000000000000023 ffff8800e70250d0 ffffffff88209471 > Mar 27 01:39:36 johnny kernel: 0000000000047ffd > 00000001f1ba2a08 > Mar 27 01:39:36 johnny kernel: Call Trace: <ffffffff80179bed> > {bio_clone+53} <ffffffff88209471>{:drbd:drbd_make_request_26+1046} > Mar 27 01:39:36 johnny kernel: <ffffffff80155bbf> > {mempool_alloc+66} <ffffffff8032835c>{_spin_unlock_irqrestore+9} > Mar 27 01:39:36 johnny kernel: <ffffffff88086544> > {:dm_mod:dm_request+345} <ffffffff8820924a> > {:drbd:drbd_make_request_26+495} > Mar 27 01:39:36 johnny kernel: <ffffffff801e9225> > {generic_make_request+365} <ffffffff801ea61a>{submit_bio+186} > Mar 27 01:39:36 johnny kernel: <ffffffff80266851> > {dispatch_rw_block_io+994} <ffffffff80266c6a>{blkif_schedule+944} > Mar 27 01:39:36 johnny kernel: <ffffffff80124780> > {__wake_up_common+62} <ffffffff80141339>{autoremove_wake_function+0} > Mar 27 01:39:36 johnny kernel: <ffffffff80140f17> > {keventd_create_kthread+0} <ffffffff802668ba>{blkif_schedule+0} > Mar 27 01:39:37 johnny kernel: <ffffffff80140f17> > {keventd_create_kthread+0} <ffffffff80141200>{kthread+212} > Mar 27 01:39:37 johnny kernel: <ffffffff8010b856>{child_rip > +8} <ffffffff80140f17>{keventd_create_kthread+0} > Mar 27 01:39:37 johnny kernel: <ffffffff8014112c>{kthread+0} > <ffffffff8010b84e>{child_rip+0} > Mar 27 01:39:37 johnny kernel: > Mar 27 01:39:37 johnny kernel: Code: f3 a4 48 8b 02 48 89 03 48 8b > 42 10 48 89 43 10 48 83 4b 18 > Mar 27 01:39:37 johnny kernel: RIP <ffffffff80179b5b>{__bio_clone > +46} RSP <ffff8800ac24d948> > Mar 27 01:39:37 johnny kernel: CR2: ffff8800e53ba000 > > > It appears the domU is trying to write to the DRBD resource, and > DRBD has issues with that. The domU then becomes unresponsive and > xen itself begines to degrades ungracefully from there. > > I'm running 3 other domUs on top of DRBD, and all of them are > working flawlessly. So maybe it's an issue with this particular > resource? It seems to happen somewhat randomly, but has a higher > chance with higher IO levels. > > I'm using DRBD 0.7.17 on 2.6.15-1.2054_FC5xenU.