[DRBD-user] lvm+drbd+xen problem

Thu Jun 12 13:50:09 CEST 2008

Jan Kellermann|werk21 wrote:
> Dear Gabriele,
>
> we are running a similar system on RHEL5 (xen over drbd over lvm) and 
> have not this problem. the drbd is 8.2.
>
> Did you test writing in split-brain-situation without xen?
> Maybe the drbd is a little "to hard" configured so that it throw a 
> panic in case of unsync. Then xen has no chance to work further.
> In drbd.conf are a lot of parameters for that issue...
I have tested the writing in a split-brain-situation with the drbd 
devices mounted and it works as it should.
>
> Or it may be a problem, how xen is including the disks. What path is 
> in your domu-conf? We use "drbd:ressourcename", so drbd will manage 
> all for xen.
> Have a look:
> http://fghaas.wordpress.com/2007/09/03/drbd-806-brings-full-live-migration-for-xen-on-drbd/ 
>
I use the drbd rpm that comes with SLES10 and unfortunately it's not 
drbd8x, so I can't use the new drbdresourcename parameter. If I try to 
use it the Xen host sees it as a file and not a block device.
>
> I have an other issue: On your system we have big trouble with iowait. 
> The disks are high performant in a raid5. The iovalues are very good. 
> But in domu we have iowaits up to 80% (e.g. while rsync or bzip). May 
> you tell me your experience? Because we are not able to narrow the 
> problem down at the time.
I'm sorry but I can't help you with the iowait problem since I really 
haven't had a chance to implement my setup yet.
>
> Regards to lund (jealous of midsommar...)
> Jan Kellermann
>
>
> Gabriele kalus schrieb:
>> I don’t know if this is a drbd or a xen issue so I will post this on 
>> both lists.
>>
>> I have 2 HP Proliant servers with SLES10 SP2 installed. There are 8 
>> disks on the system, 2 of the disks are in a hardware raid1 and the 
>> rest is raid5, also hardware. The servers have several network cards; 
>> one is exclusively for drbd replication. I use the drbd 0.7.22-42 
>> that ships with SLES10 and kernel 2.6.16.60-0.23-xen.
>>
>> The Xen host is installed on a non-lvm2 partition of 20 GB on the 
>> raid1. The rest of the raid1 are a volume group called system. I have 
>> several logical volumes on the volume group system, called mail-root, 
>> mail-swap, webmail-root, webmail-swap, etc. The *-root lvs are to be 
>> the drbd devices where the domUs systems are going to be installed. 
>> The *-swap are just for swap which I don’t want to replicate.
>>
>> I have tested the setup without the domUs running but with mounted 
>> drbd devices and pulled the network cable that manages the 
>> replication. Everything works as it should; both servers continue 
>> working and replication starts as soon as I connect the cable again.
>>
>> I have tested to install the domUs on file systems on the drbd 
>> devices, i.e. with the drbd devices mounted and it also works as it 
>> should.
>>
>> But when I install the domUs on physical devices /dev/drbdx for the 
>> domUs and pull the network cable between the replicating nics the Xen 
>> host with the domUs running hangs. Only a cold boot gets it started 
>> again. The replication works as it should as long as the cable is 
>> there or as long as both servers are up, everything gets replicated 
>> through the dedicated nic. But if I reboot the passive server the 
>> active server hangs, i.e. the same situation as when I pull the 
>> network cable. I have all the domUs on one of the servers, the other 
>> server is just passive.
>>
>> I haven't installed heart beat to avoid adding complexity to the 
>> situation.
>>
>> Can somebody help me figure out why it I can’t get it to work when 
>> the domUs are installed on physical devices?
>>
>> Regards
>>
>> Gabriele
>>
>


-- 
Gabriele Kalus, Ph.D.
IT-Manager
Lund University, Physics Department
Box 118 SE-22100 Lund, SWEDEN
Phone:	+46-462229675
Mobil:	0702-901227
Fax:	+46-462224709