[DRBD-user] Problem with drbd and software RAID0 on hardware RAID5

Lars Ellenberg Lars.Ellenberg at linbit.com
Wed Jan 28 13:37:01 CET 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


/ 2004-01-28 09:02:43 -0000
\ Dave Smith:
> Hi all,
> 
> I am currently struggling with a Linux HA cluster.   This is
> my first venture into this area, but I am looking for a HA cluster
> of smb file servers.
> 
> The setup I am trying to get working is:
> 
> 2 identical servers, each with:
>  Tyan Thunder i7501 Pro Motherboard
>  Dual Pentium Xeon 2.4Ghz 533 FSB
>  2Gb ECC 266 RAM
>  2 Gigabit Ethernet ports
>  4Gb SCSI HDD (/dev/sdc) SWAP Partition
>  4Gb SCSI HDD (/dev/sdd) System disk
>  2 x 3Ware Escalade 8506 SATA RAID controllers
>   Each with 4 SATA 200Gb HDD (8Mb Cache)
>   Configured as RAID5 with 1 Hot Spare
>  I want to RAID0 the two RAID5 arrays with software RAID for performance
> 
> The two servers will be clustered together with drbd and heartbeat over a
> dedicated Gigabit link.
> 
> The Problem
> -----------
> 
> The hardware RAID5 seems to be fine, and working well, but when I introduce
> the RAID0 level on top, drbd seesm to hang the PRIMARY machine
> after a relatively short period of time.
> 
> I have tried dbrb with just RAID5 (hardware), and that seems fine.
> 
> When I put RAID0 on top it all falls down.
> 
> I have tried RAID0 on top without a file system and with a filesystem.
> I have tried creating the RAID0 using both mdadm and raidtools.
> 
> I have updated the kernel to the latest version
> I have rebuilt the 3ware drivers to the latest version
> I have updated the 3ware firmware

what about the drbd version, did you use 0.6.10+cvs ?
And which "latest kernel version"? kernel.org? some vendor kernel?
Not that I think it is a kernel problem, but it won't be the first
interoperability problem ...

> The signs
> ---------
> The primary machine just hangs.   There are no panics or logs of anything
> unusual.
> The secondary machine gives a c:WFConnect s:Secondary/Unknown status from
> /proc/drbd
> /var/log/messages reports a ping ack timeout error.   This is all that
> happens.
> 
> Thank you in advance to anyone who might be able to help me.
> 
> Neil
> 
> 
> Below are the drbd.conf file and the raidtab file (which I used in the
> raidtools test)

> 
> # drbd.conf
> resource drbd0 {
>   protocol = B

You should use protocol C.
Unless you are mirroring over some long distance link, in which
case you shuold use A. Benchmarks suggest that proto B does not
cut it at all, even though one may think it does.

[...]

> # raidtab
> raiddev /dev/md0
>  raid-level 0
>  nr-raid-disks 2
>  persistent-superblock 1
>  chunk-size 64
>  device  /dev/sda1
>  raid-disk 0
>  device  /dev/sdb1
>  raid-disk 1

I don't see anything suspicious.

	Lars Ellenberg



More information about the drbd-user mailing list