Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2004-01-28 09:02:43 -0000 \ Dave Smith: > Hi all, > > I am currently struggling with a Linux HA cluster. This is > my first venture into this area, but I am looking for a HA cluster > of smb file servers. > > The setup I am trying to get working is: > > 2 identical servers, each with: > Tyan Thunder i7501 Pro Motherboard > Dual Pentium Xeon 2.4Ghz 533 FSB > 2Gb ECC 266 RAM > 2 Gigabit Ethernet ports > 4Gb SCSI HDD (/dev/sdc) SWAP Partition > 4Gb SCSI HDD (/dev/sdd) System disk > 2 x 3Ware Escalade 8506 SATA RAID controllers > Each with 4 SATA 200Gb HDD (8Mb Cache) > Configured as RAID5 with 1 Hot Spare > I want to RAID0 the two RAID5 arrays with software RAID for performance > > The two servers will be clustered together with drbd and heartbeat over a > dedicated Gigabit link. > > The Problem > ----------- > > The hardware RAID5 seems to be fine, and working well, but when I introduce > the RAID0 level on top, drbd seesm to hang the PRIMARY machine > after a relatively short period of time. > > I have tried dbrb with just RAID5 (hardware), and that seems fine. > > When I put RAID0 on top it all falls down. > > I have tried RAID0 on top without a file system and with a filesystem. > I have tried creating the RAID0 using both mdadm and raidtools. > > I have updated the kernel to the latest version > I have rebuilt the 3ware drivers to the latest version > I have updated the 3ware firmware what about the drbd version, did you use 0.6.10+cvs ? And which "latest kernel version"? kernel.org? some vendor kernel? Not that I think it is a kernel problem, but it won't be the first interoperability problem ... > The signs > --------- > The primary machine just hangs. There are no panics or logs of anything > unusual. > The secondary machine gives a c:WFConnect s:Secondary/Unknown status from > /proc/drbd > /var/log/messages reports a ping ack timeout error. This is all that > happens. > > Thank you in advance to anyone who might be able to help me. > > Neil > > > Below are the drbd.conf file and the raidtab file (which I used in the > raidtools test) > > # drbd.conf > resource drbd0 { > protocol = B You should use protocol C. Unless you are mirroring over some long distance link, in which case you shuold use A. Benchmarks suggest that proto B does not cut it at all, even though one may think it does. [...] > # raidtab > raiddev /dev/md0 > raid-level 0 > nr-raid-disks 2 > persistent-superblock 1 > chunk-size 64 > device /dev/sda1 > raid-disk 0 > device /dev/sdb1 > raid-disk 1 I don't see anything suspicious. Lars Ellenberg