Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi all, I am currently struggling with a Linux HA cluster. This is my first venture into this area, but I am looking for a HA cluster of smb file servers. The setup I am trying to get working is: 2 identical servers, each with: Tyan Thunder i7501 Pro Motherboard Dual Pentium Xeon 2.4Ghz 533 FSB 2Gb ECC 266 RAM 2 Gigabit Ethernet ports 4Gb SCSI HDD (/dev/sdc) SWAP Partition 4Gb SCSI HDD (/dev/sdd) System disk 2 x 3Ware Escalade 8506 SATA RAID controllers Each with 4 SATA 200Gb HDD (8Mb Cache) Configured as RAID5 with 1 Hot Spare I want to RAID0 the two RAID5 arrays with software RAID for performance The two servers will be clustered together with drbd and heartbeat over a dedicated Gigabit link. The Problem ----------- The hardware RAID5 seems to be fine, and working well, but when I introduce the RAID0 level on top, drbd seesm to hang the PRIMARY machine after a relatively short period of time. I have tried dbrb with just RAID5 (hardware), and that seems fine. When I put RAID0 on top it all falls down. I have tried RAID0 on top without a file system and with a filesystem. I have tried creating the RAID0 using both mdadm and raidtools. I have updated the kernel to the latest version I have rebuilt the 3ware drivers to the latest version I have updated the 3ware firmware The signs --------- The primary machine just hangs. There are no panics or logs of anything unusual. The secondary machine gives a c:WFConnect s:Secondary/Unknown status from /proc/drbd /var/log/messages reports a ping ack timeout error. This is all that happens. Thank you in advance to anyone who might be able to help me. Neil Below are the drbd.conf file and the raidtab file (which I used in the raidtools test) # drbd.conf resource drbd0 { protocol = B fsckcmd = /bin/true disk { do-panic disk-size = 796582784k } net { sync-nice = -18 sync-min = 4M sync-max = 500M tl-size = 5000 timeout = 60 connect-int = 10 ping-int = 10 } on babbage { device = /dev/nb0 disk = /dev/md0 address = 172.16.0.1 port = 7788 } on newton { device = /dev/nb0 disk = /dev/md0 address = 172.16.0.2 port = 7788 } } # raidtab raiddev /dev/md0 raid-level 0 nr-raid-disks 2 persistent-superblock 1 chunk-size 64 device /dev/sda1 raid-disk 0 device /dev/sdb1 raid-disk 1 ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________