Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi all,
I am currently struggling with a Linux HA cluster. This is
my first venture into this area, but I am looking for a HA cluster
of smb file servers.
The setup I am trying to get working is:
2 identical servers, each with:
Tyan Thunder i7501 Pro Motherboard
Dual Pentium Xeon 2.4Ghz 533 FSB
2Gb ECC 266 RAM
2 Gigabit Ethernet ports
4Gb SCSI HDD (/dev/sdc) SWAP Partition
4Gb SCSI HDD (/dev/sdd) System disk
2 x 3Ware Escalade 8506 SATA RAID controllers
Each with 4 SATA 200Gb HDD (8Mb Cache)
Configured as RAID5 with 1 Hot Spare
I want to RAID0 the two RAID5 arrays with software RAID for performance
The two servers will be clustered together with drbd and heartbeat over a
dedicated Gigabit link.
The Problem
-----------
The hardware RAID5 seems to be fine, and working well, but when I introduce
the RAID0
level on top, drbd seesm to hang the PRIMARY machine after a relatively
short period of time.
I have tried dbrb with just RAID5 (hardware), and that seems fine.
When I put RAID0 on top it all falls down.
I have tried RAID0 on top without a file system and with a filesystem.
I have tried creating the RAID0 using both mdadm and raidtools.
I have updated the kernel to the latest version
I have rebuilt the 3ware drivers to the latest version
I have updated the 3ware firmware
The signs
---------
The primary machine just hangs. There are no panics or logs of anything
unusual.
The secondary machine gives a c:WFConnect s:Secondary/Unknown status from
/proc/drbd
/var/log/messages reports a ping ack timeout error. This is all that
happens.
Thank you in advance to anyone who might be able to help me.
Neil
Below are the drbd.conf file and the raidtab file (which I used in the
raidtools test)
# drbd.conf
resource drbd0 {
protocol = B
fsckcmd = /bin/true
disk {
do-panic
disk-size = 796582784k
}
net {
sync-nice = -18
sync-min = 4M
sync-max = 500M
tl-size = 5000
timeout = 60
connect-int = 10
ping-int = 10
}
on babbage {
device = /dev/nb0
disk = /dev/md0
address = 172.16.0.1
port = 7788
}
on newton {
device = /dev/nb0
disk = /dev/md0
address = 172.16.0.2
port = 7788
}
}
# raidtab
raiddev /dev/md0
raid-level 0
nr-raid-disks 2
persistent-superblock 1
chunk-size 64
device /dev/sda1
raid-disk 0
device /dev/sdb1
raid-disk 1
________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________