Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, I am testing drbd + heartbeat for an HA setup consisiting of two cluster members. The first is A Dell 2400, 256MB, Dual PIII 500, HW Raid. The second is a Dell 2300, 128Mb, Single PIII500, Soft RAID. Both systems are running RedHat 9 with 2.4.20-31.9smp kernel (the single proc box because of a bug in the 440GX chipset, APIC only works when running SMP kernel). I am using 0.6.12 as 0.7 seemed hell on my machines (loads of kernel oopses, panics, hangs etc.). So far I've been having good results. Tested failover between nodes, which all worked well. Until I decided to test the all out disaster scenario. First I took down my primary cluster node (did this by disconnecting all NIC's). Failover went well as expected. Then decided to go for all-out by gracefully shutting down the secondary node. In this scenario you would boot up the secondary cluster node first, as that would have the latest data set. And as I want HA, decided not to wait for the other side of drbd to show up and make disks primary. Up until this point still no problem, disks would be mounted and data served from the secondary cluster node. But when I booted my primary cluster node, shit did really hit the van (you should see my office, it smells terrible ;-). As soon as it started replicating off data from the secondary cluster node, problems started. Immediately both of the nodes were showing lock-up problems (eg, not able to log in on console / ssh etc.). Already logged in sessions kept working except for doing su would lock up also. A 'cat /proc/drbd' would initially show acceptable speeds (around 5MB/s, my sync min. Syncing from primary node to secondary would reach 10MB/s+). Also the system load would slowly increase up unto the point where heartbeat generated failover: (If I run softdog, it would even just reset the machine) 11:09:37 up 10:23, 1 user, load average: 3.58, 3.00, 2.41 85 processes: 75 sleeping, 7 running, 3 zombie, 0 stopped CPU states: 70.9% user 29.0% system 0.0% nice 0.0% iowait 0.0% idle Mem: 125412k av, 122820k used, 2592k free, 0k shrd, 36628k buff 78112k actv, 796k in_d, 1624k in_c Swap: 787064k av, 1184k used, 785880k free 54192k cached (CPU was usually not at 100%, but more like 25 to 30%), Load 3+ on a Single CPU machine, while not using that much mem and cpu time, that's weird. Also at this point sync speeds would drop to under 1MB/s. Plus the console got overloaded with these messages: drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967295 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967294 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967295 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967295 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967294 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967295 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967294 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967295 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967295 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967294 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967295 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967294 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967295 drbd1: [drbd_syncer_1/4321] sock_sendmsg time expired, ko = 4294967295 I've tryd fiddling with sync parameters (sync-nice, sync-group, tl-size, etc.) nothing helped, although symptoms did vary (time before lock-ups of system, time before HB failed over, less or more of these sock_sendmsg messages). As soon as Hearbeat had shut itself down, sync speed would sometimes go up again, but other tiimes remained low. Same thing with the load somtimes went down to normal values, sometimes not. System lock ups too. Stopping the sync by disconnecting the secondary cluster node always brought sysmtems back to normal. The only way systems remained stable was doing the sync in single user mode. But as it's 70GB of data we're talking about and 5MB/s sync would take 3hrs+, this would be unacceptable downtime. I will now start with a new dataset and see if I can reproduce the problem. I am not going to wait for sync to finish in single user mode. I would not mind, if in a situation like this syncing the data back to the primary node, takes a day, but it has to be stable and the secondary node has to serve the data in the meantime. My drbd.conf: resource drbd0 { protocol = C fsckcmd = /bin/true disk { disk-size = 4890000k do-panic } net { sync-group = 0 sync-rate = 8M sync-min = 5M sync-max = 10M sync-nice = 0 tl-size = 5000 ping-int = 10 timeout = 9 } on syslogcs-cla { device = /dev/nb0 disk = /dev/sdb2 address = 10.0.0.1 port = 7788 } on syslogcs-clb { device = /dev/nb0 disk = /dev/md14 address = 10.0.0.2 port = 7788 } } resource drbd1 { protocol = C fsckcmd = /bin/true disk { disk-size = 64700000k do-panic } net { sync-group = 1 sync-rate = 8M sync-min = 5M sync-max = 10M sync-nice = 19 tl-size = 5000 ping-int = 10 timeout = 9 } on syslogcs-cla { device = /dev/nb1 disk = /dev/sdb3 address = 10.0.0.1 port = 7789 } on syslogcs-clb { device = /dev/nb1 disk = /dev/md15 address = 10.0.0.2 port = 7789 } } /dev/md14 is RAID0 made of two RAID1 pairs (md9 & md10) /dev/md15 is RAID0 made of two RAID1 pairs (md11 & md12) Output of mount commands: drbd1: blksize=1024 B drbd1: blksize=4096 B kjournald starting. Commit interval 5 seconds EXT3 FS 2.4-0.9.19, 19 August 2002 on drbd(43,1), internal journal EXT3-fs: mounted filesystem with ordered data mode. Why the different block size? Both disks have this when mounting. Sometimes I get the message, that md device used obsolete ioctl, but this should only be cosmetical. Sometimes got the message on the SW RAID systdm, that block size couldn't be determined and 512b was assumed. The SW RAID seems to outperform the HW RAID by 100% On rare occasions I saw lock-ups of fsck or mount during heartbeat start-up. One time even causing entire system to hang during reboot (killall was not able to kill a hanging mount process.) Maybe also important info: Some md devices were syncing at the same time drbd devices were syncing. This too was not acheiving high speeds. You would expect this, when drbd sync uses 5MB, but not when that drops. You then would expect md sync to go faster, but it didn't, it would stay at 100-300KB/s. Lots of information, but probably more needed. I will let you know if I can reproduce the problem, when I have created new datasets to test with. Sietse -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20040517/9a37bd93/attachment.htm>