Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Wed, Nov 16, 2011 at 01:01:38PM +1100, Steve Kieu wrote: > Hello, > > I am experimenting drbd and not quite good in stability (un usable). I saw > this in dmesg log: > > block drbd1: md_sync_timer expired! Worker calls drbd_md_sync > (). Usually, especially with "huge" devices, this is no reason to worry. No need to do _anything_. > At fist restart it works for a while, and then all of sudden - cat > /proc/drbd show ProtocolError and system hang (mysql or any other process > read/write to the drbd partitions. > > It is repeatable and when it happend network is not busy, machine load is > nearly 0 and all other network connectivity is normal. > > Googling show me that many users has same problem and one suggested to > lower the rate of resync and sync, I did that (for 100Mbit ethernet I set > resync is 3M and in syncer rate 40M; I setup two volumes . Problem still. > > > Here is the short description of the system: > > * Centos 6 x86_64 > * Kernel 2.6.32.43-vs2.3.0.36.29.8-h1-32cpu-noselinux which is vanilar > kernel 2.6.32.43 with vserver patch vs2.3.0.36.29.8 - compile with HZ = 100 > and SMP for 32 cpu > * DRBD compiled from source, version 8.4.0 (including kernel module) 8.4.0 seems to have serious stability issues under moderate to heavy IO when actually using the multi volume feature :-( We are preparing a 8.4.1. > * DRBD build on top of LVM here is the config > > resource r0 { > > on cosmos { > volume 0 { > #device minor 0; > device /dev/drbd0; > meta-disk internal; > disk /dev/vs-resource1/mysqldata; > } > > volume 1 { > device /dev/drbd1; > meta-disk internal; > disk /dev/vs-resource1/pgsqldata; > } > > address 10.200.11.4:7789; > } > > on seaspray { > volume 0 { > # device minor 0; > device /dev/drbd0; > meta-disk internal; > disk /dev/vg_seaspray/mysqldata; > } > > volume 1 { > device /dev/drbd1; > meta-disk internal; > disk /dev/vg_seaspray/pgsqldata; > } > > address 10.200.11.3:7789; > } > > startup { > #become-primary-on both; > > } > net { > #allow-two-primaries; > protocol C; > after-sb-0pri discard-zero-changes; > after-sb-1pri discard-secondary; > after-sb-2pri disconnect; > #cram-hmac-alg sha1; > #shared-secret "FooFunFactory"; > > } > > > } > > * DRBD runs in Primary/Secondary mode for now. The device is mounted into a > vserver instance and mysql and postgres is running from the vserver > * IPtables is setup to allow DRBD trafic - it happened even iptables is off > > * Network route > route > Kernel IP routing table > Destination Gateway Genmask Flags Metric Ref Use > Iface > 10.200.11.0 * 255.255.255.224 U 0 0 0 eth0 > 10.200.11.128 * 255.255.255.192 U 0 0 0 > eth1.503 > 192.168.100.0 * 255.255.255.0 U 0 0 0 > dummy0 > 1.1.1.0 * 255.255.255.0 U 0 0 0 > vmbr0 > link-local * 255.255.0.0 U 1002 0 0 eth0 > link-local * 255.255.0.0 U 1003 0 0 eth1 > link-local * 255.255.0.0 U 1004 0 0 > eth1.503 > default 10.200.11.1 0.0.0.0 UG 0 0 0 eth0 > > I attach the dmesg here as well if it helps to debug. I would like to have > it fixed so please help. > > Many thanks, > > > > > -- > Steve Kieu > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed