Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Wed, Nov 16, 2011 at 01:01:38PM +1100, Steve Kieu wrote:
> Hello,
>
> I am experimenting drbd and not quite good in stability (un usable). I saw
> this in dmesg log:
>
> block drbd1: md_sync_timer expired! Worker calls drbd_md_sync
> ().
Usually, especially with "huge" devices, this is no reason to worry.
No need to do _anything_.
> At fist restart it works for a while, and then all of sudden - cat
> /proc/drbd show ProtocolError and system hang (mysql or any other process
> read/write to the drbd partitions.
>
> It is repeatable and when it happend network is not busy, machine load is
> nearly 0 and all other network connectivity is normal.
>
> Googling show me that many users has same problem and one suggested to
> lower the rate of resync and sync, I did that (for 100Mbit ethernet I set
> resync is 3M and in syncer rate 40M; I setup two volumes . Problem still.
>
>
> Here is the short description of the system:
>
> * Centos 6 x86_64
> * Kernel 2.6.32.43-vs2.3.0.36.29.8-h1-32cpu-noselinux which is vanilar
> kernel 2.6.32.43 with vserver patch vs2.3.0.36.29.8 - compile with HZ = 100
> and SMP for 32 cpu
> * DRBD compiled from source, version 8.4.0 (including kernel module)
8.4.0 seems to have serious stability issues under moderate to heavy IO
when actually using the multi volume feature :-(
We are preparing a 8.4.1.
> * DRBD build on top of LVM here is the config
>
> resource r0 {
>
> on cosmos {
> volume 0 {
> #device minor 0;
> device /dev/drbd0;
> meta-disk internal;
> disk /dev/vs-resource1/mysqldata;
> }
>
> volume 1 {
> device /dev/drbd1;
> meta-disk internal;
> disk /dev/vs-resource1/pgsqldata;
> }
>
> address 10.200.11.4:7789;
> }
>
> on seaspray {
> volume 0 {
> # device minor 0;
> device /dev/drbd0;
> meta-disk internal;
> disk /dev/vg_seaspray/mysqldata;
> }
>
> volume 1 {
> device /dev/drbd1;
> meta-disk internal;
> disk /dev/vg_seaspray/pgsqldata;
> }
>
> address 10.200.11.3:7789;
> }
>
> startup {
> #become-primary-on both;
>
> }
> net {
> #allow-two-primaries;
> protocol C;
> after-sb-0pri discard-zero-changes;
> after-sb-1pri discard-secondary;
> after-sb-2pri disconnect;
> #cram-hmac-alg sha1;
> #shared-secret "FooFunFactory";
>
> }
>
>
> }
>
> * DRBD runs in Primary/Secondary mode for now. The device is mounted into a
> vserver instance and mysql and postgres is running from the vserver
> * IPtables is setup to allow DRBD trafic - it happened even iptables is off
>
> * Network route
> route
> Kernel IP routing table
> Destination Gateway Genmask Flags Metric Ref Use
> Iface
> 10.200.11.0 * 255.255.255.224 U 0 0 0 eth0
> 10.200.11.128 * 255.255.255.192 U 0 0 0
> eth1.503
> 192.168.100.0 * 255.255.255.0 U 0 0 0
> dummy0
> 1.1.1.0 * 255.255.255.0 U 0 0 0
> vmbr0
> link-local * 255.255.0.0 U 1002 0 0 eth0
> link-local * 255.255.0.0 U 1003 0 0 eth1
> link-local * 255.255.0.0 U 1004 0 0
> eth1.503
> default 10.200.11.1 0.0.0.0 UG 0 0 0 eth0
>
> I attach the dmesg here as well if it helps to debug. I would like to have
> it fixed so please help.
>
> Many thanks,
>
>
>
>
> --
> Steve Kieu
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list -- I'm subscribed