[DRBD-user] Congratulations and problems...
Felix Ide
felix.ide-drbd at educators.de
Thu Feb 12 11:35:09 CET 2004
Hi list, hi Philipp R.,
first of all my congratulations to the article about your company in the
german "magazine" Computerpartner ;-)
But we do have a problem here with DRBD, whenever I run bonnie++-tests on the
DRBD-device I get error like these:
Feb 9 18:53:52 drbd1-bfd kernel: drbd0: [kupdated/7] sock_sendmsg returned
-32
Feb 9 18:53:57 drbd1-bfd kernel: drbd0: Connection lost.
Feb 9 18:53:57 drbd1-bfd kernel: drbd0: Connection established. size=26097472
KB / blksize=4096 B
Feb 9 18:53:57 drbd1-bfd kernel: drbd0: Synchronisation started blks=15
Feb 9 18:53:59 drbd1-bfd kernel: drbd0: [drbd_syncer_0/23587] sock_sendmsg
returned -104
Feb 9 18:53:59 drbd1-bfd kernel: drbd0: Syncer send failed.
Feb 9 18:54:05 drbd1-bfd kernel: drbd0: Connection lost.
Feb 9 18:54:05 drbd1-bfd kernel: drbd0: Connection established. size=26097472
KB / blksize=4096 B
Feb 9 18:54:05 drbd1-bfd kernel: drbd0: Synchronisation started blks=15
-----
Feb 9 21:57:45 drbd1-bfd kernel: drbd0: [bdflush/6] send timed out!!
Feb 9 21:57:45 drbd1-bfd kernel: drbd0: Syncer send failed.
Feb 9 21:57:50 drbd1-bfd kernel: drbd0: Connection lost.
Feb 9 21:57:50 drbd1-bfd kernel: drbd0: Connection established. size=26097472
KB / blksize=4096 B
Feb 9 21:57:50 drbd1-bfd kernel: drbd0: Synchronisation started blks=15
-----
Feb 10 00:09:51 drbd1-bfd kernel: drbd0: [bdflush/6] sock_sendmsg returned -32
Feb 10 00:09:55 drbd1-bfd kernel: drbd0: Connection lost.
Feb 10 00:09:55 drbd1-bfd kernel: drbd0: Connection established. size=26097472
KB / blksize=4096 B
Feb 10 00:09:55 drbd1-bfd kernel: drbd0: Synchronisation started blks=15
Feb 10 00:09:57 drbd1-bfd kernel: drbd0: [bdflush/6] send timed out!!
Feb 10 00:10:28 drbd1-bfd kernel: drbd0: Syncer aborted.
Feb 10 00:10:32 drbd1-bfd kernel: drbd0: Connection lost.
Feb 10 00:10:32 drbd1-bfd kernel: drbd0: Connection established. size=26097472
KB / blksize=4096 B
Feb 10 00:10:32 drbd1-bfd kernel: drbd0: Synchronisation started blks=15
-----
Feb 11 10:54:14 drbd1-bfd kernel: drbd0: [kupdated/7] sock_sendmsg returned
-32
Feb 11 10:54:20 drbd1-bfd kernel: drbd0: Connection lost.
Feb 11 10:54:20 drbd1-bfd kernel: drbd0: Connection established. size=26097472
KB / blksize=4096 B
Feb 11 10:54:20 drbd1-bfd kernel: drbd0: Synchronisation started blks=15
Feb 11 10:54:22 drbd1-bfd kernel: drbd0: [bonnie++/32678] send timed out!!
Feb 11 10:54:53 drbd1-bfd kernel: drbd0: Syncer aborted.
Feb 11 10:54:57 drbd1-bfd kernel: drbd0: Connection lost.
Feb 11 10:54:57 drbd1-bfd kernel: drbd0: Connection established. size=26097472
KB / blksize=4096 B
Feb 11 10:54:57 drbd1-bfd kernel: drbd0: Synchronisation started blks=15
Feb 11 10:59:01 drbd1-bfd /USR/SBIN/CRON[4858]: (root) CMD ( rm -f
/var/spool/cron/lastrun/cron.hourly)
Feb 11 11:11:03 drbd1-bfd -- MARK --
Feb 11 11:31:03 drbd1-bfd -- MARK --
Feb 11 11:48:13 drbd1-bfd kernel: drbd0: [bdflush/6] sock_sendmsg returned -32
Feb 11 11:48:13 drbd1-bfd kernel: drbd0: Syncer send failed.
Feb 11 11:48:20 drbd1-bfd kernel: drbd0: Connection lost.
Feb 11 11:48:20 drbd1-bfd kernel: drbd0: Connection established. size=26097472
KB / blksize=4096 B
Feb 11 11:48:20 drbd1-bfd kernel: drbd0: Synchronisation started blks=15
-----
Feb 11 19:12:42 drbd1-bfd kernel: drbd0: [drbd_asender_0/5400] sock_sendmsg
returned 0
Feb 11 19:12:46 drbd1-bfd kernel: drbd0: Connection lost.
Feb 11 19:12:46 drbd1-bfd kernel: drbd0: Connection established. size=26097472
KB / blksize=4096 B
Feb 11 19:12:46 drbd1-bfd kernel: drbd0: Synchronisation started blks=15
The 2 systems are IBM eServer x345, Xeon 2,8GHz HT, Dual Intel GBit NIC,
1 GB RAM, 2 x 36 GB U320 SCSI with SW-Raid 1 (onboard HW-Raid 1 with
LSI1030 had performance problems), SuSE 8.2 with vanilla kernel 2.4.24,
DRBD 0.6.10+cvs and heartbeat 1.0.4
eth1 on both servers are dirctly connected and netio shows 117327 k bytes/sec,
MTU is at 5000, so the NIC should not degredade performance...
We are running altogether 5 DRBD-Systems for our customers in different
configurations, but these errors occur only on this one (It's the newest and
fastest of these systems...)
Running bonnie++ without DRBD shows no problems, even when running 10 bonnies
in parallel. But with DRBD underneath the error occurs sometimes even when
running one bonnie process (but it is reproducable when running 3-4 bonnies
at once).
drbd1-bfd:~ # drbdsetup /dev/nb0 show
Lower device: 09:01 (/dev/md1)
Disk options:
do-panic
Local address: 10.20.30.1:7788
Remote address: 10.20.30.2:7788
Wire protocol: C
Net options:
timeout = 19.0 sec
ko-count = 20
tl-size = 10000
connect-int = 20 sec
ping-int = 20 sec
sndbuf-size = 131070
sync-min = 500 KB/sec
sync-max = 204800 KB/sec
drbd1-bfd:~ # cat /proc/drbd
version: 0.6.10+cvs (api:64/proto:62)
0: cs:Connected st:Primary/Secondary ns:1339387608 nr:0 dw:1590662124
dr:2003415012 pe:0 ua:0
Thank for your help,
Felix
More information about the drbd-user
mailing list