Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Fri, Dec 12, 2008 at 09:04:44AM -0600, Nathan Stratton wrote: > On Fri, 12 Dec 2008, Lars Ellenberg wrote: > >> On Thu, Dec 11, 2008 at 08:14:17PM -0600, Nathan Stratton wrote: >>> On Thu, 11 Dec 2008, Nathan Stratton wrote: >>> >>>> Anyidea how to fix this? I keep getting them when trying to sync two >>>> large systems. >>> >>> Running drbd-8.3.0rc2 on Centos 5.2 >>> >>>> Dec 11 19:59:44 xen1 kernel: drbd0: BAD! BarrierAck #3231051334 >>>> received, expected #3231051333! >> >> verry interessting. >> this is new paranoia code, >> leading to reconnection. >> no harm done. > > yep, only issue is access to local /dev/drbd0 frezes runing the > disconnect/reconnect of the remote notes. > >> but, >> can you give some more details? > > For you? Sure! > >> how long between two such "BAD!"s, wall clock time and approx. amount of >> written data? > > Looks random, can be 100G or 2G, wall clock looks like: > > Dec 11 14:11:02 xen1 kernel: drbd0: BAD! BarrierAck #2399440554 received, expected #2399440553! > Dec 11 15:06:08 xen1 kernel: drbd0: BAD! BarrierAck #3562915500 received, expected #3562915499! > Dec 11 15:10:16 xen1 kernel: drbd0: BAD! BarrierAck #2877127253 received, expected #2877127252! > Dec 11 17:12:49 xen1 kernel: drbd0: BAD! BarrierAck #684515493 received, expected #684515492! > Dec 11 18:07:11 xen1 kernel: drbd0: BAD! BarrierAck #1304938437 received, expected #1304938436! > Dec 11 18:40:48 xen1 kernel: drbd0: BAD! BarrierAck #2899175375 received, expected #2899175374! > Dec 11 18:55:46 xen1 kernel: drbd0: BAD! BarrierAck #229959413 received, expected #229959412! > Dec 11 19:59:44 xen1 kernel: drbd0: BAD! BarrierAck #3231051334 received, expected #3231051333! > Dec 11 20:00:17 xen1 kernel: drbd0: BAD! BarrierAck #1512535064 received, expected #1512535063! > > >> what access pattern? > > All access right now is on the Primary/UpToDate system. > >> only sync? > > Unknown since I am not doing much else. > >> what is "large"? > > /dev/drbd0 9.6T 218G 9.4T 3% /share > >> what is your hardware/io subsys/network/drivers? > > 3Ware 9650SX with 16 760 gig disks, network is Mellanox MT25204 10 Gb/s > with IPoIB since direct infiniband is not yet supported. : ) > >> can you give me a "dmesg | grep drbd" >> from module load to first mount of file system? > > http://share.robotics.net/drbd0 the same from the other node as well, please. actually, rather grep the kernel log, so I see the timestamps as well. thanks, -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed