[DRBD-user] BAD! BarrierAck

Nathan Stratton nathan at robotics.net
Fri Dec 12 16:41:57 CET 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Fri, 12 Dec 2008, Lars Ellenberg wrote:

> On Fri, Dec 12, 2008 at 09:04:44AM -0600, Nathan Stratton wrote:
>> On Fri, 12 Dec 2008, Lars Ellenberg wrote:
>>
>>> On Thu, Dec 11, 2008 at 08:14:17PM -0600, Nathan Stratton wrote:
>>>> On Thu, 11 Dec 2008, Nathan Stratton wrote:
>>>>
>>>>> Anyidea how to fix this? I keep getting them when trying to sync two
>>>>> large systems.
>>>>
>>>> Running drbd-8.3.0rc2 on Centos 5.2
>>>>
>>>>> Dec 11 19:59:44 xen1 kernel: drbd0: BAD! BarrierAck #3231051334
>>>>> received, expected #3231051333!
>>>
>>> verry interessting.
>>> this is new paranoia code,
>>> leading to reconnection.
>>> no harm done.
>>
>> yep, only issue is access to local /dev/drbd0 frezes runing the
>> disconnect/reconnect of the remote notes.
>>
>>> but,
>>> can you give some more details?
>>
>> For you? Sure!
>>
>>> how long between two such "BAD!"s, wall clock time and approx. amount of
>>> written data?
>>
>> Looks random, can be 100G or 2G, wall clock looks like:
>>
>> Dec 11 14:11:02 xen1 kernel: drbd0: BAD! BarrierAck #2399440554 received, expected #2399440553!
>> Dec 11 15:06:08 xen1 kernel: drbd0: BAD! BarrierAck #3562915500 received, expected #3562915499!
>> Dec 11 15:10:16 xen1 kernel: drbd0: BAD! BarrierAck #2877127253 received, expected #2877127252!
>> Dec 11 17:12:49 xen1 kernel: drbd0: BAD! BarrierAck #684515493 received, expected #684515492!
>> Dec 11 18:07:11 xen1 kernel: drbd0: BAD! BarrierAck #1304938437 received, expected #1304938436!
>> Dec 11 18:40:48 xen1 kernel: drbd0: BAD! BarrierAck #2899175375 received, expected #2899175374!
>> Dec 11 18:55:46 xen1 kernel: drbd0: BAD! BarrierAck #229959413 received, expected #229959412!
>> Dec 11 19:59:44 xen1 kernel: drbd0: BAD! BarrierAck #3231051334 received, expected #3231051333!
>> Dec 11 20:00:17 xen1 kernel: drbd0: BAD! BarrierAck #1512535064 received, expected #1512535063!
>>
>>
>>> what access pattern?
>>
>> All access right now is on the Primary/UpToDate system.
>>
>>> only sync?
>>
>> Unknown since I am not doing much else.
>>
>>> what is "large"?
>>
>> /dev/drbd0            9.6T  218G  9.4T   3% /share
>>
>>> what is your hardware/io subsys/network/drivers?
>>
>> 3Ware 9650SX with 16 760 gig disks, network is Mellanox MT25204 10 Gb/s
>> with IPoIB since direct infiniband is not yet supported. : )
>>
>>> can you give me a "dmesg | grep drbd"
>>> from module load to first mount of file system?
>>
>> http://share.robotics.net/drbd0
>
> the same from the other node as well, please.
>
> actually, rather grep the kernel log,
> so I see the timestamps as well.


http://share.robotics.net/drbd0-SyncSource
http://share.robotics.net/drbd0-SyncTarget


><>
Nathan Stratton                                CTO, BlinkMind, Inc.
nathan at robotics.net                         nathan at blinkmind.com
http://www.robotics.net                        http://www.blinkmind.com



More information about the drbd-user mailing list