[DRBD-user] Drbd 8.0.1 Inconsistent vs UpToDate

Lars Ellenberg lars.ellenberg at linbit.com
Thu Apr 5 12:34:55 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Thu, Apr 05, 2007 at 11:37:57AM +0200, Laurent CARON wrote:
> Hi
> 
> I did set up 2 servers with Drbd 8.0.1.
> 
> I've got 2 drbd devices (/dev/drbd0 and /dev/drbd1)
> 
> /dev/drbd0 is about 700	GB (XFS filesystem on top of it)
> /dev/drbd1 is about 50 GB (XFS filesystem on top of it)
> 
> drbd1 is fine and UpToDate on both nodes
> 
> After waiting 6 hrs for drbd0 to sync, it is as follows on the nodes:
> 
> Node1:
> version: 8.0.1 (api:86/proto:86)
> SVN Revision: 2784 build by root at picsou, 2007-04-03 17:06:35
>  0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate B r---
>     ns:918133770 nr:0 dw:173062530 dr:745075430 al:50223 bm:45476 lo:0
> pe:0 ua:0 ap:0
>         resync: used:0/31 hits:372490144 misses:45476 starving:0 dirty:0
> changed:45476
>         act_log: used:1/257 hits:43674131 misses:53808 starving:0
> dirty:3585 changed:50223
> 
> Node2:
> version: 8.0.1 (api:86/proto:86)
> SVN Revision: 2784 build by root at picsou, 2007-04-03 17:06:35
>  0: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate B r---
>     ns:0 nr:918133770 dw:918133770 dr:0 al:0 bm:45476 lo:0 pe:0 ua:0 ap:0
>         [===================>] sync'ed:100.0% (0/727608)M
>         stalled
>         resync: used:9/31 hits:372490144 misses:45476 starving:0 dirty:0
> changed:45476
>         act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
> 
> /var/log/syslog of Node2 is filled with the following messages repeated
> over and over:
> 
> Apr  5 11:36:41 node2 kernel: drbd0: Retrying drbd_rs_del_all() later.
> refcnt=1

aparently there is something strange in the reference counting of some
housekeeping structures.  did you have some io errors?

> Messages appearing in node1's syslog:
> Apr  5 11:33:56 node1 kernel: drbd0: ASSERT(
> mdev->net_conf->wire_protocol == DRBD_PROT_A ) in
> /usr/src/drbd-8.0.1/drbd/drbd_receiver.c:3193

assert seems to be a typo. I still have to verify,
but I think it should assert DRBD_PROT_B, not A :(

> If I try to reboot a node, drbd1 resyncs only the changed parts,
> however drbd0 does a full resync.

does it ever completely finish?

> Did I miss something ?

most likely some strange race condition or logic cornercase.

does it help, if you switch to protocol C ?

-- 
: Lars Ellenberg                            Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list