[DRBD-user] "syncer" crash when doing full resync

Kohari, Moiz mkohari at enterasys.com
Thu Sep 23 15:19:57 CEST 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi Philipp,

Thank you for your comments.  I will assume that DRBD-0.8 is at least 6 months out and I am in need of a work around.  This means that I will probably try and implement your suggested fix in 0.6.12 and then send it to the development list for comments.

Do you have any pointers for me (prototype) that may save me time as I look to tackle this issue.  My company will not allow me to use drbd in production unless I can demonstrate a stable subsystem.  I appreciate any help.  Following I have attached the section from 0.8 roadmap that deals with this issue:

> 5 It is possible that a secondary node crashes a primary by 
>   returning invalid block_ids in ACK packets. [This might be 
>   either caused by faulty hardware, or by a hostile modification
>   of DRBD on the secondary node]
>  
>   Proposed solution:
>  
>  Extend the block_id field. (currently 64 bit) by at least
>  32 bits (64?) . (=block_id_chk field). The primary node 
>  stores an encrypted (random key, changes every 15 minutes...) 
>  checksum (=signature) in the second field. 
> 
>  The secondary node can not fake (either intentionally or 
>  unintentionally) these signature. 
> 
>  The primary node will only dereference the block_id pointers
>  if the signature is right.

Best Regards,
Moiz



-----Original Message-----
From: drbd-user-bounces at linbit.com [mailto:drbd-user-bounces at linbit.com] On Behalf Of Philipp Reisner
Sent: Thursday, September 23, 2004 8:32 AM
To: drbd-user at linbit.com
Subject: Re: [DRBD-user] "syncer" crash when doing full resync

On Wednesday 22 September 2004 23:26, Kohari, Moiz wrote:
> Folks,
>
>
>
> I am seeing a drbdd oops almost exact in nature to the one below, certainly
> looks like memory corruption.  It is happening in the same spot
> (drbd_end_req()).
>
>
>
> Because this is happening somewhat consistently within the drbd subsystem
> and no where else, I wonder if the corruption is coming from within drbd? 
> Was this issue ever resolved?
>
>
>
> I am using drbd version 0.6.12, has anyone seen this problem with newer
> versions of drbd?
>

The thing is, the corrupt memory bank is in the secondary, but the
primary crashes !!!

We have a fix for this on the DRBD-0.8 roadmap.

-Philipp
-- 
: Dipl-Ing Philipp Reisner                      Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH          Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria    http://www.linbit.com :
_______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user





More information about the drbd-user mailing list