[DRBD-user] 'drbdadm connect' panic?
lars.ellenberg at linbit.com
Thu Jul 19 09:36:56 CEST 2007
On Wed, Jul 18, 2007 at 04:19:02PM -0500, alex at crackpot.org wrote:
> My 2 drbd boxen are called 42 and 43.
> drbd version: 0.7.16 (api:77/proto:74)
> * Today, 42 was primary.
> * A co-worker noticed that it was not connected to 43. (42 =
> 'st:Primary/Unknown ld:Consistent', 43 = 'st:Secondary/Unknown
> * I saw that 43 said 'cs:WFConnection'. Co-worker did 'drbdadm
> connect' on 42, and it kernel paniced.
what cs: was 42 in, before the "drbdadm connect" ?
what is in the kernel logs,
what lead to them being disconnected in the first place?
what does the panic/oops look like?
did it panic in drbd or somewhere else?
was it an "intentional" panic?
> * 43 took over as primary as it should.
(with out-of-date data)
> * When 42 was rebooted, it entered Secondary status and performed a
> sync of data from 43. Since the 2 boxes had been disconnected for
> several days, the data on 43 was old, and the newer data from 42 was
> We're getting backup restores from tape. We've added better
> monitoring to catch when drbd disconnects in the future.
> I am writing because up to this point I thought that a 'drbdadm
> connect' was a fairly safe command to issue. Are there circumstances
> under which it should not be done, or which may cause a panic as we
> saw today?
those would be bugs.
some of them might be fixed already,
you are 0.7.16, we are 0.7.24?
> Would doing 'drbdadm disconnect' before 'drbdadm connect'
> have made a difference?
hard to say. maybe. probably not.
> If the 2 boxes disconnect in the future (for network failure or
> whatever other reason), what is the safe way to get them talking again?
: Lars Ellenberg Tel +43-1-8178292-0 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com :
please use the "List-Reply" function of your email client.
More information about the drbd-user