Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I have a setup where I can reliably reproduce the following within a few minutes: Jul 11 10:59:46 wrn-vm2 kernel: [236603.130604] block drbd0: uuid_compare()=-1 by rule 35 Jul 11 10:59:46 wrn-vm2 kernel: [236603.135779] block drbd0: I shall become SyncTarget, but I am primary! Jul 11 10:59:46 wrn-vm2 kernel: [236603.142336] block drbd0: ASSERT( os.conn == C_WF_REPORT_PARAMS ) in /build/linux-s5x2oE/linux-3.2.46/drivers/block/drbd/drbd_receiver.c:3245 It's on Debian Wheezy with Debian stock kernel (3.2.0-4-amd64). Jun 25 15:01:27 wrn-vm1 kernel: [ 626.901545] drbd: initialized. Version: 8.3.11 (api:88/proto:86-96) Jun 25 15:01:27 wrn-vm1 kernel: [ 626.901547] drbd: srcversion: F937DCB2E5D83C6CCE4A6C9 There are more details in this thread: https://groups.google.com/forum/#!topic/ganeti/icqLNFk1si0 I am reproducing it using ganeti, which uses drbd on top of LVM logical volumes to replicate virtual machine images. It migrates virtual machines by sending drdbsetup commands to switch master->slave replication firstly to multi-master, and then to slave<-master (apparently by disconnecting and reconnecting). I believe there is some sort of race condition going on, because (a) it seems few if any other people observe what I see; and (b) although I can reproduce the problem within a few minutes, if I attach a full-blown strace to the process which is issuing the drbdsetup calls, the problem goes away. The google groups thread includes an strace log of execve() calls, so you can see what sequence of drbdsetup calls are being issued. Is it possible that ganeti is taking an unsafe approach to switching over the drbd state? Regards, Brian Candler. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20130712/979130d9/attachment.htm>