Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, Jun 09, 2009 at 03:37:31PM +0200, Reinier Haasjes wrote: > Hi, > > Problem: > cs:WFReportParams/cs:WFBitMapS status remains after changes in network. (Not syncing) (More detailed description below) > Configuration > ha1: > ---------------------------------------- > global { > usage-count no; > } > common { > syncer { rate 10M; } > } > resource shared { > protocol C; > handlers { > } > startup { > degr-wfc-timeout 120; # 2 minutes. > become-primary-on both; > } > disk { > on-io-error pass_on; > } > net { > allow-two-primaries; > cram-hmac-alg "sha256"; > shared-secret "DRBD2KVM"; > after-sb-0pri discard-least-changes; > after-sb-1pri call-pri-lost-after-sb; > after-sb-2pri call-pri-lost-after-sb; you configure it to "call-pri-lost-after-sb;". but you did not configure such a handler. you need to add such a handler, and that handler is expected to hard reboot the box. > rr-conflict disconnect; > } > Jun 9 14:59:31 ha1 kernel: [18306.057311] drbd0: drbd_sync_handshake: > Jun 9 14:59:31 ha1 kernel: [18306.057336] drbd0: self 1078AA65592ABF21:CAB5686435C06CF3:A69BFEC8ACF135DB:C09BFAC5A5B8F389 > Jun 9 14:59:31 ha1 kernel: [18306.057372] drbd0: peer 06A5AF91DACA3D7F:CAB5686435C06CF3:A69BFEC8ACF135DB:C09BFAC5A5B8F389 > Jun 9 14:59:31 ha1 kernel: [18306.057408] drbd0: uuid_compare()=100 by rule 9 > Jun 9 14:59:31 ha2 kernel: [18325.129642] drbd0: drbd_sync_handshake: > Jun 9 14:59:31 ha2 kernel: [18325.129667] drbd0: self 06A5AF91DACA3D7F:CAB5686435C06CF3:A69BFEC8ACF135DB:C09BFAC5A5B8F389 > Jun 9 14:59:31 ha2 kernel: [18325.129717] drbd0: peer 1078AA65592ABF21:CAB5686435C06CF3:A69BFEC8ACF135DB:C09BFAC5A5B8F389 > Jun 9 14:59:31 ha2 kernel: [18325.129752] drbd0: uuid_compare()=100 by rule 9 > Jun 9 14:59:31 ha2 kernel: [18325.129774] drbd0: Split-Brain detected, 2 primaries, automatically solved. Sync from this node > Jun 9 14:59:31 ha2 kernel: [18325.129845] drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate ) hmmm. too bad. the combination of after-sb- configuration you chose leads into a deadlock within the drbd state handling on ha1 :( we'll fix that. what would have been intended for this combination of after-sb- configuration is ha1 tries to become Secondary, if that succeeds it becomes normal SyncTarget. if that does not succeed, it would call the lost-after-sb handler, which you need to define, and which is supposed to hard reboot the box. because that handler is not configured, but the rr-conflict is on the default disconnect, you'd then get on ha1 "I shall become SyncTarget, but I am primary!", and ha1 would go StandAlone. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed