Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
hello, I'm in a dual primary config with 8.3.2 on f11 x86_64, kernel 2.6.29.5-191.fc11.x86_64. Behaviour seems ok, but I have doubts related to the log lines below marked with >>> output of "drbdadm dump" is at the end. Also one other question: from drbd point of view, should these two operations be the same and so managed the same way? 1) node1 and node2 active in dual primary service drbdadm stop on node2 wait some seconds service drbdadm start on node2 2) node1 and node2 active in dual primary drbd configured to start and stop in init scripts shutdown -r on node 2 Or are there any differences or latencies when I do a shutdown (so in practice a stop/start of drbd but with greater delay and with network interface of node2 going down...) Thanks in advance, Gianluca starting point is: node1 version: 8.3.2 (api:88/proto:86-90) GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by root at virtfed.domain.com, 2009-07-16 11:03:37 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r---- ns:55143748 nr:211304 dw:964992 dr:56018052 al:548 bm:3711 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 node2 version: 8.3.2 (api:88/proto:86-90) GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by root at virtfed.domain.com, 2009-07-16 11:03:37 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r---- ns:0 nr:10088 dw:10088 dr:92 al:0 bm:18 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 On node2 (using init script as provided by installation form source): [root at virtfedbis ~]# service drbd stop Stopping all DRBD resources: . [root at virtfedbis ~]# echo $? 0 Now on node1: version: 8.3.2 (api:88/proto:86-90) GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by root at virtfed.domain.com, 2009-07-16 11:03:37 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/Outdated C r---- ns:55144420 nr:211304 dw:966164 dr:56018052 al:548 bm:3711 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:436 logs: node1: Jul 27 15:15:41 virtfed kernel: block drbd0: peer( Primary -> Secondary ) >>> Jul 27 15:15:41 virtfed kernel: block drbd0: State change failed: Refusing to be Primary while peer is not outdated Jul 27 15:15:41 virtfed kernel: block drbd0: state = { cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate r--- } Jul 27 15:15:41 virtfed kernel: block drbd0: wanted = { cs:TearDown ro:Primary/Unknown ds:UpToDate/DUnknown s--- } Jul 27 15:15:41 virtfed kernel: block drbd0: peer( Secondary -> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> Outdated ) Jul 27 15:15:41 virtfed kernel: block drbd0: Creating new current UUID Jul 27 15:15:41 virtfed kernel: block drbd0: asender terminated Jul 27 15:15:41 virtfed kernel: block drbd0: Terminating asender thread Jul 27 15:15:41 virtfed kernel: block drbd0: Connection closed Jul 27 15:15:41 virtfed kernel: block drbd0: conn( TearDown -> Unconnected ) Jul 27 15:15:41 virtfed kernel: block drbd0: receiver terminated Jul 27 15:15:41 virtfed kernel: block drbd0: Restarting receiver thread Jul 27 15:15:41 virtfed kernel: block drbd0: receiver (re)started Jul 27 15:15:41 virtfed kernel: block drbd0: conn( Unconnected -> WFConnection ) logs node2: Jul 27 15:15:41 virtfedbis kernel: block drbd0: role( Primary -> Secondary ) >>> Jul 27 15:15:41 virtfedbis kernel: block drbd0: Requested state change failed by peer: Refusing to be Primary while peer is not outdated (-7) Jul 27 15:15:41 virtfedbis kernel: block drbd0: meta connection shut down by peer. Jul 27 15:15:41 virtfedbis kernel: block drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Jul 27 15:15:41 virtfedbis kernel: block drbd0: asender terminated Jul 27 15:15:41 virtfedbis kernel: block drbd0: Terminating asender thread Jul 27 15:15:41 virtfedbis kernel: block drbd0: conn( NetworkFailure -> Disconnecting ) disk( UpToDate -> Outdated ) Jul 27 15:15:41 virtfedbis kernel: block drbd0: Connection closed Jul 27 15:15:41 virtfedbis kernel: block drbd0: conn( Disconnecting -> StandAlone ) Jul 27 15:15:41 virtfedbis kernel: block drbd0: receiver terminated Jul 27 15:15:41 virtfedbis kernel: block drbd0: Terminating receiver thread Jul 27 15:15:41 virtfedbis kernel: block drbd0: disk( Outdated -> Diskless ) Jul 27 15:15:41 virtfedbis kernel: block drbd0: drbd_bm_resize called with capacity == 0 Jul 27 15:15:41 virtfedbis kernel: block drbd0: worker terminated Jul 27 15:15:41 virtfedbis kernel: block drbd0: Terminating worker thread Jul 27 15:15:41 virtfedbis kernel: drbd: module cleanup done. Jul 27 15:15:41 virtfedbis kernel: drbd: initialised. Version: 8.3.2 (api:88/proto:86-90) Jul 27 15:15:41 virtfedbis kernel: drbd: GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by root at virtfed.domain.com, 2009-07-16 11:03:37 Jul 27 15:15:41 virtfedbis kernel: drbd: registered as block device major 147 Jul 27 15:15:41 virtfedbis kernel: drbd: minor_table @ 0xffff8808293d1200 Then restart of drbd on node2: [root at virtfedbis ~]# service drbd start Starting DRBD resources: [ d(r0) s(r0) n(r0) ]. now situation is node1 version: 8.3.2 (api:88/proto:86-90) GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by root at virtfed.domain.com, 2009-07-16 11:03:37 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r---- ns:55146520 nr:211304 dw:968180 dr:56019964 al:548 bm:3720 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 node2 version: 8.3.2 (api:88/proto:86-90) GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by root at virtfed.domain.com, 2009-07-16 11:03:37 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r---- ns:0 nr:3764 dw:3764 dr:92 al:0 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 logs node1: Jul 27 15:17:07 virtfed kernel: block drbd0: Handshake successful: Agreed network protocol version 90 Jul 27 15:17:07 virtfed kernel: block drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC Jul 27 15:17:07 virtfed kernel: block drbd0: conn( WFConnection -> WFReportParams ) Jul 27 15:17:07 virtfed kernel: block drbd0: Starting asender thread (from drbd0_receiver [3195]) Jul 27 15:17:07 virtfed kernel: block drbd0: data-integrity-alg: <not-used> Jul 27 15:17:07 virtfed kernel: block drbd0: drbd_sync_handshake: Jul 27 15:17:07 virtfed kernel: block drbd0: self A54E721233CDE951:484E7F140D5A9EC5:8AF49A12F8186DD2:00C81C83671779C1 bits:478 flags:0 Jul 27 15:17:07 virtfed kernel: block drbd0: peer 484E7F140D5A9EC4:0000000000000000:8AF49A12F8186DD3:00C81C83671779C1 bits:0 flags:0 Jul 27 15:17:07 virtfed kernel: block drbd0: uuid_compare()=1 by rule 7 Jul 27 15:17:07 virtfed kernel: block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) Jul 27 15:17:07 virtfed kernel: block drbd0: peer( Secondary -> Primary ) Jul 27 15:17:07 virtfed kernel: block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Outdated -> Inconsistent ) Jul 27 15:17:07 virtfed kernel: block drbd0: Began resync as SyncSource (will sync 1912 KB [478 bits set]). Jul 27 15:17:07 virtfed kernel: block drbd0: Resync done (total 1 sec; paused 0 sec; 1912 K/sec) Jul 27 15:17:07 virtfed kernel: block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) logs node2: Jul 27 15:17:07 virtfedbis kernel: block drbd0: Starting worker thread (from cqueue [64]) Jul 27 15:17:07 virtfedbis kernel: block drbd0: disk( Diskless -> Attaching ) Jul 27 15:17:07 virtfedbis kernel: block drbd0: Found 6 transactions (64 active extents) in activity log. Jul 27 15:17:07 virtfedbis kernel: block drbd0: Method to ensure write ordering: barrier Jul 27 15:17:07 virtfedbis kernel: block drbd0: max_segment_size ( = BIO size ) = 32768 Jul 27 15:17:07 virtfedbis kernel: block drbd0: drbd_bm_resize called with capacity == 109317376 Jul 27 15:17:07 virtfedbis kernel: block drbd0: resync bitmap: bits=13664672 words=213511 Jul 27 15:17:07 virtfedbis kernel: block drbd0: size = 52 GB (54658688 KB) Jul 27 15:17:07 virtfedbis kernel: block drbd0: recounting of set bits took additional 1 jiffies Jul 27 15:17:07 virtfedbis kernel: block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map. Jul 27 15:17:07 virtfedbis kernel: block drbd0: disk( Attaching -> Outdated ) Jul 27 15:17:07 virtfedbis kernel: block drbd0: Barriers not supported on meta data device - disabling Jul 27 15:17:07 virtfedbis kernel: block drbd0: conn( StandAlone -> Unconnected ) Jul 27 15:17:07 virtfedbis kernel: block drbd0: Starting receiver thread (from drbd0_worker [9487]) Jul 27 15:17:07 virtfedbis kernel: block drbd0: receiver (re)started Jul 27 15:17:07 virtfedbis kernel: block drbd0: conn( Unconnected -> WFConnection ) Jul 27 15:17:07 virtfedbis kernel: block drbd0: Handshake successful: Agreed network protocol version 90 Jul 27 15:17:07 virtfedbis kernel: block drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC Jul 27 15:17:07 virtfedbis kernel: block drbd0: conn( WFConnection -> WFReportParams ) Jul 27 15:17:07 virtfedbis kernel: block drbd0: Starting asender thread (from drbd0_receiver [9506]) Jul 27 15:17:07 virtfedbis kernel: block drbd0: data-integrity-alg: <not-used> Jul 27 15:17:07 virtfedbis kernel: block drbd0: drbd_sync_handshake: Jul 27 15:17:07 virtfedbis kernel: block drbd0: self 484E7F140D5A9EC4:0000000000000000:8AF49A12F8186DD3:00C81C83671779C1 bits:0 flags:0 Jul 27 15:17:07 virtfedbis kernel: block drbd0: peer A54E721233CDE951:484E7F140D5A9EC5:8AF49A12F8186DD2:00C81C83671779C1 bits:478 flags:0 Jul 27 15:17:07 virtfedbis kernel: block drbd0: uuid_compare()=-1 by rule 5 Jul 27 15:17:07 virtfedbis kernel: block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate ) Jul 27 15:17:07 virtfedbis kernel: block drbd0: role( Secondary -> Primary ) Jul 27 15:17:07 virtfedbis kernel: block drbd0: conn( WFBitMapT -> WFSyncUUID ) Jul 27 15:17:07 virtfedbis kernel: block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 Jul 27 15:17:07 virtfedbis kernel: block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0) Jul 27 15:17:07 virtfedbis kernel: block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent ) Jul 27 15:17:07 virtfedbis kernel: block drbd0: Began resync as SyncTarget (will sync 1912 KB [478 bits set]). Jul 27 15:17:07 virtfedbis kernel: block drbd0: Resync done (total 1 sec; paused 0 sec; 1912 K/sec) Jul 27 15:17:07 virtfedbis kernel: block drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) Jul 27 15:17:07 virtfedbis kernel: block drbd0: helper command: /sbin/drbdadm after-resync-target minor-0 Jul 27 15:17:07 virtfedbis kernel: block drbd0: helper command: /sbin/drbdadm after-resync-target minor-0 exit code 0 (0x0) Jul 27 15:17:09 virtfedbis kernel: block drbd0: Method to ensure write ordering: flush >>> Jul 27 15:17:09 virtfedbis kernel: end_request: I/O error, dev cciss/c0d0, sector 0 >>> Jul 27 15:17:09 virtfedbis kernel: block drbd0: local disk flush failed with status -95 >>> Jul 27 15:17:09 virtfedbis kernel: block drbd0: Method to ensure write ordering: drain [root at virtfed ~]# drbdadm dump # /etc/drbd.conf common { syncer { rate 100M; } } # resource r0 on virtfed.domain.com: not ignored, not stacked resource r0 { protocol C; on virtfed.domain.com { device /dev/drbd0 minor 0; disk /dev/cciss/c0d0p3; address ipv4 192.168.16.111:7788; flexible-meta-disk /dev/cciss/c0d0p4; } on virtfedbis.domain.com { device /dev/drbd0 minor 0; disk /dev/cciss/c0d0p3; address ipv4 192.168.16.112:7788; flexible-meta-disk /dev/cciss/c0d0p4; } net { allow-two-primaries; cram-hmac-alg sha1; shared-secret kvmdrbd; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; rr-conflict disconnect; } disk { on-io-error detach; fencing resource-and-stonith; } syncer { al-extents 257; } startup { wfc-timeout 30; degr-wfc-timeout 10; outdated-wfc-timeout 2; become-primary-on both; } handlers { pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; fence-peer /usr/local/bin/obliterate-peer.sh; } }