[DRBD-user] Pacemaker + Dual Primary, handlers and fail-back issues

Daniel Grunblatt dgrunblatt at invap.com.ar
Thu Mar 1 18:12:36 CET 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Andreas, Lars,

Thanks much for the quick response.

I made the changes.
Here's the current drbd.conf:
global {
    usage-count  yes;
}
common {
    protocol C;
    disk {
         on-io-error    detach;
         fencing resource-and-stonith;


    }
    syncer {
       rate               33M;
       al-extents       3389;
    }
    net {
       allow-two-primaries; # Enable this *after* initial testing
       cram-hmac-alg sha1;
       shared-secret "a6a0680c40bca2439dbe48343ddddcf4";
       after-sb-0pri discard-zero-changes;
       after-sb-1pri discard-secondary;
       after-sb-2pri disconnect;
        }

   handlers {
       fence-peer "/usr/lib/drbd/stonith_admin-fence-peer.sh";

}
}
resource vmsvn {
       device    /dev/drbd0;
       disk      /dev/sdb;
       meta-disk internal;
    on xm01 {
       address   100.0.0.1:7788;
    }
    on xm02 {
       address   100.0.0.2:7788;
    }
}

resource srvsvn1 {
       protocol C;
       device    /dev/drbd1;
       disk      /dev/sdc;
       meta-disk internal;
    on xm01 {
       address   100.0.0.1:7789;
    }
    on xm02 {
       address   100.0.0.2:7789;
    }
}

resource srvsvn2 {
       protocol C;
       device    /dev/drbd2;
       disk      /dev/sdd;
       meta-disk internal;
    on xm01 {
       address   100.0.0.1:7790;
    }
    on xm02 {
       address   100.0.0.2:7790;
    }
}

resource vmconfig {
         protocol C;
       device    /dev/drbd3;
       meta-disk internal;
    on xm01 {
       address   100.0.0.1:7791;
       disk     /dev/vg_xm01/lv_xm01_vmconfig;
    }
    on xm02 {
       address   100.0.0.2:7791;
       disk     /dev/vg_xm02/lv_xm02_vmconfig;
    }
}

And here's what happened:
- rcnetwork stop on XM01 @ 1:33:00 PM:
Mar  1 13:32:59 xm01 ifdown:     eth0      device: Broadcom 
Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
Mar  1 13:33:00 xm01 ifdown:     eth1      device: Broadcom 
Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
Mar  1 13:33:01 xm01 /usr/sbin/cron[9479]: (root) CMD 
(/usr/sbin/logwatch --service dmeventd)
Mar  1 13:33:01 xm01 ifdown:     usb0      name: RNDIS/CDC ETHER
Mar  1 13:33:02 xm01 ifdown:     vif1.0
Mar  1 13:33:02 xm01 ifdown:               No configuration found for vif1.0
Mar  1 13:33:02 xm01 ifdown:               Nevertheless the interface 
will be shut down.

- XM01 is back:
Mar  1 13:36:35 xm01 kernel: [   51.170175] drbd: initialized. 
Version: 8.3.11 (api:88/proto:86-96)
Mar  1 13:36:35 xm01 kernel: [   51.170178] drbd: GIT-hash: 
0de839cee13a4160eed6037c4bddd066645e23c5 build by phil at fat-tyre, 
2011-06-29 11:37:11
Mar  1 13:36:35 xm01 kernel: [   51.170181] drbd: registered as block 
device major 147
Mar  1 13:36:35 xm01 kernel: [   51.170184] drbd: minor_table @ 
0xffff8807d66c5480
Mar  1 13:36:35 xm01 kernel: [   51.319210] block drbd0: Starting 
worker thread (from cqueue [4927])
Mar  1 13:36:35 xm01 kernel: [   51.319283] block drbd0: disk( 
Diskless -> Attaching )
Mar  1 13:36:35 xm01 kernel: klogd 1.4.1, ---------- state change ----------
Mar  1 13:36:35 xm01 kernel: [   51.332408] block drbd0: Found 57 
transactions (91 active extents) in activity log.
Mar  1 13:36:35 xm01 kernel: [   51.332411] block drbd0: Method to 
ensure write ordering: barrier
Mar  1 13:36:35 xm01 kernel: [   51.332414] block drbd0: max BIO size = 131072
Mar  1 13:36:35 xm01 kernel: [   51.332418] block drbd0: 
drbd_bm_resize called with capacity == 1172087720
Mar  1 13:36:35 xm01 kernel: [   51.336592] block drbd0: resync 
bitmap: bits=146510965 words=2289234 pages=4472
Mar  1 13:36:35 xm01 kernel: [   51.336598] block drbd0: size = 559 
GB (586043860 KB)
Mar  1 13:36:35 xm01 kernel: [   51.534814] block drbd0: bitmap READ 
of 4472 pages took 50 jiffies
Mar  1 13:36:35 xm01 kernel: [   51.551170] block drbd0: recounting 
of set bits took additional 4 jiffies
Mar  1 13:36:35 xm01 kernel: [   51.551174] block drbd0: 0 KB (0 
bits) marked out-of-sync by on disk bit-map.
Mar  1 13:36:35 xm01 kernel: [   51.551231] block drbd0: Marked 
additional 224 MB as out-of-sync based on AL.
Mar  1 13:36:35 xm01 kernel: [   51.551274] block drbd0: bitmap WRITE 
of 0 pages took 0 jiffies
Mar  1 13:36:35 xm01 kernel: [   51.551296] block drbd0: 224 MB 
(57344 bits) marked out-of-sync by on disk bit-map.
Mar  1 13:36:35 xm01 kernel: [   51.551304] block drbd0: disk( 
Attaching -> Consistent )
Mar  1 13:36:35 xm01 kernel: [   51.551307] block drbd0: attached to 
UUIDs EEDF542BD48564B5:0000000000000000:AF298F27A3172092:AF288F27A3172093
Mar  1 13:36:35 xm01 kernel: [   51.567908] block drbd1: Starting 
worker thread (from cqueue [4927])
Mar  1 13:36:35 xm01 kernel: [   51.567981] block drbd1: disk( 
Diskless -> Attaching )
Mar  1 13:36:35 xm01 kernel: [   51.581253] block drbd1: Found 57 
transactions (57 active extents) in activity log.
Mar  1 13:36:35 xm01 kernel: [   51.581257] block drbd1: Method to 
ensure write ordering: barrier
Mar  1 13:36:35 xm01 kernel: [   51.581260] block drbd1: max BIO size = 131072
Mar  1 13:36:35 xm01 kernel: [   51.581265] block drbd1: 
drbd_bm_resize called with capacity == 1172087720
Mar  1 13:36:35 xm01 kernel: [   51.585510] block drbd1: resync 
bitmap: bits=146510965 words=2289234 pages=4472
Mar  1 13:36:35 xm01 kernel: [   51.585525] block drbd1: size = 559 
GB (586043860 KB)
Mar  1 13:36:36 xm01 kernel: [   51.778368] block drbd1: bitmap READ 
of 4472 pages took 48 jiffies
Mar  1 13:36:36 xm01 kernel: [   51.794740] block drbd1: recounting 
of set bits took additional 4 jiffies
Mar  1 13:36:36 xm01 kernel: [   51.794744] block drbd1: 0 KB (0 
bits) marked out-of-sync by on disk bit-map.
Mar  1 13:36:36 xm01 kernel: [   51.794797] block drbd1: Marked 
additional 120 MB as out-of-sync based on AL.
Mar  1 13:36:36 xm01 kernel: [   51.794838] block drbd1: bitmap WRITE 
of 0 pages took 0 jiffies
Mar  1 13:36:36 xm01 kernel: [   51.794860] block drbd1: 120 MB 
(30720 bits) marked out-of-sync by on disk bit-map.
Mar  1 13:36:36 xm01 kernel: [   51.794867] block drbd1: disk( 
Attaching -> Consistent )
Mar  1 13:36:36 xm01 kernel: [   51.794871] block drbd1: attached to 
UUIDs E6E23470FD3656AD:0000000000000000:65C464E576893480:65C364E576893481
Mar  1 13:36:36 xm01 kernel: [   51.811431] block drbd2: Starting 
worker thread (from cqueue [4927])
Mar  1 13:36:36 xm01 kernel: [   51.811511] block drbd2: disk( 
Diskless -> Attaching )
Mar  1 13:36:36 xm01 kernel: [   51.825901] block drbd2: Found 57 
transactions (57 active extents) in activity log.
Mar  1 13:36:36 xm01 kernel: [   51.825905] block drbd2: Method to 
ensure write ordering: barrier
Mar  1 13:36:36 xm01 kernel: [   51.825908] block drbd2: max BIO size = 131072
Mar  1 13:36:36 xm01 kernel: [   51.825915] block drbd2: 
drbd_bm_resize called with capacity == 1172087720
Mar  1 13:36:36 xm01 kernel: [   51.830989] block drbd2: resync 
bitmap: bits=146510965 words=2289234 pages=4472
Mar  1 13:36:36 xm01 kernel: [   51.830995] block drbd2: size = 559 
GB (586043860 KB)
Mar  1 13:36:36 xm01 kernel: [   52.033592] block drbd2: bitmap READ 
of 4472 pages took 51 jiffies
Mar  1 13:36:36 xm01 kernel: [   52.050223] block drbd2: recounting 
of set bits took additional 4 jiffies
Mar  1 13:36:36 xm01 kernel: [   52.050228] block drbd2: 0 KB (0 
bits) marked out-of-sync by on disk bit-map.
Mar  1 13:36:36 xm01 kernel: [   52.050291] block drbd2: Marked 
additional 48 MB as out-of-sync based on AL.
Mar  1 13:36:36 xm01 kernel: [   52.050352] block drbd2: bitmap WRITE 
of 0 pages took 0 jiffies
Mar  1 13:36:36 xm01 kernel: [   52.050382] block drbd2: 48 MB (12288 
bits) marked out-of-sync by on disk bit-map.
Mar  1 13:36:36 xm01 kernel: [   52.050391] block drbd2: disk( 
Attaching -> Consistent )
Mar  1 13:36:36 xm01 kernel: [   52.050396] block drbd2: attached to 
UUIDs 324E9CEEF0227FAD:0000000000000000:F91D77DB4FF3672A:F91C77DB4FF3672B
Mar  1 13:36:36 xm01 kernel: [   52.079074] block drbd3: Starting 
worker thread (from cqueue [4927])
Mar  1 13:36:36 xm01 kernel: [   52.079172] block drbd3: disk( 
Diskless -> Attaching )
Mar  1 13:36:36 xm01 kernel: [   52.118864] block drbd3: Found 29 
transactions (29 active extents) in activity log.
Mar  1 13:36:36 xm01 kernel: [   52.118868] block drbd3: Method to 
ensure write ordering: barrier
Mar  1 13:36:36 xm01 kernel: [   52.118872] block drbd3: max BIO size = 131072
Mar  1 13:36:36 xm01 kernel: [   52.118877] block drbd3: 
drbd_bm_resize called with capacity == 2097016
Mar  1 13:36:36 xm01 kernel: [   52.118888] block drbd3: resync 
bitmap: bits=262127 words=4096 pages=8
Mar  1 13:36:36 xm01 kernel: [   52.118891] block drbd3: size = 1024 
MB (1048508 KB)
Mar  1 13:36:36 xm01 kernel: [   52.125476] block drbd3: bitmap READ 
of 8 pages took 2 jiffies
Mar  1 13:36:36 xm01 kernel: [   52.125509] block drbd3: recounting 
of set bits took additional 0 jiffies
Mar  1 13:36:36 xm01 kernel: [   52.125511] block drbd3: 0 KB (0 
bits) marked out-of-sync by on disk bit-map.
Mar  1 13:36:36 xm01 kernel: [   52.125540] block drbd3: Marked 
additional 20 MB as out-of-sync based on AL.
Mar  1 13:36:36 xm01 kernel: [   52.125543] block drbd3: bitmap WRITE 
of 0 pages took 0 jiffies
Mar  1 13:36:36 xm01 kernel: [   52.129955] block drbd3: 20 MB (5120 
bits) marked out-of-sync by on disk bit-map.
Mar  1 13:36:36 xm01 kernel: [   52.129960] block drbd3: disk( 
Attaching -> Consistent )
Mar  1 13:36:36 xm01 kernel: [   52.129964] block drbd3: attached to 
UUIDs 75C7AE841CB0682F:0000000000000000:99ABCCCBF1E4D000:99AACCCBF1E4D001
Mar  1 13:36:36 xm01 kernel: [   52.204837] padlock: VIA PadLock Hash 
Engine not detected.
Mar  1 13:36:36 xm01 modprobe: FATAL: Error inserting padlock_sha 
(/lib/modules/2.6.32.49-0.3-xen/kernel/drivers/crypto/padlock-sha.ko): 
No such device
Mar  1 13:36:36 xm01 kernel: [   52.238263] block drbd0: conn( 
StandAlone -> Unconnected )
Mar  1 13:36:36 xm01 kernel: [   52.238301] block drbd0: Starting 
receiver thread (from drbd0_worker [4938])
Mar  1 13:36:36 xm01 kernel: [   52.238341] block drbd0: receiver (re)started
Mar  1 13:36:36 xm01 kernel: [   52.238349] block drbd0: conn( 
Unconnected -> WFConnection )
Mar  1 13:36:36 xm01 kernel: [   52.241205] block drbd1: conn( 
StandAlone -> Unconnected )
Mar  1 13:36:36 xm01 kernel: [   52.241238] block drbd1: Starting 
receiver thread (from drbd1_worker [4960])
Mar  1 13:36:36 xm01 kernel: [   52.241311] block drbd1: receiver (re)started
Mar  1 13:36:36 xm01 kernel: [   52.241318] block drbd1: conn( 
Unconnected -> WFConnection )
Mar  1 13:36:36 xm01 kernel: [   52.243718] block drbd2: conn( 
StandAlone -> Unconnected )
Mar  1 13:36:36 xm01 kernel: [   52.243743] block drbd2: Starting 
receiver thread (from drbd2_worker [4986])
Mar  1 13:36:36 xm01 kernel: [   52.243808] block drbd2: receiver (re)started
Mar  1 13:36:36 xm01 kernel: [   52.243817] block drbd2: conn( 
Unconnected -> WFConnection )
Mar  1 13:36:36 xm01 kernel: [   52.246305] block drbd3: conn( 
StandAlone -> Unconnected )
Mar  1 13:36:36 xm01 kernel: [   52.246337] block drbd3: Starting 
receiver thread (from drbd3_worker [5016])
Mar  1 13:36:36 xm01 kernel: [   52.246406] block drbd3: receiver (re)started
Mar  1 13:36:36 xm01 kernel: [   52.246415] block drbd3: conn( 
Unconnected -> WFConnection )
Mar  1 13:36:37 xm01 kernel: [   52.738908] block drbd1: Handshake 
successful: Agreed network protocol version 96
Mar  1 13:36:37 xm01 kernel: [   52.738985] block drbd0: Handshake 
successful: Agreed network protocol version 96
Mar  1 13:36:37 xm01 kernel: [   52.739113] block drbd1: Peer 
authenticated using 20 bytes of 'sha1' HMAC
Mar  1 13:36:37 xm01 kernel: [   52.739122] block drbd1: conn( 
WFConnection -> WFReportParams )
Mar  1 13:36:37 xm01 kernel: [   52.739141] block drbd0: Peer 
authenticated using 20 bytes of 'sha1' HMAC
Mar  1 13:36:37 xm01 kernel: [   52.739146] block drbd0: conn( 
WFConnection -> WFReportParams )
Mar  1 13:36:37 xm01 kernel: [   52.739182] block drbd1: Starting 
asender thread (from drbd1_receiver [5114])
Mar  1 13:36:37 xm01 kernel: [   52.739191] block drbd0: Starting 
asender thread (from drbd0_receiver [5110])
Mar  1 13:36:37 xm01 kernel: [   52.739298] block drbd0: 
data-integrity-alg: <not-used>
Mar  1 13:36:37 xm01 kernel: [   52.739316] block drbd0: drbd_sync_handshake:
Mar  1 13:36:37 xm01 kernel: [   52.739320] block drbd0: self 
EEDF542BD48564B4:0000000000000000:AF298F27A3172092:AF288F27A3172093 
bits:57344 flags:0
Mar  1 13:36:37 xm01 kernel: [   52.739324] block drbd0: peer 
EEDF542BD48564B5:0000000000000000:AF298F27A3172093:AF288F27A3172093 
bits:0 flags:0
Mar  1 13:36:37 xm01 kernel: [   52.739328] block drbd0: 
uuid_compare()=1 by rule 40
Mar  1 13:36:37 xm01 kernel: [   52.739334] block drbd0: peer( 
Unknown -> Primary ) conn( WFReportParams -> WFBitMapS ) disk( 
Consistent -> UpToDate ) pdsk( DUnknown -> Consistent )
Mar  1 13:36:37 xm01 kernel: [   52.739374] block drbd1: 
data-integrity-alg: <not-used>
Mar  1 13:36:37 xm01 kernel: [   52.739389] block drbd1: drbd_sync_handshake:
Mar  1 13:36:37 xm01 kernel: [   52.739393] block drbd1: self 
E6E23470FD3656AC:0000000000000000:65C464E576893480:65C364E576893481 
bits:30720 flags:0
Mar  1 13:36:37 xm01 kernel: [   52.739397] block drbd1: peer 
E6E23470FD3656AD:0000000000000000:65C464E576893481:65C364E576893481 
bits:0 flags:0
Mar  1 13:36:37 xm01 kernel: [   52.739400] block drbd1: 
uuid_compare()=1 by rule 40
Mar  1 13:36:37 xm01 kernel: [   52.739406] block drbd1: peer( 
Unknown -> Primary ) conn( WFReportParams -> WFBitMapS ) disk( 
Consistent -> UpToDate ) pdsk( DUnknown -> Consistent )
Mar  1 13:36:37 xm01 kernel: [   52.739584] block drbd1: meta 
connection shut down by peer.
Mar  1 13:36:37 xm01 kernel: [   52.739590] block drbd1: peer( 
Primary -> Unknown ) conn( WFBitMapS -> NetworkFailure ) pdsk( 
Consistent -> DUnknown )
Mar  1 13:36:37 xm01 kernel: [   52.739646] block drbd0: sock_sendmsg 
returned -32
Mar  1 13:36:37 xm01 kernel: [   52.739651] block drbd0: peer( 
Primary -> Unknown ) conn( WFBitMapS -> BrokenPipe ) pdsk( Consistent 
-> DUnknown )
Mar  1 13:36:37 xm01 kernel: [   52.739657] block drbd0: short sent 
ReportBitMap size=4096 sent=3172
Mar  1 13:36:37 xm01 kernel: [   52.739674] block drbd0: meta 
connection shut down by peer.
Mar  1 13:36:37 xm01 kernel: [   52.739683] block drbd0: asender terminated
Mar  1 13:36:37 xm01 kernel: [   52.739687] block drbd0: Terminating 
asender thread
Mar  1 13:36:37 xm01 kernel: [   52.739738] block drbd1: bitmap WRITE 
of 0 pages took 0 jiffies
Mar  1 13:36:37 xm01 kernel: [   52.741865] block drbd1: 120 MB 
(30720 bits) marked out-of-sync by on disk bit-map.
Mar  1 13:36:37 xm01 kernel: [   52.743017] block drbd2: Handshake 
successful: Agreed network protocol version 96
Mar  1 13:36:37 xm01 kernel: [   52.743091] block drbd3: Handshake 
successful: Agreed network protocol version 96
Mar  1 13:36:37 xm01 kernel: [   52.743270] block drbd2: Peer 
authenticated using 20 bytes of 'sha1' HMAC
Mar  1 13:36:37 xm01 kernel: [   52.743278] block drbd2: conn( 
WFConnection -> WFReportParams )
Mar  1 13:36:37 xm01 kernel: [   52.743309] block drbd2: Starting 
asender thread (from drbd2_receiver [5120])
Mar  1 13:36:37 xm01 kernel: [   52.743341] block drbd3: Peer 
authenticated using 20 bytes of 'sha1' HMAC
Mar  1 13:36:37 xm01 kernel: [   52.743348] block drbd3: conn( 
WFConnection -> WFReportParams )
Mar  1 13:36:37 xm01 kernel: [   52.743410] block drbd3: Starting 
asender thread (from drbd3_receiver [5124])
Mar  1 13:36:37 xm01 kernel: [   52.743494] block drbd3: 
data-integrity-alg: <not-used>
Mar  1 13:36:37 xm01 kernel: [   52.743532] block drbd3: drbd_sync_handshake:
Mar  1 13:36:37 xm01 kernel: [   52.743536] block drbd3: self 
75C7AE841CB0682E:0000000000000000:99ABCCCBF1E4D000:99AACCCBF1E4D001 
bits:5120 flags:0
Mar  1 13:36:37 xm01 kernel: [   52.743540] block drbd3: peer 
75C7AE841CB0682F:0000000000000000:99ABCCCBF1E4D001:99AACCCBF1E4D001 
bits:0 flags:0
Mar  1 13:36:37 xm01 kernel: [   52.743543] block drbd3: 
uuid_compare()=1 by rule 40
Mar  1 13:36:37 xm01 kernel: [   52.743550] block drbd3: peer( 
Unknown -> Primary ) conn( WFReportParams -> WFBitMapS ) disk( 
Consistent -> UpToDate ) pdsk( DUnknown -> Consistent )
Mar  1 13:36:37 xm01 kernel: [   52.743733] block drbd3: meta 
connection shut down by peer.
Mar  1 13:36:37 xm01 kernel: [   52.743740] block drbd3: peer( 
Primary -> Unknown ) conn( WFBitMapS -> NetworkFailure ) pdsk( 
Consistent -> DUnknown )
Mar  1 13:36:37 xm01 kernel: [   52.743878] block drbd3: sock_sendmsg 
returned -32
Mar  1 13:36:37 xm01 kernel: [   52.743884] block drbd3: short sent 
ReportBitMap size=4096 sent=276
Mar  1 13:36:37 xm01 kernel: [   52.743894] block drbd2: 
data-integrity-alg: <not-used>
Mar  1 13:36:37 xm01 kernel: [   52.743905] block drbd3: bitmap WRITE 
of 0 pages took 0 jiffies
Mar  1 13:36:37 xm01 kernel: [   52.743908] block drbd2: drbd_sync_handshake:
Mar  1 13:36:37 xm01 kernel: [   52.743914] block drbd2: self 
324E9CEEF0227FAC:0000000000000000:F91D77DB4FF3672A:F91C77DB4FF3672B 
bits:12288 flags:0
Mar  1 13:36:37 xm01 kernel: [   52.743918] block drbd2: peer 
324E9CEEF0227FAD:0000000000000000:F91D77DB4FF3672B:F91C77DB4FF3672B 
bits:0 flags:0
Mar  1 13:36:37 xm01 kernel: [   52.743921] block drbd2: 
uuid_compare()=1 by rule 40
Mar  1 13:36:37 xm01 kernel: [   52.743928] block drbd2: peer( 
Unknown -> Primary ) conn( WFReportParams -> WFBitMapS ) disk( 
Consistent -> UpToDate ) pdsk( DUnknown -> Consistent )
Mar  1 13:36:37 xm01 kernel: [   52.744091] block drbd2: meta 
connection shut down by peer.
Mar  1 13:36:37 xm01 kernel: [   52.744097] block drbd2: peer( 
Primary -> Unknown ) conn( WFBitMapS -> NetworkFailure ) pdsk( 
Consistent -> DUnknown )
Mar  1 13:36:37 xm01 kernel: [   52.744279] block drbd2: sock_sendmsg 
returned -32
Mar  1 13:36:37 xm01 kernel: [   52.744283] block drbd2: short sent 
ReportBitMap size=4096 sent=2180
Mar  1 13:36:37 xm01 kernel: [   52.744335] block drbd2: bitmap WRITE 
of 0 pages took 0 jiffies
Mar  1 13:36:37 xm01 kernel: [   52.747349] block drbd2: 48 MB (12288 
bits) marked out-of-sync by on disk bit-map.
Mar  1 13:36:37 xm01 kernel: [   52.747833] block drbd1: asender terminated
Mar  1 13:36:37 xm01 kernel: [   52.747837] block drbd1: Terminating 
asender thread
Mar  1 13:36:37 xm01 kernel: [   52.747902] block drbd1: Connection closed
Mar  1 13:36:37 xm01 kernel: [   52.747908] block drbd1: conn( 
NetworkFailure -> Unconnected )
Mar  1 13:36:37 xm01 kernel: [   52.747915] block drbd1: receiver terminated
Mar  1 13:36:37 xm01 kernel: [   52.747917] block drbd1: Restarting 
receiver thread
Mar  1 13:36:37 xm01 kernel: [   52.747933] block drbd1: receiver (re)started
Mar  1 13:36:37 xm01 kernel: [   52.747938] block drbd1: conn( 
Unconnected -> WFConnection )
Mar  1 13:36:37 xm01 kernel: [   52.749723] block drbd0: bitmap WRITE 
of 0 pages took 0 jiffies
Mar  1 13:36:37 xm01 kernel: [   52.749734] block drbd0: 224 MB 
(57344 bits) marked out-of-sync by on disk bit-map.
Mar  1 13:36:37 xm01 kernel: [   52.749775] block drbd0: Connection closed
Mar  1 13:36:37 xm01 kernel: [   52.749780] block drbd0: conn( 
BrokenPipe -> Unconnected )
Mar  1 13:36:37 xm01 kernel: [   52.749787] block drbd0: receiver terminated
Mar  1 13:36:37 xm01 kernel: [   52.749789] block drbd0: Restarting 
receiver thread
Mar  1 13:36:37 xm01 kernel: [   52.749792] block drbd0: receiver (re)started
Mar  1 13:36:37 xm01 kernel: [   52.749796] block drbd0: conn( 
Unconnected -> WFConnection )
Mar  1 13:36:37 xm01 kernel: [   52.753343] block drbd2: asender terminated
Mar  1 13:36:37 xm01 kernel: [   52.753347] block drbd2: Terminating 
asender thread
Mar  1 13:36:37 xm01 kernel: [   52.753391] block drbd2: Connection closed
Mar  1 13:36:37 xm01 kernel: [   52.753395] block drbd2: conn( 
NetworkFailure -> Unconnected )
Mar  1 13:36:37 xm01 kernel: [   52.753399] block drbd2: receiver terminated
Mar  1 13:36:37 xm01 kernel: [   52.753401] block drbd2: Restarting 
receiver thread
Mar  1 13:36:37 xm01 kernel: [   52.753403] block drbd2: receiver (re)started
Mar  1 13:36:37 xm01 kernel: [   52.753407] block drbd2: conn( 
Unconnected -> WFConnection )
Mar  1 13:36:37 xm01 kernel: [   52.754182] block drbd3: 20 MB (5120 
bits) marked out-of-sync by on disk bit-map.
Mar  1 13:36:37 xm01 kernel: [   52.769214] block drbd3: asender terminated
Mar  1 13:36:37 xm01 kernel: [   52.769222] block drbd3: Terminating 
asender thread
Mar  1 13:36:37 xm01 kernel: [   52.769303] block drbd3: Connection closed
Mar  1 13:36:37 xm01 kernel: [   52.769309] block drbd3: conn( 
NetworkFailure -> Unconnected )
Mar  1 13:36:37 xm01 kernel: [   52.769317] block drbd3: receiver terminated
Mar  1 13:36:37 xm01 kernel: [   52.769320] block drbd3: Restarting 
receiver thread
Mar  1 13:36:37 xm01 kernel: [   52.769322] block drbd3: receiver (re)started
Mar  1 13:36:37 xm01 kernel: [   52.769327] block drbd3: conn( 
Unconnected -> WFConnection )
...
Mar  1 13:37:17 xm01 kernel: [   93.073374] block drbd0: Handshake 
successful: Agreed network protocol version 96
Mar  1 13:37:17 xm01 kernel: [   93.073589] block drbd0: Peer 
authenticated using 20 bytes of 'sha1' HMAC
Mar  1 13:37:17 xm01 kernel: [   93.073609] block drbd0: conn( 
WFConnection -> WFReportParams )
Mar  1 13:37:17 xm01 kernel: [   93.073647] block drbd0: Starting 
asender thread (from drbd0_receiver [5110])
Mar  1 13:37:17 xm01 kernel: [   93.073768] block drbd0: 
data-integrity-alg: <not-used>
Mar  1 13:37:17 xm01 kernel: [   93.073786] block drbd0: drbd_sync_handshake:
Mar  1 13:37:17 xm01 kernel: [   93.073790] block drbd0: self 
EEDF542BD48564B4:0000000000000000:AF298F27A3172092:AF288F27A3172093 
bits:57344 flags:0
Mar  1 13:37:17 xm01 kernel: [   93.073794] block drbd0: peer 
EEDF542BD48564B5:0000000000000000:AF298F27A3172093:AF288F27A3172093 
bits:0 flags:0
Mar  1 13:37:17 xm01 kernel: [   93.073798] block drbd0: 
uuid_compare()=1 by rule 40
Mar  1 13:37:17 xm01 kernel: [   93.073804] block drbd0: peer( 
Unknown -> Primary ) conn( WFReportParams -> WFBitMapS ) pdsk( 
DUnknown -> Consistent )
Mar  1 13:37:17 xm01 kernel: [   93.073985] block drbd0: sock_sendmsg 
returned -32
Mar  1 13:37:17 xm01 kernel: [   93.073990] block drbd0: peer( 
Primary -> Unknown ) conn( WFBitMapS -> BrokenPipe ) pdsk( Consistent 
-> DUnknown )
Mar  1 13:37:17 xm01 kernel: [   93.073998] block drbd0: short sent 
ReportBitMap size=4096 sent=732
Mar  1 13:37:17 xm01 kernel: [   93.074015] block drbd0: meta 
connection shut down by peer.
Mar  1 13:37:17 xm01 kernel: [   93.074021] block drbd0: asender terminated
Mar  1 13:37:17 xm01 kernel: [   93.074024] block drbd0: Terminating 
asender thread
Mar  1 13:37:17 xm01 kernel: [   93.077364] block drbd3: Handshake 
successful: Agreed network protocol version 96
Mar  1 13:37:17 xm01 kernel: [   93.078584] block drbd3: Peer 
authenticated using 20 bytes of 'sha1' HMAC
Mar  1 13:37:17 xm01 kernel: [   93.078593] block drbd3: conn( 
WFConnection -> WFReportParams )
Mar  1 13:37:17 xm01 kernel: [   93.078633] block drbd3: Starting 
asender thread (from drbd3_receiver [5124])
Mar  1 13:37:17 xm01 kernel: [   93.078756] block drbd3: 
data-integrity-alg: <not-used>
Mar  1 13:37:17 xm01 kernel: [   93.078786] block drbd3: drbd_sync_handshake:
Mar  1 13:37:17 xm01 kernel: [   93.078790] block drbd3: self 
75C7AE841CB0682E:0000000000000000:99ABCCCBF1E4D000:99AACCCBF1E4D001 
bits:5120 flags:0
Mar  1 13:37:17 xm01 kernel: [   93.078794] block drbd3: peer 
75C7AE841CB0682F:0000000000000000:99ABCCCBF1E4D001:99AACCCBF1E4D001 
bits:0 flags:0
Mar  1 13:37:17 xm01 kernel: [   93.078797] block drbd3: 
uuid_compare()=1 by rule 40
Mar  1 13:37:17 xm01 kernel: [   93.078803] block drbd3: peer( 
Unknown -> Primary ) conn( WFReportParams -> WFBitMapS ) pdsk( 
DUnknown -> Consistent )
Mar  1 13:37:17 xm01 kernel: [   93.078925] block drbd3: meta 
connection shut down by peer.
Mar  1 13:37:17 xm01 kernel: [   93.078930] block drbd3: peer( 
Primary -> Unknown ) conn( WFBitMapS -> NetworkFailure ) pdsk( 
Consistent -> DUnknown )
Mar  1 13:37:17 xm01 kernel: [   93.078970] block drbd3: sock_sendmsg 
returned -32
Mar  1 13:37:17 xm01 kernel: [   93.078975] block drbd3: short sent 
ReportBitMap size=4096 sent=276
Mar  1 13:37:17 xm01 kernel: [   93.078983] block drbd3: bitmap WRITE 
of 0 pages took 0 jiffies
Mar  1 13:37:17 xm01 kernel: [   93.084657] block drbd0: bitmap WRITE 
of 0 pages took 0 jiffies
Mar  1 13:37:17 xm01 kernel: [   93.084668] block drbd0: 224 MB 
(57344 bits) marked out-of-sync by on disk bit-map.
Mar  1 13:37:17 xm01 kernel: [   93.084678] block drbd0: Connection closed
Mar  1 13:37:17 xm01 kernel: [   93.084683] block drbd0: conn( 
BrokenPipe -> Unconnected )
Mar  1 13:37:17 xm01 kernel: [   93.084687] block drbd0: receiver terminated
Mar  1 13:37:17 xm01 kernel: [   93.084689] block drbd0: Restarting 
receiver thread
Mar  1 13:37:17 xm01 kernel: [   93.084692] block drbd0: receiver (re)started
Mar  1 13:37:17 xm01 kernel: [   93.084696] block drbd0: conn( 
Unconnected -> WFConnection )
Mar  1 13:37:17 xm01 kernel: [   93.089359] block drbd1: Handshake 
successful: Agreed network protocol version 96
Mar  1 13:37:17 xm01 kernel: [   93.089575] block drbd1: Peer 
authenticated using 20 bytes of 'sha1' HMAC
Mar  1 13:37:17 xm01 kernel: [   93.089582] block drbd1: conn( 
WFConnection -> WFReportParams )
Mar  1 13:37:17 xm01 kernel: [   93.089595] block drbd1: Starting 
asender thread (from drbd1_receiver [5114])
Mar  1 13:37:17 xm01 kernel: [   93.089691] block drbd1: 
data-integrity-alg: <not-used>
Mar  1 13:37:17 xm01 kernel: [   93.089745] block drbd1: drbd_sync_handshake:
Mar  1 13:37:17 xm01 kernel: [   93.089749] block drbd1: self 
E6E23470FD3656AC:0000000000000000:65C464E576893480:65C364E576893481 
bits:30720 flags:0
Mar  1 13:37:17 xm01 kernel: [   93.089753] block drbd1: peer 
E6E23470FD3656AD:0000000000000000:65C464E576893481:65C364E576893481 
bits:0 flags:0
Mar  1 13:37:17 xm01 kernel: [   93.089757] block drbd1: 
uuid_compare()=1 by rule 40
Mar  1 13:37:17 xm01 kernel: [   93.089762] block drbd1: peer( 
Unknown -> Primary ) conn( WFReportParams -> WFBitMapS ) pdsk( 
DUnknown -> Consistent )
Mar  1 13:37:17 xm01 kernel: [   93.089862] block drbd1: meta 
connection shut down by peer.
Mar  1 13:37:17 xm01 kernel: [   93.089868] block drbd1: peer( 
Primary -> Unknown ) conn( WFBitMapS -> NetworkFailure ) pdsk( 
Consistent -> DUnknown )
Mar  1 13:37:17 xm01 kernel: [   93.089931] block drbd1: sock_sendmsg 
returned -32
Mar  1 13:37:17 xm01 kernel: [   93.089935] block drbd1: short sent 
ReportBitMap size=4096 sent=2180
Mar  1 13:37:17 xm01 kernel: [   93.089985] block drbd1: bitmap WRITE 
of 0 pages took 0 jiffies
Mar  1 13:37:17 xm01 kernel: [   93.094402] block drbd1: 120 MB 
(30720 bits) marked out-of-sync by on disk bit-map.
Mar  1 13:37:17 xm01 kernel: [   93.100362] block drbd1: asender terminated
Mar  1 13:37:17 xm01 kernel: [   93.100367] block drbd1: Terminating 
asender thread
Mar  1 13:37:17 xm01 kernel: [   93.100451] block drbd1: Connection closed
Mar  1 13:37:17 xm01 kernel: [   93.100456] block drbd1: conn( 
NetworkFailure -> Unconnected )
Mar  1 13:37:17 xm01 kernel: [   93.100464] block drbd1: receiver terminated
Mar  1 13:37:17 xm01 kernel: [   93.100466] block drbd1: Restarting 
receiver thread
Mar  1 13:37:17 xm01 kernel: [   93.100468] block drbd1: receiver (re)started
Mar  1 13:37:17 xm01 kernel: [   93.100472] block drbd1: conn( 
Unconnected -> WFConnection )
Mar  1 13:37:17 xm01 kernel: [   93.102859] block drbd3: 20 MB (5120 
bits) marked out-of-sync by on disk bit-map.
Mar  1 13:37:17 xm01 kernel: [   93.119786] block drbd3: asender terminated
Mar  1 13:37:17 xm01 kernel: [   93.119794] block drbd3: Terminating 
asender thread
Mar  1 13:37:17 xm01 kernel: [   93.119847] block drbd3: Connection closed
Mar  1 13:37:17 xm01 kernel: [   93.119853] block drbd3: conn( 
NetworkFailure -> Unconnected )
Mar  1 13:37:17 xm01 kernel: [   93.119859] block drbd3: receiver terminated
Mar  1 13:37:17 xm01 kernel: [   93.119861] block drbd3: Restarting 
receiver thread
Mar  1 13:37:17 xm01 kernel: [   93.119864] block drbd3: receiver (re)started
Mar  1 13:37:17 xm01 kernel: [   93.119868] block drbd3: conn( 
Unconnected -> WFConnection )
Mar  1 13:37:17 xm01 kernel: [   93.625232] block drbd2: Handshake 
successful: Agreed network protocol version 96
Mar  1 13:37:17 xm01 kernel: [   93.625450] block drbd2: Peer 
authenticated using 20 bytes of 'sha1' HMAC
Mar  1 13:37:17 xm01 kernel: [   93.625460] block drbd2: conn( 
WFConnection -> WFReportParams )
Mar  1 13:37:17 xm01 kernel: [   93.625476] block drbd2: Starting 
asender thread (from drbd2_receiver [5120])
Mar  1 13:37:17 xm01 kernel: [   93.625592] block drbd2: 
data-integrity-alg: <not-used>
Mar  1 13:37:17 xm01 kernel: [   93.625639] block drbd2: drbd_sync_handshake:
Mar  1 13:37:17 xm01 kernel: [   93.625643] block drbd2: self 
324E9CEEF0227FAC:0000000000000000:F91D77DB4FF3672A:F91C77DB4FF3672B 
bits:12288 flags:0
Mar  1 13:37:17 xm01 kernel: [   93.625647] block drbd2: peer 
324E9CEEF0227FAD:0000000000000000:F91D77DB4FF3672B:F91C77DB4FF3672B 
bits:0 flags:0
Mar  1 13:37:17 xm01 kernel: [   93.625651] block drbd2: 
uuid_compare()=1 by rule 40
Mar  1 13:37:17 xm01 kernel: [   93.625657] block drbd2: peer( 
Unknown -> Primary ) conn( WFReportParams -> WFBitMapS ) pdsk( 
DUnknown -> Consistent )
Mar  1 13:37:17 xm01 kernel: [   93.625804] block drbd2: meta 
connection shut down by peer.
Mar  1 13:37:17 xm01 kernel: [   93.625812] block drbd2: peer( 
Primary -> Unknown ) conn( WFBitMapS -> NetworkFailure ) pdsk( 
Consistent -> DUnknown )
Mar  1 13:37:17 xm01 kernel: [   93.625819] block drbd2: sock_sendmsg 
returned -32
Mar  1 13:37:17 xm01 kernel: [   93.625824] block drbd2: short sent 
ReportBitMap size=4096 sent=2180
Mar  1 13:37:17 xm01 kernel: [   93.625875] block drbd2: bitmap WRITE 
of 0 pages took 0 jiffies
Mar  1 13:37:17 xm01 kernel: [   93.632366] block drbd2: 48 MB (12288 
bits) marked out-of-sync by on disk bit-map.
Mar  1 13:37:17 xm01 kernel: [   93.638339] block drbd2: asender terminated
Mar  1 13:37:17 xm01 kernel: [   93.638344] block drbd2: Terminating 
asender thread
Mar  1 13:37:17 xm01 kernel: [   93.638395] block drbd2: Connection closed
Mar  1 13:37:17 xm01 kernel: [   93.638400] block drbd2: conn( 
NetworkFailure -> Unconnected )
Mar  1 13:37:17 xm01 kernel: [   93.638405] block drbd2: receiver terminated
Mar  1 13:37:17 xm01 kernel: [   93.638407] block drbd2: Restarting 
receiver thread
Mar  1 13:37:17 xm01 kernel: [   93.638409] block drbd2: receiver (re)started
Mar  1 13:37:18 xm01 kernel: [   93.638413] block drbd2: conn( 
Unconnected -> WFConnection )
Mar  1 13:37:19 xm01 lrmd: [5649]: info: rsc:vmconfig:0 promote[20] (pid 6032)
Mar  1 13:37:19 xm01 lrmd: [5649]: info: RA output: 
(vmconfig:0:promote:stdout)         allow-two-primaries;
Mar  1 13:37:19 xm01 kernel: [   94.962300] block drbd3: helper 
command: /sbin/drbdadm fence-peer minor-3
Mar  1 13:37:20 xm01 lrmd: [5649]: info: RA output: 
(vmconfig:0:promote:stderr) 3: State change failed: (-7) Refusing to 
be Primary while peer is not outdated
Mar  1 13:37:20 xm01 lrmd: [5649]: info: RA output: 
(vmconfig:0:promote:stderr) Command 'drbdsetup 3 primary
Mar  1 13:37:20 xm01 lrmd: [5649]: info: RA output: 
(vmconfig:0:promote:stderr) ' terminated with exit code 11
Mar  1 13:37:20 xm01 kernel: [   95.978113] block drbd3: helper 
command: /sbin/drbdadm fence-peer minor-3 exit code 126 (0x7e00)
Mar  1 13:37:20 xm01 kernel: [   95.978117] block drbd3: fence-peer 
helper broken, returned 126
Mar  1 13:37:20 xm01 kernel: [   95.978124] block drbd3: State change 
failed: Refusing to be Primary while peer is not outdated
Mar  1 13:37:20 xm01 kernel: [   95.978128] block drbd3:   state = { 
cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown r----- }
Mar  1 13:37:20 xm01 kernel: [   95.978132] block drbd3:  wanted = { 
cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown s---F- }
Mar  1 13:37:20 xm01 drbd[6032]: ERROR: vmconfig: Called drbdadm -c 
/etc/drbd.conf primary vmconfig
Mar  1 13:37:20 xm01 drbd[6032]: ERROR: vmconfig: Exit code 11
Mar  1 13:37:20 xm01 drbd[6032]: ERROR: vmconfig: Command output:
Mar  1 13:37:20 xm01 lrmd: [5649]: info: RA output: (vmconfig:0:promote:stdout)
Mar  1 13:37:20 xm01 kernel: [   96.012979] block drbd3: helper 
command: /sbin/drbdadm fence-peer minor-3
Mar  1 13:37:21 xm01 lrmd: [5649]: info: RA output: 
(vmconfig:0:promote:stderr) 3: State change failed: (-7) Refusing to 
be Primary while peer is not outdated
Mar  1 13:37:21 xm01 lrmd: [5649]: info: RA output: 
(vmconfig:0:promote:stderr) Command 'drbdsetup 3 primary
Mar  1 13:37:21 xm01 lrmd: [5649]: info: RA output: 
(vmconfig:0:promote:stderr) ' terminated with exit code 11
Mar  1 13:37:21 xm01 kernel: [   97.020366] block drbd3: helper 
command: /sbin/drbdadm fence-peer minor-3 exit code 126 (0x7e00)
Mar  1 13:37:21 xm01 kernel: [   97.020369] block drbd3: fence-peer 
helper broken, returned 126
Mar  1 13:37:21 xm01 kernel: [   97.020375] block drbd3: State change 
failed: Refusing to be Primary while peer is not outdated
Mar  1 13:37:21 xm01 kernel: [   97.020379] block drbd3:   state = { 
cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown r----- }
Mar  1 13:37:21 xm01 kernel: [   97.020383] block drbd3:  wanted = { 
cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown s---F- }
Mar  1 13:37:21 xm01 drbd[6032]: ERROR: vmconfig: Called drbdadm -c 
/etc/drbd.conf primary vmconfig
Mar  1 13:37:21 xm01 drbd[6032]: ERROR: vmconfig: Exit code 11

several times until I get this:

Mar  1 13:38:47 xm01 lrmd: [5649]: info: RA output: (vmconfig:0:promote:stdout)
Mar  1 13:38:48 xm01 kernel: [  184.088528] block drbd3: helper 
command: /sbin/drbdadm fence-peer minor-3
Mar  1 13:38:49 xm01 lrmd: [5649]: WARN: vmconfig:0:promote process 
(PID 6032) timed out (try 1).  Killing with signal SIGTERM (15).
Mar  1 13:38:49 xm01 lrmd: [5649]: WARN: operation promote[20] on 
vmconfig:0 for client 5652: pid 6032 timed out
Mar  1 13:38:49 xm01 crmd: [5652]: ERROR: process_lrm_event: LRM 
operation vmconfig:0_promote_0 (20) Timed Out (timeout=90000ms)
Mar  1 13:38:49 xm01 attrd: [5650]: notice: attrd_ais_dispatch: 
Update relayed from xm02
Mar  1 13:38:49 xm01 attrd: [5650]: info: find_hash_entry: Creating 
hash entry for fail-count-vmconfig:0
Mar  1 13:38:49 xm01 crmd: [5652]: info: do_lrm_rsc_op: Performing 
key=211:6:0:8b7a050b-901b-4db7-b1f7-c3c5dd8a9653 op=vmconfig:0_notify_0 )
Mar  1 13:38:49 xm01 attrd: [5650]: info: attrd_local_callback: 
Expanded fail-count-vmconfig:0=value++ to 1
Mar  1 13:38:49 xm01 attrd: [5650]: info: attrd_trigger_update: 
Sending flush op to all hosts for: fail-count-vmconfig:0 (1)
Mar  1 13:38:49 xm01 attrd: [5650]: info: attrd_perform_update: Sent 
update 33: fail-count-vmconfig:0=1
Mar  1 13:38:49 xm01 lrmd: [5649]: info: rsc:vmconfig:0 notify[21] (pid 7100)
Mar  1 13:38:49 xm01 attrd: [5650]: notice: attrd_ais_dispatch: 
Update relayed from xm02
Mar  1 13:38:49 xm01 attrd: [5650]: info: find_hash_entry: Creating 
hash entry for last-failure-vmconfig:0
Mar  1 13:38:49 xm01 attrd: [5650]: info: attrd_trigger_update: 
Sending flush op to all hosts for: last-failure-vmconfig:0 (1330619909)
Mar  1 13:38:49 xm01 attrd: [5650]: info: attrd_perform_update: Sent 
update 36: last-failure-vmconfig:0=1330619909
Mar  1 13:38:52 xm01 lrmd: [5649]: info: RA output: 
(vmconfig:0:notify:stderr) lock on /var/lock/drbd-147-3 currently 
held by pid:7099
Mar  1 13:38:52 xm01 crm_attribute: [7128]: info: Invoked: 
crm_attribute -N xm01 -n master-vmconfig:0 -l reboot -D
Mar  1 13:38:52 xm01 attrd: [5650]: info: attrd_trigger_update: 
Sending flush op to all hosts for: master-vmconfig:0 (<null>)
Mar  1 13:38:52 xm01 attrd: [5650]: info: attrd_perform_update: Sent 
delete 38: node=xm01, attr=master-vmconfig:0, id=<n/a>, set=(null), 
section=status
Mar  1 13:38:52 xm01 attrd: [5650]: info: attrd_perform_update: Sent 
delete 40: node=xm01, attr=master-vmconfig:0, id=<n/a>, set=(null), 
section=status
Mar  1 13:38:52 xm01 lrmd: [5649]: info: RA output: (vmconfig:0:notify:stdout)
Mar  1 13:38:52 xm01 lrmd: [5649]: info: operation notify[21] on 
vmconfig:0 for client 5652: pid 7100 exited with return code 0
Mar  1 13:38:52 xm01 crmd: [5652]: info: process_lrm_event: LRM 
operation vmconfig:0_notify_0 (call=21, rc=0, cib-update=26, confirmed=true) ok
Mar  1 13:38:55 xm01 external/ipmi[7135]: [7146]: debug: ipmitool 
output: Chassis Power is on
Mar  1 13:38:56 xm01 stonith: [7131]: info: external/ipmi device OK.
Mar  1 13:39:01 xm01 /usr/sbin/cron[7148]: (root) CMD 
(/usr/sbin/logwatch --service dmeventd)
Mar  1 13:39:11 xm01 external/ipmi[7176]: [7187]: debug: ipmitool 
output: Chassis Power is on
Mar  1 13:39:12 xm01 stonith: [7172]: info: external/ipmi device OK.
Mar  1 13:39:19 xm01 crmd: [5652]: info: do_lrm_rsc_op: Performing 
key=216:8:0:8b7a050b-901b-4db7-b1f7-c3c5dd8a9653 op=vmconfig:0_notify_0 )
Mar  1 13:39:19 xm01 lrmd: [5649]: info: rsc:vmconfig:0 notify[22] (pid 7188)
Mar  1 13:39:19 xm01 crmd: [5652]: info: do_lrm_rsc_op: Performing 
key=224:8:0:8b7a050b-901b-4db7-b1f7-c3c5dd8a9653 op=vmsvn-drbd:0_notify_0 )
Mar  1 13:39:19 xm01 lrmd: [5649]: info: rsc:vmsvn-drbd:0 notify[23] (pid 7189)
Mar  1 13:39:19 xm01 crmd: [5652]: info: do_lrm_rsc_op: Performing 
key=232:8:0:8b7a050b-901b-4db7-b1f7-c3c5dd8a9653 op=srvsvn1-drbd:0_notify_0 )
Mar  1 13:39:19 xm01 lrmd: [5649]: info: rsc:srvsvn1-drbd:0 
notify[24] (pid 7190)
Mar  1 13:39:19 xm01 crmd: [5652]: info: do_lrm_rsc_op: Performing 
key=240:8:0:8b7a050b-901b-4db7-b1f7-c3c5dd8a9653 op=srvsvn2-drbd:0_notify_0 )
Mar  1 13:39:19 xm01 lrmd: [5649]: info: rsc:srvsvn2-drbd:0 
notify[25] (pid 7191)
Mar  1 13:39:19 xm01 stonith-ng: [5647]: info: crm_new_peer: Node 
xm02 now has id: 33554532
Mar  1 13:39:19 xm01 stonith-ng: [5647]: info: crm_new_peer: Node 
33554532 is now known as xm02
Mar  1 13:39:19 xm01 stonith-ng: [5647]: info: stonith_queryQuery 
<stonith_command t="stonith-ng" 
st_async_id="c1be22cc-e535-441c-a674-89551a2b9d4c" st_op="st_query" 
st_callid="0" st_callopt="0" st_
remote_op="c1be22cc-e535-441c-a674-89551a2b9d4c" st_target="xm02" 
st_device_action="reboot" 
st_clientid="bb653c7a-6351-4517-ad06-6fb0e20fe375" st_timeout="6000" 
src="xm02" seq="5" />
Mar  1 13:39:19 xm01 stonith-ng: [5647]: info: 
can_fence_host_with_device: Refreshing port list for ipmi-stonith-xm02
Mar  1 13:39:19 xm01 stonith-ng: [5647]: WARN: parse_host_line: Could 
not parse (0 0):
Mar  1 13:39:19 xm01 stonith-ng: [5647]: info: 
can_fence_host_with_device: ipmi-stonith-xm02 can fence xm02: dynamic-list
Mar  1 13:39:19 xm01 stonith-ng: [5647]: info: stonith_query: Found 1 
matching devices for 'xm02'
Mar  1 13:39:19 xm01 stonith-ng: [5647]: info: stonith_fenceExec 
<stonith_command t="stonith-ng" 
st_async_id="c1be22cc-e535-441c-a674-89551a2b9d4c" st_op="st_fence" 
st_callid="0" st_callopt="0" st_r
emote_op="c1be22cc-e535-441c-a674-89551a2b9d4c" st_target="xm02" 
st_device_action="reboot" st_timeout="54000" src="xm02" seq="7" />
Mar  1 13:39:19 xm01 stonith-ng: [5647]: info: 
can_fence_host_with_device: ipmi-stonith-xm02 can fence xm02: dynamic-list
Mar  1 13:39:19 xm01 stonith-ng: [5647]: info: stonith_fence: Found 1 
matching devices for 'xm02'
Mar  1 13:39:20 xm01 external/ipmi[7288]: [7302]: debug: ipmitool 
output: Chassis Power Control: Reset
Mar  1 13:39:21 xm01 stonith-ng: [5647]: info: log_operation: 
Operation 'reboot' [7277] for host 'xm02' with device 
'ipmi-stonith-xm02' returned: 0 (call 0 from (null))
Mar  1 13:39:21 xm01 lrmd: [5649]: info: operation notify[22] on 
vmconfig:0 for client 5652: pid 7188 exited with return code 0
Mar  1 13:39:21 xm01 crmd: [5652]: info: process_lrm_event: LRM 
operation vmconfig:0_notify_0 (call=22, rc=0, cib-update=27, confirmed=true) ok
Mar  1 13:39:22 xm01 kernel: [  218.177661] bnx2: eth1 NIC Copper Link is Down
Mar  1 13:39:24 xm01 kernel: [  220.488280] bnx2: eth1 NIC Copper 
Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON
Mar  1 13:39:24 xm01 corosync[5621]:  [TOTEM ] A processor failed, 
forming new configuration.
Mar  1 13:39:27 xm01 external/ipmi[7311]: [7322]: debug: ipmitool 
output: Chassis Power is on
Mar  1 13:39:28 xm01 stonith: [7307]: info: external/ipmi device OK.
Mar  1 13:39:30 xm01 corosync[5621]:  [CLM   ] CLM CONFIGURATION CHANGE
Mar  1 13:39:30 xm01 corosync[5621]:  [CLM   ] New Configuration:
Mar  1 13:39:30 xm01 corosync[5621]:  [CLM   ]  r(0) ip(100.0.0.1)
Mar  1 13:39:30 xm01 corosync[5621]:  [CLM   ] Members Left:
Mar  1 13:39:30 xm01 corosync[5621]:  [CLM   ]  r(0) ip(100.0.0.2)
Mar  1 13:39:30 xm01 cib: [5648]: notice: ais_dispatch_message: 
Membership 1028: quorum lost
Mar  1 13:39:30 xm01 corosync[5621]:  [CLM   ] Members Joined:
Mar  1 13:39:30 xm01 crmd: [5652]: notice: ais_dispatch_message: 
Membership 1028: quorum lost
Mar  1 13:39:30 xm01 corosync[5621]:  [pcmk  ] notice: 
pcmk_peer_update: Transitional membership event on ring 1028: memb=1, 
new=0, lost=1
Mar  1 13:39:30 xm01 cib: [5648]: info: crm_update_peer: Node xm02: 
id=33554532 state=lost (new) addr=r(0) ip(100.0.0.2)  votes=1 
born=1016 seen=1024 proc=00000000000000000000000000151312
Mar  1 13:39:30 xm01 corosync[5621]:  [pcmk  ] info: 
pcmk_peer_update: memb: xm01 16777316
Mar  1 13:39:30 xm01 corosync[5621]:  [pcmk  ] info: 
pcmk_peer_update: lost: xm02 33554532
Mar  1 13:39:30 xm01 corosync[5621]:  [CLM   ] CLM CONFIGURATION CHANGE
Mar  1 13:39:30 xm01 corosync[5621]:  [CLM   ] New Configuration:
Mar  1 13:39:30 xm01 crmd: [5652]: info: ais_status_callback: status: 
xm02 is now lost (was member)
Mar  1 13:39:30 xm01 crmd: [5652]: info: crm_update_peer: Node xm02: 
id=33554532 state=lost (new) addr=r(0) ip(100.0.0.2)  votes=1 
born=1016 seen=1024 proc=00000000000000000000000000151312
Mar  1 13:39:30 xm01 stonith-ng: [5647]: info: 
process_remote_stonith_execExecResult <st-reply 
st_origin="stonith_construct_async_reply" t="stonith-ng" 
st_op="st_notify" st_remote_op="c1be22cc-e535-
441c-a674-89551a2b9d4c" st_callid="0" st_callopt="0" st_rc="0" 
st_output="Performing: stonith -t external/ipmi -T reset xm02 
success: xm02 0 " src="xm01" seq="2" />
Mar  1 13:39:30 xm01 corosync[5621]:  [CLM   ]  r(0) ip(100.0.0.1)
Mar  1 13:39:30 xm01 crmd: [5652]: WARN: check_dead_member: Our DC 
node (xm02) left the cluster
Mar  1 13:39:30 xm01 corosync[5621]:  [CLM   ] Members Left:
Mar  1 13:39:30 xm01 stonith-ng: [5647]: info: remote_op_done: 
Notifing clients of c1be22cc-e535-441c-a674-89551a2b9d4c (reboot of 
xm02 from bb653c7a-6351-4517-ad06-6fb0e20fe375 by xm01): 0, rc=0
Mar  1 13:39:30 xm01 corosync[5621]:  [CLM   ] Members Joined:
Mar  1 13:39:30 xm01 stonith-ng: [5647]: info: stonith_notify_client: 
Sending st_fence-notification to client 
5652/c9e6b033-73f2-43a9-b848-81bffa3c6d9b
Mar  1 13:39:30 xm01 crmd: [5652]: info: do_state_transition: State 
transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION 
cause=C_FSA_INTERNAL origin=check_dead_member ]
Mar  1 13:39:30 xm01 corosync[5621]:  [pcmk  ] notice: 
pcmk_peer_update: Stable membership event on ring 1028: memb=1, new=0, lost=0
Mar  1 13:39:30 xm01 crmd: [5652]: info: update_dc: Unset DC xm02
Mar  1 13:39:30 xm01 corosync[5621]:  [pcmk  ] info: 
pcmk_peer_update: MEMB: xm01 16777316
Mar  1 13:39:30 xm01 corosync[5621]:  [pcmk  ] info: 
ais_mark_unseen_peer_dead: Node xm02 was not seen in the previous transition
Mar  1 13:39:30 xm01 corosync[5621]:  [pcmk  ] info: update_member: 
Node 33554532/xm02 is now: lost
Mar  1 13:39:30 xm01 corosync[5621]:  [pcmk  ] info: 
send_member_notification: Sending membership update 1028 to 2 children
Mar  1 13:39:30 xm01 corosync[5621]:  [TOTEM ] A processor joined or 
left the membership and a new membership was formed.
Mar  1 13:39:30 xm01 corosync[5621]:  [CPG   ] chosen downlist: 
sender r(0) ip(100.0.0.1) ; members(old:2 left:1)
Mar  1 13:39:30 xm01 crmd: [5652]: info: tengine_stonith_notify: Peer 
xm02 was terminated (reboot) by xm01 for xm02 
(ref=c1be22cc-e535-441c-a674-89551a2b9d4c): OK
Mar  1 13:39:30 xm01 crmd: [5652]: notice: tengine_stonith_notify: 
Target was our leader xm02/xm02 (recorded leader: <unset>)
Mar  1 13:39:30 xm01 corosync[5621]:  [MAIN  ] Completed service 
synchronization, ready to provide service.
Mar  1 13:39:30 xm01 crmd: [5652]: info: send_stonith_update: Sending 
fencing update 28 for xm02
Mar  1 13:39:30 xm01 crmd: [5652]: notice: crmd_peer_update: Status 
update: Client xm02/crmd now has status [offline] (DC=<null>)
Mar  1 13:39:30 xm01 crmd: [5652]: info: crm_update_peer: Node xm02: 
id=33554532 state=lost addr=r(0) ip(100.0.0.2)  votes=1 born=1016 
seen=1024 proc=00000000000000000000000000000001 (new)
Mar  1 13:39:30 xm01 crmd: [5652]: info: cib_fencing_updated: Fencing 
update 28 for xm02: complete
Mar  1 13:39:30 xm01 crmd: [5652]: info: do_state_transition: State 
transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC 
cause=C_FSA_INTERNAL origin=do_election_check ]

@ 13:39:32 XM02 has been stonithed.

WHY????

With the drbd.conf modifications, I no longer have the constraints 
(which is fine!) and they both become Master. BUT...
The VM never fails over to XM02 as it should when XM01 goes down.
Here's the XM02 log between 13:33:00 and 13:36:40 when XM01 is up again.

Mar  1 13:32:56 xm02 mgmtd: [6300]: info: CIB query: cib
Mar  1 13:33:01 xm02 /usr/sbin/cron[8783]: (root) CMD 
(/usr/sbin/logwatch --service dmeventd)
Mar  1 13:33:09 xm02 external/ipmi[8858]: [8869]: debug: ipmitool 
output: Chassis Power is on
Mar  1 13:33:10 xm02 stonith: [8854]: info: external/ipmi device OK.
Mar  1 13:33:21 xm02 kernel: [  238.815026] bnx2: eth1 NIC Copper Link is Down
Mar  1 13:33:23 xm02 kernel: [  241.298581] bnx2: eth1 NIC Copper 
Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON
Mar  1 13:33:27 xm02 external/ipmi[9035]: [9061]: debug: ipmitool 
output: Chassis Power is on
Mar  1 13:33:28 xm02 stonith: [9031]: info: external/ipmi device OK.
Mar  1 13:33:36 xm02 kernel: [  254.005922] bnx2: eth1 NIC Copper Link is Down
Mar  1 13:33:39 xm02 kernel: [  256.432743] bnx2: eth1 NIC Copper 
Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON
Mar  1 13:33:39 xm02 kernel: [  256.820486] bnx2: eth1 NIC Copper Link is Down
Mar  1 13:33:41 xm02 kernel: [  259.290456] bnx2: eth1 NIC Copper 
Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON
Mar  1 13:33:44 xm02 external/ipmi[9254]: [9265]: debug: ipmitool 
output: Chassis Power is on
Mar  1 13:33:45 xm02 stonith: [9250]: info: external/ipmi device OK.
Mar  1 13:33:55 xm02 lrmd: [6296]: WARN: VMSVN:start process (PID 
8644) timed out (try 1).  Killing with signal SIGTERM (15).
Mar  1 13:34:00 xm02 lrmd: [6296]: WARN: VMSVN:start process (PID 
8644) timed out (try 2).  Killing with signal SIGKILL (9).
Mar  1 13:34:00 xm02 external/ipmi[9478]: [9489]: debug: ipmitool 
output: Chassis Power is on
Mar  1 13:34:01 xm02 /usr/sbin/cron[9491]: (root) CMD 
(/usr/sbin/logwatch --service dmeventd)
Mar  1 13:34:01 xm02 stonith: [9474]: info: external/ipmi device OK.
Mar  1 13:34:05 xm02 lrmd: [6296]: ERROR: TrackedProcTimeoutFunction: 
VMSVN:start process (PID 8644) will not die!
Mar  1 13:34:17 xm02 external/ipmi[9697]: [9708]: debug: ipmitool 
output: Chassis Power is on
Mar  1 13:34:18 xm02 stonith: [9693]: info: external/ipmi device OK.
Mar  1 13:34:31 xm02 kernel: [  308.659429] bnx2: eth1 NIC Copper Link is Down
Mar  1 13:34:33 xm02 external/ipmi[9804]: [9815]: debug: ipmitool 
output: Chassis Power is on
Mar  1 13:34:33 xm02 kernel: [  311.171354] bnx2: eth1 NIC Copper 
Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON
Mar  1 13:34:34 xm02 stonith: [9800]: info: external/ipmi device OK.
Mar  1 13:34:50 xm02 external/ipmi[10014]: [10025]: debug: ipmitool 
output: Chassis Power is on
Mar  1 13:34:51 xm02 stonith: [10010]: info: external/ipmi device OK.
Mar  1 13:34:55 xm02 crmd: [6299]: WARN: action_timer_callback: Timer 
popped (timeout=60000, abort_level=1000000, complete=false)
Mar  1 13:34:55 xm02 crmd: [6299]: ERROR: print_elem: Aborting 
transition, action lost: [Action 180]: In-flight (id: VMSVN_start_0, 
loc: xm02, priority: 0)
Mar  1 13:34:55 xm02 crmd: [6299]: info: abort_transition_graph: 
action_timer_callback:486 - Triggered transition abort (complete=0) : 
Action lost
Mar  1 13:34:55 xm02 crmd: [6299]: WARN: cib_action_update: rsc_op 
180: VMSVN_start_0 on xm02 timed out
Mar  1 13:34:55 xm02 crmd: [6299]: info: create_operation_update: 
cib_action_update: Updating resouce VMSVN after Timed Out start op (interval=0)
Mar  1 13:34:55 xm02 crmd: [6299]: info: run_graph: 
====================================================
Mar  1 13:34:55 xm02 crmd: [6299]: notice: run_graph: Transition 0 
(Complete=31, Pending=0, Fired=0, Skipped=35, Incomplete=37, 
Source=/var/lib/pengine/pe-warn-309.bz2): Stopped
Mar  1 13:34:55 xm02 crmd: [6299]: info: te_graph_trigger: Transition 
0 is now complete
Mar  1 13:34:55 xm02 crmd: [6299]: info: do_state_transition: State 
transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC 
cause=C_FSA_INTERNAL origin=notify_crmd ]
Mar  1 13:34:55 xm02 crmd: [6299]: info: do_state_transition: All 1 
cluster nodes are eligible to run resources.
Mar  1 13:34:55 xm02 crmd: [6299]: info: do_pe_invoke: Query 78: 
Requesting the current CIB: S_POLICY_ENGINE
Mar  1 13:34:55 xm02 crmd: [6299]: info: process_graph_event: Action 
VMSVN_start_0 arrived after a completed transition
Mar  1 13:34:55 xm02 crmd: [6299]: info: abort_transition_graph: 
process_graph_event:482 - Triggered transition abort (complete=1, 
tag=lrm_rsc_op, id=VMSVN_start_0, magic=2:1;180:0:0:8b7a050b-901b-4
db7-b1f7-c3c5dd8a9653, cib=0.2472.134) : Inactive graph
Mar  1 13:34:55 xm02 crmd: [6299]: WARN: update_failcount: Updating 
failcount for VMSVN on xm02 after failed start: rc=1 
(update=INFINITY, time=1330619695)
Mar  1 13:34:55 xm02 attrd: [6297]: info: find_hash_entry: Creating 
hash entry for fail-count-VMSVN
Mar  1 13:34:55 xm02 attrd: [6297]: info: attrd_trigger_update: 
Sending flush op to all hosts for: fail-count-VMSVN (INFINITY)
Mar  1 13:34:55 xm02 crmd: [6299]: info: do_pe_invoke: Query 79: 
Requesting the current CIB: S_POLICY_ENGINE
Mar  1 13:34:55 xm02 attrd: [6297]: info: attrd_perform_update: Sent 
update 35: fail-count-VMSVN=INFINITY
Mar  1 13:34:55 xm02 attrd: [6297]: info: find_hash_entry: Creating 
hash entry for last-failure-VMSVN
Mar  1 13:34:55 xm02 attrd: [6297]: info: attrd_trigger_update: 
Sending flush op to all hosts for: last-failure-VMSVN (1330619695)
Mar  1 13:34:55 xm02 crmd: [6299]: info: do_pe_invoke_callback: 
Invoking the PE: query=79, ref=pe_calc-dc-1330619695-20, seq=1020, quorate=0
Mar  1 13:34:55 xm02 crmd: [6299]: info: abort_transition_graph: 
te_update_diff:142 - Triggered transition abort (complete=1, 
tag=nvpair, id=status-xm02-fail-count-VMSVN, magic=NA, cib=0.2472.135) :
  Transient attribute: update
Mar  1 13:34:55 xm02 attrd: [6297]: info: attrd_perform_update: Sent 
update 38: last-failure-VMSVN=1330619695
Mar  1 13:34:55 xm02 pengine: [6298]: notice: unpack_config: On loss 
of CCM Quorum: Ignore
Mar  1 13:34:55 xm02 crmd: [6299]: info: abort_transition_graph: 
te_update_diff:142 - Triggered transition abort (complete=1, 
tag=nvpair, id=status-xm02-last-failure-VMSVN, magic=NA, cib=0.2472.136)
  : Transient attribute: update
Mar  1 13:34:55 xm02 crmd: [6299]: info: do_pe_invoke: Query 80: 
Requesting the current CIB: S_POLICY_ENGINE
Mar  1 13:34:55 xm02 crmd: [6299]: info: do_pe_invoke: Query 81: 
Requesting the current CIB: S_POLICY_ENGINE
Mar  1 13:34:55 xm02 pengine: [6298]: notice: unpack_rsc_op: 
Operation vmsvn-drbd:1_monitor_0 found resource vmsvn-drbd:1 active on xm02
Mar  1 13:34:55 xm02 pengine: [6298]: notice: unpack_rsc_op: 
Operation srvsvn1-drbd:1_monitor_0 found resource srvsvn1-drbd:1 active on xm02
Mar  1 13:34:55 xm02 pengine: [6298]: notice: unpack_rsc_op: 
Operation srvsvn2-drbd:1_monitor_0 found resource srvsvn2-drbd:1 active on xm02
Mar  1 13:34:55 xm02 pengine: [6298]: notice: unpack_rsc_op: 
Operation vmconfig:1_monitor_0 found resource vmconfig:1 active on xm02
Mar  1 13:34:55 xm02 pengine: [6298]: WARN: unpack_rsc_op: Processing 
failed op VMSVN_start_0 on xm02: unknown error (1)
Mar  1 13:34:55 xm02 crmd: [6299]: info: do_pe_invoke_callback: 
Invoking the PE: query=81, ref=pe_calc-dc-1330619695-21, seq=1020, quorate=0
Mar  1 13:34:55 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (30s) for VMSVN on xm02
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   ipmi-stonith-xm01     (Started xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   ipmi-stonith-xm02     (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmconfig:0    (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmconfig:1    (Master xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmsvn-drbd:0  (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmsvn-drbd:1  (Master xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   srvsvn1-drbd:0        (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   srvsvn1-drbd:1        (Master xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   srvsvn2-drbd:0        (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   srvsvn2-drbd:1        (Master xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   dlm:0 (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   o2cb:0        (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   clvm:0        (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   dlm:1 (Started xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   o2cb:1        (Started xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   clvm:1        (Started xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmconfig-pri:0        (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmconfig-pri:1        (Started xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vg_svn:0      (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vg_svn:1      (Started xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: Recover 
VMSVN (Started xm02)
Mar  1 13:34:55 xm02 crmd: [6299]: info: handle_response: pe_calc 
calculation pe_calc-dc-1330619695-20 is obsolete
Mar  1 13:34:55 xm02 pengine: [6298]: notice: process_pe_message: 
Transition 1: PEngine Input stored in: /var/lib/pengine/pe-input-2288.bz2
Mar  1 13:34:55 xm02 pengine: [6298]: notice: unpack_config: On loss 
of CCM Quorum: Ignore
Mar  1 13:34:55 xm02 pengine: [6298]: notice: unpack_rsc_op: 
Operation vmsvn-drbd:1_monitor_0 found resource vmsvn-drbd:1 active on xm02
Mar  1 13:34:55 xm02 pengine: [6298]: notice: unpack_rsc_op: 
Operation srvsvn1-drbd:1_monitor_0 found resource srvsvn1-drbd:1 active on xm02
Mar  1 13:34:55 xm02 pengine: [6298]: notice: unpack_rsc_op: 
Operation srvsvn2-drbd:1_monitor_0 found resource srvsvn2-drbd:1 active on xm02
Mar  1 13:34:55 xm02 pengine: [6298]: notice: unpack_rsc_op: 
Operation vmconfig:1_monitor_0 found resource vmconfig:1 active on xm02
Mar  1 13:34:55 xm02 pengine: [6298]: WARN: unpack_rsc_op: Processing 
failed op VMSVN_start_0 on xm02: unknown error (1)
Mar  1 13:34:55 xm02 pengine: [6298]: WARN: common_apply_stickiness: 
Forcing VMSVN away from xm02 after 1000000 failures (max=1000000)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   ipmi-stonith-xm01     (Started xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   ipmi-stonith-xm02     (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmconfig:0    (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmconfig:1    (Master xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmsvn-drbd:0  (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmsvn-drbd:1  (Master xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   srvsvn1-drbd:0        (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   srvsvn1-drbd:1        (Master xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   srvsvn2-drbd:0        (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   srvsvn2-drbd:1        (Master xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   dlm:0 (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   o2cb:0        (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   clvm:0        (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   dlm:1 (Started xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   o2cb:1        (Started xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   clvm:1        (Started xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmconfig-pri:0        (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmconfig-pri:1        (Started xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vg_svn:0      (Stopped)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: 
Leave   vg_svn:1      (Started xm02)
Mar  1 13:34:55 xm02 pengine: [6298]: notice: LogActions: Stop    VMSVN (xm02)
Mar  1 13:34:55 xm02 crmd: [6299]: info: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ 
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Mar  1 13:34:55 xm02 crmd: [6299]: info: unpack_graph: Unpacked 
transition 2: 2 actions in 2 synapses
Mar  1 13:34:55 xm02 crmd: [6299]: info: do_te_invoke: Processing 
graph 2 (ref=pe_calc-dc-1330619695-21) derived from 
/var/lib/pengine/pe-input-2289.bz2
Mar  1 13:34:55 xm02 crmd: [6299]: info: te_rsc_command: Initiating 
action 5: stop VMSVN_stop_0 on xm02 (local)
Mar  1 13:34:55 xm02 crmd: [6299]: info: do_lrm_rsc_op: Performing 
key=5:2:0:8b7a050b-901b-4db7-b1f7-c3c5dd8a9653 op=VMSVN_stop_0 )
Mar  1 13:34:55 xm02 lrmd: [6296]: info: perform_op:2932: operation 
start[45] with pid 8644 on VMSVN for client 6299, its parameters: 
CRM_meta_name=[start] crm_feature_set=[3.0.5] xmfile=[/etc/xen/v
m/vmsvn] CRM_meta_timeout=[60000]  for rsc is already running.
Mar  1 13:34:55 xm02 lrmd: [6296]: info: perform_op:2942: postponing 
all ops on resource VMSVN by 1000 ms
Mar  1 13:34:55 xm02 pengine: [6298]: notice: process_pe_message: 
Transition 2: PEngine Input stored in: /var/lib/pengine/pe-input-2289.bz2
Mar  1 13:34:56 xm02 mgmtd: [6300]: info: CIB query: cib
Mar  1 13:34:56 xm02 lrmd: [6296]: info: perform_op:2932: operation 
start[45] with pid 8644 on VMSVN for client 6299, its parameters: 
CRM_meta_name=[start] crm_feature_set=[3.0.5] xmfile=[/etc/xen/v
m/vmsvn] CRM_meta_timeout=[60000]  for rsc is already running.

I got this several times until I get the following:
Mar  1 13:36:16 xm02 lrmd: [6296]: info: perform_op:2942: postponing 
all ops on resource VMSVN by 1000 ms
Mar  1 13:36:17 xm02 kernel: [  414.513459] block drbd0: Handshake 
successful: Agreed network protocol version 96
Mar  1 13:36:17 xm02 kernel: [  414.513468] block drbd1: Handshake 
successful: Agreed network protocol version 96
Mar  1 13:36:17 xm02 kernel: [  414.513708] block drbd1: Peer 
authenticated using 20 bytes of 'sha1' HMAC
Mar  1 13:36:17 xm02 kernel: [  414.513726] block drbd1: conn( 
WFConnection -> WFReportParams )
Mar  1 13:36:17 xm02 kernel: [  414.513775] block drbd0: Peer 
authenticated using 20 bytes of 'sha1' HMAC
Mar  1 13:36:17 xm02 kernel: [  414.513780] block drbd0: conn( 
WFConnection -> WFReportParams )
Mar  1 13:36:17 xm02 kernel: [  414.513797] block drbd0: Starting 
asender thread (from drbd0_receiver [5689])
Mar  1 13:36:17 xm02 kernel: [  414.513822] block drbd1: Starting 
asender thread (from drbd1_receiver [5691])
Mar  1 13:36:17 xm02 kernel: [  414.513965] block drbd1: 
data-integrity-alg: <not-used>
Mar  1 13:36:17 xm02 kernel: [  414.513984] block drbd1: drbd_sync_handshake:
Mar  1 13:36:17 xm02 kernel: [  414.513988] block drbd1: self 
E6E23470FD3656AD:0000000000000000:65C464E576893481:65C364E576893481 
bits:0 flags:0
Mar  1 13:36:17 xm02 kernel: [  414.513992] block drbd1: peer 
E6E23470FD3656AC:0000000000000000:65C464E576893480:65C364E576893481 
bits:30720 flags:2
Mar  1 13:36:17 xm02 kernel: [  414.513995] block drbd1: 
uuid_compare()=-1 by rule 40
Mar  1 13:36:17 xm02 kernel: [  414.513997] block drbd1: I shall 
become SyncTarget, but I am primary!
Mar  1 13:36:17 xm02 kernel: [  414.514001] block drbd1: conn( 
WFReportParams -> Disconnecting )
Mar  1 13:36:17 xm02 kernel: [  414.514008] block drbd1: error 
receiving ReportState, l: 4!
Mar  1 13:36:17 xm02 kernel: [  414.514039] block drbd1: asender terminated
Mar  1 13:36:17 xm02 kernel: [  414.514045] block drbd1: Terminating 
asender thread
Mar  1 13:36:17 xm02 kernel: [  414.514051] block drbd0: 
data-integrity-alg: <not-used>
Mar  1 13:36:17 xm02 kernel: [  414.514090] block drbd0: drbd_sync_handshake:
Mar  1 13:36:17 xm02 kernel: [  414.514095] block drbd0: self 
EEDF542BD48564B5:0000000000000000:AF298F27A3172093:AF288F27A3172093 
bits:0 flags:0
Mar  1 13:36:17 xm02 kernel: [  414.514099] block drbd0: peer 
EEDF542BD48564B4:0000000000000000:AF298F27A3172092:AF288F27A3172093 
bits:57344 flags:2
Mar  1 13:36:17 xm02 kernel: [  414.514103] block drbd0: 
uuid_compare()=-1 by rule 40
Mar  1 13:36:17 xm02 kernel: [  414.514105] block drbd0: I shall 
become SyncTarget, but I am primary!
Mar  1 13:36:17 xm02 kernel: [  414.514109] block drbd0: conn( 
WFReportParams -> Disconnecting )
Mar  1 13:36:17 xm02 kernel: [  414.514117] block drbd0: error 
receiving ReportState, l: 4!
Mar  1 13:36:17 xm02 kernel: [  414.514158] block drbd0: asender terminated
Mar  1 13:36:17 xm02 kernel: [  414.514164] block drbd0: Terminating 
asender thread
Mar  1 13:36:17 xm02 kernel: [  414.514253] block drbd0: Connection closed
Mar  1 13:36:17 xm02 kernel: [  414.514285] block drbd1: Connection closed
Mar  1 13:36:17 xm02 kernel: [  414.514320] block drbd0: helper 
command: /sbin/drbdadm fence-peer minor-0
Mar  1 13:36:17 xm02 lrmd: [6296]: info: perform_op:2932: operation 
start[45] with pid 8644 on VMSVN for client 6299, its parameters: 
CRM_meta_name=[start] crm_feature_set=[3.0.5] xmfile=[/etc/xen/v
m/vmsvn] CRM_meta_timeout=[60000]  for rsc is already running.
Mar  1 13:36:17 xm02 lrmd: [6296]: info: perform_op:2942: postponing 
all ops on resource VMSVN by 1000 ms
Mar  1 13:36:17 xm02 kernel: [  414.514327] block drbd0: conn( 
Disconnecting -> StandAlone )
Mar  1 13:36:17 xm02 kernel: [  414.514347] block drbd1: conn( 
Disconnecting -> StandAlone )
Mar  1 13:36:17 xm02 kernel: [  414.514350] block drbd1: helper 
command: /sbin/drbdadm fence-peer minor-1
Mar  1 13:36:17 xm02 kernel: [  414.514433] block drbd0: receiver terminated
Mar  1 13:36:17 xm02 kernel: [  414.514437] block drbd0: Terminating 
receiver thread
Mar  1 13:36:17 xm02 kernel: [  414.514473] block drbd1: receiver terminated
Mar  1 13:36:17 xm02 kernel: [  414.514475] block drbd1: Terminating 
receiver thread
Mar  1 13:36:17 xm02 kernel: [  414.517576] block drbd2: Handshake 
successful: Agreed network protocol version 96
Mar  1 13:36:17 xm02 kernel: [  414.517651] block drbd3: Handshake 
successful: Agreed network protocol version 96
Mar  1 13:36:17 xm02 kernel: [  414.517944] block drbd3: Peer 
authenticated using 20 bytes of 'sha1' HMAC
Mar  1 13:36:17 xm02 kernel: [  414.517956] block drbd3: conn( 
WFConnection -> WFReportParams )
Mar  1 13:36:17 xm02 kernel: [  414.517986] block drbd3: Starting 
asender thread (from drbd3_receiver [5703])
Mar  1 13:36:17 xm02 kernel: [  414.518045] block drbd2: Peer 
authenticated using 20 bytes of 'sha1' HMAC
Mar  1 13:36:17 xm02 kernel: [  414.518054] block drbd2: conn( 
WFConnection -> WFReportParams )
Mar  1 13:36:17 xm02 kernel: [  414.518073] block drbd2: Starting 
asender thread (from drbd2_receiver [5699])
Mar  1 13:36:17 xm02 kernel: [  414.518164] block drbd3: 
data-integrity-alg: <not-used>
Mar  1 13:36:17 xm02 kernel: [  414.518204] block drbd3: drbd_sync_handshake:
Mar  1 13:36:17 xm02 kernel: [  414.518210] block drbd3: self 
75C7AE841CB0682F:0000000000000000:99ABCCCBF1E4D001:99AACCCBF1E4D001 
bits:0 flags:0
Mar  1 13:36:17 xm02 kernel: [  414.518214] block drbd3: peer 
75C7AE841CB0682E:0000000000000000:99ABCCCBF1E4D000:99AACCCBF1E4D001 
bits:5120 flags:2
Mar  1 13:36:17 xm02 kernel: [  414.518218] block drbd3: 
uuid_compare()=-1 by rule 40
Mar  1 13:36:17 xm02 kernel: [  414.518220] block drbd3: I shall 
become SyncTarget, but I am primary!
Mar  1 13:36:17 xm02 kernel: [  414.518233] block drbd3: conn( 
WFReportParams -> Disconnecting )
Mar  1 13:36:17 xm02 kernel: [  414.518243] block drbd3: error 
receiving ReportState, l: 4!
Mar  1 13:36:17 xm02 kernel: [  414.518255] block drbd3: asender terminated
Mar  1 13:36:17 xm02 kernel: [  414.518258] block drbd3: Terminating 
asender thread
Mar  1 13:36:17 xm02 kernel: [  414.518333] block drbd3: Connection closed
Mar  1 13:36:17 xm02 kernel: [  414.518414] block drbd3: helper 
command: /sbin/drbdadm fence-peer minor-3
Mar  1 13:36:17 xm02 kernel: [  414.518417] block drbd3: conn( 
Disconnecting -> StandAlone )
Mar  1 13:36:17 xm02 kernel: [  414.518455] block drbd3: receiver terminated
Mar  1 13:36:17 xm02 kernel: [  414.518460] block drbd3: Terminating 
receiver thread
Mar  1 13:36:17 xm02 kernel: [  414.518551] block drbd2: 
data-integrity-alg: <not-used>
Mar  1 13:36:17 xm02 kernel: [  414.518572] block drbd2: drbd_sync_handshake:
Mar  1 13:36:17 xm02 kernel: [  414.518576] block drbd2: self 
324E9CEEF0227FAD:0000000000000000:F91D77DB4FF3672B:F91C77DB4FF3672B 
bits:0 flags:0
Mar  1 13:36:17 xm02 kernel: [  414.518580] block drbd2: peer 
324E9CEEF0227FAC:0000000000000000:F91D77DB4FF3672A:F91C77DB4FF3672B 
bits:12288 flags:2
Mar  1 13:36:17 xm02 kernel: [  414.518584] block drbd2: 
uuid_compare()=-1 by rule 40
Mar  1 13:36:17 xm02 kernel: [  414.518587] block drbd2: I shall 
become SyncTarget, but I am primary!
Mar  1 13:36:17 xm02 kernel: [  414.518592] block drbd2: conn( 
WFReportParams -> Disconnecting )
Mar  1 13:36:17 xm02 kernel: [  414.518598] block drbd2: error 
receiving ReportState, l: 4!
Mar  1 13:36:17 xm02 kernel: [  414.518616] block drbd2: asender terminated
Mar  1 13:36:17 xm02 kernel: [  414.518626] block drbd2: Terminating 
asender thread
Mar  1 13:36:17 xm02 kernel: [  414.518770] block drbd2: Connection closed
Mar  1 13:36:17 xm02 kernel: [  414.518839] block drbd2: conn( 
Disconnecting -> StandAlone )
Mar  1 13:36:17 xm02 kernel: [  414.518836] block drbd2: helper 
command: /sbin/drbdadm fence-peer minor-2
Mar  1 13:36:17 xm02 kernel: [  414.518869] block drbd2: receiver terminated
Mar  1 13:36:17 xm02 kernel: [  414.518871] block drbd2: Terminating 
receiver thread
Mar  1 13:36:17 xm02 kernel: [  414.522450] block drbd0: helper 
command: /sbin/drbdadm fence-peer minor-0 exit code 126 (0x7e00)
Mar  1 13:36:17 xm02 kernel: [  414.522454] block drbd0: fence-peer 
helper broken, returned 126
Mar  1 13:36:17 xm02 kernel: [  414.522902] block drbd1: helper 
command: /sbin/drbdadm fence-peer minor-1 exit code 126 (0x7e00)
Mar  1 13:36:17 xm02 kernel: [  414.522905] block drbd1: fence-peer 
helper broken, returned 126
Mar  1 13:36:17 xm02 kernel: [  414.526993] block drbd2: helper 
command: /sbin/drbdadm fence-peer minor-2 exit code 126 (0x7e00)
Mar  1 13:36:17 xm02 kernel: [  414.526996] block drbd2: fence-peer 
helper broken, returned 126
Mar  1 13:36:17 xm02 kernel: [  414.527230] block drbd3: helper 
command: /sbin/drbdadm fence-peer minor-3 exit code 126 (0x7e00)
Mar  1 13:36:17 xm02 kernel: [  414.527233] block drbd3: fence-peer 
helper broken, returned 126
Mar  1 13:36:18 xm02 lrmd: [6296]: info: perform_op:2932: operation 
start[45] with pid 8644 on VMSVN for client 6299, its parameters: 
CRM_meta_name=[start] crm_feature_set=[3.0.5] xmfile=[/etc/xen/v
m/vmsvn] CRM_meta_timeout=[60000]  for rsc is already running.
Mar  1 13:36:18 xm02 lrmd: [6296]: info: perform_op:2942: postponing 
all ops on resource VMSVN by 1000 ms
Mar  1 13:36:19 xm02 lrmd: [6296]: info: perform_op:2932: operation 
start[45] with pid 8644 on VMSVN for client 6299, its parameters: 
CRM_meta_name=[start] crm_feature_set=[3.0.5] xmfile=[/etc/xen/v
m/vmsvn] CRM_meta_timeout=[60000]  for rsc is already running.
Mar  1 13:36:19 xm02 lrmd: [6296]: info: perform_op:2942: postponing 
all ops on resource VMSVN by 1000 ms
Mar  1 13:36:20 xm02 lrmd: [6296]: info: perform_op:2932: operation 
start[45] with pid 8644 on VMSVN for client 6299, its parameters: 
CRM_meta_name=[start] crm_feature_set=[3.0.5] xmfile=[/etc/xen/v
m/vmsvn] CRM_meta_timeout=[60000]  for rsc is already running.
Mar  1 13:36:20 xm02 lrmd: [6296]: info: perform_op:2942: postponing 
all ops on resource VMSVN by 1000 ms
Mar  1 13:36:21 xm02 lrmd: [6296]: info: perform_op:2932: operation 
start[45] with pid 8644 on VMSVN for client 6299, its parameters: 
CRM_meta_name=[start] crm_feature_set=[3.0.5] xmfile=[/etc/xen/v
m/vmsvn] CRM_meta_timeout=[60000]  for rsc is already running.
Mar  1 13:36:21 xm02 lrmd: [6296]: info: perform_op:2942: postponing 
all ops on resource VMSVN by 1000 ms
Mar  1 13:36:22 xm02 lrmd: [6296]: info: perform_op:2932: operation 
start[45] with pid 8644 on VMSVN for client 6299, its parameters: 
CRM_meta_name=[start] crm_feature_set=[3.0.5] xmfile=[/etc/xen/v
m/vmsvn] CRM_meta_timeout=[60000]  for rsc is already running.
Mar  1 13:36:22 xm02 lrmd: [6296]: info: perform_op:2942: postponing 
all ops on resource VMSVN by 1000 ms
Mar  1 13:36:22 xm02 corosync[6228]:  [CLM   ] CLM CONFIGURATION CHANGE
Mar  1 13:36:22 xm02 corosync[6228]:  [CLM   ] New Configuration:
Mar  1 13:36:22 xm02 corosync[6228]:  [CLM   ]  r(0) ip(100.0.0.2)
Mar  1 13:36:22 xm02 corosync[6228]:  [CLM   ] Members Left:
Mar  1 13:36:22 xm02 corosync[6228]:  [CLM   ] Members Joined:
Mar  1 13:36:22 xm02 corosync[6228]:  [pcmk  ] notice: 
pcmk_peer_update: Transitional membership event on ring 1024: memb=1, 
new=0, lost=0
Mar  1 13:36:22 xm02 corosync[6228]:  [pcmk  ] info: 
pcmk_peer_update: memb: xm02 33554532
Mar  1 13:36:22 xm02 corosync[6228]:  [CLM   ] CLM CONFIGURATION CHANGE
Mar  1 13:36:22 xm02 corosync[6228]:  [CLM   ] New Configuration:
Mar  1 13:36:22 xm02 corosync[6228]:  [CLM   ]  r(0) ip(100.0.0.1)
Mar  1 13:36:22 xm02 corosync[6228]:  [CLM   ]  r(0) ip(100.0.0.2)
Mar  1 13:36:22 xm02 corosync[6228]:  [CLM   ] Members Left:
Mar  1 13:36:22 xm02 ocfs2_controld: [7172]: notice: 
ais_dispatch_message: Membership 1024: quorum acquired
Mar  1 13:36:22 xm02 crmd: [6299]: notice: ais_dispatch_message: 
Membership 1024: quorum acquired
Mar  1 13:36:22 xm02 crmd: [6299]: notice: crmd_peer_update: Status 
update: Client xm01/crmd now has status [online] (DC=true)
Mar  1 13:36:22 xm02 cluster-dlm: [7099]: notice: 
ais_dispatch_message: Membership 1024: quorum acquired
Mar  1 13:36:22 xm02 corosync[6228]:  [CLM   ] Members Joined:
Mar  1 13:36:22 xm02 ocfs2_controld: [7172]: info: crm_update_peer: 
Node xm01: id=16777316 state=member (new) addr=r(0) 
ip(100.0.0.1)  votes=1 born=1016 seen=1024 proc=000000000000000000000000001513
12
Mar  1 13:36:22 xm02 cib: [6295]: notice: ais_dispatch_message: 
Membership 1024: quorum acquired
Mar  1 13:36:22 xm02 cluster-dlm: [7099]: info: crm_update_peer: Node 
xm01: id=16777316 state=member (new) addr=r(0) ip(100.0.0.1)  votes=1 
born=1016 seen=1024 proc=00000000000000000000000000151312
Mar  1 13:36:22 xm02 crmd: [6299]: info: ais_status_callback: status: 
xm01 is now member (was lost)
Mar  1 13:36:22 xm02 corosync[6228]:  [CLM   ]  r(0) ip(100.0.0.1)
Mar  1 13:36:22 xm02 cib: [6295]: info: crm_update_peer: Node xm01: 
id=16777316 state=member (new) addr=r(0) ip(100.0.0.1)  votes=1 
born=1016 seen=1024 proc=00000000000000000000000000151312
Mar  1 13:36:22 xm02 cluster-dlm: update_cluster: Processing membership 1024
Mar  1 13:36:22 xm02 cib: [6295]: info: ais_dispatch_message: 
Membership 1024: quorum retained
Mar  1 13:36:22 xm02 crmd: [6299]: info: crm_update_peer: Node xm01: 
id=16777316 state=member (new) addr=r(0) ip(100.0.0.1)  votes=1 
born=1016 seen=1024 proc=00000000000000000000000000151312 (new)
Mar  1 13:36:22 xm02 corosync[6228]:  [pcmk  ] notice: 
pcmk_peer_update: Stable membership event on ring 1024: memb=2, new=1, lost=0
Mar  1 13:36:22 xm02 cluster-dlm: dlm_process_node: Adding address 
ip(100.0.0.1) to configfs for node 16777316
Mar  1 13:36:22 xm02 ocfs2_controld: [7172]: info: 
ais_dispatch_message: Membership 1024: quorum retained
Mar  1 13:36:22 xm02 corosync[6228]:  [pcmk  ] info: update_member: 
Node 16777316/xm01 is now: member
Mar  1 13:36:22 xm02 cluster-dlm: add_configfs_node: 
set_configfs_node 16777316 100.0.0.1 local 0
Mar  1 13:36:22 xm02 corosync[6228]:  [pcmk  ] info: 
pcmk_peer_update: NEW:  xm01 16777316
Mar  1 13:36:22 xm02 crmd: [6299]: info: crm_update_quorum: Updating 
quorum status to true (call=87)
Mar  1 13:36:22 xm02 cluster-dlm: dlm_process_node: Added active node 
16777316: born-on=1016, last-seen=1024, this-event=1024, last-event=1020
Mar  1 13:36:22 xm02 corosync[6228]:  [pcmk  ] info: 
pcmk_peer_update: MEMB: xm01 16777316
Mar  1 13:36:22 xm02 cluster-dlm: dlm_process_node: Skipped active 
node 33554532: born-on=1016, last-seen=1024, this-event=1024, last-event=1020
Mar  1 13:36:22 xm02 corosync[6228]:  [pcmk  ] info: 
pcmk_peer_update: MEMB: xm02 33554532
Mar  1 13:36:22 xm02 cluster-dlm: [7099]: info: ais_dispatch_message: 
Membership 1024: quorum retained
Mar  1 13:36:22 xm02 corosync[6228]:  [pcmk  ] info: 
send_member_notification: Sending membership update 1024 to 4 children
Mar  1 13:36:22 xm02 corosync[6228]:  [TOTEM ] A processor joined or 
left the membership and a new membership was formed.
Mar  1 13:36:22 xm02 corosync[6228]:  [pcmk  ] info: update_member: 
0x6acba0 Node 16777316 (xm01) born on: 1024
Mar  1 13:36:22 xm02 corosync[6228]:  [pcmk  ] info: 
send_member_notification: Sending membership update 1024 to 4 children
Mar  1 13:36:22 xm02 corosync[6228]:  [CPG   ] chosen downlist: 
sender r(0) ip(100.0.0.1) ; members(old:1 left:0)
Mar  1 13:36:22 xm02 corosync[6228]:  [MAIN  ] Completed service 
synchronization, ready to provide service.
Mar  1 13:36:22 xm02 cib: [6295]: info: cib_process_request: 
Operation complete: op cib_delete for section 
//node_state[@uname='xm01']/lrm (origin=local/crmd/83, 
version=0.2472.138): ok (rc=0)
Mar  1 13:36:22 xm02 cib: [6295]: info: cib_process_request: 
Operation complete: op cib_delete for section 
//node_state[@uname='xm01']/transient_attributes 
(origin=local/crmd/84, version=0.2472.139)
: ok (rc=0)
Mar  1 13:36:22 xm02 cib: [6295]: info: cib_process_request: 
Operation complete: op cib_modify for section nodes 
(origin=local/crmd/85, version=0.2472.140): ok (rc=0)
Mar  1 13:36:22 xm02 cib: [6295]: info: cib_process_request: 
Operation complete: op cib_modify for section cib 
(origin=local/crmd/87, version=0.2472.142): ok (rc=0)
Mar  1 13:36:22 xm02 crmd: [6299]: info: crmd_ais_dispatch: Setting 
expected votes to 2
Mar  1 13:36:22 xm02 crmd: [6299]: info: do_state_transition: State 
transition S_TRANSITION_ENGINE -> S_INTEGRATION [ input=I_NODE_JOIN 
cause=C_FSA_INTERNAL origin=crmd_peer_update ]
Mar  1 13:36:22 xm02 crmd: [6299]: info: abort_transition_graph: 
do_te_invoke:175 - Triggered transition abort (complete=0) : Peer Halt
Mar  1 13:36:22 xm02 crmd: [6299]: info: update_abort_priority: Abort 
priority upgraded from 0 to 1000000
Mar  1 13:36:22 xm02 crmd: [6299]: info: update_abort_priority: Abort 
action done superceeded by stop
Mar  1 13:36:22 xm02 crmd: [6299]: WARN: match_down_event: No match 
for shutdown action on xm01
Mar  1 13:36:22 xm02 crmd: [6299]: info: te_update_diff: 
Stonith/shutdown of xm01 not matched
Mar  1 13:36:22 xm02 crmd: [6299]: info: abort_transition_graph: 
te_update_diff:193 - Triggered transition abort (complete=0, 
tag=node_state, id=xm01, magic=NA, cib=0.2472.137) : Node failure
Mar  1 13:36:22 xm02 crmd: [6299]: info: update_abort_priority: Abort 
action stop superceeded by restart
Mar  1 13:36:22 xm02 crmd: [6299]: info: erase_xpath_callback: 
Deletion of "//node_state[@uname='xm01']/lrm": ok (rc=0)
Mar  1 13:36:22 xm02 crmd: [6299]: info: erase_xpath_callback: 
Deletion of "//node_state[@uname='xm01']/transient_attributes": ok (rc=0)
Mar  1 13:36:22 xm02 cib: [6295]: info: cib_process_request: 
Operation complete: op cib_modify for section crm_config 
(origin=local/crmd/89, version=0.2472.143): ok (rc=0)
Mar  1 13:36:22 xm02 crmd: [6299]: info: ais_dispatch_message: 
Membership 1024: quorum retained
Mar  1 13:36:22 xm02 cib: [6295]: info: cib_process_request: 
Operation complete: op cib_modify for section nodes 
(origin=local/crmd/90, version=0.2472.144): ok (rc=0)
Mar  1 13:36:22 xm02 crmd: [6299]: info: crmd_ais_dispatch: Setting 
expected votes to 2
Mar  1 13:36:22 xm02 crmd: [6299]: info: abort_transition_graph: 
do_te_invoke:175 - Triggered transition abort (complete=0) : Peer Halt
Mar  1 13:36:22 xm02 cib: [6295]: info: cib_process_request: 
Operation complete: op cib_modify for section crm_config 
(origin=local/crmd/93, version=0.2472.146): ok (rc=0)
Mar  1 13:36:22 xm02 crmd: [6299]: info: abort_transition_graph: 
do_te_invoke:175 - Triggered transition abort (complete=0) : Peer Halt
Mar  1 13:36:22 xm02 crmd: [6299]: info: abort_transition_graph: 
do_te_invoke:175 - Triggered transition abort (complete=0) : Peer Halt
Mar  1 13:36:23 xm02 lrmd: [6296]: info: perform_op:2932: operation 
start[45] with pid 8644 on VMSVN for client 6299, its parameters: 
CRM_meta_name=[start] crm_feature_set=[3.0.5] xmfile=[/etc/xen/v
m/vmsvn] CRM_meta_timeout=[60000]  for rsc is already running.

and again...

until 13:38:59 when XM02 goes down:

Mar  1 13:38:59 xm02 pengine: [6298]: WARN: unpack_rsc_op: Processing 
failed op VMSVN_stop_0 on xm02: unknown error (1)
Mar  1 13:38:59 xm02 pengine: [6298]: WARN: pe_fence_node: Node xm02 
will be fenced to recover from resource failure(s)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: 
common_apply_stickiness: ms_drbd_vmconfig can fail 9 more times on 
xm01 before being forced off
Mar  1 13:38:59 xm02 pengine: [6298]: notice: 
common_apply_stickiness: ms_drbd_vmconfig can fail 9 more times on 
xm01 before being forced off
Mar  1 13:38:59 xm02 pengine: [6298]: WARN: common_apply_stickiness: 
Forcing VMSVN away from xm02 after 1000000 failures (max=1000000)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for vmsvn-drbd:0 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for vmsvn-drbd:0 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for srvsvn1-drbd:0 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for srvsvn1-drbd:0 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for srvsvn2-drbd:0 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for srvsvn2-drbd:0 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (10s) for dlm:1 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (10s) for o2cb:1 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (10s) for clvm:1 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for vmconfig-pri:1 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (30s) for VMSVN on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: WARN: stage6: Scheduling Node 
xm02 for STONITH
Mar  1 13:38:59 xm02 pengine: [6298]: WARN: native_stop_constraints: 
Stop of failed resource VMSVN is implicit after xm02 is fenced
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Stop    ipmi-stonith-xm01     (xm02)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Leave   ipmi-stonith-xm02     (Started xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Demote  vmconfig:0    (Master -> Slave xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: Recover 
vmconfig:0    (Master xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Demote  vmconfig:1    (Master -> Stopped xm02)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: Promote 
vmsvn-drbd:0  (Slave -> Master xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Demote  vmsvn-drbd:1  (Master -> Stopped xm02)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: Promote 
srvsvn1-drbd:0        (Slave -> Master xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Demote  srvsvn1-drbd:1        (Master -> Stopped xm02)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: Promote 
srvsvn2-drbd:0        (Slave -> Master xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Demote  srvsvn2-drbd:1        (Master -> Stopped xm02)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Leave   dlm:0 (Stopped)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Leave   o2cb:0        (Stopped)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Leave   clvm:0        (Stopped)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Move    dlm:1 (Started xm02 -> xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Move    o2cb:1        (Started xm02 -> xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Move    clvm:1        (Started xm02 -> xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmconfig-pri:0        (Stopped)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Move    vmconfig-pri:1        (Started xm02 -> xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Leave   vg_svn:0      (Stopped)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Move    vg_svn:1      (Started xm02 -> xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Move    VMSVN (Started xm02 -> xm01)
Mar  1 13:38:59 xm02 crmd: [6299]: info: handle_response: pe_calc 
calculation pe_calc-dc-1330619939-67 is obsolete

Mar  1 13:38:59 xm02 pengine: [6298]: WARN: common_apply_stickiness: 
Forcing VMSVN away from xm02 after 1000000 failures (max=1000000)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for vmsvn-drbd:0 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for vmsvn-drbd:0 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for srvsvn1-drbd:0 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for srvsvn1-drbd:0 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for srvsvn2-drbd:0 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for srvsvn2-drbd:0 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (10s) for dlm:1 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (10s) for o2cb:1 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (10s) for clvm:1 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (20s) for vmconfig-pri:1 on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: notice: RecurringOp:  Start 
recurring monitor (30s) for VMSVN on xm01
Mar  1 13:38:59 xm02 pengine: [6298]: WARN: stage6: Scheduling Node 
xm02 for STONITH
Mar  1 13:38:59 xm02 pengine: [6298]: WARN: native_stop_constraints: 
Stop of failed resource VMSVN is implicit after xm02 is fenced
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Stop    ipmi-stonith-xm01     (xm02)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Leave   ipmi-stonith-xm02     (Started xm01)
Mar  1 13:38:59 xm02 mgmtd: [6300]: info: CIB query: cib
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Demote  vmconfig:0    (Master -> Slave xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: Recover 
vmconfig:0    (Master xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Demote  vmconfig:1    (Master -> Stopped xm02)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: Promote 
vmsvn-drbd:0  (Slave -> Master xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Demote  vmsvn-drbd:1  (Master -> Stopped xm02)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: Promote 
srvsvn1-drbd:0        (Slave -> Master xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Demote  srvsvn1-drbd:1        (Master -> Stopped xm02)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: Promote 
srvsvn2-drbd:0        (Slave -> Master xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Demote  srvsvn2-drbd:1        (Master -> Stopped xm02)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Leave   dlm:0 (Stopped)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Leave   o2cb:0        (Stopped)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Leave   clvm:0        (Stopped)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Move    dlm:1 (Started xm02 -> xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Move    o2cb:1        (Started xm02 -> xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Move    clvm:1        (Started xm02 -> xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Leave   vmconfig-pri:0        (Stopped)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Move    vmconfig-pri:1        (Started xm02 -> xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Leave   vg_svn:0      (Stopped)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Move    vg_svn:1      (Started xm02 -> xm01)
Mar  1 13:38:59 xm02 pengine: [6298]: notice: LogActions: 
Move    VMSVN (Started xm02 -> xm01)
Mar  1 13:38:59 xm02 lrmd: [6296]: info: perform_op:2932: operation 
start[45] with pid 8644 on VMSVN for client 6299, its parameters: 
CRM_meta_name=[start] crm_feature_set=[3.0.5] xmfile=[/etc/xen/v
m/vmsvn] CRM_meta_timeout=[60000]  for rsc is already running.
Mar  1 13:38:59 xm02 lrmd: [6296]: info: perform_op:2942: postponing 
all ops on resource VMSVN by 1000 ms
Mar  1 13:38:59 xm02 crmd: [6299]: info: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ 
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Mar  1 13:38:59 xm02 crmd: [6299]: info: unpack_graph: Unpacked 
transition 8: 128 actions in 128 synapses
Mar  1 13:38:59 xm02 crmd: [6299]: info: do_te_invoke: Processing 
graph 8 (ref=pe_calc-dc-1330619939-68) derived from 
/var/lib/pengine/pe-warn-311.bz2
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_pseudo_action: Pseudo 
action 15 fired and confirmed
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_pseudo_action: Pseudo 
action 42 fired and confirmed
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_pseudo_action: Pseudo 
action 73 fired and confirmed

Mar  1 13:38:59 xm02 crmd: [6299]: info: te_pseudo_action: Pseudo 
action 104 fired and confirmed
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_pseudo_action: Pseudo 
action 135 fired and confirmed
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_pseudo_action: Pseudo 
action 184 fired and confirmed
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_rsc_command: Initiating 
action 216: notify vmconfig:0_pre_notify_demote_0 on xm01
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_rsc_command: Initiating 
action 218: notify vmconfig:1_pre_notify_demote_0 on xm02 (local)
Mar  1 13:38:59 xm02 crmd: [6299]: info: do_lrm_rsc_op: Performing 
key=218:8:0:8b7a050b-901b-4db7-b1f7-c3c5dd8a9653 op=vmconfig:1_notify_0 )
Mar  1 13:38:59 xm02 lrmd: [6296]: info: rsc:vmconfig:1 notify[57] (pid 13343)
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_rsc_command: Initiating 
action 224: notify vmsvn-drbd:0_pre_notify_demote_0 on xm01
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_rsc_command: Initiating 
action 226: notify vmsvn-drbd:1_pre_notify_demote_0 on xm02 (local)
Mar  1 13:38:59 xm02 crmd: [6299]: info: do_lrm_rsc_op: Performing 
key=226:8:0:8b7a050b-901b-4db7-b1f7-c3c5dd8a9653 op=vmsvn-drbd:1_notify_0 )
Mar  1 13:38:59 xm02 lrmd: [6296]: info: rsc:vmsvn-drbd:1 notify[58] 
(pid 13344)
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_rsc_command: Initiating 
action 232: notify srvsvn1-drbd:0_pre_notify_demote_0 on xm01
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_rsc_command: Initiating 
action 234: notify srvsvn1-drbd:1_pre_notify_demote_0 on xm02 (local)
Mar  1 13:38:59 xm02 crmd: [6299]: info: do_lrm_rsc_op: Performing 
key=234:8:0:8b7a050b-901b-4db7-b1f7-c3c5dd8a9653 op=srvsvn1-drbd:1_notify_0 )
Mar  1 13:38:59 xm02 lrmd: [6296]: info: rsc:srvsvn1-drbd:1 
notify[59] (pid 13345)
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_rsc_command: Initiating 
action 240: notify srvsvn2-drbd:0_pre_notify_demote_0 on xm01
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_rsc_command: Initiating 
action 242: notify srvsvn2-drbd:1_pre_notify_demote_0 on xm02 (local)
Mar  1 13:38:59 xm02 crmd: [6299]: info: do_lrm_rsc_op: Performing 
key=242:8:0:8b7a050b-901b-4db7-b1f7-c3c5dd8a9653 op=srvsvn2-drbd:1_notify_0 )
Mar  1 13:38:59 xm02 crmd: [6299]: info: te_fence_node: Executing 
reboot fencing operation (186) on xm02 (timeout=60000)
Mar  1 13:38:59 xm02 stonith-ng: [6294]: info: 
initiate_remote_stonith_op: Initiating remote operation reboot for 
xm02: c1be22cc-e535-441c-a674-89551a2b9d4c
Mar  1 13:38:59 xm02 stonith-ng: [6294]: info: stonith_queryQuery 
<stonith_command t="stonith-ng" 
st_async_id="c1be22cc-e535-441c-a674-89551a2b9d4c" st_op="st_query" 
st_callid="0" st_callopt="0" st_
remote_op="c1be22cc-e535-441c-a674-89551a2b9d4c" st_target="xm02" 
st_device_action="reboot" 
st_clientid="bb653c7a-6351-4517-ad06-6fb0e20fe375" st_timeout="6000" 
src="xm02" seq="5" />
Mar  1 13:38:59 xm02 pengine: [6298]: WARN: process_pe_message: 
Transition 8: WARNINGs found during PE processing. PEngine Input 
stored in: /var/lib/pengine/pe-warn-311.bz2
Mar  1 13:38:59 xm02 pengine: [6298]: notice: process_pe_message: 
Configuration WARNINGs found during PE processing.  Please run 
"crm_verify -L" to identify issues.
Mar  1 13:38:59 xm02 lrmd: [6296]: info: operation notify[58] on 
vmsvn-drbd:1 for client 6299: pid 13344 exited with return code 0
Mar  1 13:38:59 xm02 stonith-ng: [6294]: info: 
can_fence_host_with_device: Refreshing port list for ipmi-stonith-xm01
Mar  1 13:38:59 xm02 stonith-ng: [6294]: WARN: parse_host_line: Could 
not parse (0 0):
Mar  1 13:38:59 xm02 stonith-ng: [6294]: info: 
can_fence_host_with_device: ipmi-stonith-xm01 can not fence xm02: dynamic-list
Mar  1 13:38:59 xm02 stonith-ng: [6294]: info: stonith_query: Found 0 
matching devices for 'xm02'
Mar  1 13:38:59 xm02 stonith-ng: [6294]: info: stonith_command: 
Processed st_query from xm02: rc=0
Mar  1 13:38:59 xm02 crmd: [6299]: info: process_lrm_event: LRM 
operation vmsvn-drbd:1_notify_0 (call=58, rc=0, cib-update=130, 
confirmed=true) ok

After the storm, both nodes became online, Master/Master and VMSVN is 
also online. However, the cloned init-group in Pacemaker (dlm, o2cb, 
clvm) is not running on xm01.

Feedbacks?

Thanks!
Daniel





More information about the drbd-user mailing list