Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Using latest drbdmanage, I have a 3-node setup, but do auto-deploy with redundancy 2. I simply create a diskless assignment when I live migrate a VM to a node without assignment. # drbdmanage a +----------------------------------------------------------------------------------------------+ | Node | Resource | Vol ID | | State | +----------------------------------------------------------------------------------------------+ | hatest1 | vm-100-disk-1 | * | | ok | | hatest2 | vm-100-disk-1 | * | | ok | +------------------------------------------ hatest1# drbdsetup show ... resource vm-100-disk-1 { _this_host { node-id 0; volume 0 { device minor 10; disk "/dev/drbdpool/vm-100-disk-1_00"; meta-disk internal; disk { size 2097152s; # bytes } } } connection { _peer_node_id 1; _this_host ipv4 192.168.3.201:7700; _remote_host ipv4 192.168.3.202:7700; net { allow-two-primaries yes; cram-hmac-alg "sha1"; shared-secret "z2m7pQ+YULNJF4RlmXA0"; _name "hatest2"; } } } When I migrate the VM to node hatest3 I do: ($rc, $res) = $hdl->assign($nodename, $volname, { diskless => 'true' }); and wait until the new assignment is ready ("cstate:deploy" => "true"). But migration fails, and I get the following log on node hatest3: [79781.422112] drbd .drbdctrl: Preparing cluster-wide state change 2357679533 (2->-1 3/1) [79781.422356] drbd .drbdctrl: State change 2357679533: primary_nodes=4, weak_nodes=FFFFFFFFFFFFFFF8 [79781.422358] drbd .drbdctrl: Committing cluster-wide state change 2357679533 (0ms) [79781.422367] drbd .drbdctrl: role( Secondary -> Primary ) [79781.444068] drbd vm-100-disk-1: Starting worker thread (from drbdsetup [5414]) [79781.446251] drbd vm-100-disk-1 hatest1: Starting sender thread (from drbdsetup [5418]) [79781.447353] drbd vm-100-disk-1 hatest2: Starting sender thread (from drbdsetup [5422]) [79781.448354] drbd vm-100-disk-1 hatest1: conn( StandAlone -> Unconnected ) [79781.448388] drbd vm-100-disk-1 hatest1: Starting receiver thread (from drbd_w_vm-100-d [5415]) [79781.448451] drbd vm-100-disk-1 hatest1: conn( Unconnected -> Connecting ) [79781.448928] drbd vm-100-disk-1 hatest2: conn( StandAlone -> Unconnected ) [79781.448989] drbd vm-100-disk-1 hatest2: Starting receiver thread (from drbd_w_vm-100-d [5415]) [79781.449050] drbd vm-100-disk-1 hatest2: conn( Unconnected -> Connecting ) [79781.500491] drbd .drbdctrl: role( Primary -> Secondary ) [79781.507598] drbd .drbdctrl hatest2: Preparing remote state change 488413481 (primary_nodes=0, weak_nodes=0) [79781.507608] drbd .drbdctrl: State change failed: Peer may not become primary while device is opened read-only [79781.509894] drbd .drbdctrl hatest2: Failed: peer( Secondary -> Primary ) [79781.509982] drbd .drbdctrl hatest2: Aborting remote state change 488413481 [79781.509993] drbd .drbdctrl hatest1: Preparing remote state change 1382890737 (primary_nodes=0, weak_nodes=0) [79781.510004] drbd .drbdctrl: State change failed: Peer may not become primary while device is opened read-only [79781.510601] drbd .drbdctrl hatest1: Failed: peer( Secondary -> Primary ) [79781.510749] drbd .drbdctrl hatest1: Aborting remote state change 1382890737 [79781.510771] drbd .drbdctrl hatest1: Preparing remote state change 2519845707 (primary_nodes=0, weak_nodes=0) [79781.510778] drbd .drbdctrl: State change failed: Peer may not become primary while device is opened read-only [79781.511348] drbd .drbdctrl hatest1: Failed: peer( Secondary -> Primary ) [79781.511468] drbd .drbdctrl hatest1: Aborting remote state change 2519845707 [79781.511484] drbd .drbdctrl hatest1: Preparing remote state change 3331576747 (primary_nodes=0, weak_nodes=0) [79781.511497] drbd .drbdctrl: State change failed: Peer may not become primary while device is opened read-only [79781.512094] drbd .drbdctrl hatest1: Failed: peer( Secondary -> Primary ) [79781.512343] drbd .drbdctrl hatest1: Aborting remote state change 3331576747 [79781.512355] drbd .drbdctrl hatest1: Preparing remote state change 876759415 (primary_nodes=0, weak_nodes=0) [79781.512363] drbd .drbdctrl: State change failed: Peer may not become primary while device is opened read-only [79781.513030] drbd .drbdctrl hatest1: Failed: peer( Secondary -> Primary ) [79781.513058] drbd .drbdctrl hatest2: Rejecting concurrent remote state change 55884480 because of state change 876759415 [79781.513066] drbd .drbdctrl hatest2: Ignoring P_TWOPC_ABORT packet 55884480. [79781.513171] drbd .drbdctrl hatest1: Aborting remote state change 876759415 [79781.513218] drbd .drbdctrl hatest2: Preparing remote state change 998847790 (primary_nodes=0, weak_nodes=0) [79781.513225] drbd .drbdctrl: State change failed: Peer may not become primary while device is opened read-only [79781.513745] drbd .drbdctrl hatest2: Failed: peer( Secondary -> Primary ) [79781.513832] drbd .drbdctrl hatest2: Aborting remote state change 998847790 [79781.513843] drbd .drbdctrl hatest2: Preparing remote state change 2066673987 (primary_nodes=0, weak_nodes=0) [79781.513849] drbd .drbdctrl: State change failed: Peer may not become primary while device is opened read-only [79781.514353] drbd .drbdctrl hatest2: Failed: peer( Secondary -> Primary ) [79781.514484] drbd .drbdctrl hatest2: Aborting remote state change 2066673987 [79781.711795] drbd .drbdctrl hatest1: Preparing remote state change 3548726750 (primary_nodes=0, weak_nodes=0) [79781.723310] drbd .drbdctrl hatest1: Committing remote state change 3548726750 [79781.723321] drbd .drbdctrl hatest1: peer( Secondary -> Primary ) [79781.723456] drbd .drbdctrl hatest2: Preparing remote state change 4174678771 (primary_nodes=0, weak_nodes=0) [79781.724908] drbd .drbdctrl hatest2: Aborting remote state change 4174678771 [79781.757653] drbd .drbdctrl hatest1: peer( Primary -> Secondary ) [79781.758298] drbd .drbdctrl: Preparing cluster-wide state change 1613663172 (2->-1 3/1) [79781.758763] drbd .drbdctrl: Aborting cluster-wide state change 1613663172 (0ms) rv = -10 [79781.758791] drbd .drbdctrl: Preparing cluster-wide state change 4196524879 (2->-1 3/1) [79781.759225] drbd .drbdctrl: Aborting cluster-wide state change 4196524879 (0ms) rv = -10 [79781.759247] drbd .drbdctrl: Preparing cluster-wide state change 236135030 (2->-1 3/1) [79781.759699] drbd .drbdctrl: Aborting cluster-wide state change 236135030 (0ms) rv = -10 [79781.759731] drbd .drbdctrl: Preparing cluster-wide state change 891650542 (2->-1 3/1) [79781.760324] drbd .drbdctrl: Aborting cluster-wide state change 891650542 (4ms) rv = -10 [79781.960098] drbd .drbdctrl: Preparing cluster-wide state change 3806129081 (2->-1 3/1) [79781.960396] drbd .drbdctrl: State change 3806129081: primary_nodes=4, weak_nodes=FFFFFFFFFFFFFFF8 [79781.960398] drbd .drbdctrl: Committing cluster-wide state change 3806129081 (0ms) [79781.960406] drbd .drbdctrl: role( Secondary -> Primary ) [79781.967255] drbd .drbdctrl: role( Primary -> Secondary ) [79782.224808] drbd .drbdctrl hatest2: Preparing remote state change 1359815849 (primary_nodes=0, weak_nodes=0) [79782.233123] drbd .drbdctrl hatest2: Committing remote state change 1359815849 [79782.233140] drbd .drbdctrl hatest2: peer( Secondary -> Primary ) [79782.240031] drbd vm-100-disk-1 hatest1: Handshake successful: Agreed network protocol version 110 [79782.240033] drbd vm-100-disk-1 hatest1: Agreed to support TRIM on protocol level [79782.240108] drbd vm-100-disk-1 hatest1: Peer authenticated using 20 bytes HMAC [79782.240116] drbd vm-100-disk-1 hatest1: Starting ack_recv thread (from drbd_r_vm-100-d [5426]) [79782.240202] drbd vm-100-disk-1 hatest1: incompatible allow-two-primaries settings [79782.240700] drbd vm-100-disk-1 hatest1: conn( Connecting -> Disconnecting ) [79782.240711] drbd vm-100-disk-1 hatest1: error receiving P_PROTOCOL, e: -5 l: 1! [79782.241201] drbd vm-100-disk-1 hatest1: ack_receiver terminated [79782.241202] drbd vm-100-disk-1 hatest1: Terminating ack_recv thread [79782.256166] drbd vm-100-disk-1 hatest1: Connection closed [79782.256190] drbd vm-100-disk-1 hatest1: conn( Disconnecting -> StandAlone ) [79782.256199] drbd vm-100-disk-1 hatest1: Terminating receiver thread [79782.259371] drbd .drbdctrl hatest2: peer( Primary -> Secondary ) [79782.259903] drbd .drbdctrl: Preparing cluster-wide state change 1478643619 (2->-1 3/1) [79782.260596] drbd .drbdctrl: Aborting cluster-wide state change 1478643619 (4ms) rv = -10 [79782.260624] drbd .drbdctrl: Preparing cluster-wide state change 660226156 (2->-1 3/1) [79782.260825] drbd .drbdctrl hatest1: Aborting local state change 660226156 to yield to remote state change 1270642965. [79782.260838] drbd .drbdctrl: Aborting cluster-wide state change 660226156 (0ms) rv = -19 [79782.260852] drbd .drbdctrl hatest1: Preparing remote state change 1270642965 (primary_nodes=0, weak_nodes=0) [79782.261615] drbd .drbdctrl hatest1: Aborting remote state change 1270642965 [79782.261637] drbd .drbdctrl: Preparing cluster-wide state change 2591013370 (2->-1 3/1) [79782.261639] drbd .drbdctrl hatest1: Aborting local state change 2591013370 to yield to remote state change 3039198147. [79782.261648] drbd .drbdctrl: Aborting cluster-wide state change 2591013370 (0ms) rv = -19 [79782.261675] drbd .drbdctrl: Preparing cluster-wide state change 1878541550 (2->-1 3/1) [79782.261685] drbd .drbdctrl: Aborting cluster-wide state change 1878541550 (0ms) rv = -19 [79782.261698] drbd .drbdctrl hatest1: Preparing remote state change 3039198147 (primary_nodes=0, weak_nodes=0) [79782.261700] drbd .drbdctrl: Auto-promote failed: Concurrent state changes detected and aborted [79782.262088] drbd .drbdctrl hatest1: Aborting remote state change 3039198147 [79782.262148] drbd .drbdctrl hatest1: Preparing remote state change 3402814018 (primary_nodes=0, weak_nodes=0) [79782.262505] drbd .drbdctrl hatest1: Aborting remote state change 3402814018 [79782.262521] drbd .drbdctrl hatest1: Preparing remote state change 1935329853 (primary_nodes=0, weak_nodes=0) [79782.262855] drbd .drbdctrl hatest1: Aborting remote state change 1935329853 [79782.459853] drbd .drbdctrl hatest1: Preparing remote state change 2208552747 (primary_nodes=0, weak_nodes=0) [79782.467375] drbd .drbdctrl hatest1: Committing remote state change 2208552747 [79782.467391] drbd .drbdctrl hatest1: peer( Secondary -> Primary ) [79782.476407] drbd .drbdctrl hatest1: peer( Primary -> Secondary ) [79782.740792] drbd vm-100-disk-1 hatest2: Handshake successful: Agreed network protocol version 110 [79782.740794] drbd vm-100-disk-1 hatest2: Agreed to support TRIM on protocol level [79782.740951] drbd vm-100-disk-1 hatest2: Peer authenticated using 20 bytes HMAC [79782.740969] drbd vm-100-disk-1 hatest2: Starting ack_recv thread (from drbd_r_vm-100-d [5428]) [79782.741007] drbd vm-100-disk-1 hatest2: incompatible allow-two-primaries settings [79782.741536] drbd vm-100-disk-1 hatest2: conn( Connecting -> Disconnecting ) [79782.741548] drbd vm-100-disk-1 hatest2: error receiving P_PROTOCOL, e: -5 l: 1! [79782.742070] drbd vm-100-disk-1 hatest2: ack_receiver terminated [79782.742071] drbd vm-100-disk-1 hatest2: Terminating ack_recv thread [79782.788093] drbd vm-100-disk-1 hatest2: Connection closed [79782.788116] drbd vm-100-disk-1 hatest2: conn( Disconnecting -> StandAlone ) [79782.788124] drbd vm-100-disk-1 hatest2: Terminating receiver thread [79782.817344] device tap100i0 entered promiscuous mode [79782.821298] vmbr0: port 2(tap100i0) entered forwarding state [79782.821302] vmbr0: port 2(tap100i0) entered forwarding state [79782.827251] drbd vm-100-disk-1: State change failed: Need access to UpToDate data [79782.827804] drbd vm-100-disk-1: Failed: role( Secondary -> Primary ) [79782.827807] drbd vm-100-disk-1: Auto-promote failed: Need access to UpToDate data [79782.907617] vmbr0: port 2(tap100i0) entered disabled state [79784.003668] drbd .drbdctrl: Preparing cluster-wide state change 990779869 (2->-1 3/1) [79784.003974] drbd .drbdctrl: State change 990779869: primary_nodes=4, weak_nodes=FFFFFFFFFFFFFFF8 [79784.003976] drbd .drbdctrl: Committing cluster-wide state change 990779869 (0ms) [79784.003984] drbd .drbdctrl: role( Secondary -> Primary ) [79784.009861] drbd .drbdctrl: role( Primary -> Secondary ) And I get the following info on the original node hatest2# drbdsetup show ... resource vm-100-disk-1 { _this_host { node-id 1; volume 0 { device minor 10; disk "/dev/drbdpool/vm-100-disk-1_00"; meta-disk internal; disk { size 2097152s; # bytes } } } connection { _peer_node_id 0; _this_host ipv4 192.168.3.202:7700; _remote_host ipv4 192.168.3.201:7700; net { allow-two-primaries yes; cram-hmac-alg "sha1"; shared-secret "z2m7pQ+YULNJF4RlmXA0"; _name "hatest1"; } } connection { _peer_node_id 2; _this_host ipv4 192.168.3.202:7700; _remote_host ipv4 192.168.3.203:7700; _is_standalone; net { allow-two-primaries yes; cram-hmac-alg "sha1"; shared-secret "z2m7pQ+YULNJF4RlmXA0"; _name "hatest3"; } } } The setup on node hatest3 differs: hatest3# drbdsetup show ... resource vm-100-disk-1 { _this_host { node-id 2; volume 0 { device minor 10; } } connection { _peer_node_id 0; _this_host ipv4 192.168.3.203:7700; _remote_host ipv4 192.168.3.201:7700; _is_standalone; net { cram-hmac-alg "sha1"; shared-secret "z2m7pQ+YULNJF4RlmXA0"; _name "hatest1"; } } connection { _peer_node_id 1; _this_host ipv4 192.168.3.203:7700; _remote_host ipv4 192.168.3.202:7700; _is_standalone; net { cram-hmac-alg "sha1"; shared-secret "z2m7pQ+YULNJF4RlmXA0"; _name "hatest2"; } } } Whats wrong? I am not 100% sure, but AFAIR the same code worked with previous drbd9 releases.