[DRBD-user] incompatible allow-two-primaries settings

Dietmar Maurer dietmar at proxmox.com
Thu Jun 18 08:11:00 CEST 2015

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Using latest drbdmanage, I have a 3-node setup, but do auto-deploy with
redundancy 2.
I simply create a diskless assignment when I live migrate a VM to a node without
assignment.

# drbdmanage a
+----------------------------------------------------------------------------------------------+
| Node    | Resource      | Vol ID |
                                                  | State |
+----------------------------------------------------------------------------------------------+
| hatest1 | vm-100-disk-1 |      * |
                                                  |    ok |
| hatest2 | vm-100-disk-1 |      * |
                                                  |    ok |
+------------------------------------------


hatest1# drbdsetup show
...
resource vm-100-disk-1 {
    _this_host {
        node-id			0;
        volume 0 {
            device			minor 10;
            disk			"/dev/drbdpool/vm-100-disk-1_00";
            meta-disk			internal;
            disk {
                size            	2097152s; # bytes
            }
        }
    }
    connection {
        _peer_node_id 1;
        _this_host ipv4 192.168.3.201:7700;
        _remote_host ipv4 192.168.3.202:7700;
        net {
            allow-two-primaries	yes;
            cram-hmac-alg   	"sha1";
            shared-secret   	"z2m7pQ+YULNJF4RlmXA0";
            _name           	"hatest2";
        }
    }
}


When I migrate the VM to node hatest3 I do:

   ($rc, $res) = $hdl->assign($nodename, $volname, { diskless => 'true' });

and wait until the new assignment is ready ("cstate:deploy" => "true").
But migration fails, and I get the following log on node hatest3:
 
[79781.422112] drbd .drbdctrl: Preparing cluster-wide state change 2357679533
(2->-1 3/1)
[79781.422356] drbd .drbdctrl: State change 2357679533: primary_nodes=4,
weak_nodes=FFFFFFFFFFFFFFF8
[79781.422358] drbd .drbdctrl: Committing cluster-wide state change 2357679533
(0ms)
[79781.422367] drbd .drbdctrl: role( Secondary -> Primary )
[79781.444068] drbd vm-100-disk-1: Starting worker thread (from drbdsetup
[5414])
[79781.446251] drbd vm-100-disk-1 hatest1: Starting sender thread (from
drbdsetup [5418])
[79781.447353] drbd vm-100-disk-1 hatest2: Starting sender thread (from
drbdsetup [5422])
[79781.448354] drbd vm-100-disk-1 hatest1: conn( StandAlone -> Unconnected )
[79781.448388] drbd vm-100-disk-1 hatest1: Starting receiver thread (from
drbd_w_vm-100-d [5415])
[79781.448451] drbd vm-100-disk-1 hatest1: conn( Unconnected -> Connecting )
[79781.448928] drbd vm-100-disk-1 hatest2: conn( StandAlone -> Unconnected )
[79781.448989] drbd vm-100-disk-1 hatest2: Starting receiver thread (from
drbd_w_vm-100-d [5415])
[79781.449050] drbd vm-100-disk-1 hatest2: conn( Unconnected -> Connecting )
[79781.500491] drbd .drbdctrl: role( Primary -> Secondary )
[79781.507598] drbd .drbdctrl hatest2: Preparing remote state change 488413481
(primary_nodes=0, weak_nodes=0)
[79781.507608] drbd .drbdctrl: State change failed: Peer may not become primary
while device is opened read-only
[79781.509894] drbd .drbdctrl hatest2: Failed: peer( Secondary -> Primary )
[79781.509982] drbd .drbdctrl hatest2: Aborting remote state change 488413481
[79781.509993] drbd .drbdctrl hatest1: Preparing remote state change 1382890737
(primary_nodes=0, weak_nodes=0)
[79781.510004] drbd .drbdctrl: State change failed: Peer may not become primary
while device is opened read-only
[79781.510601] drbd .drbdctrl hatest1: Failed: peer( Secondary -> Primary )
[79781.510749] drbd .drbdctrl hatest1: Aborting remote state change 1382890737
[79781.510771] drbd .drbdctrl hatest1: Preparing remote state change 2519845707
(primary_nodes=0, weak_nodes=0)
[79781.510778] drbd .drbdctrl: State change failed: Peer may not become primary
while device is opened read-only
[79781.511348] drbd .drbdctrl hatest1: Failed: peer( Secondary -> Primary )
[79781.511468] drbd .drbdctrl hatest1: Aborting remote state change 2519845707
[79781.511484] drbd .drbdctrl hatest1: Preparing remote state change 3331576747
(primary_nodes=0, weak_nodes=0)
[79781.511497] drbd .drbdctrl: State change failed: Peer may not become primary
while device is opened read-only
[79781.512094] drbd .drbdctrl hatest1: Failed: peer( Secondary -> Primary )
[79781.512343] drbd .drbdctrl hatest1: Aborting remote state change 3331576747
[79781.512355] drbd .drbdctrl hatest1: Preparing remote state change 876759415
(primary_nodes=0, weak_nodes=0)
[79781.512363] drbd .drbdctrl: State change failed: Peer may not become primary
while device is opened read-only
[79781.513030] drbd .drbdctrl hatest1: Failed: peer( Secondary -> Primary )
[79781.513058] drbd .drbdctrl hatest2: Rejecting concurrent remote state change
55884480 because of state change 876759415
[79781.513066] drbd .drbdctrl hatest2: Ignoring P_TWOPC_ABORT packet 55884480.
[79781.513171] drbd .drbdctrl hatest1: Aborting remote state change 876759415
[79781.513218] drbd .drbdctrl hatest2: Preparing remote state change 998847790
(primary_nodes=0, weak_nodes=0)
[79781.513225] drbd .drbdctrl: State change failed: Peer may not become primary
while device is opened read-only
[79781.513745] drbd .drbdctrl hatest2: Failed: peer( Secondary -> Primary )
[79781.513832] drbd .drbdctrl hatest2: Aborting remote state change 998847790
[79781.513843] drbd .drbdctrl hatest2: Preparing remote state change 2066673987
(primary_nodes=0, weak_nodes=0)
[79781.513849] drbd .drbdctrl: State change failed: Peer may not become primary
while device is opened read-only
[79781.514353] drbd .drbdctrl hatest2: Failed: peer( Secondary -> Primary )
[79781.514484] drbd .drbdctrl hatest2: Aborting remote state change 2066673987
[79781.711795] drbd .drbdctrl hatest1: Preparing remote state change 3548726750
(primary_nodes=0, weak_nodes=0)
[79781.723310] drbd .drbdctrl hatest1: Committing remote state change 3548726750
[79781.723321] drbd .drbdctrl hatest1: peer( Secondary -> Primary )
[79781.723456] drbd .drbdctrl hatest2: Preparing remote state change 4174678771
(primary_nodes=0, weak_nodes=0)
[79781.724908] drbd .drbdctrl hatest2: Aborting remote state change 4174678771
[79781.757653] drbd .drbdctrl hatest1: peer( Primary -> Secondary )
[79781.758298] drbd .drbdctrl: Preparing cluster-wide state change 1613663172
(2->-1 3/1)
[79781.758763] drbd .drbdctrl: Aborting cluster-wide state change 1613663172
(0ms) rv = -10
[79781.758791] drbd .drbdctrl: Preparing cluster-wide state change 4196524879
(2->-1 3/1)
[79781.759225] drbd .drbdctrl: Aborting cluster-wide state change 4196524879
(0ms) rv = -10
[79781.759247] drbd .drbdctrl: Preparing cluster-wide state change 236135030
(2->-1 3/1)
[79781.759699] drbd .drbdctrl: Aborting cluster-wide state change 236135030
(0ms) rv = -10
[79781.759731] drbd .drbdctrl: Preparing cluster-wide state change 891650542
(2->-1 3/1)
[79781.760324] drbd .drbdctrl: Aborting cluster-wide state change 891650542
(4ms) rv = -10
[79781.960098] drbd .drbdctrl: Preparing cluster-wide state change 3806129081
(2->-1 3/1)
[79781.960396] drbd .drbdctrl: State change 3806129081: primary_nodes=4,
weak_nodes=FFFFFFFFFFFFFFF8
[79781.960398] drbd .drbdctrl: Committing cluster-wide state change 3806129081
(0ms)
[79781.960406] drbd .drbdctrl: role( Secondary -> Primary )
[79781.967255] drbd .drbdctrl: role( Primary -> Secondary )
[79782.224808] drbd .drbdctrl hatest2: Preparing remote state change 1359815849
(primary_nodes=0, weak_nodes=0)
[79782.233123] drbd .drbdctrl hatest2: Committing remote state change 1359815849
[79782.233140] drbd .drbdctrl hatest2: peer( Secondary -> Primary )
[79782.240031] drbd vm-100-disk-1 hatest1: Handshake successful: Agreed network
protocol version 110
[79782.240033] drbd vm-100-disk-1 hatest1: Agreed to support TRIM on protocol
level
[79782.240108] drbd vm-100-disk-1 hatest1: Peer authenticated using 20 bytes
HMAC
[79782.240116] drbd vm-100-disk-1 hatest1: Starting ack_recv thread (from
drbd_r_vm-100-d [5426])
[79782.240202] drbd vm-100-disk-1 hatest1: incompatible allow-two-primaries
settings
[79782.240700] drbd vm-100-disk-1 hatest1: conn( Connecting -> Disconnecting )
[79782.240711] drbd vm-100-disk-1 hatest1: error receiving P_PROTOCOL, e: -5 l:
1!
[79782.241201] drbd vm-100-disk-1 hatest1: ack_receiver terminated
[79782.241202] drbd vm-100-disk-1 hatest1: Terminating ack_recv thread
[79782.256166] drbd vm-100-disk-1 hatest1: Connection closed
[79782.256190] drbd vm-100-disk-1 hatest1: conn( Disconnecting -> StandAlone )
[79782.256199] drbd vm-100-disk-1 hatest1: Terminating receiver thread
[79782.259371] drbd .drbdctrl hatest2: peer( Primary -> Secondary )
[79782.259903] drbd .drbdctrl: Preparing cluster-wide state change 1478643619
(2->-1 3/1)
[79782.260596] drbd .drbdctrl: Aborting cluster-wide state change 1478643619
(4ms) rv = -10
[79782.260624] drbd .drbdctrl: Preparing cluster-wide state change 660226156
(2->-1 3/1)
[79782.260825] drbd .drbdctrl hatest1: Aborting local state change 660226156 to
yield to remote state change 1270642965.
[79782.260838] drbd .drbdctrl: Aborting cluster-wide state change 660226156
(0ms) rv = -19
[79782.260852] drbd .drbdctrl hatest1: Preparing remote state change 1270642965
(primary_nodes=0, weak_nodes=0)
[79782.261615] drbd .drbdctrl hatest1: Aborting remote state change 1270642965
[79782.261637] drbd .drbdctrl: Preparing cluster-wide state change 2591013370
(2->-1 3/1)
[79782.261639] drbd .drbdctrl hatest1: Aborting local state change 2591013370 to
yield to remote state change 3039198147.
[79782.261648] drbd .drbdctrl: Aborting cluster-wide state change 2591013370
(0ms) rv = -19
[79782.261675] drbd .drbdctrl: Preparing cluster-wide state change 1878541550
(2->-1 3/1)
[79782.261685] drbd .drbdctrl: Aborting cluster-wide state change 1878541550
(0ms) rv = -19
[79782.261698] drbd .drbdctrl hatest1: Preparing remote state change 3039198147
(primary_nodes=0, weak_nodes=0)
[79782.261700] drbd .drbdctrl: Auto-promote failed: Concurrent state changes
detected and aborted
[79782.262088] drbd .drbdctrl hatest1: Aborting remote state change 3039198147
[79782.262148] drbd .drbdctrl hatest1: Preparing remote state change 3402814018
(primary_nodes=0, weak_nodes=0)
[79782.262505] drbd .drbdctrl hatest1: Aborting remote state change 3402814018
[79782.262521] drbd .drbdctrl hatest1: Preparing remote state change 1935329853
(primary_nodes=0, weak_nodes=0)
[79782.262855] drbd .drbdctrl hatest1: Aborting remote state change 1935329853
[79782.459853] drbd .drbdctrl hatest1: Preparing remote state change 2208552747
(primary_nodes=0, weak_nodes=0)
[79782.467375] drbd .drbdctrl hatest1: Committing remote state change 2208552747
[79782.467391] drbd .drbdctrl hatest1: peer( Secondary -> Primary )
[79782.476407] drbd .drbdctrl hatest1: peer( Primary -> Secondary )
[79782.740792] drbd vm-100-disk-1 hatest2: Handshake successful: Agreed network
protocol version 110
[79782.740794] drbd vm-100-disk-1 hatest2: Agreed to support TRIM on protocol
level
[79782.740951] drbd vm-100-disk-1 hatest2: Peer authenticated using 20 bytes
HMAC
[79782.740969] drbd vm-100-disk-1 hatest2: Starting ack_recv thread (from
drbd_r_vm-100-d [5428])
[79782.741007] drbd vm-100-disk-1 hatest2: incompatible allow-two-primaries
settings
[79782.741536] drbd vm-100-disk-1 hatest2: conn( Connecting -> Disconnecting )
[79782.741548] drbd vm-100-disk-1 hatest2: error receiving P_PROTOCOL, e: -5 l:
1!
[79782.742070] drbd vm-100-disk-1 hatest2: ack_receiver terminated
[79782.742071] drbd vm-100-disk-1 hatest2: Terminating ack_recv thread
[79782.788093] drbd vm-100-disk-1 hatest2: Connection closed
[79782.788116] drbd vm-100-disk-1 hatest2: conn( Disconnecting -> StandAlone )
[79782.788124] drbd vm-100-disk-1 hatest2: Terminating receiver thread
[79782.817344] device tap100i0 entered promiscuous mode
[79782.821298] vmbr0: port 2(tap100i0) entered forwarding state
[79782.821302] vmbr0: port 2(tap100i0) entered forwarding state
[79782.827251] drbd vm-100-disk-1: State change failed: Need access to UpToDate
data
[79782.827804] drbd vm-100-disk-1: Failed: role( Secondary -> Primary )
[79782.827807] drbd vm-100-disk-1: Auto-promote failed: Need access to UpToDate
data
[79782.907617] vmbr0: port 2(tap100i0) entered disabled state
[79784.003668] drbd .drbdctrl: Preparing cluster-wide state change 990779869
(2->-1 3/1)
[79784.003974] drbd .drbdctrl: State change 990779869: primary_nodes=4,
weak_nodes=FFFFFFFFFFFFFFF8
[79784.003976] drbd .drbdctrl: Committing cluster-wide state change 990779869
(0ms)
[79784.003984] drbd .drbdctrl: role( Secondary -> Primary )
[79784.009861] drbd .drbdctrl: role( Primary -> Secondary )

And I get the following info on the original node 

hatest2# drbdsetup show
...
resource vm-100-disk-1 {
    _this_host {
        node-id			1;
        volume 0 {
            device			minor 10;
            disk			"/dev/drbdpool/vm-100-disk-1_00";
            meta-disk			internal;
            disk {
                size            	2097152s; # bytes
            }
        }
    }
    connection {
        _peer_node_id 0;
        _this_host ipv4 192.168.3.202:7700;
        _remote_host ipv4 192.168.3.201:7700;
        net {
            allow-two-primaries	yes;
            cram-hmac-alg   	"sha1";
            shared-secret   	"z2m7pQ+YULNJF4RlmXA0";
            _name           	"hatest1";
        }
    }
    connection {
        _peer_node_id 2;
        _this_host ipv4 192.168.3.202:7700;
        _remote_host ipv4 192.168.3.203:7700;
        _is_standalone;
        net {
            allow-two-primaries	yes;
            cram-hmac-alg   	"sha1";
            shared-secret   	"z2m7pQ+YULNJF4RlmXA0";
            _name           	"hatest3";
        }
    }
}

The setup on node hatest3 differs:

hatest3# drbdsetup show
...
resource vm-100-disk-1 {
    _this_host {
        node-id			2;
        volume 0 {
            device			minor 10;
        }
    }
    connection {
        _peer_node_id 0;
        _this_host ipv4 192.168.3.203:7700;
        _remote_host ipv4 192.168.3.201:7700;
        _is_standalone;
        net {
            cram-hmac-alg   	"sha1";
            shared-secret   	"z2m7pQ+YULNJF4RlmXA0";
            _name           	"hatest1";
        }
    }
    connection {
        _peer_node_id 1;
        _this_host ipv4 192.168.3.203:7700;
        _remote_host ipv4 192.168.3.202:7700;
        _is_standalone;
        net {
            cram-hmac-alg   	"sha1";
            shared-secret   	"z2m7pQ+YULNJF4RlmXA0";
            _name           	"hatest2";
        }
    }
}

Whats wrong? I am not 100% sure, but AFAIR the same code worked 
with previous drbd9 releases.




More information about the drbd-user mailing list