Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, thanks for reply. Here's the output of drbdsetup status node1: root at deb1:~# drbdsetup status .drbdctrl role:Primary volume:0 disk:UpToDate volume:1 disk:UpToDate node2: root at deb2:~# drbdsetup status .drbdctrl role:Secondary volume:0 disk:Inconsistent volume:1 disk:Inconsistent deb1 connection:Connecting I figured out, that this problem only occurs when using dedicated interfaces for drbd. In a testsetup it's not important to use just one nic, but I want to run drbd for productional use cases. Here's the complete setup: *Node1:* *nic1*: - ip: 192.168.2.103 - netmask: 255.255.255.0 - gateway: 192.168.2.1 *nic2*: - ip: 10.0.0.11 - netmask 255.255.255.0 *hostname*: - deb1 *dns* - 127.0.0.1 localhost - 10.0.0.11 deb1 - 10.0.0.12 deb2 volumegroup drbdpool: - /dev/sdb *Node2:* *nic1*: - ip: 192.168.2.104 - netmask: 255.255.255.0 - gateway: 192.168.2.1 *nic2*: - ip: 10.0.0.12 - netmask 255.255.255.0 *hostname*: - deb2 *dns:* - 127.0.0.1 localhost - 10.0.0.11 deb1 - 10.0.0.12 deb2 volumegroup drbdpool: - /dev/sdb It seems DRBD cannot figure out who/what is primary... DRBD drives me insane... Sometimes it work and sometimes it doesn't... drbdmanage init 10.0.0.11 got stuck 2 times... and at the 3rd try it worked like a charm. Hä!?!?! Here's the ouput after trying to add a secondary node: ------------------- 1st node ------------------------ root at deb1:~# drbdmanage add-node deb2 10.0.0.12 Operation completed successfully Operation completed successfully Executing join command using ssh. IMPORTANT: The output you see comes from deb2 IMPORTANT: Your input is executed on deb2 You are going to join an existing drbdmanage cluster. CAUTION! Note that: * Any previous drbdmanage cluster information may be removed * Any remaining resources managed by a previous drbdmanage installation that still exist on this system will no longer be managed by drbdmanage Confirm: yes/no: yes Operation completed successfully root at deb1:~# root at deb1:~# drbdsetup status .drbdctrl role:Primary volume:0 disk:UpToDate volume:1 disk:UpToDate root at deb1:~# root at deb1:~# root at deb1:~# drbdsetup status .drbdctrl role:Primary volume:0 disk:UpToDate volume:1 disk:UpToDate deb2 role:Secondary volume:0 replication:SyncSource peer-disk:Inconsistent done:15.78 volume:1 replication:SyncSource peer-disk:Inconsistent done:15.78 root at deb1:~# drbdsetup status .drbdctrl role:Primary volume:0 disk:UpToDate volume:1 disk:UpToDate deb2 role:Secondary volume:0 peer-disk:UpToDate volume:1 peer-disk:UpToDate ------------------- 2nd node ------------------------ root at deb2:~# drbdsetup status .drbdctrl role:Secondary volume:0 disk:Inconsistent volume:1 disk:Inconsistent deb1 role:Primary volume:0 replication:SyncTarget peer-disk:UpToDate done:81.46 volume:1 replication:SyncTarget peer-disk:UpToDate done:81.46 root at deb2:~# drbdsetup status .drbdctrl role:Secondary volume:0 disk:UpToDate volume:1 disk:UpToDate deb1 role:Primary volume:0 peer-disk:UpToDate volume:1 peer-disk:UpToDate root at deb2:~# drbdsetup status .drbdctrl role:Secondary volume:0 disk:UpToDate volume:1 disk:UpToDate deb1 role:Primary volume:0 peer-disk:UpToDate volume:1 peer-disk:UpToDate What is going on there and why does it work and sometimes does not? Best Regards, Toni (Still a big fan) 2016-10-31 10:07 GMT+01:00 Roland Kammerer <roland.kammerer at linbit.com>: > On Sat, Oct 29, 2016 at 08:55:45PM +0200, Toni Bolduan wrote: > > Hi list, > > > > Today I've updated to drbdmanage 0.98 on my 2 ubuntu server nodes. > > After setting up the volume group on both nodes I started and the > > initialization on node 1. That worked fine. > > > > > > Then I tried to add the second node to my cluster with "drbdmanage > add-node > > node2 10.0.0.12" and drbdmanage get stuck after confirmation. > > During startup drbdmanage now has to handle more things, so it might > take longer (~15-30 seconds). > > > > > On the second node dmesg shows the following: > > > > [...] > > [ 1103.413457] drbd .drbdctrl: Terminating worker thread > > [ 1386.430669] drbd .drbdctrl: Starting worker thread (from drbdsetup > > [2142]) > > [ 1386.437482] drbd .drbdctrl node1: Starting sender thread (from > drbdsetup > > [2155]) > > [ 1386.445330] drbd .drbdctrl/0 drbd0: disk( Diskless -> Attaching ) > > [ 1386.445340] drbd .drbdctrl/0 drbd0: Maximum number of peer devices = > 31 > > [ 1386.445425] drbd .drbdctrl: Method to ensure write ordering: flush > > [ 1386.445427] drbd .drbdctrl/0 drbd0 node1: node_id: 0 idx: 0 bm-uuid: > 0x0 > > flags: 0x10 max_size: 0 (DUnknown) > > [ 1386.445428] drbd .drbdctrl/0 drbd0: my node_id: 1 > > [ 1386.445433] drbd .drbdctrl/0 drbd0 node1: node_id: 0 idx: 0 bm-uuid: > 0x0 > > flags: 0x10 max_size: 0 (DUnknown) > > [ 1386.445434] drbd .drbdctrl/0 drbd0: my node_id: 1 > > [ 1386.445435] drbd .drbdctrl/0 drbd0: drbd_bm_resize called with > capacity > > == 8112 > > [ 1386.445441] drbd .drbdctrl/0 drbd0: resync bitmap: bits=1014 words=496 > > pages=1 > > [ 1386.445442] drbd .drbdctrl/0 drbd0: size = 4056 KB (4056 KB) > > [ 1386.446431] drbd .drbdctrl/0 drbd0: recounting of set bits took > > additional 0ms > > [ 1386.446440] drbd .drbdctrl/0 drbd0: disk( Attaching -> Outdated ) > > [ 1386.446443] drbd .drbdctrl/0 drbd0: attached to current UUID: > > 120FE59FE04690DE > > [ 1411.289042] drbd .drbdctrl: State change failed: Need access to > UpToDate > > data > > [ 1411.289066] drbd .drbdctrl: Failed: role( Secondary -> Primary ) > > [ 1434.136862] drbd .drbdctrl: State change failed: Need access to > UpToDate > > data > > [...] > > [ 2033.117704] drbd .drbdctrl: Failed: role( Secondary -> Primary ) > > > > How can I figure what happened here and why? > > > > I guess that that happened while the second node was in the leader > election phase, where it tries to become DRBD Primary on the control > volume (.drbdctrl). That is how leader election basically works. All > nodes race to become Primary until one succeeds, the others then see a > Primary and give up and become satellite nodes. The problem is that > there is no UpToDate data. > > I would run "drbdsetup status" in a second window and check if the > resource (.drbdctrl) makes any progress. Does it sync up to the second > node or does it get stuck after some percentage? Or does it not start > syncing at all? Are they in some strange network state,... The output of > "drbdsetup status" of both nodes would help a lot. > > Regards, rck > > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20161031/76e1cdc8/attachment.htm>