Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, thanks for reply.
Here's the output of drbdsetup status
node1:
root at deb1:~# drbdsetup status
.drbdctrl role:Primary
volume:0 disk:UpToDate
volume:1 disk:UpToDate
node2:
root at deb2:~# drbdsetup status
.drbdctrl role:Secondary
volume:0 disk:Inconsistent
volume:1 disk:Inconsistent
deb1 connection:Connecting
I figured out, that this problem only occurs when using dedicated
interfaces for drbd.
In a testsetup it's not important to use just one nic, but I want to run
drbd for productional use cases.
Here's the complete setup:
*Node1:*
*nic1*:
- ip: 192.168.2.103
- netmask: 255.255.255.0
- gateway: 192.168.2.1
*nic2*:
- ip: 10.0.0.11
- netmask 255.255.255.0
*hostname*:
- deb1
*dns*
- 127.0.0.1 localhost
- 10.0.0.11 deb1
- 10.0.0.12 deb2
volumegroup drbdpool:
- /dev/sdb
*Node2:*
*nic1*:
- ip: 192.168.2.104
- netmask: 255.255.255.0
- gateway: 192.168.2.1
*nic2*:
- ip: 10.0.0.12
- netmask 255.255.255.0
*hostname*:
- deb2
*dns:*
- 127.0.0.1 localhost
- 10.0.0.11 deb1
- 10.0.0.12 deb2
volumegroup drbdpool:
- /dev/sdb
It seems DRBD cannot figure out who/what is primary...
DRBD drives me insane... Sometimes it work and sometimes it doesn't...
drbdmanage init 10.0.0.11 got stuck 2 times... and at the 3rd try it worked
like a charm. Hä!?!?!
Here's the ouput after trying to add a secondary node:
------------------- 1st node ------------------------
root at deb1:~# drbdmanage add-node deb2 10.0.0.12
Operation completed successfully
Operation completed successfully
Executing join command using ssh.
IMPORTANT: The output you see comes from deb2
IMPORTANT: Your input is executed on deb2
You are going to join an existing drbdmanage cluster.
CAUTION! Note that:
* Any previous drbdmanage cluster information may be removed
* Any remaining resources managed by a previous drbdmanage installation
that still exist on this system will no longer be managed by drbdmanage
Confirm:
yes/no: yes
Operation completed successfully
root at deb1:~#
root at deb1:~# drbdsetup status
.drbdctrl role:Primary
volume:0 disk:UpToDate
volume:1 disk:UpToDate
root at deb1:~#
root at deb1:~#
root at deb1:~# drbdsetup status
.drbdctrl role:Primary
volume:0 disk:UpToDate
volume:1 disk:UpToDate
deb2 role:Secondary
volume:0 replication:SyncSource peer-disk:Inconsistent done:15.78
volume:1 replication:SyncSource peer-disk:Inconsistent done:15.78
root at deb1:~# drbdsetup status
.drbdctrl role:Primary
volume:0 disk:UpToDate
volume:1 disk:UpToDate
deb2 role:Secondary
volume:0 peer-disk:UpToDate
volume:1 peer-disk:UpToDate
------------------- 2nd node ------------------------
root at deb2:~# drbdsetup status
.drbdctrl role:Secondary
volume:0 disk:Inconsistent
volume:1 disk:Inconsistent
deb1 role:Primary
volume:0 replication:SyncTarget peer-disk:UpToDate done:81.46
volume:1 replication:SyncTarget peer-disk:UpToDate done:81.46
root at deb2:~# drbdsetup status
.drbdctrl role:Secondary
volume:0 disk:UpToDate
volume:1 disk:UpToDate
deb1 role:Primary
volume:0 peer-disk:UpToDate
volume:1 peer-disk:UpToDate
root at deb2:~# drbdsetup status
.drbdctrl role:Secondary
volume:0 disk:UpToDate
volume:1 disk:UpToDate
deb1 role:Primary
volume:0 peer-disk:UpToDate
volume:1 peer-disk:UpToDate
What is going on there and why does it work and sometimes does not?
Best Regards,
Toni (Still a big fan)
2016-10-31 10:07 GMT+01:00 Roland Kammerer <roland.kammerer at linbit.com>:
> On Sat, Oct 29, 2016 at 08:55:45PM +0200, Toni Bolduan wrote:
> > Hi list,
> >
> > Today I've updated to drbdmanage 0.98 on my 2 ubuntu server nodes.
> > After setting up the volume group on both nodes I started and the
> > initialization on node 1. That worked fine.
> >
> >
> > Then I tried to add the second node to my cluster with "drbdmanage
> add-node
> > node2 10.0.0.12" and drbdmanage get stuck after confirmation.
>
> During startup drbdmanage now has to handle more things, so it might
> take longer (~15-30 seconds).
>
> >
> > On the second node dmesg shows the following:
> >
> > [...]
> > [ 1103.413457] drbd .drbdctrl: Terminating worker thread
> > [ 1386.430669] drbd .drbdctrl: Starting worker thread (from drbdsetup
> > [2142])
> > [ 1386.437482] drbd .drbdctrl node1: Starting sender thread (from
> drbdsetup
> > [2155])
> > [ 1386.445330] drbd .drbdctrl/0 drbd0: disk( Diskless -> Attaching )
> > [ 1386.445340] drbd .drbdctrl/0 drbd0: Maximum number of peer devices =
> 31
> > [ 1386.445425] drbd .drbdctrl: Method to ensure write ordering: flush
> > [ 1386.445427] drbd .drbdctrl/0 drbd0 node1: node_id: 0 idx: 0 bm-uuid:
> 0x0
> > flags: 0x10 max_size: 0 (DUnknown)
> > [ 1386.445428] drbd .drbdctrl/0 drbd0: my node_id: 1
> > [ 1386.445433] drbd .drbdctrl/0 drbd0 node1: node_id: 0 idx: 0 bm-uuid:
> 0x0
> > flags: 0x10 max_size: 0 (DUnknown)
> > [ 1386.445434] drbd .drbdctrl/0 drbd0: my node_id: 1
> > [ 1386.445435] drbd .drbdctrl/0 drbd0: drbd_bm_resize called with
> capacity
> > == 8112
> > [ 1386.445441] drbd .drbdctrl/0 drbd0: resync bitmap: bits=1014 words=496
> > pages=1
> > [ 1386.445442] drbd .drbdctrl/0 drbd0: size = 4056 KB (4056 KB)
> > [ 1386.446431] drbd .drbdctrl/0 drbd0: recounting of set bits took
> > additional 0ms
> > [ 1386.446440] drbd .drbdctrl/0 drbd0: disk( Attaching -> Outdated )
> > [ 1386.446443] drbd .drbdctrl/0 drbd0: attached to current UUID:
> > 120FE59FE04690DE
> > [ 1411.289042] drbd .drbdctrl: State change failed: Need access to
> UpToDate
> > data
> > [ 1411.289066] drbd .drbdctrl: Failed: role( Secondary -> Primary )
> > [ 1434.136862] drbd .drbdctrl: State change failed: Need access to
> UpToDate
> > data
> > [...]
> > [ 2033.117704] drbd .drbdctrl: Failed: role( Secondary -> Primary )
> >
> > How can I figure what happened here and why?
> >
>
> I guess that that happened while the second node was in the leader
> election phase, where it tries to become DRBD Primary on the control
> volume (.drbdctrl). That is how leader election basically works. All
> nodes race to become Primary until one succeeds, the others then see a
> Primary and give up and become satellite nodes. The problem is that
> there is no UpToDate data.
>
> I would run "drbdsetup status" in a second window and check if the
> resource (.drbdctrl) makes any progress. Does it sync up to the second
> node or does it get stuck after some percentage? Or does it not start
> syncing at all? Are they in some strange network state,... The output of
> "drbdsetup status" of both nodes would help a lot.
>
> Regards, rck
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20161031/76e1cdc8/attachment.htm>