[DRBD-user] drbd from port 7789 exit after 10 seconds at primary node

aTTi attiuj at gmail.com
Wed Oct 1 16:45:37 CEST 2014

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi!

I fully reinstalled both servers: Centos 7, all up to date, selinux on
by default, default minimum install.

drbdtest1 and drbdtest12 nodes.

Config at both nodes in drbd.d directory:


resource r0 {

        startup {
                wfc-timeout 30;
                outdated-wfc-timeout 20;
                degr-wfc-timeout 30;
        }

        net {
        cram-hmac-alg sha1;
        shared-secret "abc";
        }

        syncer {
        rate 100M;
        verify-alg sha1;
        }


        on drbdtest1 {
                device /dev/drbd0;
                disk /dev/sdb1;
                address 10.1.1.1:7789;
                meta-disk internal;

                }

        on drbdtest2 {
                device /dev/drbd0;
                disk /dev/sdb1;
                address 10.1.2.1:7789;
                meta-disk internal;
        }

protocol C;

}



Both nodes same:

yum install drbd84-utils kmod-drbd84 ntp ntpdate

modprobe drbd

 lsmod |grep drbd
drbd                  373504  1
libcrc32c              12644  2 xfs,drbd

firewall-cmd --zone=internal --add-port=7788-7799/tcp --permanent
firewall-cmd --zone=internal --add-port=7788-7799/udp --permanent
firewall-cmd --reload
but I disable it for test:
systemctl stop firewalld

Both nodes can ssh to other one in default ssh port.

Both nodes /dev/sda = operating system, at 2. disk /dev/sdb for drbd.
fdisk /dev/sdb

created partition on both nodes, same size, done: /dev/sdb1

drbdadm create-md r0
it's ok, on both nodes.

drbdadm up r0
it's ok maybe, no error, but at primary port 7789 listen is lost after 10 sec.

at primary, first node only:

drbdadm -- --overwrite-data-of-peer primary all

drbdtest1 node:

 drbd-overview
 0:r0/0  StandAlone Primary/Unknown UpToDate/DUnknown

 cat /proc/drbd

version: 8.4.5 (api:1/proto:86-101)
GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by
mockbuild@, 2014-08-17 22:54:26
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r----s
    ns:0 nr:0 dw:0 dr:728 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:52427164

netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address
State       PID/Program name
tcp        0      0 127.0.0.1:25            0.0.0.0:*
LISTEN      2102/master
tcp        0      0 10.1.1.1:7789          0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:22              0.0.0.0:*
LISTEN      1198/sshd
tcp6       0      0 ::1:25                  :::*
LISTEN      2102/master
tcp6       0      0 :::22                   :::*
LISTEN      1198/sshd

but after 10 sec:

netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address
State       PID/Program name
tcp        0      0 127.0.0.1:25            0.0.0.0:*
LISTEN      2102/master
tcp        0      0 0.0.0.0:22              0.0.0.0:*
LISTEN      1198/sshd
tcp6       0      0 ::1:25                  :::*
LISTEN      2102/master
tcp6       0      0 :::22                   :::*
LISTEN      1198/sshd

drbd 7789 port no more listen in primary node! It's normal?


mount /dev/drbd0 /mnt/drbd

I see the files what I copy here for test.



drbdtest2 node:

 drbd-overview
 0:r0/0  WFConnection Secondary/Unknown Inconsistent/DUnknown

cat /proc/drbd
version: 8.4.5 (api:1/proto:86-101)
GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by
mockbuild@, 2014-08-17 22:54:26
 0: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown C r----s
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1023932


 netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address
State       PID/Program name
tcp        0      0 127.0.0.1:25            0.0.0.0:*
LISTEN      2155/master
tcp        0      0 10.1.2.1:7789          0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:22              0.0.0.0:*
LISTEN      1335/sshd
tcp6       0      0 ::1:25                  :::*
LISTEN      2155/master
tcp6       0      0 :::22                   :::*
LISTEN      1335/sshd


I can ping the IPs, the hostnames. uname -n same as hostname at both nodes.

How can I debug why cannot communicate the nodes?

It's normal to used drbd port 7789 used only for 5-10 sec and no more
service at 7789 port?


I run this at primary node:

# drbdadm connect r0 --verbose
drbdsetup-84 connect r0 ipv4:10.1.1.1:7789 ipv4:10.1.2.1:7789
--cram-hmac-alg=sha1 --shared-secret=abc --verify-alg=sha1
--protocol=C

and I run it again in 1-3 sec, this will happen:
# drbdadm connect r0 --verbose
drbdsetup-84 connect r0 ipv4:10.1.1.1:7789 ipv4:10.1.2.1:7789
--cram-hmac-alg=sha1 --shared-secret=abc --verify-alg=sha1
--protocol=C
r0: Failure: (102) Local address(port) already in use.

if I wait for 5-10 seconds or more, I can run it again without error:
drbdsetup-84 connect r0 ipv4:10.1.1.1:7789 ipv4:10.1.2.1:7789
--cram-hmac-alg=sha1 --shared-secret=abc --verify-alg=sha1
--protocol=C

Btw, if I run from cli:

]# drbdsetup-84
-bash: drbdsetup-84: command not found

but

]# drbdsetup

exists and working

How can I debug this error?

What is the problem? What I cannot see?

Please help me to fix this.
aTTi



More information about the drbd-user mailing list