[DRBD-user] drbd from port 7789 exit after 10 seconds at primary node

Nelson Hicks nelsonh at socket.net
Wed Oct 1 17:00:12 CEST 2014

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Next step is to check the logs on both nodes to see what log messages
were generated by DRBD after the drbdadm connect r0 command. I don't
think I've used a Red Hat derivative since EL5, but I think it'll be in
the /var/log/messages file, and should also appear in the dmesg command.

Thanks,
- Nelson




On Wed, 2014-10-01 at 16:45 +0200, aTTi wrote:

> Hi!
> 
> I fully reinstalled both servers: Centos 7, all up to date, selinux on
> by default, default minimum install.
> 
> drbdtest1 and drbdtest12 nodes.
> 
> Config at both nodes in drbd.d directory:
> 
> 
> resource r0 {
> 
>         startup {
>                 wfc-timeout 30;
>                 outdated-wfc-timeout 20;
>                 degr-wfc-timeout 30;
>         }
> 
>         net {
>         cram-hmac-alg sha1;
>         shared-secret "abc";
>         }
> 
>         syncer {
>         rate 100M;
>         verify-alg sha1;
>         }
> 
> 
>         on drbdtest1 {
>                 device /dev/drbd0;
>                 disk /dev/sdb1;
>                 address 10.1.1.1:7789;
>                 meta-disk internal;
> 
>                 }
> 
>         on drbdtest2 {
>                 device /dev/drbd0;
>                 disk /dev/sdb1;
>                 address 10.1.2.1:7789;
>                 meta-disk internal;
>         }
> 
> protocol C;
> 
> }
> 
> 
> 
> Both nodes same:
> 
> yum install drbd84-utils kmod-drbd84 ntp ntpdate
> 
> modprobe drbd
> 
>  lsmod |grep drbd
> drbd                  373504  1
> libcrc32c              12644  2 xfs,drbd
> 
> firewall-cmd --zone=internal --add-port=7788-7799/tcp --permanent
> firewall-cmd --zone=internal --add-port=7788-7799/udp --permanent
> firewall-cmd --reload
> but I disable it for test:
> systemctl stop firewalld
> 
> Both nodes can ssh to other one in default ssh port.
> 
> Both nodes /dev/sda = operating system, at 2. disk /dev/sdb for drbd.
> fdisk /dev/sdb
> 
> created partition on both nodes, same size, done: /dev/sdb1
> 
> drbdadm create-md r0
> it's ok, on both nodes.
> 
> drbdadm up r0
> it's ok maybe, no error, but at primary port 7789 listen is lost after 10 sec.
> 
> at primary, first node only:
> 
> drbdadm -- --overwrite-data-of-peer primary all
> 
> drbdtest1 node:
> 
>  drbd-overview
>  0:r0/0  StandAlone Primary/Unknown UpToDate/DUnknown
> 
>  cat /proc/drbd
> 
> version: 8.4.5 (api:1/proto:86-101)
> GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by
> mockbuild@, 2014-08-17 22:54:26
>  0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r----s
>     ns:0 nr:0 dw:0 dr:728 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:52427164
> 
> netstat -tnlp
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address           Foreign Address
> State       PID/Program name
> tcp        0      0 127.0.0.1:25            0.0.0.0:*
> LISTEN      2102/master
> tcp        0      0 10.1.1.1:7789          0.0.0.0:*               LISTEN      -
> tcp        0      0 0.0.0.0:22              0.0.0.0:*
> LISTEN      1198/sshd
> tcp6       0      0 ::1:25                  :::*
> LISTEN      2102/master
> tcp6       0      0 :::22                   :::*
> LISTEN      1198/sshd
> 
> but after 10 sec:
> 
> netstat -tnlp
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address           Foreign Address
> State       PID/Program name
> tcp        0      0 127.0.0.1:25            0.0.0.0:*
> LISTEN      2102/master
> tcp        0      0 0.0.0.0:22              0.0.0.0:*
> LISTEN      1198/sshd
> tcp6       0      0 ::1:25                  :::*
> LISTEN      2102/master
> tcp6       0      0 :::22                   :::*
> LISTEN      1198/sshd
> 
> drbd 7789 port no more listen in primary node! It's normal?
> 
> 
> mount /dev/drbd0 /mnt/drbd
> 
> I see the files what I copy here for test.
> 
> 
> 
> drbdtest2 node:
> 
>  drbd-overview
>  0:r0/0  WFConnection Secondary/Unknown Inconsistent/DUnknown
> 
> cat /proc/drbd
> version: 8.4.5 (api:1/proto:86-101)
> GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by
> mockbuild@, 2014-08-17 22:54:26
>  0: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown C r----s
>     ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1023932
> 
> 
>  netstat -tnlp
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address           Foreign Address
> State       PID/Program name
> tcp        0      0 127.0.0.1:25            0.0.0.0:*
> LISTEN      2155/master
> tcp        0      0 10.1.2.1:7789          0.0.0.0:*               LISTEN      -
> tcp        0      0 0.0.0.0:22              0.0.0.0:*
> LISTEN      1335/sshd
> tcp6       0      0 ::1:25                  :::*
> LISTEN      2155/master
> tcp6       0      0 :::22                   :::*
> LISTEN      1335/sshd
> 
> 
> I can ping the IPs, the hostnames. uname -n same as hostname at both nodes.
> 
> How can I debug why cannot communicate the nodes?
> 
> It's normal to used drbd port 7789 used only for 5-10 sec and no more
> service at 7789 port?
> 
> 
> I run this at primary node:
> 
> # drbdadm connect r0 --verbose
> drbdsetup-84 connect r0 ipv4:10.1.1.1:7789 ipv4:10.1.2.1:7789
> --cram-hmac-alg=sha1 --shared-secret=abc --verify-alg=sha1
> --protocol=C
> 
> and I run it again in 1-3 sec, this will happen:
> # drbdadm connect r0 --verbose
> drbdsetup-84 connect r0 ipv4:10.1.1.1:7789 ipv4:10.1.2.1:7789
> --cram-hmac-alg=sha1 --shared-secret=abc --verify-alg=sha1
> --protocol=C
> r0: Failure: (102) Local address(port) already in use.
> 
> if I wait for 5-10 seconds or more, I can run it again without error:
> drbdsetup-84 connect r0 ipv4:10.1.1.1:7789 ipv4:10.1.2.1:7789
> --cram-hmac-alg=sha1 --shared-secret=abc --verify-alg=sha1
> --protocol=C
> 
> Btw, if I run from cli:
> 
> ]# drbdsetup-84
> -bash: drbdsetup-84: command not found
> 
> but
> 
> ]# drbdsetup
> 
> exists and working
> 
> How can I debug this error?
> 
> What is the problem? What I cannot see?
> 
> Please help me to fix this.
> aTTi
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20141001/c3a700db/attachment.htm>


More information about the drbd-user mailing list