Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
One thing I just came across, check the SELinux configuration or turn it off. I was researching another issue that was SELinux related and sealert showed a message about daemons_enable_cluster_mode not being enabled.
Klint.
-----Original Message-----
From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of aTTi
Sent: Thursday, 2 October 2014 12:46 AM
To: drbd-user at lists.linbit.com
Subject: [DRBD-user] drbd from port 7789 exit after 10 seconds at primary node
Hi!
I fully reinstalled both servers: Centos 7, all up to date, selinux on by default, default minimum install.
drbdtest1 and drbdtest12 nodes.
Config at both nodes in drbd.d directory:
resource r0 {
startup {
wfc-timeout 30;
outdated-wfc-timeout 20;
degr-wfc-timeout 30;
}
net {
cram-hmac-alg sha1;
shared-secret "abc";
}
syncer {
rate 100M;
verify-alg sha1;
}
on drbdtest1 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.1.1.1:7789;
meta-disk internal;
}
on drbdtest2 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.1.2.1:7789;
meta-disk internal;
}
protocol C;
}
Both nodes same:
yum install drbd84-utils kmod-drbd84 ntp ntpdate
modprobe drbd
lsmod |grep drbd
drbd 373504 1
libcrc32c 12644 2 xfs,drbd
firewall-cmd --zone=internal --add-port=7788-7799/tcp --permanent firewall-cmd --zone=internal --add-port=7788-7799/udp --permanent firewall-cmd --reload but I disable it for test:
systemctl stop firewalld
Both nodes can ssh to other one in default ssh port.
Both nodes /dev/sda = operating system, at 2. disk /dev/sdb for drbd.
fdisk /dev/sdb
created partition on both nodes, same size, done: /dev/sdb1
drbdadm create-md r0
it's ok, on both nodes.
drbdadm up r0
it's ok maybe, no error, but at primary port 7789 listen is lost after 10 sec.
at primary, first node only:
drbdadm -- --overwrite-data-of-peer primary all
drbdtest1 node:
drbd-overview
0:r0/0 StandAlone Primary/Unknown UpToDate/DUnknown
cat /proc/drbd
version: 8.4.5 (api:1/proto:86-101)
GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by mockbuild@, 2014-08-17 22:54:26
0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----s
ns:0 nr:0 dw:0 dr:728 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:52427164
netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address
State PID/Program name
tcp 0 0 127.0.0.1:25 0.0.0.0:*
LISTEN 2102/master
tcp 0 0 10.1.1.1:7789 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:22 0.0.0.0:*
LISTEN 1198/sshd
tcp6 0 0 ::1:25 :::*
LISTEN 2102/master
tcp6 0 0 :::22 :::*
LISTEN 1198/sshd
but after 10 sec:
netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address
State PID/Program name
tcp 0 0 127.0.0.1:25 0.0.0.0:*
LISTEN 2102/master
tcp 0 0 0.0.0.0:22 0.0.0.0:*
LISTEN 1198/sshd
tcp6 0 0 ::1:25 :::*
LISTEN 2102/master
tcp6 0 0 :::22 :::*
LISTEN 1198/sshd
drbd 7789 port no more listen in primary node! It's normal?
mount /dev/drbd0 /mnt/drbd
I see the files what I copy here for test.
drbdtest2 node:
drbd-overview
0:r0/0 WFConnection Secondary/Unknown Inconsistent/DUnknown
cat /proc/drbd
version: 8.4.5 (api:1/proto:86-101)
GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by mockbuild@, 2014-08-17 22:54:26
0: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown C r----s
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1023932
netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address
State PID/Program name
tcp 0 0 127.0.0.1:25 0.0.0.0:*
LISTEN 2155/master
tcp 0 0 10.1.2.1:7789 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:22 0.0.0.0:*
LISTEN 1335/sshd
tcp6 0 0 ::1:25 :::*
LISTEN 2155/master
tcp6 0 0 :::22 :::*
LISTEN 1335/sshd
I can ping the IPs, the hostnames. uname -n same as hostname at both nodes.
How can I debug why cannot communicate the nodes?
It's normal to used drbd port 7789 used only for 5-10 sec and no more service at 7789 port?
I run this at primary node:
# drbdadm connect r0 --verbose
drbdsetup-84 connect r0 ipv4:10.1.1.1:7789 ipv4:10.1.2.1:7789
--cram-hmac-alg=sha1 --shared-secret=abc --verify-alg=sha1 --protocol=C
and I run it again in 1-3 sec, this will happen:
# drbdadm connect r0 --verbose
drbdsetup-84 connect r0 ipv4:10.1.1.1:7789 ipv4:10.1.2.1:7789
--cram-hmac-alg=sha1 --shared-secret=abc --verify-alg=sha1 --protocol=C
r0: Failure: (102) Local address(port) already in use.
if I wait for 5-10 seconds or more, I can run it again without error:
drbdsetup-84 connect r0 ipv4:10.1.1.1:7789 ipv4:10.1.2.1:7789
--cram-hmac-alg=sha1 --shared-secret=abc --verify-alg=sha1 --protocol=C
Btw, if I run from cli:
]# drbdsetup-84
-bash: drbdsetup-84: command not found
but
]# drbdsetup
exists and working
How can I debug this error?
What is the problem? What I cannot see?
Please help me to fix this.
aTTi
_______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user