<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
<META NAME="GENERATOR" CONTENT="GtkHTML/4.6.6">
</HEAD>
<BODY>
Next step is to check the logs on both nodes to see what log messages were generated by DRBD after the drbdadm connect r0 command. I don't think I've used a Red Hat derivative since EL5, but I think it'll be in the /var/log/messages file, and should also appear in the dmesg command.<BR>
<BR>
Thanks,<BR>
- Nelson<BR>
<BR>
<BR>
<BR>
<BR>
On Wed, 2014-10-01 at 16:45 +0200, aTTi wrote:
<BLOCKQUOTE TYPE=CITE>
<PRE>
Hi!
I fully reinstalled both servers: Centos 7, all up to date, selinux on
by default, default minimum install.
drbdtest1 and drbdtest12 nodes.
Config at both nodes in drbd.d directory:
resource r0 {
startup {
wfc-timeout 30;
outdated-wfc-timeout 20;
degr-wfc-timeout 30;
}
net {
cram-hmac-alg sha1;
shared-secret "abc";
}
syncer {
rate 100M;
verify-alg sha1;
}
on drbdtest1 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.1.1.1:7789;
meta-disk internal;
}
on drbdtest2 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.1.2.1:7789;
meta-disk internal;
}
protocol C;
}
Both nodes same:
yum install drbd84-utils kmod-drbd84 ntp ntpdate
modprobe drbd
lsmod |grep drbd
drbd 373504 1
libcrc32c 12644 2 xfs,drbd
firewall-cmd --zone=internal --add-port=7788-7799/tcp --permanent
firewall-cmd --zone=internal --add-port=7788-7799/udp --permanent
firewall-cmd --reload
but I disable it for test:
systemctl stop firewalld
Both nodes can ssh to other one in default ssh port.
Both nodes /dev/sda = operating system, at 2. disk /dev/sdb for drbd.
fdisk /dev/sdb
created partition on both nodes, same size, done: /dev/sdb1
drbdadm create-md r0
it's ok, on both nodes.
drbdadm up r0
it's ok maybe, no error, but at primary port 7789 listen is lost after 10 sec.
at primary, first node only:
drbdadm -- --overwrite-data-of-peer primary all
drbdtest1 node:
drbd-overview
0:r0/0 StandAlone Primary/Unknown UpToDate/DUnknown
cat /proc/drbd
version: 8.4.5 (api:1/proto:86-101)
GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by
mockbuild@, 2014-08-17 22:54:26
0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----s
ns:0 nr:0 dw:0 dr:728 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:52427164
netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address
State PID/Program name
tcp 0 0 127.0.0.1:25 0.0.0.0:*
LISTEN 2102/master
tcp 0 0 10.1.1.1:7789 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:22 0.0.0.0:*
LISTEN 1198/sshd
tcp6 0 0 ::1:25 :::*
LISTEN 2102/master
tcp6 0 0 :::22 :::*
LISTEN 1198/sshd
but after 10 sec:
netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address
State PID/Program name
tcp 0 0 127.0.0.1:25 0.0.0.0:*
LISTEN 2102/master
tcp 0 0 0.0.0.0:22 0.0.0.0:*
LISTEN 1198/sshd
tcp6 0 0 ::1:25 :::*
LISTEN 2102/master
tcp6 0 0 :::22 :::*
LISTEN 1198/sshd
drbd 7789 port no more listen in primary node! It's normal?
mount /dev/drbd0 /mnt/drbd
I see the files what I copy here for test.
drbdtest2 node:
drbd-overview
0:r0/0 WFConnection Secondary/Unknown Inconsistent/DUnknown
cat /proc/drbd
version: 8.4.5 (api:1/proto:86-101)
GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by
mockbuild@, 2014-08-17 22:54:26
0: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown C r----s
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1023932
netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address
State PID/Program name
tcp 0 0 127.0.0.1:25 0.0.0.0:*
LISTEN 2155/master
tcp 0 0 10.1.2.1:7789 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:22 0.0.0.0:*
LISTEN 1335/sshd
tcp6 0 0 ::1:25 :::*
LISTEN 2155/master
tcp6 0 0 :::22 :::*
LISTEN 1335/sshd
I can ping the IPs, the hostnames. uname -n same as hostname at both nodes.
How can I debug why cannot communicate the nodes?
It's normal to used drbd port 7789 used only for 5-10 sec and no more
service at 7789 port?
I run this at primary node:
# drbdadm connect r0 --verbose
drbdsetup-84 connect r0 ipv4:10.1.1.1:7789 ipv4:10.1.2.1:7789
--cram-hmac-alg=sha1 --shared-secret=abc --verify-alg=sha1
--protocol=C
and I run it again in 1-3 sec, this will happen:
# drbdadm connect r0 --verbose
drbdsetup-84 connect r0 ipv4:10.1.1.1:7789 ipv4:10.1.2.1:7789
--cram-hmac-alg=sha1 --shared-secret=abc --verify-alg=sha1
--protocol=C
r0: Failure: (102) Local address(port) already in use.
if I wait for 5-10 seconds or more, I can run it again without error:
drbdsetup-84 connect r0 ipv4:10.1.1.1:7789 ipv4:10.1.2.1:7789
--cram-hmac-alg=sha1 --shared-secret=abc --verify-alg=sha1
--protocol=C
Btw, if I run from cli:
]# drbdsetup-84
-bash: drbdsetup-84: command not found
but
]# drbdsetup
exists and working
How can I debug this error?
What is the problem? What I cannot see?
Please help me to fix this.
aTTi
_______________________________________________
drbd-user mailing list
<A HREF="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</A>
<A HREF="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</A>
</PRE>
</BLOCKQUOTE>
<BR>
</BODY>
</HTML>