[DRBD-user] Mount problem at boot on multimaster config
Ervin Hegedüs
airween at gmail.com
Sun Sep 1 19:09:10 CEST 2019
Hi there,
I have an up-to-date Debian 10 system, with drbd 8.4.10, and ocfs2
1.8.5.
Looks like everything works, but when a node restarted, then my
/dev/drbd0 device doesn't mounted automatically. After the boot I
can log in to system, and can mount it by hand - the `mount -a`
command takes about 8 seconds...
# time mount -a
real 0m8,173s
user 0m0,009s
sys 0m0,019s
Before I mount the volume by hand, I checked the /proc/drbd, and
I think there is everything well:
# cat /proc/drbd
version: 8.4.10 (api:1/proto:86-101)
srcversion: 9B4D87C5E865DF526864868
0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
ns:124 nr:16511 dw:16635 dr:646 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
(this is what I see after the system booted and I logged in,
before I mount the device)
Here are the relevant configs:
# cat /etc/drbd.d/r0.res
resource r0 {
meta-disk internal;
device /dev/drbd0;
syncer {
verify-alg sha1;
}
net {
allow-two-primaries;
}
startup {
become-primary-on both;
}
on t2app1 {
disk /dev/vg-t2app1/lvdrbd0;
address 192.168.72.21:7789;
}
on t2app2 {
disk /dev/vg-t2app2/lvdrbd0;
address 192.168.72.22:7789;
}
}
(Note, that the configs are same on both nodes)
# cat /etc/ocfs2/cluster.conf
cluster:
node_count = 2
name = data
node:
ip_port = 7777
ip_address = 192.168.72.21
number = 1
name = t2app1
cluster = data
node:
ip_port = 7777
ip_address = 192.168.72.22
number = 2
name = t2app2
cluster = data
(also the configs are same on both nodes)
# grep drbd /etc/fstab
/dev/drbd0 /drbd ocfs2 _netdev,defaults 0 0
on both nodes.
I've found some interesting lines in syslog from the boot
intervall:
# grep "\(o2\|ocfs\|drbd\)" /var/log/syslog
Sep 1 18:47:37 t2app2 systemd[1]: Stopping Mount ocfs2 Filesystems...
Sep 1 18:47:52 t2app2 systemd-modules-load[361]: Inserted module 'drbd'
Sep 1 18:47:52 t2app2 systemd[1]: Starting Load o2cb Modules...
Sep 1 18:47:52 t2app2 kernel: [ 3.822815] drbd: initialized. Version: 8.4.10 (api:1/proto:86-101)
Sep 1 18:47:52 t2app2 kernel: [ 3.822816] drbd: srcversion: 9B4D87C5E865DF526864868
Sep 1 18:47:52 t2app2 kernel: [ 3.822817] drbd: registered as block device major 147
Sep 1 18:47:52 t2app2 o2cb[659]: checking debugfs...
Sep 1 18:47:53 t2app2 o2cb[659]: Loading filesystem "configfs": OK
Sep 1 18:47:53 t2app2 o2cb[659]: Mounting configfs filesystem at /sys/kernel/config: mount: /sys/kernel/config: configfs already mounted on /sys/kernel/config.
Sep 1 18:47:53 t2app2 o2cb[659]: Loading stack plugin "o2cb": OK
Sep 1 18:47:53 t2app2 kernel: [ 4.544893] ocfs2: Registered cluster interface o2cb
Sep 1 18:47:53 t2app2 o2cb[659]: Loading filesystem "ocfs2_dlmfs": OK
Sep 1 18:47:53 t2app2 o2cb[659]: Mounting ocfs2_dlmfs filesystem at /dlm: OK
Sep 1 18:47:53 t2app2 o2cb[659]: Setting cluster stack "o2cb": OK
Sep 1 18:47:53 t2app2 drbd[696]: Starting DRBD resources:[
Sep 1 18:47:53 t2app2 drbd[696]: create res: r0
Sep 1 18:47:53 t2app2 drbd[696]: prepare disk: r0
Sep 1 18:47:53 t2app2 systemd[1]: Found device /dev/drbd0.
Sep 1 18:47:53 t2app2 systemd[1]: Mounting /drbd...
Sep 1 18:47:53 t2app2 mount[726]: mount.ocfs2: I/O error on channel while opening device /dev/drbd0
Sep 1 18:47:53 t2app2 systemd[1]: drbd.mount: Mount process exited, code=exited, status=1/FAILURE
Sep 1 18:47:53 t2app2 systemd[1]: drbd.mount: Failed with result 'exit-code'.
Sep 1 18:47:53 t2app2 systemd[1]: Failed to mount /drbd.
Sep 1 18:47:53 t2app2 kernel: [ 4.581222] drbd r0: Starting worker thread (from drbdsetup-84 [729])
Sep 1 18:47:53 t2app2 drbd[696]: adjust disk: r0
Sep 1 18:47:53 t2app2 kernel: [ 4.581729] block drbd0: disk( Diskless -> Attaching )
Sep 1 18:47:53 t2app2 kernel: [ 4.581782] drbd r0: Method to ensure write ordering: flush
Sep 1 18:47:53 t2app2 kernel: [ 4.581784] block drbd0: max BIO size = 1048576
Sep 1 18:47:53 t2app2 kernel: [ 4.581786] block drbd0: drbd_bm_resize called with capacity == 20970808
Sep 1 18:47:53 t2app2 kernel: [ 4.581827] block drbd0: resync bitmap: bits=2621351 words=40959 pages=80
Sep 1 18:47:53 t2app2 kernel: [ 4.581828] block drbd0: size = 10 GB (10485404 KB)
Sep 1 18:47:53 t2app2 kernel: [ 4.583737] block drbd0: recounting of set bits took additional 0 jiffies
Sep 1 18:47:53 t2app2 kernel: [ 4.583738] block drbd0: 16 MB (4096 bits) marked out-of-sync by on disk bit-map.
Sep 1 18:47:53 t2app2 kernel: [ 4.583741] block drbd0: disk( Attaching -> UpToDate )
Sep 1 18:47:53 t2app2 kernel: [ 4.583743] block drbd0: attached to UUIDs 1E63CB670138DCE9:0000000000000000:A7B3A04F6A72E7D9:A7B2A04F6A72E7D9
Sep 1 18:47:53 t2app2 kernel: [ 4.586632] drbd r0: conn( StandAlone -> Unconnected )
Sep 1 18:47:53 t2app2 kernel: [ 4.586642] drbd r0: Starting receiver thread (from drbd_w_r0 [731])
Sep 1 18:47:53 t2app2 kernel: [ 4.586811] drbd r0: receiver (re)started
Sep 1 18:47:53 t2app2 kernel: [ 4.586817] drbd r0: conn( Unconnected -> WFConnection )
Sep 1 18:47:53 t2app2 drbd[696]: adjust net: r0
Sep 1 18:47:53 t2app2 drbd[696]: ]
Sep 1 18:47:53 t2app2 o2cb[659]: Registering O2CB cluster "data": OK
Sep 1 18:47:53 t2app2 o2cb[659]: Setting O2CB cluster timeouts : OK
Sep 1 18:47:53 t2app2 kernel: [ 4.634584] o2hb: Heartbeat mode set to local
Sep 1 18:47:53 t2app2 systemd[1]: Started Load o2cb Modules.
Sep 1 18:47:53 t2app2 o2hbmonitor: Starting
Sep 1 18:47:53 t2app2 systemd[1]: Starting Mount ocfs2 Filesystems...
Sep 1 18:47:53 t2app2 ocfs2[794]: Starting Oracle Cluster File System (OCFS2) mount.ocfs2: I/O error on channel while opening device /dev/drbd0
Sep 1 18:47:53 t2app2 ocfs2[794]: Failed
Sep 1 18:47:53 t2app2 systemd[1]: Started Mount ocfs2 Filesystems.
Sep 1 18:47:53 t2app2 kernel: [ 5.107027] drbd r0: Handshake successful: Agreed network protocol version 101
Sep 1 18:47:53 t2app2 kernel: [ 5.107028] drbd r0: Feature flags enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
Sep 1 18:47:53 t2app2 kernel: [ 5.107049] drbd r0: conn( WFConnection -> WFReportParams )
Sep 1 18:47:53 t2app2 kernel: [ 5.107050] drbd r0: Starting ack_recv thread (from drbd_r_r0 [738])
Sep 1 18:47:53 t2app2 kernel: [ 5.153635] block drbd0: drbd_sync_handshake:
Sep 1 18:47:53 t2app2 kernel: [ 5.153637] block drbd0: self 1E63CB670138DCE8:0000000000000000:A7B3A04F6A72E7D9:A7B2A04F6A72E7D9 bits:4096 flags:0
Sep 1 18:47:53 t2app2 kernel: [ 5.153638] block drbd0: peer DA6EC47089E7218F:1E63CB670138DCE9:A7B3A04F6A72E7D9:A7B2A04F6A72E7D9 bits:1 flags:0
Sep 1 18:47:53 t2app2 kernel: [ 5.153639] block drbd0: uuid_compare()=-1 by rule 50
Sep 1 18:47:53 t2app2 kernel: [ 5.153643] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
Sep 1 18:47:53 t2app2 kernel: [ 5.154072] block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 25(1), total 25; compression: 100.0%
Sep 1 18:47:53 t2app2 kernel: [ 5.154125] block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 44(1), total 44; compression: 100.0%
Sep 1 18:47:53 t2app2 kernel: [ 5.154127] block drbd0: conn( WFBitMapT -> WFSyncUUID )
Sep 1 18:47:53 t2app2 kernel: [ 5.155755] block drbd0: updated sync uuid 1E64CB670138DCE8:0000000000000000:A7B3A04F6A72E7D9:A7B2A04F6A72E7D9
Sep 1 18:47:53 t2app2 kernel: [ 5.156150] block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0
Sep 1 18:47:53 t2app2 kernel: [ 5.157047] block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
Sep 1 18:47:53 t2app2 kernel: [ 5.157054] block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
Sep 1 18:47:53 t2app2 kernel: [ 5.157067] block drbd0: Began resync as SyncTarget (will sync 16384 KB [4096 bits set]).
Sep 1 18:47:53 t2app2 kernel: [ 5.157500] block drbd0: role( Secondary -> Primary )
Sep 1 18:47:53 t2app2 drbd[696]: WARN: stdin/stdout is not a TTY; using /dev/console.
Sep 1 18:48:00 t2app2 kernel: [ 11.508179] block drbd0: Resync done (total 6 sec; paused 0 sec; 2728 K/sec)
Sep 1 18:48:00 t2app2 kernel: [ 11.508183] block drbd0: updated UUIDs DA6EC47089E7218F:0000000000000000:1E64CB670138DCE9:1E63CB670138DCE9
Sep 1 18:48:00 t2app2 kernel: [ 11.508186] block drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
Sep 1 18:48:00 t2app2 kernel: [ 11.508516] block drbd0: helper command: /sbin/drbdadm after-resync-target minor-0
Sep 1 18:48:00 t2app2 kernel: [ 11.509496] block drbd0: helper command: /sbin/drbdadm after-resync-target minor-0 exit code 0 (0x0)
Sep 1 18:48:58 t2app2 kernel: [ 69.610107] o2net: Connected to node t2app1 (num 1) at 192.168.72.21:7777
Sep 1 18:49:02 t2app2 kernel: [ 73.701897] o2dlm: Joining domain 34CC3A6762474F669C72F63604E47EE4
Sep 1 18:49:02 t2app2 kernel: [ 73.706853] ocfs2: Mounting device (147,0) on (node 2, slot 1) with ordered data mode.
note, that the last 3 lines (18:48:58 - 18:49:02) comes when I
mounted the device by hand.
Looks like the /dev/drbd0 doesn't exists at boot time, but when I
log in, it builded.
Am I missing something? Or there is some network issue, eg. slow
network...?
This config (which is _same_) works on Debian 9 without any
issue.
Any helps are welcome,
thanks,
a.
More information about the drbd-user
mailing list