[DRBD-user] drbd9.1.5 ocf::Filesystem Couldn't mount DRBD device

Didier Tanti tantididier at spitch.ch
Fri Mar 18 12:20:15 CET 2022


Hi,

We have a fairly standard setup with 4 nodes, 1 primary and 3 secondary (2
geo clusters (of 2 nodes each)).
The setup uses LVM volumes as drbd lower devices. All is managed by
pacemaker using linbit & pacemaker ocf resources.

DRBD kernel version is 9.1.5. Util is 9.20.2.
I must say we deploy on AWS nodes, using EBS for block devices.

Once DRBD is promoted active a mount is created in active node (this is
managed by ocf Filesystem agent). The FS is of type XFS.

Sometimes (1/30 maybe, after making failovers between the geo clusters, so
having the primary devices swapped), we observed an error on the Filesystem
OCF agent:
       *stderr [ mount: mount /dev/drbd0 on /mnt/audio failed: Resource
temporarily unavailable*

This happen even if DRBD is promoted primary. I dump the logs. Anyone know
what could be the reason? If some verbose can be activated we could do it,

regards

Mar 13 09:59:18 ip-172-31-12-232 kernel: drbd audiodata: role( Secondary ->
Primary )
Mar 13 09:59:18 ip-172-31-12-232 kernel: drbd audiodata: Preparing
cluster-wide state change 2445677710 (1->3 499/145)
Mar 13 09:59:18 ip-172-31-12-232 crmd[1571]:  notice: Result of promote
operation for audiodata on ip-172-31-12-232: 0 (ok)
Mar 13 09:59:18 ip-172-31-12-232 crmd[1571]:  notice: Initiating notify
operation audiodata_post_notify_promote_0 on ip-172-31-12-173
Mar 13 09:59:18 ip-172-31-12-232 kernel: drbd audiodata ip-172-31-12-173:
Aborting local state change 2445677710 to yield to remote state change
1509202161.
Mar 13 09:59:20 ip-172-31-12-232 kernel: drbd audiodata: Aborting
cluster-wide state change 2445677710 (2054ms) rv = -19
Mar 13 09:59:20 ip-172-31-12-232 kernel: drbd audiodata ip-172-31-12-173:
Preparing remote state change 1509202161
Mar 13 09:59:20 ip-172-31-12-232 kernel: drbd audiodata ip-172-31-12-173:
Aborting remote state change 1509202161
Mar 13 09:59:20 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-173: repl( WFBitMapS -> SyncSource )
Mar 13 09:59:20 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-173: Began resync as SyncSource (will sync 2076 KB [519 bits
set]).
Mar 13 09:59:20 ip-172-31-12-232 kernel: drbd audiodata: Preparing
cluster-wide state change 522239102 (1->3 499/145)
Mar 13 09:59:21 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-56: drbd_sync_handshake:
Mar 13 09:59:21 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-56: self
9141832129BF9D9C:0000000000000000:FCAD090A6554F6EA:0000000000000000 bits:0
flags:20
Mar 13 09:59:21 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-56: peer
9141832129BF9D9C:0000000000000000:FCAD090A6554F6EA:0000000000000000 bits:0
flags:1120
Mar 13 09:59:21 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-56: uuid_compare()=no-sync by rule=lost-quorum
Mar 13 09:59:21 ip-172-31-12-232 kernel: drbd audiodata ip-172-31-12-173:
Aborting local state change 522239102 to yield to remote state change
2672355414.
Mar 13 09:59:21 ip-172-31-12-232 kernel: drbd audiodata: Aborting
cluster-wide state change 522239102 (96ms) rv = -19
Mar 13 09:59:21 ip-172-31-12-232 kernel: drbd audiodata ip-172-31-12-173:
Preparing remote state change 2672355414
Mar 13 09:59:21 ip-172-31-12-232 awsvip(audio-awsalias)[8149]: INFO:
secondary_private_ip has been successfully brought up (172.31.12.90)
Mar 13 09:59:21 ip-172-31-12-232 crmd[1571]:  notice: Result of start
operation for audio-awsalias on ip-172-31-12-232: 0 (ok)
Mar 13 09:59:21 ip-172-31-12-232 crmd[1571]:  notice: Initiating notify
operation audiodata_post_notify_promote_0 locally on ip-172-31-12-232
Mar 13 09:59:21 ip-172-31-12-232 crmd[1571]:  notice: Result of notify
operation for audiodata on ip-172-31-12-232: 0 (ok)
Mar 13 09:59:21 ip-172-31-12-232 pengine[1570]:  notice:  * Start
 audio-fs                 (                     ip-172-31-12-232 )
Mar 13 09:59:21 ip-172-31-12-232 pengine[1570]:  notice:  * Start
 audio-cleanup            (                     ip-172-31-12-232 )
Mar 13 09:59:21 ip-172-31-12-232 pengine[1570]:  notice:  * Start
 audio-nginx              (                     ip-172-31-12-232 )
Mar 13 09:59:21 ip-172-31-12-232 crmd[1571]:  notice: Initiating monitor
operation audiodata_monitor_5000 on ip-172-31-12-173
Mar 13 09:59:21 ip-172-31-12-232 crmd[1571]:  notice: Initiating start
operation audio-fs_start_0 locally on ip-172-31-12-232
Mar 13 09:59:21 ip-172-31-12-232 crmd[1571]:  notice: Initiating monitor
operation audio-awsalias_monitor_5000 locally on ip-172-31-12-232
Mar 13 09:59:22 ip-172-31-12-232 Filesystem(audio-fs)[8661]: INFO: Running
start for /dev/drbd0 on /mnt/audio
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata ip-172-31-12-173:
Aborting remote state change 2672355414
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata: Preparing
cluster-wide state change 145509007 (1->3 499/145)
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-56: drbd_sync_handshake:
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-56: self
9141832129BF9D9C:0000000000000000:FCAD090A6554F6EA:0000000000000000 bits:0
flags:20
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-56: peer
9141832129BF9D9C:0000000000000000:FCAD090A6554F6EA:0000000000000000 bits:0
flags:1120
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-56: uuid_compare()=no-sync by rule=lost-quorum
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata ip-172-31-12-173:
Aborting local state change 145509007 to yield to remote state change
1845370428.
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata: Aborting
cluster-wide state change 145509007 (91ms) rv = -19
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata ip-172-31-12-173:
Preparing remote state change 1845370428
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-173: updated UUIDs
9141832129BF9D9C:0000000000000000:4C4977DFD426BCE0:FCAD090A6554F6EA
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-173: Resync done (total 2 sec; paused 0 sec; 1036 K/sec)
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata/0 drbd0
ip-172-31-12-173: pdsk( Inconsistent -> UpToDate ) repl( SyncSource ->
Established )
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata ip-172-31-12-173:
helper command: /sbin/drbdadm unfence-peer
Mar 13 09:59:23 ip-172-31-12-232 kernel: drbd audiodata ip-172-31-12-173:
helper command: /sbin/drbdadm unfence-peer exit code 0
Mar 13 09:59:25 ip-172-31-12-232 Filesystem(audio-fs)[8661]: ERROR:
Couldn't mount device [/dev/drbd0] as /mnt/audio
Mar 13 09:59:25 ip-172-31-12-232 lrmd[1568]:  notice:
audio-fs_start_0:8661:stderr [ mount: mount /dev/drbd0 on /mnt/audio
failed: Resource temporarily unavailable ]
Mar 13 09:59:25 ip-172-31-12-232 lrmd[1568]:  notice:
audio-fs_start_0:8661:stderr [ ocf-exit-reason:Couldn't mount device
[/dev/drbd0] as /mnt/audio ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20220318/f8842f1e/attachment.htm>


More information about the drbd-user mailing list