[Drbd-dev] IO cannot be completed
Ming Li
ming.li at easystack.cn
Mon Mar 29 13:41:58 CEST 2021
Hi everyone,
I got a problem about a IO cannot be completed.
I used three nodes to compose a cluster, config is below:
resource drbd6 {
device /dev/drbd6;
meta-disk internal;
options {
quorum 2;
on-no-quorum suspend-io;
quorum-minimum-redundancy 2;
auto-promote no;
cpu-mask "fc0000";
}
disk {
on-io-error detach;
al-updates no;
}
on node-1 {
node-id 0;
address 192.168.170.2:7789;
disk "/dev/disk/by-path/pci-0000:1b:00.0-nvme-1-part1";
}
on node-2 {
node-id 1;
address 192.168.170.3:7789;
disk "/dev/disk/by-path/pci-0000:1b:00.0-nvme-1-part1";
}
on node-3 {
node-id 2;
address 192.168.170.4:7789;
disk "/dev/disk/by-path/pci-0000:1b:00.0-nvme-1-part1";
}
on hotspare {
node-id 3;
address 192.168.170.2:17789;
disk "/dev/disk/by-path/pci-0000:1b:00.0-nvme-1-part1";
}
connection {
host node-1 port 7789;
host node-2 port 7789;
disk {
c-max-rate 2048M;
c-min-rate 0;
}
net {
max-buffers 100000;
}
}
connection {
host node-1 port 7789;
host node-3 port 7789;
disk {
c-max-rate 2048M;
c-min-rate 0;
}
net {
max-buffers 100000;
}
}
connection {
host node-2 port 7789;
host node-3 port 7789;
disk {
c-max-rate 2048M;
c-min-rate 0;
}
net {
max-buffers 100000;
}
}
connection {
host node-1 port 7789;
host hotspare port 17789;
disk {
c-max-rate 4G;
c-min-rate 0;
}
net {
max-buffers 100000;
}
}
connection {
host node-2 port 17789;
host hotspare port 17789;
disk {
c-max-rate 4G;
c-min-rate 0;
}
net {
max-buffers 100000;
}
}
connection {
host node-3 port 17789;
host hotspare port 17789;
disk {
c-max-rate 4G;
c-min-rate 0;
}
net {
max-buffers 100000;
}
}
}
drbd6 is primary in node-3, and I have systemd-udevd hang in node-3 some
times.
like this:
[84157.492607] INFO: task systemd-udevd:16394 blocked for more than 120
seconds.
[84157.493321] Tainted: G OE --------- - -
4.18.0-147.5.1.el8_1.5es.14.x86_64 #1
[84157.494476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[84157.495647] systemd-udevd D 0 16394 1309 0x80000086
[84157.496302] Call Trace:
[84157.496932] ? __schedule+0x253/0x830
[84157.497581] schedule+0x28/0x70
[84157.498233] io_schedule+0x12/0x40
[84157.498878] __lock_page+0x141/0x240
[84157.499527] ? find_get_entries+0x192/0x200
[84157.500180] ? page_cache_tree_insert+0xe0/0xe0
[84157.500830] truncate_inode_pages_range+0x432/0x8a0
[84157.501481] ? find_get_pages_range_tag+0x192/0x270
[84157.502131] ? pagevec_lookup_range_tag+0x24/0x30
[84157.502782] ? __filemap_fdatawait_range+0x7a/0x140
[84157.503456] ? cpumask_next_and+0x1a/0x20
[84157.517718] ? smp_call_function_many+0x59/0x250
[84157.518354] ? __brelse+0x20/0x20
[84157.518999] ? on_each_cpu_mask+0x25/0x60
[84157.519651] ? proc_ns_fget+0x40/0x40
[84157.520299] ? on_each_cpu_cond+0xb4/0x140
[84157.520947] __blkdev_put+0x79/0x1e0
[84157.521594] blkdev_close+0x21/0x30
[84157.522248] __fput+0xb7/0x230
[84157.522894] task_work_run+0x8a/0xb0
[84157.523541] do_exit+0x2da/0xb40
[84157.524192] do_group_exit+0x3a/0xa0
[84157.524818] get_signal+0x14b/0x850
[84157.525464] do_signal+0x36/0x610
[84157.526109] ? __audit_syscall_exit+0x228/0x290
[84157.526762] exit_to_usermode_loop+0x89/0xf0
[84157.527413] do_syscall_64+0x182/0x1b0
drbd6`s log is as below:
[82895.329346] drbd drbd6: Starting worker thread (from drbdsetup [50352])
[82895.331193] drbd drbd6 node-1: Starting sender thread (from drbdsetup
[50356])
[82895.332421] drbd drbd6 node-2: Starting sender thread (from drbdsetup
[50359])
[82895.333607] drbd drbd6 hotspare: Starting sender thread (from
drbdsetup [50364])
[82895.391610] drbd drbd6/0 drbd6: meta-data IO uses: blk-bio
[82895.392850] drbd drbd6/0 drbd6: disk( Diskless -> Attaching )
[82895.393199] drbd drbd6/0 drbd6: Maximum number of peer devices = 4
[82895.393675] drbd drbd6: Method to ensure write ordering: flush
[82895.394007] drbd drbd6/0 drbd6: drbd_bm_resize called with capacity
== 312437432
[82895.407125] drbd drbd6/0 drbd6: resync bitmap: bits=39054679
words=2440920 pages=4768
[82895.407702] drbd drbd6/0 drbd6: size = 149 GB (156218716 KB)
[82895.456654] drbd drbd6/0 drbd6: recounting of set bits took
additional 7ms
[82895.457004] drbd drbd6/0 drbd6: disk( Attaching -> Inconsistent )
[82895.457336] drbd drbd6/0 drbd6: attached to current UUID:
0000000000000004
[82895.482294] drbd drbd6 node-1: conn( StandAlone -> Unconnected )
[82895.483237] drbd drbd6 node-2: conn( StandAlone -> Unconnected )
[82895.484130] drbd drbd6 hotspare: conn( StandAlone -> Unconnected )
[82895.506730] drbd drbd6 node-1: Starting receiver thread (from
drbd_w_drbd6 [50353])
[82895.507404] drbd drbd6 node-1: conn( Unconnected -> Connecting )
[82895.507407] drbd drbd6 node-2: Starting receiver thread (from
drbd_w_drbd6 [50353])
[82895.508391] drbd drbd6 hotspare: Starting receiver thread (from
drbd_w_drbd6 [50353])
[82895.508395] drbd drbd6 node-2: conn( Unconnected -> Connecting )
[82895.509356] drbd drbd6 hotspare: conn( Unconnected -> Connecting )
[82896.804073] drbd drbd6 node-1: Handshake to peer 0 successful: Agreed
network protocol version 116
[82896.805266] drbd drbd6 node-1: Feature flags enabled on protocol
level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
[82896.806502] drbd drbd6 node-1: Starting ack_recv thread (from
drbd_r_drbd6 [50424])
[82896.946491] drbd drbd6 node-1: Preparing remote state change 1407406125
[82896.948175] drbd drbd6 node-1: Committing remote state change
1407406125 (primary_nodes=0)
[82896.950013] drbd drbd6 node-1: conn( Connecting -> Connected ) peer(
Unknown -> Secondary )
[82896.969061] drbd drbd6/0 drbd6 node-2: drbd_sync_handshake:
[82896.969735] drbd drbd6/0 drbd6 node-2: self
0000000000000004:0000000000000000:0000000000000000:0000000000000000
bits:0 flags:24
[82896.971009] drbd drbd6/0 drbd6 node-2: peer
0000000000000004:0000000000000000:0000000000000000:0000000000000000
bits:0 flags:24
[82896.972178] drbd drbd6/0 drbd6 node-2: uuid_compare()=no-sync by rule 10
[82896.976105] drbd drbd6/0 drbd6 node-2: pdsk( DUnknown -> Inconsistent
) repl( Off -> Established )
[82900.660110] drbd drbd6 node-2: Handshake to peer 1 successful: Agreed
network protocol version 116
[82900.660727] drbd drbd6 node-2: Feature flags enabled on protocol
level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
[82900.661381] drbd drbd6 node-2: Starting ack_recv thread (from
drbd_r_drbd6 [50426])
[82900.668493] drbd drbd6 node-1: Preparing remote state change 2853674044
[82900.673102] drbd drbd6 node-1: Committing remote state change
2853674044 (primary_nodes=0)
[82900.777009] drbd drbd6 node-2: Preparing remote state change 4199642800
[82900.778671] drbd drbd6 node-2: Committing remote state change
4199642800 (primary_nodes=0)
[82900.779626] drbd drbd6 node-2: conn( Connecting -> Connected ) peer(
Unknown -> Secondary )
[82900.806092] drbd drbd6/0 drbd6 node-2: drbd_sync_handshake:
[82900.806435] drbd drbd6/0 drbd6 node-2: self
0000000000000004:0000000000000000:0000000000000000:0000000000000000
bits:0 flags:24
[82900.807039] drbd drbd6/0 drbd6 node-2: peer
0000000000000004:0000000000000000:0000000000000000:0000000000000000
bits:0 flags:24
[82900.807655] drbd drbd6/0 drbd6 node-2: uuid_compare()=no-sync by rule 10
[82900.811180] drbd drbd6/0 drbd6 node-2: pdsk( DUnknown -> Inconsistent
) repl( Off -> Established )
[82905.736502] drbd drbd6/0 drbd6: quorum( no -> yes )
[82905.739631] drbd drbd6: susp-io( user -> no)
[82905.742760] drbd drbd6: Preparing cluster-wide state change
1609921248 (2->-1 7683/4609)
[82905.742893] drbd drbd6: State change 1609921248: primary_nodes=4,
weak_nodes=FFFFFFFFFFFFFFF8
[82905.744054] drbd drbd6: Committing cluster-wide state change
1609921248 (1ms)
[82905.744400] drbd drbd6: role( Secondary -> Primary )
[82905.744729] drbd drbd6/0 drbd6: disk( Inconsistent -> UpToDate )
[82905.745077] drbd drbd6/0 drbd6: size = 149 GB (156218716 KB)
[82905.893708] drbd drbd6: Forced to consider local data as UpToDate!
[82905.908612] drbd drbd6/0 drbd6: new current UUID: 959F6C5C1257B5F9
weak: FFFFFFFFFFFFFFF8
[82906.301860] drbd drbd6/0 drbd6 node-2: drbd_sync_handshake:
[82906.302207] drbd drbd6/0 drbd6 node-2: self
959F6C5C1257B5F9:ADECA711C521507D:0000000000000000:0000000000000000
bits:0 flags:20
[82906.302813] drbd drbd6/0 drbd6 node-2: peer
0000000000000004:0000000000000000:0000000000000000:0000000000000000
bits:39054679 flags:4
[82906.303433] drbd drbd6/0 drbd6 node-2:
uuid_compare()=source-set-bitmap by rule 30
[82906.304041] drbd drbd6/0 drbd6 node-2: Writing the whole bitmap, full
sync required after drbd_sync_handshake.
[82907.111693] drbd drbd6/0 drbd6 node-1: drbd_sync_handshake:
[82907.112040] drbd drbd6/0 drbd6 node-1: self
959F6C5C1257B5F9:ADECA711C521507D:0000000000000000:0000000000000000
bits:0 flags:20
[82907.112648] drbd drbd6/0 drbd6 node-1: peer
0000000000000004:0000000000000000:0000000000000000:0000000000000000
bits:39054679 flags:4
[82907.113269] drbd drbd6/0 drbd6 node-1:
uuid_compare()=source-set-bitmap by rule 30
[82907.113885] drbd drbd6/0 drbd6 node-1: Writing the whole bitmap, full
sync required after drbd_sync_handshake.
[82907.114524] drbd drbd6/0 drbd6: drbd_r_drbd6[50424] going to
'set_n_write from sync_handshake' but bitmap already locked for
'set_n_write from sync_handshake' by drbd_r_drbd6[50426]
[82909.619044] drbd drbd6/0 drbd6: bitmap WRITE of 4768 pages took 3300 ms
[82909.646416] drbd drbd6/0 drbd6 node-2: repl( Established -> WFBitMapS )
[82910.918858] drbd drbd6/0 drbd6 node-1: resync-susp( no -> user )
[82912.457406] drbd drbd6/0 drbd6: bitmap WRITE of 4768 pages took 2787 ms
[82912.477458] drbd drbd6/0 drbd6 node-1: drbd_sync_handshake:
[82912.477801] drbd drbd6/0 drbd6 node-1: self
959F6C5C1257B5F9:ADECA711C521507D:0000000000000000:0000000000000000
bits:0 flags:20
[82912.478406] drbd drbd6/0 drbd6 node-1: peer
0000000000000004:0000000000000000:0000000000000000:0000000000000000
bits:39054679 flags:4
[82912.479054] drbd drbd6/0 drbd6 node-1:
uuid_compare()=source-set-bitmap by rule 30
[82912.479648] drbd drbd6/0 drbd6 node-1: Writing the whole bitmap, full
sync required after drbd_sync_handshake.
[82912.480257] drbd drbd6/0 drbd6: drbd_r_drbd6[50424] going to
'set_n_write from sync_handshake' but bitmap already locked for
'send_bitmap (WFBitMapS)' by drbd_w_drbd6[50353]
[82912.482390] drbd drbd6/0 drbd6 node-2: send bitmap stats
[Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
[82912.483822] drbd drbd6/0 drbd6 node-2: resync-susp( no -> user )
[82912.484782] drbd drbd6/0 drbd6 hotspare: resync-susp( no -> user )
[82912.730591] drbd drbd6/0 drbd6 node-1: repl( Established -> WFBitMapS )
[82912.733632] drbd drbd6/0 drbd6 node-2: receive bitmap stats
[Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
[82912.734253] drbd drbd6/0 drbd6 node-2: helper command: /sbin/drbdadm
before-resync-source
[82912.736384] drbd drbd6/0 drbd6 node-2: helper command: /sbin/drbdadm
before-resync-source exit code 0 (0x0)
[82912.736993] drbd drbd6/0 drbd6 node-2: repl( WFBitMapS -> PausedSyncS )
[82912.737418] drbd drbd6/0 drbd6 node-2: Began resync as PausedSyncS
(will sync 156218716 KB [39054679 bits set]).
[82934.663658] drbd drbd6/0 drbd6 node-1: resync-susp( user -> no )
[82934.664674] drbd drbd6/0 drbd6 node-2: repl( PausedSyncS ->
SyncSource ) resync-susp( user -> no )
[82934.666910] drbd drbd6/0 drbd6 node-2: Syncer continues.
[82934.667851] drbd drbd6/0 drbd6 hotspare: resync-susp( user -> no )
[82934.682319] drbd drbd6/0 drbd6 node-1: send bitmap stats
[Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
[82934.710854] drbd drbd6/0 drbd6 node-1: receive bitmap stats
[Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
[82934.711485] drbd drbd6/0 drbd6 node-1: helper command: /sbin/drbdadm
before-resync-source
[82934.713606] drbd drbd6/0 drbd6 node-1: helper command: /sbin/drbdadm
before-resync-source exit code 0 (0x0)
[82934.714222] drbd drbd6/0 drbd6 node-1: repl( WFBitMapS -> SyncSource )
[82934.714651] drbd drbd6/0 drbd6 node-1: Began resync as SyncSource
(will sync 156218716 KB [39054679 bits set]).
[82934.736660] drbd drbd6: susp-io( no -> user,quorum)
[82934.737003] drbd drbd6/0 drbd6: quorum( yes -> no )
[82934.775542] drbd drbd6/0 drbd6: helper command: /sbin/drbdadm
quorum-lost
[82934.777362] drbd drbd6/0 drbd6: helper command: /sbin/drbdadm
quorum-lost exit code 0 (0x0)
[82934.781031] drbd drbd6/0 drbd6 hotspare: ignore result of IO sector:
312437416
[82938.246555] drbd drbd6/0 drbd6 node-2: updated UUIDs
959F6C5C1257B5F9:0000000000000000:ADECA711C521507C:0000000000000000
[82938.284627] drbd drbd6/0 drbd6: bitmap page idx 8 changed during IO!
[82938.284993] drbd drbd6/0 drbd6: bitmap WRITE of 3 pages took 38 ms
[82938.318646] drbd drbd6/0 drbd6 node-2: Resync done (total 24 sec;
paused 21 sec; 52072904 K/sec)
[82938.319247] drbd drbd6/0 drbd6 node-2: 0 % had equal checksums,
eliminated: 9176K; transferred 156209540K total 156218716K
[82938.319862] drbd drbd6: susp-io( quorum -> no)
[82938.320200] drbd drbd6/0 drbd6: quorum( no -> yes )
[82938.320540] drbd drbd6/0 drbd6 node-2: pdsk( Inconsistent -> UpToDate
) repl( SyncSource -> Established )
[82943.490962] drbd drbd6 node-8: Preparing remote state change 1267890931
[82943.527852] drbd drbd6 node-8: Committing remote state change
1267890931 (primary_nodes=4)
[82943.528814] drbd drbd6/0 drbd6 node-2: repl( Established -> VerifyT )
[82943.529176] drbd drbd6/0 drbd6: Online Verify start sector: 0
[84086.817010] drbd drbd6/0 drbd6 node-1: updated UUIDs
959F6C5C1257B5F9:0000000000000000:ADECA711C521507C:0000000000000000
[84086.817909] drbd drbd6/0 drbd6 node-1: Resync done (total 1152 sec;
paused 0 sec; 135604 K/sec)
[84086.818508] drbd drbd6/0 drbd6 node-1: 99 % had equal checksums,
eliminated: 156217972K; transferred 744K total 156218716K
[84086.819120] drbd drbd6/0 drbd6 node-1: pdsk( Inconsistent -> UpToDate
) repl( SyncSource -> Established )
[84092.084394] drbd drbd6 node-2: Preparing remote state change 3770923910
[84092.120711] drbd drbd6 node-2: Committing remote state change
3770923910 (primary_nodes=4)
[84092.121664] drbd drbd6/0 drbd6 node-1: repl( Established -> VerifyT )
[84092.122048] drbd drbd6/0 drbd6: Online Verify start sector: 0
[85594.442961] drbd drbd6/0 drbd6 node-2: Online verify done (total 2650
sec; paused 0 sec; 58948 K/sec)
[85594.443573] drbd drbd6/0 drbd6 node-2: repl( VerifyT -> Established )
[85743.003811] drbd drbd6/0 drbd6 node-1: Online verify done (total 1650
sec; paused 0 sec; 94676 K/sec)
[85743.004422] drbd drbd6/0 drbd6 node-1: repl( VerifyT -> Established )
and I found that drbd has in flight read request in node-3 through
/sys/kernel/debug/drbd/resources/drbd6/volumes/0/oldest_requests
/sys/kernel/debug/drbd/resources/drbd6/volumes/0/oldest_requests:
v: 0
epoch sector size rw start in AL submit sent acked
done state
0x4cd 312437416 8 R 235609292 - - -
- [3]235609292 0x00403000 master: pending local: suspended
completed|ok net[3]: done : - net[1]: - : - net[0]: - : -
I checked drbd codes, it seems if local IO or peer IO is completed after
suspended state, it wouldn`t clean suspended state, so this IO will
always pending.
I don`t have too much experience of drbd, and I don`t know where the
problem is, so is it a bug? Can I clean suspended state when all peer IO
and local IO are completed, like this:
diff --git a/drbd/drbd_req.c b/drbd/drbd_req.c
index 7a39cf7e..a6ea737c 100644
--- a/drbd/drbd_req.c
+++ b/drbd/drbd_req.c
@@ -767,6 +767,10 @@ static void mod_rq_state(struct drbd_request *req,
struct bio_and_error *m,
if (req->i.waiting)
wake_up(&req->device->misc_wait);
+ if ((atomic_read(req->completion_ref) == (c_put + 1)) &&
req->local_rq_state & RQ_COMPLETION_SUSP) {
+ c_put++;
+ req->local_rq_state &= ~RQ_COMPLETION_SUSP;
+ }
drbd_req_put_completion_ref(req, m, c_put);
kref_put(&req->kref, drbd_req_destroy);
}
May I know if this patch can solve this problem(or workaround this
problem) and it will not bring some new problems?
Thanks.
More information about the drbd-dev
mailing list