[DRBD-user] Slow DRBD since Kernel 5 (PVE Kernel)
Alexander Karamanlidis
alexander.karamanlidis at lindenbaum.eu
Thu Jul 25 14:18:48 CEST 2019
Hi everyone,
so we upgraded to PVE6 (Proxmox VE 6) about a week ago to test it.
Since then our DRBD resource got incredibly slow (Around 105Mbit/s).
If we boot to Kernel 4.15 our Speeds get back to normal (Max 15Gbit/s)
Here's some data:
root at node1:~# cat
/sys/kernel/debug/drbd/resources/r0/connections/node2/0/proc_drbd ; echo
-e "\n\n" ; uname -a ; echo -e "\n\n" ; dpkg -l | grep
'pve-kernel\|drbd' ; echo -e "\n\n" ; drbdadm dump
0: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----
ns:0 nr:12242948 dw:12242948 dr:110085144 al:0 bm:0 lo:0 pe:[0;93]
ua:0 ap:[0;0] ep:1 wo:2 oos:7352525548
[>....................] sync'ed: 1.5% (7180200/7287584)M
finish: 4:03:52 speed: 502,476 (530,020 -- 484,408) want:
2,000,000 K/sec
1% sector pos: 297469952/15002273264
resync: used:2/61 hits:214057 misses:1684 starving:0 locked:0
changed:842
act_log: used:0/1237 hits:0 misses:0 starving:0 locked:0 changed:0
blocked on activity log: 0/0/0
Linux node1 4.15.18-18-pve #1 SMP PVE 4.15.18-44 (Wed, 03 Jul 2019
11:19:13 +0200) x86_64 GNU/Linux
ii drbd-dkms 9.0.19-1
all RAID 1 over TCP/IP for Linux module source
ii drbd-utils 9.10.0-1
amd64 RAID 1 over TCP/IP for Linux (user utilities)
ii drbdtop 0.2.1-1
amd64 like top, but for drbd
ii pve-firmware 3.0-2
all Binary firmware code for the pve-kernel
ii pve-kernel-4.15 5.4-6
all Latest Proxmox VE Kernel Image
ii pve-kernel-4.15.18-12-pve 4.15.18-36
amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.15.18-16-pve 4.15.18-41
amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.15.18-18-pve 4.15.18-44
amd64 The Proxmox PVE Kernel Image
ii pve-kernel-5.0 6.0-5
all Latest Proxmox VE Kernel Image
ii pve-kernel-5.0.15-1-pve 5.0.15-1
amd64 The Proxmox PVE Kernel Image
ii pve-kernel-helper 6.0-5
all Function for various kernel maintenance tasks.
# /etc/drbd.conf
# resource r0 on node1: not ignored, not stacked
# defined at /etc/drbd.d/r0.res:1
resource r0 {
on node1 {
node-id 1;
volume 0 {
device /dev/drbd0 minor 0;
disk
/dev/disk/by-uuid/8a879a82-3880-4998-b5cb-70a95ce4bf79;
meta-disk internal;
}
address ipv4 192.168.99.1:7788;
}
on node2 {
node-id 0;
volume 0 {
device /dev/drbd0 minor 0;
disk
/dev/disk/by-uuid/8a879a82-3880-4998-b5cb-70a95ce4bf79;
meta-disk internal;
}
address ipv4 192.168.99.2:7788;
}
net {
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
csums-alg sha1;
max-buffers 36864;
max-epoch-size 20000;
rcvbuf-size 2097152;
sndbuf-size 1048576;
verify-alg sha1;
}
disk {
c-fill-target 10240;
c-max-rate 2237280;
c-min-rate 204800;
c-plan-ahead 0;
resync-rate 2000000;
}
}
If we put I/O on the DRBD Resource we get a maximum of 14.19Gbit/s (We
have 25Gbit/s direct Attached Network) on our bond interface with the
4.15 Kernel
root at node1:~# cat
/sys/kernel/debug/drbd/resources/r0/connections/node2/0/proc_drbd ; echo
-e "\n\n" ; uname -a ; echo -e "\n\n" ; dpkg -l | grep
'pve-kernel\|drbd' ; echo -e "\n\n" ; drbdadm dump
0: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----
ns:0 nr:541700 dw:541700 dr:10270724 al:0 bm:0 lo:0 pe:[0;107] ua:0
ap:[0;0] ep:1 wo:2 oos:7334758124
[>....................] sync'ed: 0.2% (7162848/7172768)M
finish: 14:05:20 speed: 144,596 (154,112 -- 166,556) want:
2,000,000 K/sec
2% sector pos: 332978176/15002273264
resync: used:2/61 hits:19875 misses:162 starving:0 locked:0
changed:81
act_log: used:0/1237 hits:0 misses:0 starving:0 locked:0 changed:0
blocked on activity log: 0/0/0
Linux node1 5.0.15-1-pve #1 SMP PVE 5.0.15-1 (Wed, 03 Jul 2019 10:51:57
+0200) x86_64 GNU/Linux
ii drbd-dkms 9.0.19-1
all RAID 1 over TCP/IP for Linux module source
ii drbd-utils 9.10.0-1
amd64 RAID 1 over TCP/IP for Linux (user utilities)
ii drbdtop 0.2.1-1
amd64 like top, but for drbd
ii pve-firmware 3.0-2
all Binary firmware code for the pve-kernel
ii pve-kernel-4.15 5.4-6
all Latest Proxmox VE Kernel Image
ii pve-kernel-4.15.18-12-pve 4.15.18-36
amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.15.18-16-pve 4.15.18-41
amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.15.18-18-pve 4.15.18-44
amd64 The Proxmox PVE Kernel Image
ii pve-kernel-5.0 6.0-5
all Latest Proxmox VE Kernel Image
ii pve-kernel-5.0.15-1-pve 5.0.15-1
amd64 The Proxmox PVE Kernel Image
ii pve-kernel-helper 6.0-5
all Function for various kernel maintenance tasks.
# /etc/drbd.conf
# resource r0 on node1: not ignored, not stacked
# defined at /etc/drbd.d/r0.res:1
resource r0 {
on node1 {
node-id 1;
volume 0 {
device /dev/drbd0 minor 0;
disk
/dev/disk/by-uuid/8a879a82-3880-4998-b5cb-70a95ce4bf79;
meta-disk internal;
}
address ipv4 192.168.99.1:7788;
}
on node2 {
node-id 0;
volume 0 {
device /dev/drbd0 minor 0;
disk
/dev/disk/by-uuid/8a879a82-3880-4998-b5cb-70a95ce4bf79;
meta-disk internal;
}
address ipv4 192.168.99.2:7788;
}
net {
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
csums-alg sha1;
max-buffers 36864;
max-epoch-size 20000;
rcvbuf-size 2097152;
sndbuf-size 1048576;
verify-alg sha1;
}
disk {
c-fill-target 10240;
c-max-rate 2237280;
c-min-rate 204800;
c-plan-ahead 0;
resync-rate 2000000;
}
}
If we put I/O on the DRBD Resource we get a maximum of 107Mbit/s (We
have 25Gbit/s direct Attached Network) on our bond interface with the
5.0.15 Kernel
Maybe someone has a clue what changed in Kernel 5 that is slowing us
that much down.
Maybe someone even knows a solution for that.
Kind Regards,
Alexander Karamanlidis
More information about the drbd-user
mailing list