[DRBD-user] Update DRBD in product
matthieu le roy
leroy.matthieu50 at gmail.com
Wed Mar 22 11:30:09 CET 2023
Hello,
I have two servers running in high availability..
here is the info of the first server :
OS : Ubuntu 18.04.2 LTS
#drbdadm --version
DRBDADM_BUILDTAG=GIT-hash:\ 38a99411a8fcb883214a5300ad0ce1ef7ca37730\
build\ by\ buildd at lgw01-amd64-016\,\ 2019-05-27\ 12:45:18
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x090012
DRBD_KERNEL_VERSION=9.0.18
DRBDADM_VERSION_CODE=0x090900
DRBDADM_VERSION=9.9.0
here is the info of the second server after update :
OS : Ubuntu 20.04.6 LTS
# drbdadm --version
DRBDADM_BUILDTAG=GIT-hash:\ e267c4413f7cb3d8ec5e793c3fa7f518e95f23b1\
build\ by\ buildd at lcy02-amd64-101\,\ 2023-03-14\ 09:57:26
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x090202
DRBD_KERNEL_VERSION=9.2.2
DRBDADM_VERSION_CODE=0x091701
DRBDADM_VERSION=9.23.1
drbd config :
#cat /etc/drbd.d/alfresco.conf
resource alfresco {
handlers {
# before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh";
# after-resync-target "/usr/lib/drbd/unsnapshot-resync-target-lvm.sh";
}
on storage1 {
device /dev/drbd5;
disk /dev/datavg/alfresco;
node-id 10;
address 10.50.20.1:7004;
meta-disk internal;
}
on storage2 {
device /dev/drbd5;
disk /dev/datavg/appli;
node-id 11;
address 10.50.20.2:7004;
meta-disk internal;
}
}
# cat /etc/drbd.d/global_common.conf
global {
usage-count yes;
udev-always-use-vnr;
}
common {
handlers {
split-brain "/usr/lib/drbd/notify-split-brain.sh root";
}
net {
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
data-integrity-alg crc32c;
timeout 90;
ping-timeout 20;
ping-int 15;
connect-int 10;
}
}
order placed after update :
#drbdadm create-md appli
#drbdadm up appli
the synchro is launched I was able to follow the progress but arrived at
100% here is the status of the servers:
storage1 :
# drbdadm status alfresco
alfresco role:Primary
disk:UpToDate
storage2 role:Secondary
replication:SyncSource peer-disk:Inconsistent
# drbdsetup status --verbose --statistics alfresco
alfresco node-id:10 role:Primary suspended:no
write-ordering:flush
volume:0 minor:5 disk:UpToDate quorum:yes
size:536854492 read:423078021 written:419423956 al-writes:9640
bm-writes:0 upper-pending:0 lower-pending:0
al-suspended:no blocked:no
storage2 node-id:11 connection:Connected role:Secondary congested:no
ap-in-flight:0 rs-in-flight:0
volume:0 replication:SyncSource peer-disk:Inconsistent
resync-suspended:no
received:0 sent:421584224 out-of-sync:0 pending:0 unacked:0
storage2 :
# drbdadm status alfresco
alfresco role:Secondary
disk:Inconsistent
storage1 role:Primary
replication:SyncTarget peer-disk:UpToDate
# drbdsetup status --verbose --statistics alfresco
alfresco node-id:11 role:Secondary suspended:no force-io-failures:no
write-ordering:flush
volume:0 minor:5 disk:Inconsistent backing_dev:/dev/datavg/alfresco
quorum:yes
size:536854492 read:0 written:421584224 al-writes:14 bm-writes:6112
upper-pending:0 lower-pending:0
al-suspended:no blocked:no
storage1 node-id:10 connection:Connected role:Primary congested:no
ap-in-flight:0 rs-in-flight:0
volume:0 replication:SyncTarget peer-disk:UpToDate resync-suspended:no
received:421584224 sent:0 out-of-sync:0 pending:0 unacked:0
and while I haven't had any logs concerning drbd on storage1 since the
start of the sync, I have on storage2 these logs in a loop :
Mar 22 10:22:31 storage2 kernel: [ 4713.898381] INFO: task
drbd_s_alfresco:2104 blocked for more than 120 seconds.
Mar 22 10:22:31 storage2 kernel: [ 4713.898465] Tainted: G
OE 5.4.0-144-generic #161-Ubuntu
Mar 22 10:22:31 storage2 kernel: [ 4713.898530] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 22 10:22:31 storage2 kernel: [ 4713.898604] drbd_s_alfresco D 0
2104 2 0x80004000
Mar 22 10:22:31 storage2 kernel: [ 4713.898609] Call Trace:
Mar 22 10:22:31 storage2 kernel: [ 4713.898624] __schedule+0x2e3/0x740
Mar 22 10:22:31 storage2 kernel: [ 4713.898633] ?
update_load_avg+0x7c/0x670
Mar 22 10:22:31 storage2 kernel: [ 4713.898641] ? sched_clock+0x9/0x10
Mar 22 10:22:31 storage2 kernel: [ 4713.898648] schedule+0x42/0xb0
Mar 22 10:22:31 storage2 kernel: [ 4713.898656]
rwsem_down_write_slowpath+0x244/0x4d0
Mar 22 10:22:31 storage2 kernel: [ 4713.898663] ?
put_prev_entity+0x23/0x100
Mar 22 10:22:31 storage2 kernel: [ 4713.898675] down_write+0x41/0x50
Mar 22 10:22:31 storage2 kernel: [ 4713.898703]
drbd_resync_finished+0x97/0x7c0 [drbd]
Mar 22 10:22:31 storage2 kernel: [ 4713.898735] ? drbd_cork+0x64/0x70
[drbd]
Mar 22 10:22:31 storage2 kernel: [ 4713.898754] ?
wait_for_sender_todo+0x21e/0x240 [drbd]
Mar 22 10:22:31 storage2 kernel: [ 4713.898777]
w_resync_finished+0x2c/0x40 [drbd]
Mar 22 10:22:31 storage2 kernel: [ 4713.898795] drbd_sender+0x13e/0x3d0
[drbd]
Mar 22 10:22:31 storage2 kernel: [ 4713.898827]
drbd_thread_setup+0x87/0x1d0 [drbd]
Mar 22 10:22:31 storage2 kernel: [ 4713.898836] kthread+0x104/0x140
Mar 22 10:22:31 storage2 kernel: [ 4713.898861] ?
drbd_destroy_connection+0x150/0x150 [drbd]
Mar 22 10:22:31 storage2 kernel: [ 4713.898866] ? kthread_park+0x90/0x90
Mar 22 10:22:31 storage2 kernel: [ 4713.898873] ret_from_fork+0x1f/0x40
I need help please.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20230322/cdd50b19/attachment.htm>
More information about the drbd-user
mailing list