[DRBD-user] DRBD device hang issue

Rene Peinthor rene.peinthor at linbit.com
Mon Nov 23 07:35:47 CET 2020


You are using a over 3 year old DRBD version, there have been numerous bug
fixes to DRBD since.
So first upgrade to the latest DRBD version and check if you can still
reproduce your problem.

On Mon, Nov 23, 2020 at 7:20 AM 박기혁 <korea.oops at gmail.com> wrote:

> Hello, Community
>
> My system is using Pacemaker + DRBD + MySQL DB.
> There is something unusual about your system.
>
> kernel version: 3.10.0-693.el7.x86_64
> drbd version: drbd90-utils-9.0.0-1.el7.elrepo.x86_64
> kmod-drbd90-9.0.9-1.el7_4.elrepo.x86_64
> DB version: mariadb-10.3.22
>
> Issue Time: October 15, 2020, 18:13:47 to 18:13:55
>
>    - When monitoring with IOSTAT, it detects 100% Utiliztion which cannot
>    be IO-handled for the /dev/drbd0 device.
>    command: iostat -td 1 -x
>
> 10/15/2020 06:13:47 PM
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await
> w_await svctm %util
> sda 0.00 0.00 0.00 6.00 0.00 24.00 8.00 0.00 0.17 0.00 0.17 0.17 0.10
> sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> dm-0 0.00 0.00 0.00 6.00 0.00 24.00 8.00 0.00 0.17 0.00 0.17 0.17 0.10
> dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> sdc 0.00 0.00 3.00 0.00 1.50 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00
> drbd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00
> <<----- **
>
> 10/15/2020 06:13:52 PM
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await
> w_await svctm %util
> drbd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00
>
> 10/15/2020 06:13:53 PM
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await
> w_await svctm %util
> drbd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00
>
> 10/15/2020 06:13:54 PM
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await
> w_await svctm %util
> sda 0.00 0.00 0.00 6.00 0.00 24.00 8.00 0.00 0.67 0.00 0.67 0.67 0.40
> sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> dm-0 0.00 0.00 0.00 6.00 0.00 24.00 8.00 0.00 0.67 0.00 0.67 0.67 0.40
> dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> sdc 0.00 0.00 3.00 0.00 1.50 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00
> drbd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00
>
>    - When monitoring the DRBD status, it detects that an upper-pending
>    has occurred.
>    exists resource name:drbd01 role:Primary suspended:no
>    write-ordering:flush
>    exists connection name:drbd01 peer-node-id:2 conn-name:node2
>    connection:Connected role:Secondary congested:no
>    exists device name:drbd01 volume:0 minor:0 disk:UpToDate client:no
>    size:1610559452 read:17730265 written:69192955 al-writes:16100 bm-writes:0
>    upper-pending:1 lower-pending:0 al-suspended:no blocked:no
>    exists peer-device name:drbd01 peer-node-id:2 conn-name:node2 volume:0
>    replication:Established peer-disk:UpToDate peer-client:no
>    resync-suspended:no received:8483 sent:69184366 out-of-sync:0
>    *pending:1* unacked:0
>    exists -
>
>
>    - upper-pending (application pending) : Number of block I/O requests
>    forwarded to DRBD, but not yet answered by DRBD
>
>
>    -
>
>    When you check Mysql Slow Query, the response was received after 9
>    seconds after the IO Hang was finished after the Query request.
>    User at Host: nodeapp[nodeapp] @ [100.100.100.142]
>    Thread_id: 7879 Schema: MYMQDB QC_hit: No
>    Query_time: 9.492522 Lock_time: 0.000058 Rows_sent: 0 Rows_examined: 1
>    Rows_affected: 1 Bytes_sent: 52
>    use MYMQDB;
>    SET timestamp=1602753235;
>    UPDATE ACTIVEMQ_LOCK SET BROKER_NAME='node2', TIME=1602753250881 WHERE
>    BROKER_NAME='node2' AND ID = 1;
>    -
>
>    drbd configuration
>    disk {
>    on-io-error detach;
>    no-disk-flushes ;
>    no-disk-barrier;
>    c-plan-ahead 0;
>    c-fill-target 24M;
>    c-min-rate 80M;
>    c-max-rate 720M;
>    }
>    net {
>    max-buffers 36k;
>    sndbuf-size 1024k ;
>    rcvbuf-size 2048k;
>    }
>
> In conclusion, the %util level in the DRBD device is 100%, but there is no
> read write at this time, and the slow time of MySQL is the same as the time
> of 100% duration.
>
> Does anyone know a similar case or solution to this phenomenon?
>
> Hang does not occur if drbd is operated as single.
> _______________________________________________
> Star us on GITHUB: https://github.com/LINBIT
> drbd-user mailing list
> drbd-user at lists.linbit.com
> https://lists.linbit.com/mailman/listinfo/drbd-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20201123/564acca9/attachment.htm>


More information about the drbd-user mailing list