[Drbd-dev] volume I/O hang on Ahead mode

Kim, SungEun sekim at mantech.co.kr
Tue Nov 14 05:54:53 CET 2017


Send the wrong message again.

"more than tens of drbd_req accumulated in trasfer_log"
==> more than 10,000 drbd_req in transfer_log

[image: 설명: logo]

Technical Research Center / Dev3 Team

Principal Research Engineer

*SungEun Kim               sekim at mantech.co.kr <sekim at mantech.co.kr>*

12F, Seoulforest Kolon digital Tower, 308-4, Seongsudong 2ga, Seongdong-gu,
Seoul, Korea

Tel : 02-2136-6913 / Fax : 02-575-4858 / Call Center : 1833-7790

http://www.mantech.co.kr

[image: 본문 이미지 1]

2017-11-14 13:21 GMT+09:00 Kim, SungEun <sekim at mantech.co.kr>:

> Hi,
>
> We tested the asynchronous replication Ahead mode of drbd9(9.0.9) at low
> bandwidth (1 ~ 10 Mbps Network) and found that the I/O response time rate
> slowed down during replication and eventually I/O of the volume hang
> occurred.
> Tests according to the following configuration and procedure will
> reproduce well.
>
>
> <drbd.conf>
>
> global {
>
>         disable-ip-verification;
>
>                 usage-count     no;
>
> }
>
> common {
>
>         startup {
>
>                 wfc-timeout     1;
>
>         }
>
>         disk {
>
>                 resync-rate     100M;
>
>         }
>
>         net {
>
>                 on-congestion pull-ahead;
>
>                 congestion-fill 480M;
>
>                 verify-alg      md5;
>
>         }
>
>         proxy {
>
>                 memlimit 500M;
>
>         }
>
> }
>
> resource r0 {
>
>         protocol        A;
>
>         disk {
>
>                 on-io-error     detach;
>
>         }
>
>         device  /dev/drbd0;
>
>         floating 200.200.2.10:7788 {
>
>                 disk    /dev/sdd1;
>
>                 meta-disk internal;
>
>                 proxy on pm1 {
>
>                         inside 127.0.0.1:7789;
>
>                         outside 200.200.2.10:7790;
>
>                 }
>
>         }
>
>
>
>         floating 200.200.2.11:7788 {
>
>                 disk    /dev/sdc1;
>
>                 meta-disk internal;
>
>                 proxy on pm2 {
>
>                         inside 127.0.0.1:7789;
>
>                         outside 200.200.2.11:7790;
>
>                 }
>
>
>
>         }
>
> }
>
>
> <Test procedure>
> 1. Proxy Buffer Size: 500M, congestion-fill: 480M;
> 2. Replication network bandwidth limit of 10Mbps (using VMware's network
> bandwidth limiting feature)
> 3. Mount the /data on the pm1 node
> 4. dd if=/dev/zero of=/data/test.out bs=100M count=40
> 5. Enter pm1 ahead mode
> 6. When ls -l is executed in pm1 node /data directory, the result is not
> output and it is almost hang status.
>
>
> The purpose of this test was to measure the behavior of the drbd9
> asynchronous Ahead mode and the I/O response rate of the volume, regardless
> of the network bandwidth.
>
> In my opinion, If the number of drbd_req in drbd9 becomes considerably
> large (more than tens of drbd_req accumulated in trasfer_log in this test),
> the execution time of drbd_sender increases. Especially, it takes much time
> to traverse transfer_log in the following code,finally It seems that the
> completion time has increased.
>
> // Only the execution time of this logic is measured more than 5 ms
>
> static struct drbd_request *__next_request_for_connection(
> struct drbd_connection *connection, struct drbd_request *r)
> {
> r = list_prepare_entry(r, &connection->resource->transfer_log,
> tl_requests);
>
> list_for_each_entry_continue(r, &connection->resource->transfer_log,
> tl_requests) {
> int vnr = r->device->vnr;
> struct drbd_peer_device *peer_device = conn_peer_device(connection, vnr);
> unsigned s = drbd_req_state_by_peer_device(r, peer_device);
> if (!(s & RQ_NET_QUEUED))
> continue;
> return r;
> }
> return NULL;
> }
>
>
> Please check this issue and hope that the performance of asynchronous
> replication of drbd9 will improve.
>
> Thanks.
>
>
> Best Regards
> from SungEun Kim
>
> [image: 설명: logo]
>
> Technical Research Center / Dev3 Team
>
> Principal Research Engineer
>
> *SungEun Kim               sekim at mantech.co.kr <sekim at mantech.co.kr>*
>
> 12F, Seoulforest Kolon digital Tower, 308-4, Seongsudong 2ga,
> Seongdong-gu, Seoul, Korea
>
> Tel : 02-2136-6913 / Fax : 02-575-4858 / Call Center : 1833-7790
>
> http://www.mantech.co.kr
>
> [image: 본문 이미지 1]
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-dev/attachments/20171114/9ab5b429/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 4544 bytes
Desc: not available
URL: <http://lists.linbit.com/pipermail/drbd-dev/attachments/20171114/9ab5b429/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 16229 bytes
Desc: not available
URL: <http://lists.linbit.com/pipermail/drbd-dev/attachments/20171114/9ab5b429/attachment-0003.png>


More information about the drbd-dev mailing list