[DRBD-user] high io when diskless node added to the storage pool

Alex Kolesnik drbd-user at abisoft.biz
Tue Sep 10 17:40:15 CEST 2019


Hello Robert,

Any news on this issue?

>> On 9/3/19 2:01 PM, Alex Kolesnik wrote:
>>> moving  a drive to drbdpool increases nodes' IO enormously while nothing seems to
>>> be  going on (well, the disk seems to be moving but VERY slow).

>> Does writing anything else to the volume show normal performance, or is
>> the performance degraded as well?

> The performance becomes normal as soon as I delete the diskless node from the
> configuration.

> To experiment with writing to the volume I decided to create a test volume and
> here what I got:

> root at linstor-controller:~# linstor rd create testvol
> SUCCESS:
> Description:
>     New resource definition 'testvol' created.
> Details:
>     Resource definition 'testvol' UUID is: dc16030a-3a1b-4d4d-876a-87f8713932cf


> root at linstor-controller:~# linstor resource create --diskless-on-remaining
> --storage-pool drbdpool vm-box-2 vm-box-4 testvol
> SUCCESS:
> Description:
>     New resource 'testvol' on node 'vm-box-2' registered.
> Details:
>     Resource 'testvol' on node 'vm-box-2' UUID is: 2c2119ce-cd74-4785-a147-dae0e2e90694
> SUCCESS:
> Description:
>     New resource 'testvol' on node 'vm-box-4' registered.
> Details:
>     Resource 'testvol' on node 'vm-box-4' UUID is: 296b8eb5-e0ec-40d4-85ad-2f786a2e65e0
> SUCCESS:
>     Created resource 'testvol' on 'vm-box-4'
> SUCCESS:
>     Created resource 'testvol' on 'vm-box-2'
> WARNING:
>     No volumes have been defined for resource 'testvol'


> root at linstor-controller:~# linstor vd create --storage-pool drbdpool -n 0 testvol 10G
> SUCCESS:
>     New volume definition with number '0' of resource definition 'testvol' created.


> root at linstor-controller:~# linstor vd list
> ╭─────────────────────────────────────────────────────────╮
> ┊ ResourceName  ┊ VolumeNr ┊ VolumeMinor ┊ Size   ┊ State ┊
> ╞═════════════════════════════════════════════════════════╡
> ┊ testvol       ┊ 0        ┊ 1001        ┊ 10 GiB ┊ ok    ┊
> ┊ vm-115-disk-0 ┊ 0        ┊ 1000        ┊ 2 GiB  ┊ ok    ┊
> ╰─────────────────────────────────────────────────────────╯
> root at linstor-controller:~# linstor rd list
> ╭──────────────────────────────────────────────╮
> ┊ ResourceName  ┊ Port ┊ ResourceGroup ┊ State ┊
> ╞══════════════════════════════════════════════╡
> ┊ testvol       ┊ 7001 ┊ DfltRscGrp    ┊ ok    ┊
> ┊ vm-115-disk-0 ┊ 7000 ┊ DfltRscGrp    ┊ ok    ┊
> ╰──────────────────────────────────────────────╯
> root at linstor-controller:~# linstor r list
> ^C
> linstor: Client exiting (received SIGINT)

> I had to interrupt the listing the resources after a minute of waiting or so.
> The reason of that was the blocked IO on vm-box-4:
> root at vm-box-2:~# drbdadm status
> testvol role:Secondary
>   disk:UpToDate
>   vm-box-4 role:Secondary
>     peer-disk:UpToDate

> vm-115-disk-0 role:Secondary
>   disk:UpToDate
>   vm-box-4 role:Primary
>     peer-disk:UpToDate

> root at vm-box-2:~# ssh vm-box-4 drbdadm status
> testvol role:Secondary
>   disk:Negotiating blocked:upper
>   vm-box-2 role:Secondary
>     peer-disk:UpToDate

> vm-115-disk-0 role:Primary
>   disk:UpToDate
>   vm-box-2 role:Secondary
>     peer-disk:UpToDate

> root at vm-box-2:~# ssh vm-box-3 drbdadm status
> no resources defined!

> root at vm-box-2:~# ssh vm-box-4 drbdadm resume-sync testvol
> testvol: Failure: (135) Sync-pause flag is already cleared
> Command 'drbdsetup resume-sync testvol 0 0' terminated with exit code 10

> root at vm-box-4:~# drbdsetup resume-io 1001
> root at vm-box-4:~# drbdadm status
> testvol role:Secondary
>   disk:Negotiating blocked:upper
>   vm-box-2 role:Secondary
>     peer-disk:UpToDate

> vm-115-disk-0 role:Primary
>   disk:UpToDate
>   vm-box-2 role:Secondary
>     peer-disk:UpToDate

> Tried  to remove the diskless node (vm-box-3) from the resources but that didn't
> help  with  unblocking  the volume. Since I wasn't able to find a way to unblock
> the volume, I just deleted it on linstor controller:
> root at linstor-controller:~# linstor vd delete testvol 0
> SUCCESS:
> Description:
>     Volume definition with number '0' of resource definition 'testvol' marked for deletion.
> Details:
>     Volume definition with number '0' of resource definition 'testvol' UUID
> is: 350a1cfb-590d-4cbc-b97c-59a4187562da
> SUCCESS:
>     Deleted volume 0 of 'testvol' on 'vm-box-2'
> SUCCESS:
>     Deleted volume 0 of 'testvol' on 'vm-box-4'
> SUCCESS:
> Description:
>     Volume definition with number '0' of resource definition 'testvol' deleted.
> Details:
>     Volume definition with number '0' of resource definition 'testvol' UUID
> was: 350a1cfb-590d-4cbc-b97c-59a4187562da

>>> The log displays
>>> just this w/o any progress, so I had to stop the disk moving:
>>> create full clone of drive scsi0 (LVM-Storage:126/vm-126-disk-0.qcow2)
>>> trying to acquire cfs lock 'storage-drbdpool' ...
>>> transferred: 0 bytes remaining: 10739277824 bytes total: 10739277824 bytes progression: 0.00 %

>> I cannot provide much help with those messages, since they originate
>> neither from LINSTOR nor from DRBD.
>> The "trying to acquire cfs lock" message appears to be issued by
>> Proxmox, and may be related to communication problems with Corosync's
>> cluster link.

> Anyway, that does not look like a Proxmox issue.




-- 
Best regards,
Alex Kolesnik



More information about the drbd-user mailing list