[DRBD-user] high io when diskless node added to the storage pool

Alex Kolesnik drbd-user at abisoft.biz
Thu Sep 5 16:44:06 CEST 2019


Hello Robert,

Thanks for your reply!

> On 9/3/19 2:01 PM, Alex Kolesnik wrote:
>> moving  a drive to drbdpool increases nodes' IO enormously while nothing seems to
>> be  going on (well, the disk seems to be moving but VERY slow).

> Does writing anything else to the volume show normal performance, or is
> the performance degraded as well?

The performance becomes normal as soon as I delete the diskless node from the
configuration.

To experiment with writing to the volume I decided to create a test volume and
here what I got:

root at linstor-controller:~# linstor rd create testvol
SUCCESS:
Description:
    New resource definition 'testvol' created.
Details:
    Resource definition 'testvol' UUID is: dc16030a-3a1b-4d4d-876a-87f8713932cf


root at linstor-controller:~# linstor resource create --diskless-on-remaining --storage-pool drbdpool vm-box-2 vm-box-4 testvol
SUCCESS:
Description:
    New resource 'testvol' on node 'vm-box-2' registered.
Details:
    Resource 'testvol' on node 'vm-box-2' UUID is: 2c2119ce-cd74-4785-a147-dae0e2e90694
SUCCESS:
Description:
    New resource 'testvol' on node 'vm-box-4' registered.
Details:
    Resource 'testvol' on node 'vm-box-4' UUID is: 296b8eb5-e0ec-40d4-85ad-2f786a2e65e0
SUCCESS:
    Created resource 'testvol' on 'vm-box-4'
SUCCESS:
    Created resource 'testvol' on 'vm-box-2'
WARNING:
    No volumes have been defined for resource 'testvol'


root at linstor-controller:~# linstor vd create --storage-pool drbdpool -n 0 testvol 10G
SUCCESS:
    New volume definition with number '0' of resource definition 'testvol' created.


root at linstor-controller:~# linstor vd list
╭─────────────────────────────────────────────────────────╮
┊ ResourceName  ┊ VolumeNr ┊ VolumeMinor ┊ Size   ┊ State ┊
╞═════════════════════════════════════════════════════════╡
┊ testvol       ┊ 0        ┊ 1001        ┊ 10 GiB ┊ ok    ┊
┊ vm-115-disk-0 ┊ 0        ┊ 1000        ┊ 2 GiB  ┊ ok    ┊
╰─────────────────────────────────────────────────────────╯
root at linstor-controller:~# linstor rd list
╭──────────────────────────────────────────────╮
┊ ResourceName  ┊ Port ┊ ResourceGroup ┊ State ┊
╞══════════════════════════════════════════════╡
┊ testvol       ┊ 7001 ┊ DfltRscGrp    ┊ ok    ┊
┊ vm-115-disk-0 ┊ 7000 ┊ DfltRscGrp    ┊ ok    ┊
╰──────────────────────────────────────────────╯
root at linstor-controller:~# linstor r list
^C
linstor: Client exiting (received SIGINT)

I had to interrupt the listing the resources after a minute of waiting or so.
The reason of that was the blocked IO on vm-box-4:
root at vm-box-2:~# drbdadm status
testvol role:Secondary
  disk:UpToDate
  vm-box-4 role:Secondary
    peer-disk:UpToDate

vm-115-disk-0 role:Secondary
  disk:UpToDate
  vm-box-4 role:Primary
    peer-disk:UpToDate

root at vm-box-2:~# ssh vm-box-4 drbdadm status
testvol role:Secondary
  disk:Negotiating blocked:upper
  vm-box-2 role:Secondary
    peer-disk:UpToDate

vm-115-disk-0 role:Primary
  disk:UpToDate
  vm-box-2 role:Secondary
    peer-disk:UpToDate

root at vm-box-2:~# ssh vm-box-3 drbdadm status
no resources defined!

root at vm-box-2:~# ssh vm-box-4 drbdadm resume-sync testvol
testvol: Failure: (135) Sync-pause flag is already cleared
Command 'drbdsetup resume-sync testvol 0 0' terminated with exit code 10

root at vm-box-4:~# drbdsetup resume-io 1001
root at vm-box-4:~# drbdadm status
testvol role:Secondary
  disk:Negotiating blocked:upper
  vm-box-2 role:Secondary
    peer-disk:UpToDate

vm-115-disk-0 role:Primary
  disk:UpToDate
  vm-box-2 role:Secondary
    peer-disk:UpToDate

Tried  to remove the diskless node (vm-box-3) from the resources but that didn't
help  with  unblocking  the volume. Since I wasn't able to find a way to unblock
the volume, I just deleted it on linstor controller:
root at linstor-controller:~# linstor vd delete testvol 0
SUCCESS:
Description:
    Volume definition with number '0' of resource definition 'testvol' marked for deletion.
Details:
    Volume definition with number '0' of resource definition 'testvol' UUID is: 350a1cfb-590d-4cbc-b97c-59a4187562da
SUCCESS:
    Deleted volume 0 of 'testvol' on 'vm-box-2'
SUCCESS:
    Deleted volume 0 of 'testvol' on 'vm-box-4'
SUCCESS:
Description:
    Volume definition with number '0' of resource definition 'testvol' deleted.
Details:
    Volume definition with number '0' of resource definition 'testvol' UUID was: 350a1cfb-590d-4cbc-b97c-59a4187562da

>> The log displays
>> just this w/o any progress, so I had to stop the disk moving:
>> create full clone of drive scsi0 (LVM-Storage:126/vm-126-disk-0.qcow2)
>> trying to acquire cfs lock 'storage-drbdpool' ...
>> transferred: 0 bytes remaining: 10739277824 bytes total: 10739277824 bytes progression: 0.00 %

> I cannot provide much help with those messages, since they originate
> neither from LINSTOR nor from DRBD.
> The "trying to acquire cfs lock" message appears to be issued by
> Proxmox, and may be related to communication problems with Corosync's
> cluster link.

Anyway, that does not look like a Proxmox issue.

-- 
Best regards,
Alex Kolesnik



More information about the drbd-user mailing list