[DRBD-user] high io when diskless node added to the storage pool
Alex Kolesnik
drbd-user at abisoft.biz
Tue Sep 10 17:40:15 CEST 2019
Hello Robert,
Any news on this issue?
>> On 9/3/19 2:01 PM, Alex Kolesnik wrote:
>>> moving a drive to drbdpool increases nodes' IO enormously while nothing seems to
>>> be going on (well, the disk seems to be moving but VERY slow).
>> Does writing anything else to the volume show normal performance, or is
>> the performance degraded as well?
> The performance becomes normal as soon as I delete the diskless node from the
> configuration.
> To experiment with writing to the volume I decided to create a test volume and
> here what I got:
> root at linstor-controller:~# linstor rd create testvol
> SUCCESS:
> Description:
> New resource definition 'testvol' created.
> Details:
> Resource definition 'testvol' UUID is: dc16030a-3a1b-4d4d-876a-87f8713932cf
> root at linstor-controller:~# linstor resource create --diskless-on-remaining
> --storage-pool drbdpool vm-box-2 vm-box-4 testvol
> SUCCESS:
> Description:
> New resource 'testvol' on node 'vm-box-2' registered.
> Details:
> Resource 'testvol' on node 'vm-box-2' UUID is: 2c2119ce-cd74-4785-a147-dae0e2e90694
> SUCCESS:
> Description:
> New resource 'testvol' on node 'vm-box-4' registered.
> Details:
> Resource 'testvol' on node 'vm-box-4' UUID is: 296b8eb5-e0ec-40d4-85ad-2f786a2e65e0
> SUCCESS:
> Created resource 'testvol' on 'vm-box-4'
> SUCCESS:
> Created resource 'testvol' on 'vm-box-2'
> WARNING:
> No volumes have been defined for resource 'testvol'
> root at linstor-controller:~# linstor vd create --storage-pool drbdpool -n 0 testvol 10G
> SUCCESS:
> New volume definition with number '0' of resource definition 'testvol' created.
> root at linstor-controller:~# linstor vd list
> ╭─────────────────────────────────────────────────────────╮
> ┊ ResourceName ┊ VolumeNr ┊ VolumeMinor ┊ Size ┊ State ┊
> ╞═════════════════════════════════════════════════════════╡
> ┊ testvol ┊ 0 ┊ 1001 ┊ 10 GiB ┊ ok ┊
> ┊ vm-115-disk-0 ┊ 0 ┊ 1000 ┊ 2 GiB ┊ ok ┊
> ╰─────────────────────────────────────────────────────────╯
> root at linstor-controller:~# linstor rd list
> ╭──────────────────────────────────────────────╮
> ┊ ResourceName ┊ Port ┊ ResourceGroup ┊ State ┊
> ╞══════════════════════════════════════════════╡
> ┊ testvol ┊ 7001 ┊ DfltRscGrp ┊ ok ┊
> ┊ vm-115-disk-0 ┊ 7000 ┊ DfltRscGrp ┊ ok ┊
> ╰──────────────────────────────────────────────╯
> root at linstor-controller:~# linstor r list
> ^C
> linstor: Client exiting (received SIGINT)
> I had to interrupt the listing the resources after a minute of waiting or so.
> The reason of that was the blocked IO on vm-box-4:
> root at vm-box-2:~# drbdadm status
> testvol role:Secondary
> disk:UpToDate
> vm-box-4 role:Secondary
> peer-disk:UpToDate
> vm-115-disk-0 role:Secondary
> disk:UpToDate
> vm-box-4 role:Primary
> peer-disk:UpToDate
> root at vm-box-2:~# ssh vm-box-4 drbdadm status
> testvol role:Secondary
> disk:Negotiating blocked:upper
> vm-box-2 role:Secondary
> peer-disk:UpToDate
> vm-115-disk-0 role:Primary
> disk:UpToDate
> vm-box-2 role:Secondary
> peer-disk:UpToDate
> root at vm-box-2:~# ssh vm-box-3 drbdadm status
> no resources defined!
> root at vm-box-2:~# ssh vm-box-4 drbdadm resume-sync testvol
> testvol: Failure: (135) Sync-pause flag is already cleared
> Command 'drbdsetup resume-sync testvol 0 0' terminated with exit code 10
> root at vm-box-4:~# drbdsetup resume-io 1001
> root at vm-box-4:~# drbdadm status
> testvol role:Secondary
> disk:Negotiating blocked:upper
> vm-box-2 role:Secondary
> peer-disk:UpToDate
> vm-115-disk-0 role:Primary
> disk:UpToDate
> vm-box-2 role:Secondary
> peer-disk:UpToDate
> Tried to remove the diskless node (vm-box-3) from the resources but that didn't
> help with unblocking the volume. Since I wasn't able to find a way to unblock
> the volume, I just deleted it on linstor controller:
> root at linstor-controller:~# linstor vd delete testvol 0
> SUCCESS:
> Description:
> Volume definition with number '0' of resource definition 'testvol' marked for deletion.
> Details:
> Volume definition with number '0' of resource definition 'testvol' UUID
> is: 350a1cfb-590d-4cbc-b97c-59a4187562da
> SUCCESS:
> Deleted volume 0 of 'testvol' on 'vm-box-2'
> SUCCESS:
> Deleted volume 0 of 'testvol' on 'vm-box-4'
> SUCCESS:
> Description:
> Volume definition with number '0' of resource definition 'testvol' deleted.
> Details:
> Volume definition with number '0' of resource definition 'testvol' UUID
> was: 350a1cfb-590d-4cbc-b97c-59a4187562da
>>> The log displays
>>> just this w/o any progress, so I had to stop the disk moving:
>>> create full clone of drive scsi0 (LVM-Storage:126/vm-126-disk-0.qcow2)
>>> trying to acquire cfs lock 'storage-drbdpool' ...
>>> transferred: 0 bytes remaining: 10739277824 bytes total: 10739277824 bytes progression: 0.00 %
>> I cannot provide much help with those messages, since they originate
>> neither from LINSTOR nor from DRBD.
>> The "trying to acquire cfs lock" message appears to be issued by
>> Proxmox, and may be related to communication problems with Corosync's
>> cluster link.
> Anyway, that does not look like a Proxmox issue.
--
Best regards,
Alex Kolesnik
More information about the drbd-user
mailing list