[DRBD-user] linstor-gateway 1.3.0 on Debian 11.7

Nicolas Bélan nicolas.belan at gmail.com
Fri Oct 27 11:08:36 CEST 2023


Hello,

I am trying to deploy linstor gateway on a 3 nodes cluster on Debian 11.7.

I added the parameter "target id" on linstor-gateway to handle the 
parameter "tid" in OCF ressources, because without it, I had:

ocf-exit-reason:Missing resource parameter "tid"!

But, I still have an error on tgt.

Well, here are the details:

root at linstor-01:~# cat /proc/drbd
version: 9.2.5 (api:2/proto:86-122)
GIT-hash: b44520271e63d4b6f359a6642eb4d475b7cc04e0 build by 
root at linstor-01, 2023-10-10 01:29:10
Transports (api:18): tcp (9.2.5)

root at linstor-01:~# drbdadm -V
DRBDADM_BUILDTAG=GIT-hash:\ bb297231c27690a31bf527e8bf77dca1fc2ce268\ 
build\ by\ root at linstor-01\,\ 2023-10-10\ 23:37:11
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x090205
DRBD_KERNEL_VERSION=9.2.5
DRBDADM_VERSION_CODE=0x091900
DRBDADM_VERSION=9.25.0

I am trying to provide a 10G iscsi device, with the command:

root at linstor-01:~# linstor-gateway iscsi create 
iqn.2023-10.com.example:test05 10.105.0.30/24 10G -r oneRessourceGroup 
--implementation tgt -t 2
Created iSCSI target 'iqn.2023-10.com.example:test05'

So, no error is reported on creation.

I created before a drbd device 'linstor_db' which is replicated between 
all nodes, and it is mounted successfully.

I add some "linstor" outputs here:

node
╭────────────────────────────────────────────────────────────╮
┊ Node       ┊ NodeType  ┊ Addresses                ┊ State  ┊
╞════════════════════════════════════════════════════════════╡
┊ linstor-01 ┊ SATELLITE ┊ 10.105.0.31:3366 (PLAIN) ┊ Online ┊
┊ linstor-02 ┊ SATELLITE ┊ 10.105.0.32:3366 (PLAIN) ┊ Online ┊
┊ linstor-03 ┊ SATELLITE ┊ 10.105.0.33:3366 (PLAIN) ┊ Online ┊
╰────────────────────────────────────────────────────────────╯
physical-storage
╭───────────────────────────╮
┊ Size ┊ Rotational ┊ Nodes ┊
╞═══════════════════════════╡
╰───────────────────────────╯
storage-pool
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node       ┊ Driver   ┊ PoolName ┊ FreeCapacity 
┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName                      ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ linstor-01 ┊ DISKLESS ┊ ┊              
┊               ┊ False        ┊ Ok    ┊ linstor-01;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ linstor-02 ┊ DISKLESS ┊ ┊              
┊               ┊ False        ┊ Ok    ┊ linstor-02;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ linstor-03 ┊ DISKLESS ┊ ┊              
┊               ┊ False        ┊ Ok    ┊ linstor-03;DfltDisklessStorPool ┊
┊ storage              ┊ linstor-01 ┊ ZFS      ┊ storage  ┊ 8.44 TiB 
┊     10.91 TiB ┊ True         ┊ Ok    ┊ linstor-01;storage              ┊
┊ storage              ┊ linstor-02 ┊ ZFS      ┊ storage  ┊ 8.42 TiB 
┊     10.91 TiB ┊ True         ┊ Ok    ┊ linstor-02;storage              ┊
┊ storage              ┊ linstor-03 ┊ ZFS      ┊ storage  ┊ 8.42 TiB 
┊     10.91 TiB ┊ True         ┊ Ok    ┊ linstor-03;storage              ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
resource-group
╭────────────────────────────────────────────────────────────────────╮
┊ ResourceGroup     ┊ SelectFilter            ┊ VlmNrs ┊ Description ┊
╞════════════════════════════════════════════════════════════════════╡
┊ DfltRscGrp        ┊ PlaceCount: 2           ┊ ┊             ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ oneRessourceGroup ┊ PlaceCount: 2           ┊ 0 ┊             ┊
┊                   ┊ StoragePool(s): storage ┊ ┊             ┊
╰────────────────────────────────────────────────────────────────────╯
resource
╭────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node       ┊ Port ┊ Usage  ┊ Conns ┊    State ┊ 
CreatedOn           ┊
╞════════════════════════════════════════════════════════════════════════════════════╡
┊ linstor_db   ┊ linstor-01 ┊ 7001 ┊ InUse  ┊ Ok    ┊ UpToDate ┊ 
2023-10-14 00:07:02 ┊
┊ linstor_db   ┊ linstor-02 ┊ 7001 ┊ Unused ┊ Ok    ┊ UpToDate ┊ 
2023-10-14 00:07:02 ┊
┊ linstor_db   ┊ linstor-03 ┊ 7001 ┊ Unused ┊ Ok    ┊ UpToDate ┊ 
2023-10-14 00:07:02 ┊
┊ test05       ┊ linstor-01 ┊ 7000 ┊ Unused ┊ Ok    ┊ Diskless ┊ 
2023-10-27 10:54:47 ┊
┊ test05       ┊ linstor-02 ┊ 7000 ┊ Unused ┊ Ok    ┊ UpToDate ┊ 
2023-10-27 10:54:58 ┊
┊ test05       ┊ linstor-03 ┊ 7000 ┊ Unused ┊ Ok    ┊ UpToDate ┊ 
2023-10-27 10:54:58 ┊
╰────────────────────────────────────────────────────────────────────────────────────╯
volume-definition
╭─────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ VolumeNr ┊ VolumeMinor ┊ Size    ┊ Gross ┊ State ┊
╞═════════════════════════════════════════════════════════════════╡
┊ linstor_db   ┊ 0        ┊ 1001        ┊ 200 MiB ┊       ┊ ok    ┊
┊ test05       ┊ 0        ┊ 1000        ┊ 64 MiB  ┊       ┊ ok    ┊
┊ test05       ┊ 1        ┊ 1002        ┊ 10 GiB  ┊       ┊ ok    ┊
╰─────────────────────────────────────────────────────────────────╯
resource-definition
╭─────────────────────────────────────────────────╮
┊ ResourceName ┊ Port ┊ ResourceGroup     ┊ State ┊
╞═════════════════════════════════════════════════╡
┊ linstor_db   ┊ 7001 ┊ DfltRscGrp        ┊ ok    ┊
┊ test05       ┊ 7000 ┊ oneRessourceGroup ┊ ok    ┊
╰─────────────────────────────────────────────────╯
volume
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node       ┊ Resource   ┊ StoragePool          ┊ VolNr ┊ MinorNr ┊ 
DeviceName    ┊ Allocated ┊ InUse  ┊    State ┊
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ linstor-01 ┊ linstor_db ┊ storage              ┊     0 ┊ 1001 ┊ 
/dev/drbd1001 ┊ 18.61 MiB ┊ InUse  ┊ UpToDate ┊
┊ linstor-02 ┊ linstor_db ┊ storage              ┊     0 ┊ 1001 ┊ 
/dev/drbd1001 ┊ 18.61 MiB ┊ Unused ┊ UpToDate ┊
┊ linstor-03 ┊ linstor_db ┊ storage              ┊     0 ┊ 1001 ┊ 
/dev/drbd1001 ┊ 18.61 MiB ┊ Unused ┊ UpToDate ┊
┊ linstor-01 ┊ test05     ┊ DfltDisklessStorPool ┊     0 ┊ 1000 ┊ 
/dev/drbd1000 ┊           ┊ Unused ┊ Diskless ┊
┊ linstor-01 ┊ test05     ┊ DfltDisklessStorPool ┊     1 ┊ 1002 ┊ 
/dev/drbd1002 ┊           ┊ Unused ┊ Diskless ┊
┊ linstor-02 ┊ test05     ┊ storage              ┊     0 ┊ 1000 ┊ 
/dev/drbd1000 ┊   204 KiB ┊ Unused ┊ UpToDate ┊
┊ linstor-02 ┊ test05     ┊ storage              ┊     1 ┊ 1002 ┊ 
/dev/drbd1002 ┊  3.67 MiB ┊ Unused ┊ UpToDate ┊
┊ linstor-03 ┊ test05     ┊ storage              ┊     0 ┊ 1000 ┊ 
/dev/drbd1000 ┊   204 KiB ┊ Unused ┊ UpToDate ┊
┊ linstor-03 ┊ test05     ┊ storage              ┊     1 ┊ 1002 ┊ 
/dev/drbd1002 ┊  3.67 MiB ┊ Unused ┊ UpToDate ┊
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯+--------------------------------+----------------+---------------+-----+---------------+
|              IQN               |   Service IP   | Service state | LUN 
| LINSTOR state |
+--------------------------------+----------------+---------------+-----+---------------+
| iqn.2023-10.com.example:test05 | 10.105.0.30/24 | Stopped       |   1 
| OK            |
+--------------------------------+----------------+---------------+-----+---------------+

The service is stopped, but it seems that there is no error reported.

root at linstor-01:~# linstor-gateway iscsi start 
iqn.2023-10.com.example:test05
Started target "iqn.2023-10.com.example:test05"

root at linstor-01:~# linstor-gateway iscsi list
+--------------------------------+----------------+---------------+-----+---------------+
|              IQN               |   Service IP   | Service state | LUN 
| LINSTOR state |
+--------------------------------+----------------+---------------+-----+---------------+
| iqn.2023-10.com.example:test05 | 10.105.0.30/24 | Stopped       |   1 
| OK            |
+--------------------------------+----------------+---------------+-----+---------------+

The service is still stopped...

If I "watch" drbdadm status, I see that the "Primary" state loops among 
all servers, and fallback to secondary.

(on the third node)

test05 role:Secondary
   volume:0 disk:UpToDate
   volume:1 disk:UpToDate
   linstor-01 role:Secondary
     volume:0 peer-disk:Diskless
     volume:1 peer-disk:Diskless
   linstor-02 role:Secondary
     volume:0 peer-disk:UpToDate
     volume:1 peer-disk:UpToDate

So ... digging into journalctl :

Oct 27 11:04:39 linstor-03 drbd-reactor[1731492]: INFO 
[drbd_reactor::plugin::promoter] systemd_start: systemctl start 
drbd-services at test05.target
Oct 27 11:04:39 linstor-03 systemd[1]: Starting Promotion of DRBD 
resource test05...
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Preparing cluster-wide 
state change 1823090526 (1->-1 3/1)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Aborting 
local state change 1823090526 to yield to remote state change 1553699760.
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Aborting cluster-wide 
state change 1823090526 (0ms) rv = -19
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Preparing 
remote state change 1553699760
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Aborting 
remote state change 1553699760
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: Preparing 
remote state change 2189658367
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: Committing 
remote state change 2189658367 (primary_nodes=4)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: peer( 
Secondary -> Primary )
Oct 27 11:04:40 linstor-03 kernel: drbd test05/0 drbd1000 linstor-01: 
received new current UUID: 1EF05D749E76B63D weak_nodes=FFFFFFFFFFFFFFFC
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Preparing 
remote state change 1032811290
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Committing 
remote state change 1032811290 (primary_nodes=5)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: peer( 
Secondary -> Primary )
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Preparing cluster-wide 
state change 4274765809 (1->-1 3/1)
Oct 27 11:04:40 linstor-03 kernel: drbd test05: State change 4274765809: 
primary_nodes=7, weak_nodes=FFFFFFFFFFFFFFF8
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Committing cluster-wide 
state change 4274765809 (0ms)
Oct 27 11:04:40 linstor-03 kernel: drbd test05: role( Secondary -> Primary )
Oct 27 11:04:40 linstor-03 systemd[1]: Finished Promotion of DRBD 
resource test05.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled 
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839779]: Oct 27 11:04:40 
INFO: Running start for /dev/drbd/by-res/test05/0 on /srv/ha/internal/test05
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839775]: Filesystem: 
fs_cluster_private_test05: NOTIFY READY=1 STATUS=calling monitor every 
30 seconds
Oct 27 11:04:40 linstor-03 kernel: EXT4-fs (drbd1000): recovery complete
Oct 27 11:04:40 linstor-03 kernel: EXT4-fs (drbd1000): mounted 
filesystem with ordered data mode. Opts: (null)
Oct 27 11:04:40 linstor-03 kernel: ext4 filesystem being mounted at 
/srv/ha/internal/test05 supports timestamps until 2038 (0x7fffffff)
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled 
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled 
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839857]: portblock: 
pblock0_test05: NOTIFY READY=1 STATUS=calling monitor every 30 seconds
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled 
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled 
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40 
INFO: Adding inet address 10.105.0.30/24 with broadcast address 
10.105.0.255 to device enp4s0f0
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40 
INFO: Bringing device enp4s0f0 up
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40 
INFO: /usr/lib/heartbeat/send_arp  -i 200 -r 5 -p 
/run/resource-agents/send_arp-10.105.0.30 enp4s0f0 10.105.0.30 auto 
not_used not_used
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839874]: IPaddr2: 
service_ip0_test05: NOTIFY READY=1 STATUS=calling monitor every 30 seconds
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled 
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled 
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839943]: Oct 27 11:04:40 
WARNING: Configuration parameter "portals" is not supported by the iSCSI 
implementation and will be ignored.
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839963]: tgtadm: failed to 
send request hdr to tgt daemon, Transport endpoint is not connected
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839943]: Oct 27 11:04:40 
ERROR: tgtadm: failed to send request hdr to tgt daemon, Transport 
endpoint is not connected
Oct 27 11:04:40 linstor-03 systemd[1]: ocf.ra at target_test05.service: 
Main process exited, code=exited, status=1/FAILURE
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839990]: tgtadm: failed to 
send request hdr to tgt daemon, Transport endpoint is not connected
Oct 27 11:04:40 linstor-03 systemd[1]: ocf.ra at target_test05.service: 
Failed with result 'exit-code'.
Oct 27 11:04:40 linstor-03 systemd[1]: Failed to start drbd-reactor 
controlled ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Dependency failed for 
drbd-reactor controlled ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Dependency failed for 
drbd-reactor controlled ocf.ra.
Oct 27 11:04:40 linstor-03 drbd-reactor[2839769]: A dependency job for 
drbd-services at test05.target failed. See 'journalctl -xe' for details.

The error is on TGT start action. But, I do not know how to fix that.

Trying to launch it using "tgtd -f" changed nothing, the device is still 
not available.

Eg:

root at linstor-03:~# tgtd -f
tgtd: iser_ib_init(3431) Failed to initialize RDMA; load kernel modules?
tgtd: work_timer_start(146) use timer_fd based scheduler
tgtd: bs_init(387) use signalfd notification
tgtd: device_mgmt(246) sz:31 params:path=/dev/drbd/by-res/test05/1
tgtd: bs_thread_open(409) 16

Do you have any idea to make that UP ? I do not have any more ideas ....

Thank you for any help you may provide.

Regards,

Nicolas.

PS: (my "fix" is push on my fork, 
https://github.com/nicolasb827/linstor-gateway/tree/target-id-parameter)



More information about the drbd-user mailing list