[DRBD-user] linstor-gateway 1.3.0 on Debian 11.7
Nicolas Bélan
nicolas.belan at gmail.com
Fri Oct 27 11:08:36 CEST 2023
Hello,
I am trying to deploy linstor gateway on a 3 nodes cluster on Debian 11.7.
I added the parameter "target id" on linstor-gateway to handle the
parameter "tid" in OCF ressources, because without it, I had:
ocf-exit-reason:Missing resource parameter "tid"!
But, I still have an error on tgt.
Well, here are the details:
root at linstor-01:~# cat /proc/drbd
version: 9.2.5 (api:2/proto:86-122)
GIT-hash: b44520271e63d4b6f359a6642eb4d475b7cc04e0 build by
root at linstor-01, 2023-10-10 01:29:10
Transports (api:18): tcp (9.2.5)
root at linstor-01:~# drbdadm -V
DRBDADM_BUILDTAG=GIT-hash:\ bb297231c27690a31bf527e8bf77dca1fc2ce268\
build\ by\ root at linstor-01\,\ 2023-10-10\ 23:37:11
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x090205
DRBD_KERNEL_VERSION=9.2.5
DRBDADM_VERSION_CODE=0x091900
DRBDADM_VERSION=9.25.0
I am trying to provide a 10G iscsi device, with the command:
root at linstor-01:~# linstor-gateway iscsi create
iqn.2023-10.com.example:test05 10.105.0.30/24 10G -r oneRessourceGroup
--implementation tgt -t 2
Created iSCSI target 'iqn.2023-10.com.example:test05'
So, no error is reported on creation.
I created before a drbd device 'linstor_db' which is replicated between
all nodes, and it is mounted successfully.
I add some "linstor" outputs here:
node
╭────────────────────────────────────────────────────────────╮
┊ Node ┊ NodeType ┊ Addresses ┊ State ┊
╞════════════════════════════════════════════════════════════╡
┊ linstor-01 ┊ SATELLITE ┊ 10.105.0.31:3366 (PLAIN) ┊ Online ┊
┊ linstor-02 ┊ SATELLITE ┊ 10.105.0.32:3366 (PLAIN) ┊ Online ┊
┊ linstor-03 ┊ SATELLITE ┊ 10.105.0.33:3366 (PLAIN) ┊ Online ┊
╰────────────────────────────────────────────────────────────╯
physical-storage
╭───────────────────────────╮
┊ Size ┊ Rotational ┊ Nodes ┊
╞═══════════════════════════╡
╰───────────────────────────╯
storage-pool
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool ┊ Node ┊ Driver ┊ PoolName ┊ FreeCapacity
┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ linstor-01 ┊ DISKLESS ┊ ┊
┊ ┊ False ┊ Ok ┊ linstor-01;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ linstor-02 ┊ DISKLESS ┊ ┊
┊ ┊ False ┊ Ok ┊ linstor-02;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ linstor-03 ┊ DISKLESS ┊ ┊
┊ ┊ False ┊ Ok ┊ linstor-03;DfltDisklessStorPool ┊
┊ storage ┊ linstor-01 ┊ ZFS ┊ storage ┊ 8.44 TiB
┊ 10.91 TiB ┊ True ┊ Ok ┊ linstor-01;storage ┊
┊ storage ┊ linstor-02 ┊ ZFS ┊ storage ┊ 8.42 TiB
┊ 10.91 TiB ┊ True ┊ Ok ┊ linstor-02;storage ┊
┊ storage ┊ linstor-03 ┊ ZFS ┊ storage ┊ 8.42 TiB
┊ 10.91 TiB ┊ True ┊ Ok ┊ linstor-03;storage ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
resource-group
╭────────────────────────────────────────────────────────────────────╮
┊ ResourceGroup ┊ SelectFilter ┊ VlmNrs ┊ Description ┊
╞════════════════════════════════════════════════════════════════════╡
┊ DfltRscGrp ┊ PlaceCount: 2 ┊ ┊ ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ oneRessourceGroup ┊ PlaceCount: 2 ┊ 0 ┊ ┊
┊ ┊ StoragePool(s): storage ┊ ┊ ┊
╰────────────────────────────────────────────────────────────────────╯
resource
╭────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node ┊ Port ┊ Usage ┊ Conns ┊ State ┊
CreatedOn ┊
╞════════════════════════════════════════════════════════════════════════════════════╡
┊ linstor_db ┊ linstor-01 ┊ 7001 ┊ InUse ┊ Ok ┊ UpToDate ┊
2023-10-14 00:07:02 ┊
┊ linstor_db ┊ linstor-02 ┊ 7001 ┊ Unused ┊ Ok ┊ UpToDate ┊
2023-10-14 00:07:02 ┊
┊ linstor_db ┊ linstor-03 ┊ 7001 ┊ Unused ┊ Ok ┊ UpToDate ┊
2023-10-14 00:07:02 ┊
┊ test05 ┊ linstor-01 ┊ 7000 ┊ Unused ┊ Ok ┊ Diskless ┊
2023-10-27 10:54:47 ┊
┊ test05 ┊ linstor-02 ┊ 7000 ┊ Unused ┊ Ok ┊ UpToDate ┊
2023-10-27 10:54:58 ┊
┊ test05 ┊ linstor-03 ┊ 7000 ┊ Unused ┊ Ok ┊ UpToDate ┊
2023-10-27 10:54:58 ┊
╰────────────────────────────────────────────────────────────────────────────────────╯
volume-definition
╭─────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ VolumeNr ┊ VolumeMinor ┊ Size ┊ Gross ┊ State ┊
╞═════════════════════════════════════════════════════════════════╡
┊ linstor_db ┊ 0 ┊ 1001 ┊ 200 MiB ┊ ┊ ok ┊
┊ test05 ┊ 0 ┊ 1000 ┊ 64 MiB ┊ ┊ ok ┊
┊ test05 ┊ 1 ┊ 1002 ┊ 10 GiB ┊ ┊ ok ┊
╰─────────────────────────────────────────────────────────────────╯
resource-definition
╭─────────────────────────────────────────────────╮
┊ ResourceName ┊ Port ┊ ResourceGroup ┊ State ┊
╞═════════════════════════════════════════════════╡
┊ linstor_db ┊ 7001 ┊ DfltRscGrp ┊ ok ┊
┊ test05 ┊ 7000 ┊ oneRessourceGroup ┊ ok ┊
╰─────────────────────────────────────────────────╯
volume
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ Resource ┊ StoragePool ┊ VolNr ┊ MinorNr ┊
DeviceName ┊ Allocated ┊ InUse ┊ State ┊
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ linstor-01 ┊ linstor_db ┊ storage ┊ 0 ┊ 1001 ┊
/dev/drbd1001 ┊ 18.61 MiB ┊ InUse ┊ UpToDate ┊
┊ linstor-02 ┊ linstor_db ┊ storage ┊ 0 ┊ 1001 ┊
/dev/drbd1001 ┊ 18.61 MiB ┊ Unused ┊ UpToDate ┊
┊ linstor-03 ┊ linstor_db ┊ storage ┊ 0 ┊ 1001 ┊
/dev/drbd1001 ┊ 18.61 MiB ┊ Unused ┊ UpToDate ┊
┊ linstor-01 ┊ test05 ┊ DfltDisklessStorPool ┊ 0 ┊ 1000 ┊
/dev/drbd1000 ┊ ┊ Unused ┊ Diskless ┊
┊ linstor-01 ┊ test05 ┊ DfltDisklessStorPool ┊ 1 ┊ 1002 ┊
/dev/drbd1002 ┊ ┊ Unused ┊ Diskless ┊
┊ linstor-02 ┊ test05 ┊ storage ┊ 0 ┊ 1000 ┊
/dev/drbd1000 ┊ 204 KiB ┊ Unused ┊ UpToDate ┊
┊ linstor-02 ┊ test05 ┊ storage ┊ 1 ┊ 1002 ┊
/dev/drbd1002 ┊ 3.67 MiB ┊ Unused ┊ UpToDate ┊
┊ linstor-03 ┊ test05 ┊ storage ┊ 0 ┊ 1000 ┊
/dev/drbd1000 ┊ 204 KiB ┊ Unused ┊ UpToDate ┊
┊ linstor-03 ┊ test05 ┊ storage ┊ 1 ┊ 1002 ┊
/dev/drbd1002 ┊ 3.67 MiB ┊ Unused ┊ UpToDate ┊
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯+--------------------------------+----------------+---------------+-----+---------------+
| IQN | Service IP | Service state | LUN
| LINSTOR state |
+--------------------------------+----------------+---------------+-----+---------------+
| iqn.2023-10.com.example:test05 | 10.105.0.30/24 | Stopped | 1
| OK |
+--------------------------------+----------------+---------------+-----+---------------+
The service is stopped, but it seems that there is no error reported.
root at linstor-01:~# linstor-gateway iscsi start
iqn.2023-10.com.example:test05
Started target "iqn.2023-10.com.example:test05"
root at linstor-01:~# linstor-gateway iscsi list
+--------------------------------+----------------+---------------+-----+---------------+
| IQN | Service IP | Service state | LUN
| LINSTOR state |
+--------------------------------+----------------+---------------+-----+---------------+
| iqn.2023-10.com.example:test05 | 10.105.0.30/24 | Stopped | 1
| OK |
+--------------------------------+----------------+---------------+-----+---------------+
The service is still stopped...
If I "watch" drbdadm status, I see that the "Primary" state loops among
all servers, and fallback to secondary.
(on the third node)
test05 role:Secondary
volume:0 disk:UpToDate
volume:1 disk:UpToDate
linstor-01 role:Secondary
volume:0 peer-disk:Diskless
volume:1 peer-disk:Diskless
linstor-02 role:Secondary
volume:0 peer-disk:UpToDate
volume:1 peer-disk:UpToDate
So ... digging into journalctl :
Oct 27 11:04:39 linstor-03 drbd-reactor[1731492]: INFO
[drbd_reactor::plugin::promoter] systemd_start: systemctl start
drbd-services at test05.target
Oct 27 11:04:39 linstor-03 systemd[1]: Starting Promotion of DRBD
resource test05...
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Preparing cluster-wide
state change 1823090526 (1->-1 3/1)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Aborting
local state change 1823090526 to yield to remote state change 1553699760.
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Aborting cluster-wide
state change 1823090526 (0ms) rv = -19
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Preparing
remote state change 1553699760
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Aborting
remote state change 1553699760
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: Preparing
remote state change 2189658367
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: Committing
remote state change 2189658367 (primary_nodes=4)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: peer(
Secondary -> Primary )
Oct 27 11:04:40 linstor-03 kernel: drbd test05/0 drbd1000 linstor-01:
received new current UUID: 1EF05D749E76B63D weak_nodes=FFFFFFFFFFFFFFFC
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Preparing
remote state change 1032811290
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Committing
remote state change 1032811290 (primary_nodes=5)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: peer(
Secondary -> Primary )
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Preparing cluster-wide
state change 4274765809 (1->-1 3/1)
Oct 27 11:04:40 linstor-03 kernel: drbd test05: State change 4274765809:
primary_nodes=7, weak_nodes=FFFFFFFFFFFFFFF8
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Committing cluster-wide
state change 4274765809 (0ms)
Oct 27 11:04:40 linstor-03 kernel: drbd test05: role( Secondary -> Primary )
Oct 27 11:04:40 linstor-03 systemd[1]: Finished Promotion of DRBD
resource test05.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839779]: Oct 27 11:04:40
INFO: Running start for /dev/drbd/by-res/test05/0 on /srv/ha/internal/test05
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839775]: Filesystem:
fs_cluster_private_test05: NOTIFY READY=1 STATUS=calling monitor every
30 seconds
Oct 27 11:04:40 linstor-03 kernel: EXT4-fs (drbd1000): recovery complete
Oct 27 11:04:40 linstor-03 kernel: EXT4-fs (drbd1000): mounted
filesystem with ordered data mode. Opts: (null)
Oct 27 11:04:40 linstor-03 kernel: ext4 filesystem being mounted at
/srv/ha/internal/test05 supports timestamps until 2038 (0x7fffffff)
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839857]: portblock:
pblock0_test05: NOTIFY READY=1 STATUS=calling monitor every 30 seconds
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40
INFO: Adding inet address 10.105.0.30/24 with broadcast address
10.105.0.255 to device enp4s0f0
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40
INFO: Bringing device enp4s0f0 up
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40
INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p
/run/resource-agents/send_arp-10.105.0.30 enp4s0f0 10.105.0.30 auto
not_used not_used
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839874]: IPaddr2:
service_ip0_test05: NOTIFY READY=1 STATUS=calling monitor every 30 seconds
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839943]: Oct 27 11:04:40
WARNING: Configuration parameter "portals" is not supported by the iSCSI
implementation and will be ignored.
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839963]: tgtadm: failed to
send request hdr to tgt daemon, Transport endpoint is not connected
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839943]: Oct 27 11:04:40
ERROR: tgtadm: failed to send request hdr to tgt daemon, Transport
endpoint is not connected
Oct 27 11:04:40 linstor-03 systemd[1]: ocf.ra at target_test05.service:
Main process exited, code=exited, status=1/FAILURE
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839990]: tgtadm: failed to
send request hdr to tgt daemon, Transport endpoint is not connected
Oct 27 11:04:40 linstor-03 systemd[1]: ocf.ra at target_test05.service:
Failed with result 'exit-code'.
Oct 27 11:04:40 linstor-03 systemd[1]: Failed to start drbd-reactor
controlled ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Dependency failed for
drbd-reactor controlled ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Dependency failed for
drbd-reactor controlled ocf.ra.
Oct 27 11:04:40 linstor-03 drbd-reactor[2839769]: A dependency job for
drbd-services at test05.target failed. See 'journalctl -xe' for details.
The error is on TGT start action. But, I do not know how to fix that.
Trying to launch it using "tgtd -f" changed nothing, the device is still
not available.
Eg:
root at linstor-03:~# tgtd -f
tgtd: iser_ib_init(3431) Failed to initialize RDMA; load kernel modules?
tgtd: work_timer_start(146) use timer_fd based scheduler
tgtd: bs_init(387) use signalfd notification
tgtd: device_mgmt(246) sz:31 params:path=/dev/drbd/by-res/test05/1
tgtd: bs_thread_open(409) 16
Do you have any idea to make that UP ? I do not have any more ideas ....
Thank you for any help you may provide.
Regards,
Nicolas.
PS: (my "fix" is push on my fork,
https://github.com/nicolasb827/linstor-gateway/tree/target-id-parameter)
More information about the drbd-user
mailing list