[DRBD-user] Missing Linstor/DRBD resource files after reboot

Nicholas Morton kmorton at cancinc.com
Fri Mar 29 16:55:11 CET 2019

>> Can you provide more details on how is your linstor-controller configured ? For example, in the users guide it is recommended to setup a dedicated VM in Proxmox just for the linstor-controller. Is that the way you have configured yours ?

I do have the controller running on a dedicated Dabian VM ---- Linux linstor-ctlr 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64
Its storage is mounted from a NFS share on a FreeNAS server available to all the nodes in my cluster. The nodes are all connected to a 10G Quanta LB6M switch dedicated for DRBD traffic on a separate IP subnet, Proxmox and all other network traffic goes through another 10G Quanta switch.

>> I have seen the same behaviour (all res files are being deleted from /var/lib/linstor.d only when all proxmox nodes went down and I have to restart them).

I somewhat get why it deletes the res files, but the weird thing is that it will restore some of files and leave others missing after rebooting. I haven't seen a pattern of specific resources not being restored, and it's a different number of missing res files each time also.
With the last restart only 13 of the 27 resources were restored, even after nearly 2 hours. I decided to do another restart and after that it eventually restored all of the resources. ¯\_(ツ)_/¯
I have started maintaining a copy of the "/var/lib/linstor.d/" from each node so that I can simply copy them back in manually if the controller fails to do it.

Using the "--keep-res" method and matching all resources might do the trick. Does this only affect deleting during startup though? The controller can still tell a satellite to delete a resource even if the "--keep-res" is used, correct?
I didn't find direct documentation on overriding the service configuration, but I will try it in the next few days. I'll also do some node restarts and be more intentional in examining any error logs that the linstor satellite and controller create.

--Extra info


