[DRBD-user] Missing Linstor/DRBD resource files after reboot
Gianni Milo
gianni.milo22 at gmail.com
Fri Mar 29 19:38:57 CET 2019
> I do have the controller running on a dedicated Dabian VM ---- Linux
> linstor-ctlr 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64
> Its storage is mounted from a NFS share on a FreeNAS server available to
> all the nodes in my cluster.
>
In my case, the controller VM is stored on DRBD itself, hence the need to
make sure that its resource file is available at all times.
I somewhat get why it deletes the res files, but the weird thing is that it
> will restore some of files and leave others missing after rebooting. I
> haven't seen a pattern of specific resources not being restored, and it's a
> different number of missing res files each time also.
>
Never seen this behavior with LINSTOR (used to see this when I was using
drbdmanage for managing DRBD though, sometimes). All resource files become
available on all nodes as soon as linstor-satellite service is able to
communicate properly with the controller from that node.
The only case where a resource file does not show up on a node, is when
that resource is Diskless on that specific node. In that case, LINSTOR
creates a resource file only when that resource becomes Primary on that
specific node and then removes the resource file as soon as you relocate
the resource to the other node, but I guess this is not your case ?
> With the last restart only 13 of the 27 resources were restored, even
> after nearly 2 hours. I decided to do another restart and after that it
> eventually restored all of the resources. ¯\_(ツ)_/¯
>
It might worth checking the system logs and/or linstor specific logs to
check what's causing this behavior. Assuming that the controller VM is up
and running at all times, satellites should be able to export the res files
on the nodes.
I have started maintaining a copy of the "/var/lib/linstor.d/" from each
> node so that I can simply copy them back in manually if the controller
> fails to do it.
>
That is a workaround, but it defeats the logic of having LINSTOR for the
management layer in the first place. Especially as your cluster scales up,
this can easily become a tedious task..
> Using the "--keep-res" method and matching all resources might do the
> trick.
No, I would not recommend this for preserving all resource files, but would
recommend it only in the case that you need to preserve just one single
resource file, that is the controller VM resource file, assuming that
controller VM is stored within DRBD (that's not your case, you are using
NFS for that). LINSTOR controller should be able to export the resource
files properly on each node, if that's not happening, then something is not
functioning properly somewhere or you have misconfigured something.
> Does this only affect deleting during startup though? The controller can
> still tell a satellite to delete a resource even if the "--keep-res" is
> used, correct?
>
Yes, this is used only during startup. The controller can still delete a
resource when instructed to do so.
I didn't find direct documentation on overriding the service configuration,
> but I will try it in the next few days.
>
Just to mention at this point that what worked for me was the following ..
# On each node
$ systemctl edit linstor-satellite
# Add the following content
[Unit]
After=drbd.service
[Service]
Type=oneshot # <--- very important
ExecStart=/usr/share/linstor-server/bin/Satellite
--logs=/var/log/linstor-satellite --config-directory=/etc/linstor
--keep-res vm-100-disk-1
Assuming that vm-100-disk-1 is your controller VM resource file. But again,
this is not your case ..
G.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20190329/c3bf77a5/attachment-0001.htm>
More information about the drbd-user
mailing list