<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>I'm having another crack at this, I think it will be worth it
      once it works.</p>
    <p>Firstly, another documentation error:</p>
    <p><a class="moz-txt-link-freetext" href="https://www.linbit.com/drbd-user-guide/linstor-guide-1_0-en/#s-using_the_linstor_client">https://www.linbit.com/drbd-user-guide/linstor-guide-1_0-en/#s-using_the_linstor_client</a><br>
      <blockquote type="cite">
        <div class="paragraph">
          <p>In case anything goes wrong with the storage pool’s
            VG/zPool, e.g. the VG having been renamed or somehow
            became invalid you can delete the storage pool in LINSTOR
            with the following command, given that only
            resources with all their volumes in the so-called ‘lost’
            storage pool are attached. This feature is available
            since LINSTOR v0.9.13.</p>
        </div>
        <div class="listingblock">
          <div class="content">
            <pre># linstor storage-pool lost alpha pool_ssd</pre>
          </div>
        </div>
      </blockquote>
      linstor storage-pool lost castle vg_hdd<br>
      usage: linstor storage-pool [-h]<br>
                                  {create, delete, list,
      list-properties,<br>
                                  set-property} ...<br>
      linstor storage-pool: error: argument {create, delete, list,
      list-properties, set-property}: invalid choice: 'lost' (choose
      from 'create', 'c', 'delete', 'd', 'list', 'l', 'list-properties',
      'lp', 'set-property', 'sp')<br>
    </p>
    <p>Changing to use delete instead of lost:</p>
    <p>castle:~# linstor storage-pool delete castle vg_hdd<br>
      ERROR:<br>
      Description:<br>
          Storage pool definition 'vg_hdd' not found.<br>
      Cause:<br>
          The specified storage pool definition 'vg_hdd' could not be
      found in the database<br>
      Correction:<br>
          Create a storage pool definition 'vg_hdd' first.<br>
      Details:<br>
          Node: castle, Storage pool name: vg_hdd<br>
      Show reports:<br>
          linstor error-reports show 5F0D500C-00000-000000<br>
      castle:~# linstor storage-pool list<br>
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────╮<br>
      ┊ StoragePool          ┊ Node   ┊ Driver   ┊ PoolName ┊
      FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊<br>
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════╡<br>
      ┊ DfltDisklessStorPool ┊ castle ┊ DISKLESS ┊         
      ┊              ┊               ┊ False        ┊ Ok    ┊<br>
      ┊ DfltDisklessStorPool ┊ san5   ┊ DISKLESS ┊         
      ┊              ┊               ┊ False        ┊ Ok    ┊<br>
      ┊ DfltDisklessStorPool ┊ san6   ┊ DISKLESS ┊         
      ┊              ┊               ┊ False        ┊ Ok    ┊<br>
      ┊ pool                 ┊ castle ┊ LVM      ┊ vg_hdd   ┊     2.95
      TiB ┊      3.44 TiB ┊ False        ┊ Ok    ┊<br>
      ┊ pool                 ┊ san5   ┊ LVM      ┊ vg_hdd   ┊     3.87
      TiB ┊      4.36 TiB ┊ False        ┊ Ok    ┊<br>
      ┊ pool                 ┊ san6   ┊ LVM      ┊ vg_ssd   ┊     1.26
      TiB ┊      1.75 TiB ┊ False        ┊ Ok    ┊<br>
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯<br>
    </p>
    <p>I was hoping I could just remove the storage pool from castle
      (since it doesn't seem to be working properly), and then destroy
      it, re-create it, and then re-add it and see if that solves the
      problem. However, while it seems to exist, it also doesn't (can't
      delete it).</p>
    <p>Possibly part of the cause of my original problem is that I have
      a script that automatically creates a snapshot for each LV, and
      this created a snapshot of testvm1_00000 named
      backup_testvm1_00000_blahblah.... I've now manually deleted that,
      and fixed my script to avoid messing with the VG allocated to
      linstor, but so far, there is no change in the current status (as
      per below).</p>
    <p>Would appreciate any suggestions on what might be going wrong,
      and/or how to fix it?</p>
    <p>Regards,<br>
      Adam</p>
    <p><br>
    </p>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 24/6/20 11:46, Adam Goryachev wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:5c4f368d-5d93-65f6-d70b-88d44d3636ec@websitemanagers.com.au">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <p><br>
      </p>
      <div class="moz-cite-prefix">On 23/6/20 21:53, Gábor Hernádi
        wrote:<br>
      </div>
      <blockquote type="cite"
cite="mid:CAL16tNa88c+jOP8Uffbw08Sf2vVsg60sMVg5nR_cSdWz5iZSNQ@mail.gmail.com">
        <meta http-equiv="content-type" content="text/html;
          charset=UTF-8">
        <div dir="ltr">
          <div dir="ltr">
            <div>Hi,</div>
            <div><br>
            </div>
            <div>apparently something is quite broken... maybe it's
              somehow your setup or environment, I am not sure...<br>
            </div>
          </div>
          <br>
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <div>
                <p>linstor resource list<br>
╭────────────────────────────────────────────────────────────────────────────╮<br>
                  ┊ ResourceName ┊ Node   ┊ Port ┊ Usage  ┊
                  Conns                   ┊    State ┊<br>
╞════════════════════════════════════════════════════════════════════════════╡<br>
                  ┊ testvm1      ┊ castle ┊ 7000 ┊       
                  ┊                         ┊  Unknown ┊<br>
                  ┊ testvm1      ┊ san5   ┊ 7000 ┊       
                  ┊                         ┊  Unknown ┊<br>
                  ┊ testvm1      ┊ san6   ┊ 7000 ┊ Unused ┊
                  Connecting(san5,castle) ┊ UpToDate ┊<br>
╰────────────────────────────────────────────────────────────────────────────╯<br>
                </p>
              </div>
            </blockquote>
            <div>This looks like some kind of network issues. <br>
            </div>
            <br>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <div>
                <div>
                  <div>
                    <pre># linstor storage-pool list --groupby Size</pre>
                  </div>
                </div>
                <div>However, the second command produces a usage error
                  (documentation bug perhaps). </div>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>Thanks for reporting, we will look into this.<br>
            </div>
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <div>
                <div> WARNING:<br>
                  Description:<br>
                      No active connection to satellite 'san5'<br>
                  Details:<br>
                      The controller is trying to (re-) establish a
                  connection to the satellite. The controller stored the
                  changes and as soon the satellite is connected, it
                  will receive this update.</div>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>So Linstor has obviously no connection to satellite
              'san5'.  <br>
            </div>
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <div>
                <div> </div>
                [95078.599813] drbd testvm1 castle: conn( Unconnected
                -&gt; Connecting )<br>
                <div> [95078.604454] drbd testvm1 san5: conn(
                  Unconnected -&gt; Connecting )</div>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>... and DRBD apparently also has troubles connecting...</div>
            <div><br>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <div>
                <div>linstor n l<br>
╭───────────────────────────────────────────────────────────╮<br>
                  ┊ Node   ┊ NodeType  ┊ Addresses                  ┊
                  State   ┊<br>
╞═══════════════════════════════════════════════════════════╡<br>
                  ┊ castle ┊ SATELLITE ┊ <a
                    href="http://192.168.5.204:3366" target="_blank"
                    moz-do-not-send="true">192.168.5.204:3366</a>
                  (PLAIN) ┊ Unknown ┊<br>
                  ┊ san5   ┊ SATELLITE ┊ <a
                    href="http://192.168.5.205:3366" target="_blank"
                    moz-do-not-send="true">192.168.5.205:3366</a>
                  (PLAIN) ┊ Unknown ┊<br>
                  ┊ san6   ┊ SATELLITE ┊ <a
                    href="http://192.168.5.206:3366" target="_blank"
                    moz-do-not-send="true">192.168.5.206:3366</a>
                  (PLAIN) ┊ Unknown ┊<br>
╰───────────────────────────────────────────────────────────╯<br>
                </div>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>Now  this is really strange. I will spare you with some
              details, but I assume you have triggered some bad
              exception in Linstor which somehow killed a necessary
              thread. <br>
            </div>
            <div>You should check <br>
            </div>
            <div>   linstor err list</div>
            <div>and see if you can find some related error reports. <br>
            </div>
            <div>Also, restarting the controller might help you here.<br>
            </div>
          </div>
          <br>
        </div>
      </blockquote>
      <p>Thank you!<br>
      </p>
      <p>linstor err list showed a list of errors, but the contents
        didn't make a lot of sense to me. Let me know if you are
        interested in them, and I can send them.<br>
      </p>
      <p>I did a systemctl restart linstor-controller.service on san6,
        and things started looking much better.</p>
      <p>linstor n l<br>
        ╭──────────────────────────────────────────────────────────╮<br>
        ┊ Node   ┊ NodeType  ┊ Addresses                  ┊ State  ┊<br>
        ╞══════════════════════════════════════════════════════════╡<br>
        ┊ castle ┊ SATELLITE ┊ 192.168.5.204:3366 (PLAIN) ┊ Online ┊<br>
        ┊ san5   ┊ SATELLITE ┊ 192.168.5.205:3366 (PLAIN) ┊ Online ┊<br>
        ┊ san6   ┊ SATELLITE ┊ 192.168.5.206:3366 (PLAIN) ┊ Online ┊<br>
        ╰──────────────────────────────────────────────────────────╯<br>
      </p>
      <p>So, all nodes agree that they are now online and talking to
        each other. I assume this proves there is no network issues.</p>
      <p>linstor resource list<br>
╭─────────────────────────────────────────────────────────────────────────────────╮<br>
        ┊ ResourceName ┊ Node   ┊ Port ┊ Usage  ┊ Conns             
        ┊              State ┊<br>
╞═════════════════════════════════════════════════════════════════════════════════╡<br>
        ┊ testvm1      ┊ castle ┊ 7000 ┊        ┊                   
        ┊            Unknown ┊<br>
        ┊ testvm1      ┊ san5   ┊ 7000 ┊ Unused ┊ Connecting(castle) ┊
        SyncTarget(12.67%) ┊<br>
        ┊ testvm1      ┊ san6   ┊ 7000 ┊ Unused ┊ Connecting(castle)
        ┊           UpToDate ┊<br>
╰─────────────────────────────────────────────────────────────────────────────────╯<br>
      </p>
      <p>From this, it looks like san6 (the controller) thinks it has
        the up to date data, probably based on the fact it was created
        there first or something. The data is syncing to san5 (in
        progress, and progressing steadily), so that is good also.
        However, castle doesn't seem to be syncing/connecting. <br>
      </p>
      <p>On castle, I see this:</p>
      <p>Jun 24 11:01:55 castle Satellite[7499]: 11:01:55.177
        [DeviceManager] ERROR LINSTOR/Satellite - SYSTEM - Failed to
        create meta-data for DRBD volume testvm1/0 [Report number
        5EF2A316-31431-000002]<br>
      </p>
      <p>linstor err show give this:</p>
      <p>ERROR REPORT 5EF2A316-31431-000002<br>
        <br>
        ============================================================<br>
        <br>
        Application:                        LINBIT® LINSTOR<br>
        Module:                             Satellite<br>
        Version:                            1.7.1<br>
        Build ID:                          
        6760637d6fae7a5862103ced4ea0ab0a758861f9<br>
        Build time:                         2020-05-14T13:14:11+00:00<br>
        Error time:                         2020-06-24 11:01:55<br>
        Node:                               castle<br>
        <br>
        ============================================================<br>
        <br>
        Reported error:<br>
        ===============<br>
        <br>
        Description:<br>
            Failed to create meta-data for DRBD volume testvm1/0<br>
        <br>
        Category:                           LinStorException<br>
        Class name:                         VolumeException<br>
        Class canonical name:              
        com.linbit.linstor.storage.layer.exceptions.VolumeException<br>
        Generated at:                       Method 'createMetaData',
        Source file 'DrbdLayer.java', Line #995<br>
        <br>
        Error message:                      Failed to create meta-data
        for DRBD volume testvm1/0<br>
        <br>
        Error context:<br>
            An error occurred while processing resource 'Node: 'castle',
        Rsc: 'testvm1''<br>
        <br>
        Call backtrace:<br>
        <br>
            Method                                   Native Class:Line
        number<br>
            createMetaData                           N     
        com.linbit.linstor.storage.layer.adapter.drbd.DrbdLayer:995<br>
            adjustDrbd                               N     
        com.linbit.linstor.storage.layer.adapter.drbd.DrbdLayer:575<br>
            process                                  N     
        com.linbit.linstor.storage.layer.adapter.drbd.DrbdLayer:373<br>
            process                                  N     
        com.linbit.linstor.core.devmgr.DeviceHandlerImpl:731<br>
            processResourcesAndSnapshots             N     
        com.linbit.linstor.core.devmgr.DeviceHandlerImpl:300<br>
            dispatchResources                        N     
        com.linbit.linstor.core.devmgr.DeviceHandlerImpl:138<br>
            dispatchResources                        N     
        com.linbit.linstor.core.devmgr.DeviceManagerImpl:258<br>
            phaseDispatchDeviceHandlers              N     
        com.linbit.linstor.core.devmgr.DeviceManagerImpl:896<br>
            devMgrLoop                               N     
        com.linbit.linstor.core.devmgr.DeviceManagerImpl:618<br>
            run                                      N     
        com.linbit.linstor.core.devmgr.DeviceManagerImpl:535<br>
            run                                      N     
        java.lang.Thread:834<br>
        <br>
        Caused by:<br>
        ==========<br>
        <br>
        Description:<br>
            Execution of the external command 'drbdadm' failed.<br>
        Cause:<br>
            The external command exited with error code 1.<br>
        Correction:<br>
            - Check whether the external program is operating properly.<br>
            - Check whether the command line is correct.<br>
              Contact a system administrator or a developer if the
        command line is no longer valid<br>
              for the installed version of the external program.<br>
        Additional information:<br>
            The full command line executed was:<br>
            drbdadm -vvv --max-peers 7 -- --force create-md testvm1/0<br>
        <br>
            The external command sent the following output data:<br>
        <br>
        <br>
            The external command sent the following error information:<br>
            no resources defined!<br>
        <br>
        <br>
        Category:                           LinStorException<br>
        Class name:                         ExtCmdFailedException<br>
        Class canonical name:              
        com.linbit.extproc.ExtCmdFailedException<br>
        Generated at:                       Method 'execute', Source
        file 'DrbdAdm.java', Line #550<br>
        <br>
        Error message:                      The external command
        'drbdadm' exited with error code 1<br>
        <br>
        <br>
        Call backtrace:<br>
        <br>
            Method                                   Native Class:Line
        number<br>
            execute                                  N     
        com.linbit.linstor.storage.layer.adapter.drbd.utils.DrbdAdm:550<br>
            simpleAdmCommand                         N     
        com.linbit.linstor.storage.layer.adapter.drbd.utils.DrbdAdm:495<br>
            createMd                                 N     
        com.linbit.linstor.storage.layer.adapter.drbd.utils.DrbdAdm:262<br>
            createMetaData                           N     
        com.linbit.linstor.storage.layer.adapter.drbd.DrbdLayer:923<br>
            adjustDrbd                               N     
        com.linbit.linstor.storage.layer.adapter.drbd.DrbdLayer:575<br>
            process                                  N     
        com.linbit.linstor.storage.layer.adapter.drbd.DrbdLayer:373<br>
            process                                  N     
        com.linbit.linstor.core.devmgr.DeviceHandlerImpl:731<br>
            processResourcesAndSnapshots             N     
        com.linbit.linstor.core.devmgr.DeviceHandlerImpl:300<br>
            dispatchResources                        N     
        com.linbit.linstor.core.devmgr.DeviceHandlerImpl:138<br>
            dispatchResources                        N     
        com.linbit.linstor.core.devmgr.DeviceManagerImpl:258<br>
            phaseDispatchDeviceHandlers              N     
        com.linbit.linstor.core.devmgr.DeviceManagerImpl:896<br>
            devMgrLoop                               N     
        com.linbit.linstor.core.devmgr.DeviceManagerImpl:618<br>
            run                                      N     
        com.linbit.linstor.core.devmgr.DeviceManagerImpl:535<br>
            run                                      N     
        java.lang.Thread:834<br>
        <br>
        <br>
        END OF ERROR REPORT.<br>
      </p>
      <p>Indeed, re-running the same command from the CLI provides the
        shown error message:</p>
      <p>drbdadm -vvv --max-peers 7 -- --force create-md testvm1/0<br>
        no resources defined!<br>
      </p>
      <p>Some other random status information which may or may not be
        relevant...</p>
      <p>linstor storage-pool list<br>
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────╮<br>
        ┊ StoragePool          ┊ Node   ┊ Driver   ┊ PoolName ┊
        FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊<br>
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════╡<br>
        ┊ DfltDisklessStorPool ┊ castle ┊ DISKLESS ┊         
        ┊              ┊               ┊ False        ┊ Ok    ┊<br>
        ┊ DfltDisklessStorPool ┊ san5   ┊ DISKLESS ┊         
        ┊              ┊               ┊ False        ┊ Ok    ┊<br>
        ┊ DfltDisklessStorPool ┊ san6   ┊ DISKLESS ┊         
        ┊              ┊               ┊ False        ┊ Ok    ┊<br>
        ┊ pool                 ┊ castle ┊ LVM      ┊ vg_hdd   ┊     2.95
        TiB ┊      3.44 TiB ┊ False        ┊ Ok    ┊<br>
        ┊ pool                 ┊ san5   ┊ LVM      ┊ vg_hdd   ┊     3.87
        TiB ┊      4.36 TiB ┊ False        ┊ Ok    ┊<br>
        ┊ pool                 ┊ san6   ┊ LVM      ┊ vg_ssd   ┊     1.26
        TiB ┊      1.75 TiB ┊ False        ┊ Ok    ┊<br>
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯<br>
      </p>
      <p>I've tried to restart linstor-satellite service on castle, but
        it didn't make any difference. <br>
      </p>
      <p>After a reboot of castle, and now I get this:</p>
      <p>linstor resource list<br>
╭────────────────────────────────────────────────────────────────────╮<br>
        ┊ ResourceName ┊ Node   ┊ Port ┊ Usage  ┊ Conns ┊             
        State ┊<br>
╞════════════════════════════════════════════════════════════════════╡<br>
        ┊ testvm1      ┊ castle ┊ 7000 ┊ Unused ┊ Ok    ┊          
        Diskless ┊<br>
        ┊ testvm1      ┊ san5   ┊ 7000 ┊ Unused ┊ Ok    ┊
        SyncTarget(55.99%) ┊<br>
        ┊ testvm1      ┊ san6   ┊ 7000 ┊ Unused ┊ Ok    ┊          
        UpToDate ┊<br>
╰────────────────────────────────────────────────────────────────────╯<br>
      </p>
      <p>However, looking at the err reports, and I see the exactl same
        error about creating the metadata on castle.</p>
      <p>One interesting thing is that the LV seems to have been
        created:</p>
      <p>lvs<br>
          /dev/drbd0: open failed: Wrong medium type<br>
          /dev/drbd1: open failed: Wrong medium type<br>
          LV                            VG      Attr       LSize    Pool
        Origin Data%  Meta%  Move Log Cpy%Sync Convert<br>
          backup_system_20200624_062513 storage swi-a-s---    4.00g     
        system 3.06                                   <br>
          system                        storage owi-aos---   
        5.00g                                                    <br>
          testvm1_00000                 vg_hdd  -wi-a-----
        &lt;500.11g                                                    <br>
      </p>
      <p>Any suggestions on where to look next? Or what I might have
        done wrong now?<br>
      </p>
      <p>Regards,<br>
        Adam<br>
      </p>
      <p><br>
      </p>
      <p><br>
      </p>
      <p><br>
      </p>
      <br>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
Star us on GITHUB: <a class="moz-txt-link-freetext" href="https://github.com/LINBIT">https://github.com/LINBIT</a>
drbd-user mailing list
<a class="moz-txt-link-abbreviated" href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a>
<a class="moz-txt-link-freetext" href="https://lists.linbit.com/mailman/listinfo/drbd-user">https://lists.linbit.com/mailman/listinfo/drbd-user</a>
</pre>
    </blockquote>
  </body>
</html>