<div dir="ltr"><div class="gmail_default" style="font-size:small">We plan to move from 8 to drbd-9.0.16. I will post it here how this will handle the resync problem.</div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">Thanks!</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Oct 17, 2019 at 10:21 AM Digimer &lt;<a href="mailto:lists@alteeve.ca">lists@alteeve.ca</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">8.4.4 is old. Can you upgrade to the latest 8.4.11? I believe 8.4.4 was<br>
older than the other reporters with a similar issue, so this may be<br>
fixed. Upgrading to .11 should not cause any issues.<br>
<br>
PS - Please keep replies on the list. These discussions help others by<br>
being in the archives.<br>
<br>
digimer<br>
<br>
On 2019-10-17 9:54 a.m., Paras pradhan wrote:<br>
&gt; drbd version is drbd-8.4.4-0.27.4.2 and yes are upgrading it to version<br>
&gt; 9 in the near future.<br>
&gt; <br>
&gt; No it is not a live snapshot. Both drbd nodes were shutdown and used<br>
&gt; clonezilla bootable image to take backup and also to restore.<br>
&gt; <br>
&gt; Thanks<br>
&gt; Paras.<br>
&gt; <br>
&gt; On Wed, Oct 16, 2019 at 7:26 PM Digimer &lt;<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a><br>
&gt; &lt;mailto:<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>&gt;&gt; wrote:<br>
&gt; <br>
&gt;     I don&#39;t see the version, but looking in the mailing list archives, the<br>
&gt;     common recommendation is to upgrade. What version of DRBD 8 are you<br>
&gt;     using, exactly?<br>
&gt; <br>
&gt;     Does the resync happen only after recovery? Is the backups of the nodes<br>
&gt;     done via live-snapshot? If so, then if there is _any_ time between the<br>
&gt;     two nodes being snapped, the UUID will differ and could be causing this.<br>
&gt; <br>
&gt;     digimer<br>
&gt; <br>
&gt;     On 2019-10-16 10:04 a.m., Paras pradhan wrote:<br>
&gt;     &gt; Hi<br>
&gt;     &gt;<br>
&gt;     &gt; Here is the log for one of the drbd resource (which is 300GB).<br>
&gt;     &gt;<br>
&gt;     &gt; --<br>
&gt;     &gt; [  194.780377] block drbd1: disk( Diskless -&gt; Attaching )<br>
&gt;     &gt; [  194.780536] block drbd1: max BIO size = 1048576<br>
&gt;     &gt; [  194.780548] block drbd1: drbd_bm_resize called with capacity ==<br>
&gt;     629126328<br>
&gt;     &gt; [  194.783069] block drbd1: resync bitmap: bits=78640791 words=1228763<br>
&gt;     &gt; pages=2400<br>
&gt;     &gt; [  194.783077] block drbd1: size = 300 GB (314563164 KB)<br>
&gt;     &gt; [  194.793958] block drbd1: bitmap READ of 2400 pages took 3 jiffies<br>
&gt;     &gt; [  194.796342] block drbd1: recounting of set bits took additional<br>
&gt;     1 jiffies<br>
&gt;     &gt; [  194.796348] block drbd1: 0 KB (0 bits) marked out-of-sync by on<br>
&gt;     disk<br>
&gt;     &gt; bit-map.<br>
&gt;     &gt; [  194.796359] block drbd1: disk( Attaching -&gt; Outdated )<br>
&gt;     &gt; [  194.796366] block drbd1: attached to UUIDs<br>
&gt;     &gt; 56E4CF14A115440C:0000000000000000:02DCAB23D758DA48:02DBAB23D758DA49<br>
&gt;     &gt; [  475.740272] block drbd1: drbd_sync_handshake:<br>
&gt;     &gt; [  475.740280] block drbd1: self<br>
&gt;     &gt; 56E4CF14A115440C:0000000000000000:02DCAB23D758DA48:02DBAB23D758DA49<br>
&gt;     &gt; bits:0 flags:0<br>
&gt;     &gt; [  475.740288] block drbd1: peer<br>
&gt;     &gt; F5A226CE3F2DA2F2:0000000000000000:56E5CF14A115440D:56E4CF14A115440D<br>
&gt;     &gt; bits:0 flags:0<br>
&gt;     &gt; [  475.740295] block drbd1: uuid_compare()=-2 by rule 60<br>
&gt;     &gt; [  475.740299] block drbd1: Writing the whole bitmap, full sync<br>
&gt;     required<br>
&gt;     &gt; after drbd_sync_handshake.<br>
&gt;     &gt; [  475.757877] block drbd1: bitmap WRITE of 2400 pages took 4 jiffies<br>
&gt;     &gt; [  475.757888] block drbd1: 300 GB (78640791 bits) marked<br>
&gt;     out-of-sync by<br>
&gt;     &gt; on disk bit-map.<br>
&gt;     &gt; [  475.758018] block drbd1: peer( Unknown -&gt; Secondary ) conn(<br>
&gt;     &gt; WFReportParams -&gt; WFBitMapT ) pdsk( DUnknown -&gt; UpToDate )<br>
&gt;     &gt; [  475.800134] block drbd1: receive bitmap stats [Bytes(packets)]:<br>
&gt;     plain<br>
&gt;     &gt; 0(0), RLE 23(1), total 23; compression: 100.0%<br>
&gt;     &gt; [  475.802697] block drbd1: send bitmap stats [Bytes(packets)]: plain<br>
&gt;     &gt; 0(0), RLE 23(1), total 23; compression: 100.0%<br>
&gt;     &gt; [  475.802717] block drbd1: conn( WFBitMapT -&gt; WFSyncUUID )<br>
&gt;     &gt; [  475.815155] block drbd1: updated sync uuid<br>
&gt;     &gt; CEF5B26573C154CC:0000000000000000:02DCAB23D758DA48:02DBAB23D758DA49<br>
&gt;     &gt; [  475.815377] block drbd1: helper command: /sbin/drbdadm<br>
&gt;     &gt; before-resync-target minor-1<br>
&gt;     &gt; [  475.820270] block drbd1: helper command: /sbin/drbdadm<br>
&gt;     &gt; before-resync-target minor-1 exit code 0 (0x0)<br>
&gt;     &gt; [  475.820293] block drbd1: conn( WFSyncUUID -&gt; SyncTarget ) disk(<br>
&gt;     &gt; Outdated -&gt; Inconsistent )<br>
&gt;     &gt; [  475.820306] block drbd1: Began resync as SyncTarget (will sync<br>
&gt;     &gt; 314563164 KB [78640791 bits set]).<br>
&gt;     &gt; [  538.518371] block drbd1: peer( Secondary -&gt; Primary )<br>
&gt;     &gt; [  538.548954] block drbd1: role( Secondary -&gt; Primary )<br>
&gt;     &gt; [ 2201.521232] block drbd1: conn( SyncTarget -&gt; PausedSyncT )<br>
&gt;     user_isp(<br>
&gt;     &gt; 0 -&gt; 1 )<br>
&gt;     &gt; [ 2201.521237] block drbd1: Resync suspended<br>
&gt;     &gt; [ 2301.930484] block drbd1: conn( PausedSyncT -&gt; SyncTarget )<br>
&gt;     user_isp(<br>
&gt;     &gt; 1 -&gt; 0 )<br>
&gt;     &gt; [ 2301.930490] block drbd1: Syncer continues.<br>
&gt;     &gt; [ 5216.750314] block drbd1: Resync done (total 4740 sec; paused<br>
&gt;     100 sec;<br>
&gt;     &gt; 67792 K/sec)<br>
&gt;     &gt; [ 5216.750323] block drbd1: 98 % had equal checksums, eliminated:<br>
&gt;     &gt; 311395164K; transferred 3168000K total 314563164K<br>
&gt;     &gt; [ 5216.750333] block drbd1: updated UUIDs<br>
&gt;     &gt; F5A226CE3F2DA2F3:0000000000000000:CEF5B26573C154CD:56E5CF14A115440D<br>
&gt;     &gt; [ 5216.750343] block drbd1: conn( SyncTarget -&gt; Connected ) disk(<br>
&gt;     &gt; Inconsistent -&gt; UpToDate )<br>
&gt;     &gt; [ 5216.750518] block drbd1: helper command: /sbin/drbdadm<br>
&gt;     &gt; after-resync-target minor-1<br>
&gt;     &gt; [ 5216.845211] block drbd1: helper command: /sbin/drbdadm<br>
&gt;     &gt; after-resync-target minor-1 exit code 0 (0x0)<br>
&gt;     &gt; ---<br>
&gt;     &gt;<br>
&gt;     &gt;<br>
&gt;     &gt; Thanks!<br>
&gt;     &gt;<br>
&gt;     &gt; On Wed, Oct 16, 2019 at 12:46 AM Digimer &lt;<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a><br>
&gt;     &lt;mailto:<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>&gt;<br>
&gt;     &gt; &lt;mailto:<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a> &lt;mailto:<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>&gt;&gt;&gt; wrote:<br>
&gt;     &gt;<br>
&gt;     &gt;     On 2019-10-15 4:58 p.m., Paras pradhan wrote:<br>
&gt;     &gt;     &gt; Hi<br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt; I have a two node drbd 8 cluster. We are doing some test and<br>
&gt;     while<br>
&gt;     &gt;     drbd<br>
&gt;     &gt;     &gt; resources are consistent/synced on both nodes, I powered off<br>
&gt;     both<br>
&gt;     &gt;     nodes<br>
&gt;     &gt;     &gt; and took a bare metal backup using clonezilla. <br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt; Then I restored the both nodes using the backup and started drbd<br>
&gt;     &gt;     on both<br>
&gt;     &gt;     &gt; nodes. On one of the nodes it starts to sync all over again.  <br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt; My question is: when I took the backup drbd resources are synced<br>
&gt;     &gt;     and why<br>
&gt;     &gt;     &gt; it is starting all over again? I hope I explained clearly.<br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt; Thanks in advance !<br>
&gt;     &gt;     &gt; Paras.<br>
&gt;     &gt;<br>
&gt;     &gt;     Do you have the system logs from when you started DRBD on the<br>
&gt;     nodes<br>
&gt;     &gt;     post-recovery? There should be DRBD log entries on both nodes<br>
&gt;     as DRBD<br>
&gt;     &gt;     started. The reason/trigger of the resync will likely be<br>
&gt;     explained in<br>
&gt;     &gt;     there. If not, please share the logs.<br>
&gt;     &gt;<br>
&gt;     &gt;     --<br>
&gt;     &gt;     Digimer<br>
&gt;     &gt;     Papers and Projects: <a href="https://alteeve.com/w/" rel="noreferrer" target="_blank">https://alteeve.com/w/</a><br>
&gt;     &gt;     &quot;I am, somehow, less interested in the weight and convolutions of<br>
&gt;     &gt;     Einstein’s brain than in the near certainty that people of<br>
&gt;     equal talent<br>
&gt;     &gt;     have lived and died in cotton fields and sweatshops.&quot; -<br>
&gt;     Stephen Jay<br>
&gt;     &gt;     Gould<br>
&gt;     &gt;<br>
&gt; <br>
&gt; <br>
&gt;     -- <br>
&gt;     Digimer<br>
&gt;     Papers and Projects: <a href="https://alteeve.com/w/" rel="noreferrer" target="_blank">https://alteeve.com/w/</a><br>
&gt;     &quot;I am, somehow, less interested in the weight and convolutions of<br>
&gt;     Einstein’s brain than in the near certainty that people of equal talent<br>
&gt;     have lived and died in cotton fields and sweatshops.&quot; - Stephen Jay<br>
&gt;     Gould<br>
&gt; <br>
<br>
<br>
-- <br>
Digimer<br>
Papers and Projects: <a href="https://alteeve.com/w/" rel="noreferrer" target="_blank">https://alteeve.com/w/</a><br>
&quot;I am, somehow, less interested in the weight and convolutions of<br>
Einstein’s brain than in the near certainty that people of equal talent<br>
have lived and died in cotton fields and sweatshops.&quot; - Stephen Jay Gould<br>
</blockquote></div>