[DRBD-user] Repeated out-of-sync blocks

Jeffrey Froman drbd.tcijf at olympus.net
Tue Dec 9 00:37:10 CET 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello,

We run an online-verify from cron nightly, and certain blocks are 
frequently turning up out-of-sync. Notice how the same blocks show up 
repeatedly:

Nov  9 20:27:34: Out of sync: start=182520104, size=8 (sectors)
Nov  9 21:52:30: Out of sync: start=182520104, size=8 (sectors)
Nov 10 21:28:25: Out of sync: start=109249744, size=8 (sectors)
Nov 10 21:46:23: Out of sync: start=182520104, size=8 (sectors)
Nov 11 20:28:48: Out of sync: start=182520104, size=8 (sectors)
Nov 12 20:11:09: Out of sync: start=109249744, size=8 (sectors)
Nov 12 20:28:47: Out of sync: start=182520104, size=8 (sectors)
Nov 13 20:28:54: Out of sync: start=182520104, size=8 (sectors)
Nov 14 20:10:24: Out of sync: start=109249744, size=8 (sectors)
Nov 14 20:28:03: Out of sync: start=182520104, size=8 (sectors)
Nov 15 20:26:27: Out of sync: start=182520104, size=8 (sectors)
Nov 16 20:10:23: Out of sync: start=109249744, size=8 (sectors)
Nov 16 20:27:56: Out of sync: start=182520104, size=8 (sectors)
Nov 17 20:11:05: Out of sync: start=109249744, size=8 (sectors)
Nov 17 20:29:20: Out of sync: start=182520104, size=8 (sectors)
Nov 18 20:28:36: Out of sync: start=182520104, size=8 (sectors)
Nov 24 20:35:25: Out of sync: start=182520104, size=8 (sectors)
Dec  2 20:04:52: Out of sync: start=109249744, size=8 (sectors)
Dec  2 20:17:30: Out of sync: start=182520104, size=8 (sectors)
Dec  3 20:04:35: Out of sync: start=109249744, size=8 (sectors)
Dec  6 20:03:47: Out of sync: start=109249744, size=8 (sectors)
Dec  6 20:16:15: Out of sync: start=182520104, size=8 (sectors)
Dec  7 20:03:36: Out of sync: start=109249744, size=8 (sectors)
Dec  7 20:16:00: Out of sync: start=182520104, size=8 (sectors)

What does it mean that the same blocks are marked out-of-sync 
regularly? Each night this happens, we manually repair by running:

drbdadm disconnect <resource>
drbdadm connect <resource>

And when we have run another online verify immediately following the 
repair, the nodes are in sync again; so it seems the repair is 
working ... at least temporarily.

(Both the online-verify and above drbdadm commands are run from the 
Secondary node, though it's my understanding that this doesn't 
matter.)

I've read some threads on this list regarding the possibility of race 
conditions during online-verification, but this seems unlikely in my 
case since the out-of-sync blocks are so regularly the same blocks.

We are using DRBD Protocol C on LVM on raid1, hosting an ext3 
filesystem. Any ideas as to the reasons for this behavior are 
appreciated.


Thank you,
Jeffrey




More information about the drbd-user mailing list