Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Mon, Dec 08, 2008 at 03:37:10PM -0800, Jeffrey Froman wrote: > Hello, > > We run an online-verify from cron nightly, and certain blocks are > frequently turning up out-of-sync. Notice how the same blocks show up > repeatedly: > > Nov 9 20:27:34: Out of sync: start=182520104, size=8 (sectors) > Nov 9 21:52:30: Out of sync: start=182520104, size=8 (sectors) > Nov 10 21:28:25: Out of sync: start=109249744, size=8 (sectors) > Nov 10 21:46:23: Out of sync: start=182520104, size=8 (sectors) > Nov 11 20:28:48: Out of sync: start=182520104, size=8 (sectors) > Nov 12 20:11:09: Out of sync: start=109249744, size=8 (sectors) > Nov 12 20:28:47: Out of sync: start=182520104, size=8 (sectors) > Nov 13 20:28:54: Out of sync: start=182520104, size=8 (sectors) > Nov 14 20:10:24: Out of sync: start=109249744, size=8 (sectors) > Nov 14 20:28:03: Out of sync: start=182520104, size=8 (sectors) > Nov 15 20:26:27: Out of sync: start=182520104, size=8 (sectors) > Nov 16 20:10:23: Out of sync: start=109249744, size=8 (sectors) > Nov 16 20:27:56: Out of sync: start=182520104, size=8 (sectors) > Nov 17 20:11:05: Out of sync: start=109249744, size=8 (sectors) > Nov 17 20:29:20: Out of sync: start=182520104, size=8 (sectors) > Nov 18 20:28:36: Out of sync: start=182520104, size=8 (sectors) > Nov 24 20:35:25: Out of sync: start=182520104, size=8 (sectors) > Dec 2 20:04:52: Out of sync: start=109249744, size=8 (sectors) > Dec 2 20:17:30: Out of sync: start=182520104, size=8 (sectors) > Dec 3 20:04:35: Out of sync: start=109249744, size=8 (sectors) > Dec 6 20:03:47: Out of sync: start=109249744, size=8 (sectors) > Dec 6 20:16:15: Out of sync: start=182520104, size=8 (sectors) > Dec 7 20:03:36: Out of sync: start=109249744, size=8 (sectors) > Dec 7 20:16:00: Out of sync: start=182520104, size=8 (sectors) > > What does it mean that the same blocks are marked out-of-sync > regularly? Each night this happens, we manually repair by running: > > drbdadm disconnect <resource> > drbdadm connect <resource> > > And when we have run another online verify immediately following the > repair, the nodes are in sync again; so it seems the repair is > working ... at least temporarily. > > (Both the online-verify and above drbdadm commands are run from the > Secondary node, though it's my understanding that this doesn't > matter.) > > I've read some threads on this list regarding the possibility of race > conditions during online-verification, but this seems unlikely in my > case since the out-of-sync blocks are so regularly the same blocks. file system issues? may indirect blocks of already deleted temporary files? > We are using DRBD Protocol C on LVM on raid1, hosting an ext3 > filesystem. Any ideas as to the reasons for this behavior are > appreciated. and your drbd version is...? see also: What causes nodes to become out-of-sync? http://thread.gmane.org/gmane.linux.network.drbd/15430/ Behaviour of verify: false positives -> true positives http://thread.gmane.org/gmane.linux.kernel.drbd.devel/790 tons of out-of-sync sectors detected http://thread.gmane.org/gmane.linux.network.drbd/15537 -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed