Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, We run an online-verify from cron nightly, and certain blocks are frequently turning up out-of-sync. Notice how the same blocks show up repeatedly: Nov 9 20:27:34: Out of sync: start=182520104, size=8 (sectors) Nov 9 21:52:30: Out of sync: start=182520104, size=8 (sectors) Nov 10 21:28:25: Out of sync: start=109249744, size=8 (sectors) Nov 10 21:46:23: Out of sync: start=182520104, size=8 (sectors) Nov 11 20:28:48: Out of sync: start=182520104, size=8 (sectors) Nov 12 20:11:09: Out of sync: start=109249744, size=8 (sectors) Nov 12 20:28:47: Out of sync: start=182520104, size=8 (sectors) Nov 13 20:28:54: Out of sync: start=182520104, size=8 (sectors) Nov 14 20:10:24: Out of sync: start=109249744, size=8 (sectors) Nov 14 20:28:03: Out of sync: start=182520104, size=8 (sectors) Nov 15 20:26:27: Out of sync: start=182520104, size=8 (sectors) Nov 16 20:10:23: Out of sync: start=109249744, size=8 (sectors) Nov 16 20:27:56: Out of sync: start=182520104, size=8 (sectors) Nov 17 20:11:05: Out of sync: start=109249744, size=8 (sectors) Nov 17 20:29:20: Out of sync: start=182520104, size=8 (sectors) Nov 18 20:28:36: Out of sync: start=182520104, size=8 (sectors) Nov 24 20:35:25: Out of sync: start=182520104, size=8 (sectors) Dec 2 20:04:52: Out of sync: start=109249744, size=8 (sectors) Dec 2 20:17:30: Out of sync: start=182520104, size=8 (sectors) Dec 3 20:04:35: Out of sync: start=109249744, size=8 (sectors) Dec 6 20:03:47: Out of sync: start=109249744, size=8 (sectors) Dec 6 20:16:15: Out of sync: start=182520104, size=8 (sectors) Dec 7 20:03:36: Out of sync: start=109249744, size=8 (sectors) Dec 7 20:16:00: Out of sync: start=182520104, size=8 (sectors) What does it mean that the same blocks are marked out-of-sync regularly? Each night this happens, we manually repair by running: drbdadm disconnect <resource> drbdadm connect <resource> And when we have run another online verify immediately following the repair, the nodes are in sync again; so it seems the repair is working ... at least temporarily. (Both the online-verify and above drbdadm commands are run from the Secondary node, though it's my understanding that this doesn't matter.) I've read some threads on this list regarding the possibility of race conditions during online-verification, but this seems unlikely in my case since the out-of-sync blocks are so regularly the same blocks. We are using DRBD Protocol C on LVM on raid1, hosting an ext3 filesystem. Any ideas as to the reasons for this behavior are appreciated. Thank you, Jeffrey