Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi,
Thank you for your reply.
I'm not using a virtual machine (I've used a VM only to check if I got the
same issue and then went back on the physical server).
Do you suggest me I should disable the swap on my server ? I'm using EXT4
How can I check if my primary and my secondary are synchronized using a
command like fsck ? (online)
I found explanation about "data getting modifying in flight" but no
"workaround" or everything on how I can avoid getting out of sync block.
2014-10-10 18:44 GMT+11:00 Lionel Sausin <ls at numerigraphe.com>:
> "buffer modified by upper layers during write" means whatever sits on
> top of drbd changes data "in flight".
> Please search the list archives, this is a FAQ.
> Swap and some file systems do that - usually it's some kind of
> optimization. I suspect VMWare VMs hosted in ext4 do that too.
> There's probably nothing wrong, but DRBD can't know. You should do your
> data integrity checking on some higher level (fsck for example)
> Lionel.
>
> Le 10/10/2014 01:45, aurelien panizza a écrit :
>
> Hi all,
>
> I've got a problem on my environnement.
> I set up my primary server (pacemaker + drbd) which ran alone for a while,
> and then I added the second server (currently only DRBD).
> Both server can see each other and /proc/drbd reports "uptodate/uptodate".
> If I run a verify on that resource (right after the full resync), it
> reports some blocks out of sync ( generally from 100 to 1500 on my 80GO LVM
> partition).
> So I disconnect/connect the slave and oos report 0 block.
> I run again a verify and some block are still out of sync. What I've
> notived is that it seems to be almost always the same blocks which are out
> of sync.
> I tried to do a full resync multiple times but had the same issue.
> I also tried to replace the physical secondary server by a virtual machine
> (in order to check if the issue came from the secondary server) but had the
> same issue.
>
> I then activated "data-integrity-alg crc32c" and got a couple of "Digest
> mismatch, buffer modified by upper layers during write: 167134312s +4096"
> in the primary log.
>
> I tried on a different network card but got the same errors.
>
> My full configuration file:
>
> protocol C;
> meta-disk internal;
> device /dev/drbd0;
> disk /dev/sysvg/drbd;
>
> handlers {
> split-brain "/usr/lib/drbd/notify-split-brain.sh xxx at xxx";
> out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh xxx at xxx";
> fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
> after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
> }
>
> net {
> cram-hmac-alg "sha1";
> shared-secret "drbd";
> sndbuf-size 512k;
> max-buffers 8000;
> max-epoch-size 8000;
> verify-alg md5;
> after-sb-0pri disconnect;
> after-sb-1pri disconnect;
> after-sb-2pri disconnect;
> data-integrity-alg crc32c;
> }
>
> disk {
> al-extents 3389;
> fencing resource-only;
> }
>
> syncer {
> rate 90M;
> }
> on host1 {
> address 10.110.1.71:7799;
> }
> on host2 {
> address 10.110.1.72:7799;
> }
> }
>
> My OS : Redhat6 2.6.32-431.20.3.el6.x86_64
> DRBD version : drbd84-8.4.4-1
>
> ethtool -k eth0
> Features for eth0:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp-segmentation-offload: on
> udp-fragmentation-offload: off
> generic-segmentation-offload: on
> generic-receive-offload: off
> large-receive-offload: off
> ntuple-filters: off
> receive-hashing: off
>
>
> Secondary server is currently not in the HA (pacemaker) but I don't
> think this the problem.
> I have got another HA on 2 physical host with the exact same configuration
> and drbd/os version (but not same server model) and everything's OK.
>
> As the primary server is in production, I can't stop the application
> (Database) to check if the alerts are false positive.
>
> Would you have any advice ?
> Could it be the primary server which have corrupted block or wrong
> metadata ?
>
> Regards,
>
>
>
> _______________________________________________
> drbd-user mailing listdrbd-user at lists.linbit.comhttp://lists.linbit.com/mailman/listinfo/drbd-user
>
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20141013/e7299afe/attachment.htm>