[DRBD-user] data integrity in drbd

Lyre 4179e1 at gmail.com
Fri Aug 26 17:44:35 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

May be I should mount them read only. But I coundn't try agacin, since I've
just flied back home

I check the log file just now, and noticed that on secondary, md5sum report
Input/ouput error, meanwhile the primary was ok. Does this mean I've mess up
the metadata?

On Fri, Aug 26, 2011 at 11:33 PM, Lyre <4179e1 at gmail.com> wrote:

> Hi Lars:
> Yesterday,  On the secondary,  I shut down drbd, try to use rsync to
> recovery the snapshot, but failed. Then I start drbd to resync from the
> primary node. It was the only time I bypass drbd, howerver,the application
> seems good at that time.
> This afternoon, we continue our experiment, disconnect drbd on secondary,
> changed something and resync from priamry.  Oracle was unable to start. drbd
> status suggest that both node were uptodate, drbadm verify doesn't report
> any error. Disconnect one side and mount drbd device on both node, md5sum
> show that some files were different.
> On Fri, Aug 26, 2011 at 8:28 PM, Lars Ellenberg <lars.ellenberg at linbit.com
> > wrote:
>> On Fri, Aug 26, 2011 at 05:50:59PM +0800, Lyre wrote:
>> > Hi all:
>> >
>> >     Is there a way to check the data integrity on both node? I've
>> encounter
>> > an confusing problem.
>> >
>> > We have two drbd devices, 20Gb one for application, and 300Gb one for
>> oracle
>> > database(oracle was installed in a different location, which were not
>> > replicated), csums-alg & verify-alg were configured to crc32c,
>>  bandwidth
>> > was 100M.
>> I suggest to not use the same algorithm for csums-alg and verify-alg,
>> so the verify can detect differences which the csums based resnc thought
>> where identical due to identical checksums respective hash colision.
>> I further suggest that you use stronger hash algorithms for both.
>> Like md5 and sha1, or similar.
>> >  I try to upgrade our app & database on secondary(node2) , by disconnect
>> and
>> > promote 2 drbd devices to primary and then perform the upgrade, it was
>> fine.
>> > Then I try to roll back the seondary the original version, by  drbdadm
>> --
>> > --discard-my-data connect drbdX.  After drbd sync,  oracle was unable to
>> > startup, it reports data corruption. So I issue drbdadm verify, but it
>> > doesn't report anything, everything seems good.
>> >
>> > I disconnect drbds and then mount them on both side, md5sum all files on
>> the
>> > disk. I diff the output from both side and found that serval database
>> file's
>> > md5 value wasn't identical. Then I connect drbds and found that it begin
>> to
>> > sync about 30Gb's data. I didn't the device read only, but have all
>> > applications stopped.
>> >
>> >
>> > BTW,  In earier, I've try to recovery the underlying  lvm devices from
>> > snapshot, but get IO error, so I just canceled and resync it. Does it
>> > matter? Since I've get rid of drbd there.
>> Can you give more detail there?
>> What did you do?
>> Did that manipulate DRBD meta data as well?
>> Did you bypass DRBD at some point during the process?
>> --
>> : Lars Ellenberg
>> : LINBIT | Your Way to High Availability
>> : DRBD/HA support and consulting http://www.linbit.com
>> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
>> __
>> please don't Cc me, but send to list   --   I'm subscribed
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user at lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110826/13dbb691/attachment.htm>

More information about the drbd-user mailing list