<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Tahoma
}
--></style>
</head>
<body class='hmmessage'>
Thanks for responding,<BR>
<BR>
FYI: I have ran stat command to get details of the files whose data is seen criss-crossing. I mean content of one file is seen in another. Snapshot enclosed at the end, when corruption occured.<BR>
Files which have an issue belong to same block, <STRONG> IO Block: 4096 </STRONG><BR>
<STRONG>Every corruption seen, content of /repl/firewall/sysconfig/iptables content is seen in /repl/snmpagent/data/snmpd.conf</STRONG><BR>
<BR>
How much is "few"?<BR>
Today After 12 failovers. Last run after 80 failovers similar corruption is seen.<BR>
<BR> What is the IO load?<BR>
Note exactly sure, When sigterm is received there are 2 processes which write config data to DRBD partition.<BR>
<BR> How do you trigger the failover?<BR>
using reboot command<BR>
<BR>DRBD version, kernel version, file system type?<BR>
DRBD-8.0.16, 2.6.14.7, EXT3-FS<BR>
<BR> Volatile caches involved?<BR>
NO<BR>How often/when do you fsck?<BR>
Every time DRBD-GO-Primary script is called. Before mounting DRBD partition we invoke fsck -fy<BR><BR>
File: `/repl/ipsec/ipsec_xml'<BR> Size: 0 Blocks: 2 <STRONG>IO Block: 4096 </STRONG>regular empty file<BR>Device: fe03h/65027d Inode: 6404 Links: 1<BR>Access: (0640/-rw-r-----) Uid: ( 0/ root) Gid: ( 201/ admin)<BR>Access: 2010-09-07 10:06:55.000000000 +0000<BR>Modify: 2010-09-07 10:07:12.000000000 +0000<BR>Change: 2010-09-07 10:07:12.000000000 +0000<BR> File: `/repl/ipsec/psk.txt'<BR> Size: 242 Blocks: 4 <STRONG>IO Block: 4096 </STRONG>regular file<BR>Device: fe03h/65027d Inode: 6397 Links: 1<BR>Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root)<BR>Access: 2006-08-03 17:03:19.000000000 +0000<BR>Modify: 2010-09-07 10:07:12.000000000 +0000<BR>Change: 2010-09-07 10:07:12.000000000 +0000<BR> File: `/repl/ipsec/racoon.conf'<BR> Size: 1793 Blocks: 6 <STRONG>IO Block: 4096 </STRONG>regular file<BR>Device: fe03h/65027d Inode: 6391 Links: 1<BR>Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)<BR>Access: 2010-09-07 10:02:49.000000000 +0000<BR>Modify: 2010-09-07 10:07:12.000000000 +0000<BR>Change: 2010-09-07 10:07:12.000000000 +0000<BR> File: `/repl/ipsec/setkey.conf'<BR> Size: 121 Blocks: 4 <STRONG>IO Block: 4096 </STRONG>regular file<BR>Device: fe03h/65027d Inode: 6398 Links: 1<BR>Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)<BR>Access: 2006-08-03 17:03:16.000000000 +0000<BR>Modify: 2010-09-07 10:07:12.000000000 +0000<BR>Change: 2010-09-07 10:07:12.000000000 +0000<BR> File: `/repl/firewall/sysconfig/iptables'<BR> Size: 1797 Blocks: 6 <STRONG> IO Block: 4096 </STRONG>regular file<BR>Device: fe03h/65027d Inode: 14461 Links: 1<BR>Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root)<BR>Access: 2010-09-07 10:02:51.000000000 +0000<BR>Modify: 2010-09-07 10:07:13.000000000 +0000<BR>Change: 2010-09-07 10:07:13.000000000 +0000<BR> File: `/repl/snmpdagent/data/snmpd.conf'<BR> Size: 683 Blocks: 4 <STRONG> IO Block: 4096 </STRONG>regular file<BR>Device: fe03h/65027d Inode: 20744 Links: 1<BR>Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 601/usergroup)<BR>Access: 2010-09-07 10:07:14.000000000 +0000<BR>Modify: 2010-09-07 10:07:14.000000000 +0000<BR>Change: 2010-09-07 10:07:14.000000000 +0000<BR>
<BR>
Appreciate your help,<BR>
Lak.<BR><BR> <BR>> Date: Tue, 7 Sep 2010 12:16:59 +0200<BR>> From: lars.ellenberg@linbit.com<BR>> To: drbd-user@lists.linbit.com<BR>> Subject: Re: [DRBD-user] File corruption in drbd partition<BR>> <BR>> On Tue, Sep 07, 2010 at 09:35:48AM +0000, putcha narayana wrote:<BR>> > <BR>> > Hi,<BR>> > <BR>> > We are running continuous failovers on a redundant setup (Active / Standby).<BR>> > After few failovers we observe content of file x appears inside file y.<BR>> <BR>> How much is "few"?<BR>> What is the IO load?<BR>> How do you trigger the failover?<BR>> DRBD version, kernel version, file system type?<BR>> Volatile caches involved?<BR>> How often/when do you fsck?<BR>> <BR>> > In one particular case we observed inode corruption, when fsck command is run on /repl partition.<BR>> > Multiply-claimed block(s) in inode 28: 1233 1249 1251 1252<BR>> > Multiply-claimed block(s) in inode 1183: 1251 1252<BR>> > Multiply-claimed block(s) in inode 1184: 1233<BR>> > Multiply-claimed block(s) in inode 1185: 1249<BR>> > <BR>> > When fsck -fy is run on /repl partition then the end result is content of file x is seen in file y.<BR>> <BR>> <BR>> <BR>> -- <BR>> : Lars Ellenberg<BR>> : LINBIT | Your Way to High Availability<BR>> : DRBD/HA support and consulting http://www.linbit.com<BR>> <BR>> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.<BR>> __<BR>> please don't Cc me, but send to list -- I'm subscribed<BR>> _______________________________________________<BR>> drbd-user mailing list<BR>> drbd-user@lists.linbit.com<BR>> http://lists.linbit.com/mailman/listinfo/drbd-user<BR>                                            </body>
</html>