[DRBD-user] Drbd i-node error issue.

Sunil Varma sunil.sayyaparaju at netenrich.com
Mon Sep 1 11:37:28 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello,

 

We are running drbd 8.2.6 on centos 64 bit OS.

 

CentOS kernel :

2.6.18-92.1.6.el5.centos.plus --  x86_64

 

Drbd rpms :

drbd82-8.2.6-1.el5.centos

kmod-drbd82-8.2.6-1.2.6.18_92.1.6.el5.centos.plus

 

These rpms are provided in the drbd repositories.

 

Drbd.conf:

 

global {

  usage-count no;

}

resource drbd0 {

  protocol C;

  syncer

  {

        rate 600M;

  }

  device    /dev/drbd0;

  disk      /dev/sda6;

  meta-disk internal;

  on ha42.netenrich.com

  {

    address   192.168.10.42:7789;

  }

 

  on ha43.netenrich.com

  {

    address   192.168.10.43:7789;

  }

   

  handlers

  {

                split-brain "/home/bin/ResloveSplitBrain.pl SetSplitBrain";

  }

}

 

 

We build HA cluster with two nodes using heartbeat and drbd.

 

Heartbeat rpms : 

heartbeat-2.1.3-3.el5.centos

heartbeat-pils-2.1.3-3.el5.centos

heartbeat-stonith-2.1.3-3.el5.centos 

 

Issue 1: 

 

For long time HA cluster is working fine, But one day we found  that one
node is having i-node errors(primary) where other node is having no i-node
errors (secondary).

 

Why primary node is giving these errors  where as secondary node is looking
fine.

 

dmesg output from i-node error box is 

 

____________________________________________________________________________
__________________

drbd0: Handshake successful: Agreed network protocol version 88

drbd0: conn( WFConnection -> WFReportParams ) 

drbd0: Starting asender thread (from drbd0_receiver [3070])

drbd0: data-integrity-alg: <not-used>

drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT )
pdsk( DUnknown -> UpToDate ) 

drbd0: Writing meta data super block now.

drbd0: conn( WFBitMapT -> WFSyncUUID ) 

drbd0: helper command: /sbin/drbdadm before-resync-target

drbd0: conn( WFSyncUUID -> SyncTarget ) disk( UpToDate -> Inconsistent ) 

drbd0: Began resync as SyncTarget (will sync 6388 KB [1597 bits set]).

drbd0: Writing meta data super block now.

drbd0: Resync done (total 1 sec; paused 0 sec; 6388 K/sec)

drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) 

drbd0: helper command: /sbin/drbdadm after-resync-target

drbd0: Writing meta data super block now.

drbd0: role( Secondary -> Primary ) 

drbd0: Writing meta data super block now.

kjournald starting.  Commit interval 5 seconds

EXT3 FS on drbd0, internal journal

EXT3-fs: mounted filesystem with ordered data mode.

SELinux: initialized (dev drbd0, type ext3), uses xattr

FS-Cache: Loaded

FS-Cache: netfs 'nfs' registered for caching

SELinux: initialized (dev 0:16, type nfs), uses genfs_contexts

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393660 in dir
#394183

Aborting journal on device drbd0.

ext3_abort called.

EXT3-fs error (device drbd0): ext3_journal_start_sb: Detected aborted
journal

Remounting filesystem read-only

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393660 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393660 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393666 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393668 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393667 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393663 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393665 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393664 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393658 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393660 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393657 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393662 in dir
#394183

printk: 29 messages suppressed.

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393660 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393660 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393666 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393668 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393667 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393663 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393665 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393664 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393658 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393660 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393657 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393662 in dir
#394183

printk: 29 messages suppressed.

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393660 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393660 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393666 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393668 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393667 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393663 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393665 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393664 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393658 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393660 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393657 in dir
#394183

EXT3-fs error (device drbd0): ext3_lookup: unlinked inode 393662 in dir
#394183

 

Issue 2:

 

We are giving a /dev/sda6 partition has a drbd resource and when both the
nodes are in primary/secondary state.  We found that in /dev/sda6 partition
one file got corrupted  on primary where as on the secondary it is fine.

 

Even though both nodes shows as they are syncing why nodes are showing
different content in the files ?

Can somebody please throw some light on the above issues that we are
facing.?

 

 

Regards,

Sunil Varma

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20080901/82f0b3fa/attachment.htm>


More information about the drbd-user mailing list