[DRBD-user] Need help, we want invalidate if role=Secondary and uuid_compare rule==4

putcha narayana putcha_laks at hotmail.com
Thu Nov 4 13:25:31 CET 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,
 
We would like to force resync on standby if uuid_compare rule==4 ONLY. 
Why?? We are seeing DRBD files corrupted consistently and each time corruption is seen rule is 4. Mail chain below has full details. 
Yes this is a work around because we are on DRBD-8.0.16
snmpd version upgrade is a strict NO, NO.
 
Whenever we observed the following two lines, we recreated metadata on secondary node and there was not file corruption seen.
File Corruption: Content of other files are seen in snmpd.conf file. File corrupted is always snmpd.conf
 
--> uuid_compare()=0 by rule 4
--> No resync, but 78 bits in bitmap!                      <<<< Number of bits is variable. >>>>

invalidate and invalidate-remote options are available under drbdadm tool, but can only be invoked externally???
We would like to kick of sync only when we hit uuid_compare() by rule 4.
 
Work around below needs tweaking:
                case 0:
                      INFO("Lak: 0 by Rule 4, current state = %d current role = %d ! \n",
                            mdev->state.conn, mdev->state.role);
                      /* !self_pri && !peer_pri */
                      if (mdev->state.conn == WFBitMapT) {
                        drbd_start_resync(mdev, SyncTarget);
                      }
                      else if (mdev->state.conn == WFBitMapS) {
                        drbd_start_resync(mdev, SyncSource);
                      }
                      else if (mdev->state.conn == SyncTarget) {
                         drbd_start_resync(mdev, SyncTarget);
                      }
                      else if (mdev->state.conn == SyncSource) {
                         drbd_start_resync(mdev, SyncSource);
                      }
                      else if (mdev->state.role == Secondary) {
                         drbd_start_resync(mdev, SyncTarget);
                      }
                      return 0;

dmesg output on Standby Controller:
drbd0: Lak: 0 by Rule 4, current state = 9 current role = 2 ! 
drbd0: State change failed: Refusing to be inconsistent on both nodes
drbd0:   state = { cs:WFReportParams st:Secondary/Unknown ds:UpToDate/DUnknown r--- }
drbd0:  wanted = { cs:SyncTarget st:Secondary/Unknown ds:Inconsistent/DUnknown r--- }

LAK
 


From: putcha_laks at hotmail.com
To: drbd-user at lists.linbit.com
Date: Wed, 3 Nov 2010 06:50:46 +0000
Subject: [DRBD-user] Issue with uuid_compare by rule 4




Hi,

DRBD -Version 8.0.16 (Code for uuid_compare by rule 4 is same in all DRBD-8.x.y versions)
 
We are consistently seeing content of snmpd.conf get corrupted -- sometimes it shows iptables and sometimes it has some wierd binary data.
 
In all the instances we have seen corruption, Pattern observed from dmesg
uuid_compare()=0 by rule 4
No resync, but 78 bits in bitmap!                      <<<< Number of bits is variable. >>>>
 
>From drbd change log history we see that UUID_COMPARISON algorith was improved. 
 
* Sat Apr 07 2007 21:32:39 +0200 Philipp Reisner <phil(at)linbit.com>
  - drbd (8.0.2-1)
   * Improved the robustness of the UUID based algorithm that decides about the resync direction.

We would like to force sync in the rule 4 by doing the following, need your help in this regard.
                case 0:
                      /* !self_pri && !peer_pri */
                      if (mdev->state.conn == WFBitMapT) {
                        drbd_start_resync(mdev, SyncTarget);
                      }
                      else if (mdev->state.conn == WFBitMapS) {
                        drbd_start_resync(mdev, SyncSource);
                      }
                      else if (mdev->state.conn == SyncTarget) {
                         drbd_start_resync(mdev, SyncTarget);
                      }
                      else if (mdev->state.conn == SyncSource) {
                         drbd_start_resync(mdev, SyncSource);
                      }
                      return 0;
 
Testcase we are running:
Reset active board every 5 mins on a Redundant Setup.
 
dmesg output:
11-02 22:10:23 unknown kernel drbd0: drbd_sync_handshake:
11-02 22:10:23 unknown kernel drbd0: self 52A974873622A8A8:0000000000000000:D1A184CD02C8EE0D:BD9572546D8C332D
11-02 22:10:23 unknown kernel drbd0: peer 77F8DC91C89BA0F9:52A974873622A8A8:D1A184CD02C8EE0C:BD9572546D8C332D
11-02 22:10:23 unknown kernel drbd0: uuid_compare()=-1 by rule 5
11-02 22:10:23 unknown kernel drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate ) 
11-02 22:10:23 unknown kernel drbd0: conn( WFBitMapT -> WFSyncUUID ) 
11-02 22:10:23 unknown kernel drbd0: conn( WFSyncUUID -> SyncTarget ) disk( UpToDate -> Inconsistent ) 
11-02 22:10:23 unknown kernel drbd0: Began resync as SyncTarget (will sync 324 KB [81 bits set]).
11-02 22:10:23 unknown kernel drbd0: Resync done (total 1 sec; paused 0 sec; 324 K/sec)
11-02 22:10:23 unknown kernel drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) 
11-02 22:10:24 unknown kernel drbd0: local disk flush failed with status -95, disabling disk-flushes
11-02 22:14:13 unknown kernel drbd0: peer( Primary -> Secondary ) 
11-02 22:14:14 unknown kernel drbd0: role( Secondary -> Primary ) 
11-02 22:14:14 unknown kernel EXT3 FS on drbd0, internal journal
11-02 22:14:14 unknown kernel SELinux: initialized (dev drbd0, type ext3), uses xattr
11-02 22:14:18 unknown kernel drbd0: peer( Secondary -> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown ) 
11-02 22:14:18 unknown kernel drbd0: Creating new current UUID
11-02 22:14:18 unknown kernel drbd0: meta connection shut down by peer.
11-02 22:14:18 unknown kernel drbd0: asender terminated
11-02 22:14:18 unknown kernel drbd0: Terminating asender thread
11-02 22:14:19 unknown kernel drbd0: Connection closed
11-02 22:14:19 unknown kernel drbd0: conn( TearDown -> Unconnected ) 
11-02 22:14:19 unknown kernel drbd0: receiver terminated
11-02 22:14:19 unknown kernel drbd0: Restarting receiver thread
11-02 22:14:19 unknown kernel drbd0: receiver (re)started
11-02 22:14:19 unknown kernel drbd0: conn( Unconnected -> WFConnection ) 
11-02 22:15:16 unknown kernel drbd0: Handshake successful: DRBD Network Protocol version 86
11-02 22:15:16 unknown kernel drbd0: conn( WFConnection -> WFReportParams ) 
11-02 22:15:16 unknown kernel drbd0: Starting asender thread (from drbd0_receiver [1495])
11-02 22:15:16 unknown kernel drbd0: Considerable difference in lower level device sizes: 18768s vs. 32176s
11-02 22:15:16 unknown kernel drbd0: drbd_sync_handshake:
11-02 22:15:16 unknown kernel drbd0: self 77F8DC91C89BA0F9:77F8DC91C89BA0F9:A1454CD240FF75F4:52A974873622A8A8
11-02 22:15:16 unknown kernel drbd0: peer 77F8DC91C89BA0F8:0000000000000000:A1454CD240FF75F4:52A974873622A8A8
11-02 22:15:16 unknown kernel drbd0: uuid_compare()=0 by rule 4
11-02 22:15:16 unknown kernel drbd0: No resync, but 78 bits in bitmap!
11-02 22:15:16 unknown kernel drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate ) 
11-02 22:19:20 unknown kernel drbd0: role( Primary -> Secondary ) 
11-02 22:19:20 unknown kernel drbd0: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) 
 

LAK 




_______________________________________________ drbd-user mailing list drbd-user at lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20101104/0110fa81/attachment.htm>


More information about the drbd-user mailing list