Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, We would like to force resync on standby if uuid_compare rule==4 ONLY. Why?? We are seeing DRBD files corrupted consistently and each time corruption is seen rule is 4. Mail chain below has full details. Yes this is a work around because we are on DRBD-8.0.16 snmpd version upgrade is a strict NO, NO. Whenever we observed the following two lines, we recreated metadata on secondary node and there was not file corruption seen. File Corruption: Content of other files are seen in snmpd.conf file. File corrupted is always snmpd.conf --> uuid_compare()=0 by rule 4 --> No resync, but 78 bits in bitmap! <<<< Number of bits is variable. >>>> invalidate and invalidate-remote options are available under drbdadm tool, but can only be invoked externally??? We would like to kick of sync only when we hit uuid_compare() by rule 4. Work around below needs tweaking: case 0: INFO("Lak: 0 by Rule 4, current state = %d current role = %d ! \n", mdev->state.conn, mdev->state.role); /* !self_pri && !peer_pri */ if (mdev->state.conn == WFBitMapT) { drbd_start_resync(mdev, SyncTarget); } else if (mdev->state.conn == WFBitMapS) { drbd_start_resync(mdev, SyncSource); } else if (mdev->state.conn == SyncTarget) { drbd_start_resync(mdev, SyncTarget); } else if (mdev->state.conn == SyncSource) { drbd_start_resync(mdev, SyncSource); } else if (mdev->state.role == Secondary) { drbd_start_resync(mdev, SyncTarget); } return 0; dmesg output on Standby Controller: drbd0: Lak: 0 by Rule 4, current state = 9 current role = 2 ! drbd0: State change failed: Refusing to be inconsistent on both nodes drbd0: state = { cs:WFReportParams st:Secondary/Unknown ds:UpToDate/DUnknown r--- } drbd0: wanted = { cs:SyncTarget st:Secondary/Unknown ds:Inconsistent/DUnknown r--- } LAK From: putcha_laks at hotmail.com To: drbd-user at lists.linbit.com Date: Wed, 3 Nov 2010 06:50:46 +0000 Subject: [DRBD-user] Issue with uuid_compare by rule 4 Hi, DRBD -Version 8.0.16 (Code for uuid_compare by rule 4 is same in all DRBD-8.x.y versions) We are consistently seeing content of snmpd.conf get corrupted -- sometimes it shows iptables and sometimes it has some wierd binary data. In all the instances we have seen corruption, Pattern observed from dmesg uuid_compare()=0 by rule 4 No resync, but 78 bits in bitmap! <<<< Number of bits is variable. >>>> >From drbd change log history we see that UUID_COMPARISON algorith was improved. * Sat Apr 07 2007 21:32:39 +0200 Philipp Reisner <phil(at)linbit.com> - drbd (8.0.2-1) * Improved the robustness of the UUID based algorithm that decides about the resync direction. We would like to force sync in the rule 4 by doing the following, need your help in this regard. case 0: /* !self_pri && !peer_pri */ if (mdev->state.conn == WFBitMapT) { drbd_start_resync(mdev, SyncTarget); } else if (mdev->state.conn == WFBitMapS) { drbd_start_resync(mdev, SyncSource); } else if (mdev->state.conn == SyncTarget) { drbd_start_resync(mdev, SyncTarget); } else if (mdev->state.conn == SyncSource) { drbd_start_resync(mdev, SyncSource); } return 0; Testcase we are running: Reset active board every 5 mins on a Redundant Setup. dmesg output: 11-02 22:10:23 unknown kernel drbd0: drbd_sync_handshake: 11-02 22:10:23 unknown kernel drbd0: self 52A974873622A8A8:0000000000000000:D1A184CD02C8EE0D:BD9572546D8C332D 11-02 22:10:23 unknown kernel drbd0: peer 77F8DC91C89BA0F9:52A974873622A8A8:D1A184CD02C8EE0C:BD9572546D8C332D 11-02 22:10:23 unknown kernel drbd0: uuid_compare()=-1 by rule 5 11-02 22:10:23 unknown kernel drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate ) 11-02 22:10:23 unknown kernel drbd0: conn( WFBitMapT -> WFSyncUUID ) 11-02 22:10:23 unknown kernel drbd0: conn( WFSyncUUID -> SyncTarget ) disk( UpToDate -> Inconsistent ) 11-02 22:10:23 unknown kernel drbd0: Began resync as SyncTarget (will sync 324 KB [81 bits set]). 11-02 22:10:23 unknown kernel drbd0: Resync done (total 1 sec; paused 0 sec; 324 K/sec) 11-02 22:10:23 unknown kernel drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) 11-02 22:10:24 unknown kernel drbd0: local disk flush failed with status -95, disabling disk-flushes 11-02 22:14:13 unknown kernel drbd0: peer( Primary -> Secondary ) 11-02 22:14:14 unknown kernel drbd0: role( Secondary -> Primary ) 11-02 22:14:14 unknown kernel EXT3 FS on drbd0, internal journal 11-02 22:14:14 unknown kernel SELinux: initialized (dev drbd0, type ext3), uses xattr 11-02 22:14:18 unknown kernel drbd0: peer( Secondary -> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown ) 11-02 22:14:18 unknown kernel drbd0: Creating new current UUID 11-02 22:14:18 unknown kernel drbd0: meta connection shut down by peer. 11-02 22:14:18 unknown kernel drbd0: asender terminated 11-02 22:14:18 unknown kernel drbd0: Terminating asender thread 11-02 22:14:19 unknown kernel drbd0: Connection closed 11-02 22:14:19 unknown kernel drbd0: conn( TearDown -> Unconnected ) 11-02 22:14:19 unknown kernel drbd0: receiver terminated 11-02 22:14:19 unknown kernel drbd0: Restarting receiver thread 11-02 22:14:19 unknown kernel drbd0: receiver (re)started 11-02 22:14:19 unknown kernel drbd0: conn( Unconnected -> WFConnection ) 11-02 22:15:16 unknown kernel drbd0: Handshake successful: DRBD Network Protocol version 86 11-02 22:15:16 unknown kernel drbd0: conn( WFConnection -> WFReportParams ) 11-02 22:15:16 unknown kernel drbd0: Starting asender thread (from drbd0_receiver [1495]) 11-02 22:15:16 unknown kernel drbd0: Considerable difference in lower level device sizes: 18768s vs. 32176s 11-02 22:15:16 unknown kernel drbd0: drbd_sync_handshake: 11-02 22:15:16 unknown kernel drbd0: self 77F8DC91C89BA0F9:77F8DC91C89BA0F9:A1454CD240FF75F4:52A974873622A8A8 11-02 22:15:16 unknown kernel drbd0: peer 77F8DC91C89BA0F8:0000000000000000:A1454CD240FF75F4:52A974873622A8A8 11-02 22:15:16 unknown kernel drbd0: uuid_compare()=0 by rule 4 11-02 22:15:16 unknown kernel drbd0: No resync, but 78 bits in bitmap! 11-02 22:15:16 unknown kernel drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate ) 11-02 22:19:20 unknown kernel drbd0: role( Primary -> Secondary ) 11-02 22:19:20 unknown kernel drbd0: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) LAK _______________________________________________ drbd-user mailing list drbd-user at lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20101104/0110fa81/attachment.htm>