[DRBD-user] Resync Happened Automatically after Online Verify? Can the Resync Be Trusted?

Robinson, Eric eric.robinson at psmnv.com
Tue May 28 11:42:28 CEST 2013

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


According to the DRBD User Guide, after doing an online verify it is necessary to trigger a resync by disconnecting and reconnecting the resource (http://www.drbd.org/users-guide/s-use-online-verify.html).

However, we conducted a verify yesterday after a RAID array failure event on the standby node and it triggered a resync automatically. Is this normal?

During the resync, several troubling log messages were generated, starting at 23:06:17. It looks like the verify failed to complete and a resync was automatically triggered. Can the resync be trusted, or is another online verify required?

-- log snippet from primary (non-failed) node --

May 27 18:32:54 ha10a kernel: block drbd0: conn( Connected -> VerifyS )
May 27 18:32:54 ha10a kernel: block drbd0: Starting Online Verify from sector 0
May 27 18:46:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 19:01:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 19:06:10 ha10a kernel: block drbd0: Out of sync: start=242225192, size=8 (sectors)
May 27 19:06:10 ha10a kernel: block drbd0: Out of sync: start=242225200, size=8 (sectors)
May 27 19:09:31 ha10a kernel: block drbd0: Out of sync: start=267001248, size=8 (sectors)
May 27 19:09:59 ha10a kernel: block drbd0: Out of sync: start=270432752, size=16 (sectors)
May 27 19:12:01 ha10a kernel: block drbd0: Out of sync: start=285172888, size=8 (sectors)
May 27 19:12:01 ha10a kernel: block drbd0: Out of sync: start=285172920, size=8 (sectors)
May 27 19:12:01 ha10a kernel: block drbd0: Out of sync: start=285172744, size=48 (sectors)
May 27 19:12:01 ha10a kernel: block drbd0: Out of sync: start=285172792, size=8 (sectors)
<more of these removed>

May 27 19:13:24 ha10a kernel: block drbd0: Out of sync: start=295106080, size=8 (sectors)
May 27 19:13:24 ha10a kernel: block drbd0: Out of sync: start=295106192, size=8 (sectors)
May 27 19:13:24 ha10a kernel: block drbd0: Out of sync: start=295106208, size=8 (sectors)
May 27 19:13:24 ha10a kernel: block drbd0: Out of sync: start=295106216, size=16 (sectors)
May 27 19:13:24 ha10a kernel: block drbd0: Out of sync: start=295114664, size=24 (sectors)
May 27 19:13:24 ha10a kernel: block drbd0: Out of sync: start=295115456, size=8 (sectors)
May 27 19:13:26 ha10a kernel: block drbd0: Out of sync: start=295299816, size=24 (sectors)
May 27 19:16:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 19:31:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 19:46:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 20:01:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 20:16:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 20:31:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 20:46:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 21:01:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 21:16:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 21:31:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 21:46:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 22:01:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 22:16:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 22:31:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 22:46:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 23:01:38 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 23:05:41 ha10a kernel: d-con ha01_mysql: [drbd_w_ha01_mys/4088] sock_sendmsg time expired, ko = 6
May 27 23:05:47 ha10a kernel: d-con ha01_mysql: [drbd_w_ha01_mys/4088] sock_sendmsg time expired, ko = 5
May 27 23:05:53 ha10a kernel: d-con ha01_mysql: [drbd_w_ha01_mys/4088] sock_sendmsg time expired, ko = 4
May 27 23:05:59 ha10a kernel: d-con ha01_mysql: [drbd_w_ha01_mys/4088] sock_sendmsg time expired, ko = 3
May 27 23:06:05 ha10a kernel: d-con ha01_mysql: [drbd_w_ha01_mys/4088] sock_sendmsg time expired, ko = 2
May 27 23:06:11 ha10a kernel: d-con ha01_mysql: [drbd_w_ha01_mys/4088] sock_sendmsg time expired, ko = 1
May 27 23:06:17 ha10a kernel: block drbd0: Remote failed to finish a request within ko-count * timeout
May 27 23:06:17 ha10a kernel: block drbd0: peer( Secondary -> Unknown ) conn( VerifyS -> Timeout ) pdsk( UpToDate -> DUnknown )
May 27 23:06:17 ha10a kernel: block drbd0: Online Verify reached sector 1980019736
May 27 23:06:17 ha10a kernel: d-con ha01_mysql: Terminating drbd_a_ha01_mys
May 27 23:06:23 ha10a kernel: block drbd0: md_sync_timer expired! Worker calls drbd_md_sync().
May 27 23:06:23 ha10a kernel: block drbd0: drbd_rs_complete_io() called, but extent not found
May 27 23:06:23 ha10a kernel: block drbd0: drbd_rs_complete_io() called, but extent not found
May 27 23:06:23 ha10a kernel: block drbd0: drbd_rs_complete_io() called, but extent not found
May 27 23:06:23 ha10a kernel: block drbd0: drbd_rs_complete_io() called, but extent not found
May 27 23:06:23 ha10a kernel: block drbd0: drbd_rs_complete_io() called, but extent not found
May 27 23:06:23 ha10a kernel: block drbd0: new current UUID 9E6336C074323FD7:F8B5F4D0D7E2DF75:A4E3656E5427C691:A4E2656E5427C691
May 27 23:06:23 ha10a kernel: d-con ha01_mysql: helper command: /sbin/drbdadm fence-peer ha01_mysql
May 27 23:06:24 ha10a cibadmin[1176]:   notice: crm_log_args: Invoked: cibadmin -C -o constraints -X <rsc_location rsc="ms_drbd0" id="drbd-fence-by-handler-ha01_mysql-ms_drbd0">#012  <rule role="Master" score="-INFINITY" id="drbd-fence-by-handler-ha01_mysql-rule-ms_drbd0">#012    <expression attribute="#uname" operation="ne" value="ha10a" id="drbd-fence-by-handler-ha01_mysql-expr-ms_drbd0"/>#012  </rule>#012</rsc_location>
May 27 23:06:24 ha10a crm-fence-peer.sh[1132]: INFO peer is reachable, my disk is UpToDate: placed constraint 'drbd-fence-by-handler-ha01_mysql-ms_drbd0'
May 27 23:06:24 ha10a kernel: d-con ha01_mysql: helper command: /sbin/drbdadm fence-peer ha01_mysql exit code 4 (0x400)
May 27 23:06:24 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 23:06:26 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 23:06:27 ha10a kernel: d-con ha01_mysql: Starting asender thread (from drbd_r_ha01_mys [4132])
May 27 23:06:27 ha10a kernel: block drbd0: drbd_sync_handshake:
May 27 23:06:27 ha10a kernel: block drbd0: self 9E6336C074323FD7:F8B5F4D0D7E2DF75:A4E3656E5427C691:A4E2656E5427C691 bits:3182 flags:0
May 27 23:06:27 ha10a kernel: block drbd0: peer F8B5F4D0D7E2DF74:0000000000000000:A4E3656E5427C690:A4E2656E5427C691 bits:1581 flags:0
May 27 23:06:27 ha10a kernel: block drbd0: uuid_compare()=1 by rule 70
May 27 23:06:27 ha10a kernel: block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( Outdated -> Consistent )
May 27 23:06:27 ha10a kernel: block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 330(1), total 330; compression: 100.0%
May 27 23:06:27 ha10a kernel: block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 330(1), total 330; compression: 100.0%
May 27 23:06:27 ha10a kernel: block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0
May 27 23:06:27 ha10a kernel: block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0 exit code 0 (0x0)
May 27 23:06:27 ha10a kernel: block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent )
May 27 23:06:27 ha10a kernel: block drbd0: Began resync as SyncSource (will sync 12728 KB [3182 bits set]).
May 27 23:06:27 ha10a kernel: block drbd0: updated sync UUID 9E6336C074323FD7:F8B6F4D0D7E2DF75:F8B5F4D0D7E2DF75:A4E3656E5427C691
May 27 23:06:27 ha10a kernel: block drbd0: Resync done (total 1 sec; paused 0 sec; 12728 K/sec)
May 27 23:06:27 ha10a kernel: block drbd0: updated UUIDs 9E6336C074323FD7:0000000000000000:F8B6F4D0D7E2DF75:F8B5F4D0D7E2DF75
May 27 23:06:27 ha10a kernel: block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )
May 27 23:06:27 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 23:06:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 23:21:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 23:36:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 27 23:51:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 28 00:06:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 28 00:21:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 28 00:36:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 28 00:51:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 28 01:06:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 28 01:21:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 28 01:36:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 28 01:51:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
May 28 02:06:57 ha10a pengine[17218]:   notice: unpack_rsc_op: Operation monitor found resource p_drbd0:0 active in master mode on ha10a
[root at ha10a ~]# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root at ha10a.mycharts.md<mailto:root at ha10a.mycharts.md>, 2013-04-25 16:27:09
 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
    ns:72109544 nr:168 dw:74947448 dr:1004787613 al:7721 bm:371 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
    ns:4104 nr:28 dw:40 dr:1073715946 al:1 bm:2 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

--
Eric Robinson
Director of Information Technology
Physician Select Management, LLC
775.885.2211 x 111





Disclaimer - May 28, 2013 
This email and any files transmitted with it are confidential and intended solely for 'drbd-user at lists.linbit.com'. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physicians' Managed Care or Physician Select Management. Warning: Although Physicians' Managed Care or Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. 
This disclaimer was added by Policy Patrol: http://www.policypatrol.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20130528/e075a8c1/attachment.htm>


More information about the drbd-user mailing list