[DRBD-user] tons of out-of-sync sectors detected

GAUTIER Hervé herve.gautier at thalesgroup.com
Wed Jul 30 16:22:02 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi !

Well, can you explain in which context do you use the drbdadm verify 
command ? As you have created it, you certainly have a way to use it.
If file system may "re-use" an in-flight data buffer even before the 
write has been completed, is there any file system option to prevent this ?

Thank for this great tool anyway.

Lars Ellenberg a écrit :
> On Wed, Jul 30, 2008 at 03:35:08PM +0200, Eric Marin wrote:
>   
>> Hello,
>>
>> as a follow-up to mails I wrote to this list on 2008.06.23 ["drbdadm 
>> verify all seems to produce false positives on ext3 and crash the 
>> server"] and 2008.07.09, where I said I had problems with the command 
>> 'drbdadm verify all' (it regularly found out-of-sync sectors without any 
>> apparent reason), today I've decided to finally enable integrity checking 
>> :
>> net { data-integrity-alg "crc32c"; }
>>
>> Then, looking in /var/log/kern.log, I noticed lots of messages like these :
>> ---------------------------------------------------------------------
>> Jul 30 14:37:53 ldap-a kernel: [779687.127945] drbd0: meta connection shut down by peer.
>> Jul 30 14:37:53 ldap-a kernel: [779687.127980] drbd0: peer( Secondary -> 
>> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown 
>> )
>> Jul 30 14:37:53 ldap-a kernel: [779687.128040] drbd0: asender terminated
>> Jul 30 14:37:53 ldap-a kernel: [779687.128064] drbd0: Terminating asender thread
>> Jul 30 14:37:53 ldap-a kernel: [779687.128108] drbd0: Creating new current UUID
>> Jul 30 14:37:53 ldap-a kernel: [779687.128147] drbd0: Writing meta data super block now.
>> Jul 30 14:37:53 ldap-a kernel: [779687.128196] drbd0: sock was reset by peer
>> Jul 30 14:37:53 ldap-a kernel: [779687.128223] drbd0: short read expecting header on sock: r=-104
>> Jul 30 14:37:53 ldap-a kernel: [779687.128269] drbd0: tl_clear()
>> Jul 30 14:37:53 ldap-a kernel: [779687.128425] drbd0: Connection closed
>> Jul 30 14:37:53 ldap-a kernel: [779687.128452] drbd0: helper command: /sbin/drbdadm outdate-peer
>> Jul 30 14:37:55 ldap-a kernel: [779688.330184] drbd0: outdate-peer helper returned 4 (peer is outdated)
>> Jul 30 14:37:55 ldap-a kernel: [779688.330224] drbd0: pdsk( DUnknown -> Outdated )
>> Jul 30 14:37:55 ldap-a kernel: [779688.330262] drbd0: Writing meta data super block now.
>> Jul 30 14:37:55 ldap-a kernel: [779688.330360] drbd0: conn( NetworkFailure -> Unconnected )
>> Jul 30 14:37:55 ldap-a kernel: [779688.330390] drbd0: receiver terminated
>> Jul 30 14:37:55 ldap-a kernel: [779688.330418] drbd0: receiver (re)started
>> Jul 30 14:37:55 ldap-a kernel: [779688.330445] drbd0: conn( Unconnected -> WFConnection )
>> Jul 30 14:37:55 ldap-a kernel: [779688.429930] drbd0: Handshake 
>> successful: Agreed network protocol version 88
>> Jul 30 14:37:55 ldap-a kernel: [779688.472427] drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC
>> Jul 30 14:37:55 ldap-a kernel: [779688.472458] drbd0: conn( WFConnection -> WFReportParams )
>> Jul 30 14:37:55 ldap-a kernel: [779688.472493] drbd0: Starting asender 
>> thread (from drbd0_receiver [23077])
>> Jul 30 14:37:55 ldap-a kernel: [779688.472566] drbd0: data-integrity-alg: crc32c
>> Jul 30 14:37:55 ldap-a kernel: [779688.473066] drbd0: peer( Unknown -> 
>> Secondary ) conn( WFReportParams -> WFBitMapS )
>> Jul 30 14:37:55 ldap-a kernel: [779688.473115] drbd0: Writing meta data super block now.
>> Jul 30 14:37:55 ldap-a kernel: [779688.479730] drbd0: conn( WFBitMapS -> 
>> SyncSource ) pdsk( Outdated -> Inconsistent )
>> Jul 30 14:37:55 ldap-a kernel: [779688.479781] drbd1: aftr_isp( 0 -> 1 )
>> Jul 30 14:37:55 ldap-a kernel: [779688.479808] drbd0: Began resync as 
>> SyncSource (will sync 896 KB [224 bits set]).
>> Jul 30 14:37:55 ldap-a kernel: [779688.479855] drbd0: Writing meta data super block now.
>> Jul 30 14:37:55 ldap-a kernel: [779688.483240] drbd1: peer_isp( 0 -> 1 )
>> Jul 30 14:37:56 ldap-a kernel: [779688.488330] drbd0: Resync done (total 1 sec; paused 0 sec; 896 K/sec)
>> Jul 30 14:37:56 ldap-a kernel: [779688.488330] drbd0: conn( SyncSource -> 
>> Connected ) pdsk( Inconsistent -> UpToDate )
>> Jul 30 14:37:56 ldap-a kernel: [779688.488330] drbd1: aftr_isp( 1 -> 0 )
>> Jul 30 14:37:56 ldap-a kernel: [779688.488330] drbd0: Writing meta data super block now.
>> Jul 30 14:37:56 ldap-a kernel: [779688.488330] drbd1: peer_isp( 1 -> 0 )
>> Jul 30 14:38:59 ldap-a kernel: [779728.867597] drbd0: sock_sendmsg returned -104
>> Jul 30 14:38:59 ldap-a kernel: [779728.867632] drbd0: peer( Secondary -> 
>> Unknown ) conn( Connected -> BrokenPipe ) pdsk( UpToDate -> DUnknown )
>> Jul 30 14:38:59 ldap-a kernel: [779728.867799] drbd0: Creating new current UUID
>> Jul 30 14:38:59 ldap-a kernel: [779728.867841] drbd0: Writing meta data super block now.
>> Jul 30 14:38:59 ldap-a kernel: [779728.867912] drbd0: meta connection shut down by peer.
>> Jul 30 14:38:59 ldap-a kernel: [779728.867942] drbd0: asender terminated
>> Jul 30 14:38:59 ldap-a kernel: [779728.867967] drbd0: Terminating asender thread
>> Jul 30 14:38:59 ldap-a kernel: [779728.870276] drbd0: sock was shut down by peer
>> Jul 30 14:38:59 ldap-a kernel: [779728.870305] drbd0: short read expecting header on sock: r=0
>> Jul 30 14:38:59 ldap-a kernel: [779728.870356] drbd0: tl_clear()
>> Jul 30 14:38:59 ldap-a kernel: [779728.870466] drbd0: Connection closed
>> Jul 30 14:38:59 ldap-a kernel: [779728.870491] drbd0: helper command: /sbin/drbdadm outdate-peer
>> Jul 30 14:39:01 ldap-a kernel: [779729.521649] drbd0: outdate-peer helper returned 4 (peer is outdated)
>> Jul 30 14:39:01 ldap-a kernel: [779729.521690] drbd0: pdsk( DUnknown -> Outdated )
>> Jul 30 14:39:01 ldap-a kernel: [779729.521755] drbd0: Writing meta data super block now.
>> Jul 30 14:39:01 ldap-a kernel: [779729.521849] drbd0: conn( BrokenPipe -> Unconnected )
>> Jul 30 14:39:01 ldap-a kernel: [779729.521878] drbd0: receiver terminated
>> Jul 30 14:39:01 ldap-a kernel: [779729.521903] drbd0: receiver (re)started
>> Jul 30 14:39:01 ldap-a kernel: [779729.521930] drbd0: conn( Unconnected -> WFConnection )
>> Jul 30 14:39:01 ldap-a kernel: [779729.558903] drbd0: Handshake 
>> successful: Agreed network protocol version 88
>> Jul 30 14:39:01 ldap-a kernel: [779729.563009] drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC
>> Jul 30 14:39:01 ldap-a kernel: [779729.563047] drbd0: conn( WFConnection -> WFReportParams )
>> Jul 30 14:39:01 ldap-a kernel: [779729.563088] drbd0: Starting asender 
>> thread (from drbd0_receiver [23077])
>> Jul 30 14:39:01 ldap-a kernel: [779729.563158] drbd0: data-integrity-alg: crc32c
>> Jul 30 14:39:01 ldap-a kernel: [779729.563656] drbd0: peer( Unknown -> 
>> Secondary ) conn( WFReportParams -> WFBitMapS )
>> Jul 30 14:39:01 ldap-a kernel: [779729.563705] drbd0: Writing meta data super block now.
>> Jul 30 14:39:01 ldap-a kernel: [779729.566904] drbd0: conn( WFBitMapS -> 
>> SyncSource ) pdsk( Outdated -> Inconsistent )
>> Jul 30 14:39:01 ldap-a kernel: [779729.566904] drbd1: aftr_isp( 0 -> 1 )
>> Jul 30 14:39:01 ldap-a kernel: [779729.566904] drbd0: Began resync as 
>> SyncSource (will sync 1936 KB [484 bits set]).
>> Jul 30 14:39:01 ldap-a kernel: [779729.566904] drbd0: Writing meta data super block now.
>> Jul 30 14:39:01 ldap-a kernel: [779729.566904] drbd1: peer_isp( 0 -> 1 )
>> Jul 30 14:39:01 ldap-a kernel: [779729.578905] drbd0: Resync done (total 
>> 1 sec; paused 0 sec; 1936 K/sec)
>> Jul 30 14:39:01 ldap-a kernel: [779729.578905] drbd0: conn( SyncSource -> 
>> Connected ) pdsk( Inconsistent -> UpToDate )
>> Jul 30 14:39:01 ldap-a kernel: [779729.578905] drbd1: aftr_isp( 1 -> 0 )
>> Jul 30 14:39:01 ldap-a kernel: [779729.578905] drbd0: Writing meta data super block now.
>> Jul 30 14:39:01 ldap-a kernel: [779729.578905] drbd1: peer_isp( 1 -> 0 )
>> Jul 30 14:41:04 ldap-a kernel: [779796.779854] drbd0: meta connection shut down by peer.
>> (...)
>> ---------------------------------------------------------------------
>>
>> It seems data-integrity-alg does its job and notices out-of-sync errors, 
>> and proceeds to immediately resynchronize the two nodes. However I get 
>> LOTS of these messages (I'd say every two minutes on average). So I also 
>> disabled offloading on the network card (on both nodes) :
>> # ethtool -k eth1
>> Offload parameters for eth1:
>> rx-checksumming: on
>> tx-checksumming: on
>> scatter-gather: on
>> tcp segmentation offload: on
>> udp fragmentation offload: off
>> generic segmentation offload: off
>> # ethtool -K eth1 rx off tx off
>> # ethtool -k eth1
>> Offload parameters for eth1:
>> rx-checksumming: off
>> tx-checksumming: off
>> scatter-gather: off
>> tcp segmentation offload: off
>> udp fragmentation offload: off
>> generic segmentation offload: off
>>
>> But I still get these messages !
>>
>> I had updated the kernel to version : "linux-image-2.6.25-2-686-bigmem   
>> 2.6.25-6~bpo40+1" on 2008.07.09 and haven't _noticed_ any corruption 
>> since then, even though DRBD seems to find lots and lots of errors. I 
>> really get the impression these are false positives... but should I worry 
>> ? What could I do now ?
>>     
>
>
> did you memtest recently?
>
>
> as I wrote before in other threads,
> there are various ways to make these checksums fail.
>
> e.g. (not exactly technically correct, but to get the picture)
>  application writes some data, using normal writes to the file system.
>  file system receives that data, puts it into the page cache.
>  page cache decides (or is forced) to write out some pages.
>  pages is submitted to "disk" (in this case DRBD)
>  
>  drbd receives that data,
>    calculates a checksum over that data buffer,
>    [A]
>    submits it to local disk,
>    and queues it for sending over tcp.
>
>      local disk write completion comes in,
>      remote disk write completion comes in
>      in any order.
>
>    once both these completion events are there,
>    drbd completes the request to "upper layers",
>    that is page cache in this example.
>    [B]
>
>
> if your application uses direct io, leave out the page cache.
> if it happens to be a file system meta data block,
> leave out page cache and application.
>
>
> now, if the data buffer is _modified_ (by any component or bad ram)
> while being in flight (that is after being submitted and the checksum
> has been calculated at [A], but before DRBD has completed the write
> at [B]), then the checksum calculated on the originally submitted data
> would have a hard time matching the modified data that is being sent
> later.
>
> drbd does expect in-flight data buffers to not be modified until it
> completes the request.
>
> there may be applications/subsystems out there that modify in-flight
> data buffers "accidentally", risking data integrity.
> it also may be "legal" from the application/file system/swap subsystem
> point of view to "re-use" an in-flight data buffer even before the write
> has been completed. the "on disk" result would then be undefined on
> _any_ io stack (either it reached "disk" before or after this
> modification), meaning that can only be "legal" if the user actually
> does not care what is written there.
>
> with DRBD, the result is doubly undefined, because it may have reached
> local disk before or after that modification, and it may have been sent
> before or after the modification, which may result in differing data
> being written on both nodes.
>
> this may or may not be what causes your observations.
>
> there is not much that DRBD could do about this, besides doing a full
> memcpy of the data buffer to some "drbd private" memory, before even
> calculating any checksums or submitting/queueing for tcp.
>
>
>   

-- 
Hervé GAUTIER




More information about the drbd-user mailing list