[DRBD-user] DRBD not syncing with new secondary

Christian Koschmieder ck at peira-kollektiv.de
Mon Aug 25 16:33:45 CEST 2014

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello Roland,

Sorry, I didn't attach it because it does not seem to have any relevant 
information in it. But of course, here it is:

Aug 24 21:59:14 www1 kernel: [37868732.971832] block drbd1: conn( 
StandAlone -> Unconnected ).
Aug 24 21:59:14 www1 kernel: [37868732.971887] block drbd1: Starting 
receiver thread (from drbd1_worker [1733])
Aug 24 21:59:14 www1 kernel: [37868732.972202] block drbd1: receiver 
(re)started
Aug 24 21:59:14 www1 kernel: [37868732.972222] block drbd1: conn( 
Unconnected -> WFConnection ).
Aug 24 21:59:15 www1 kernel: [37868733.471248] block drbd1: Handshake 
successful: Agreed network protocol version 91
Aug 24 21:59:15 www1 kernel: [37868733.471284] block drbd1: conn( 
WFConnection -> WFReportParams ).
Aug 24 21:59:15 www1 kernel: [37868733.471344] block drbd1: Starting 
asender thread (from drbd1_receiver [22671])
Aug 24 21:59:15 www1 kernel: [37868733.471571] block drbd1: 
data-integrity-alg: <not-used>
Aug 24 21:59:15 www1 kernel: [37868733.471623] block drbd1: conn( 
WFReportParams -> Disconnecting ).
Aug 24 21:59:15 www1 kernel: [37868733.471680] block drbd1: asender 
terminated
Aug 24 21:59:15 www1 kernel: [37868733.471699] block drbd1: Terminating 
drbd1_asender
Aug 24 21:59:15 www1 kernel: [37868733.471901] block drbd1: Connection 
closed
Aug 24 21:59:15 www1 kernel: [37868733.471926] block drbd1: conn( 
Disconnecting -> StandAlone ).
Aug 24 21:59:15 www1 kernel: [37868733.471967] block drbd1: receiver 
terminated
Aug 24 21:59:15 www1 kernel: [37868733.471982] block drbd1: Terminating 
drbd1_receiver


Kind regards,

Koschi

Am 25.08.2014 um 14:23 schrieb Roland Friedwagner:
> Hi,
>
> can you provide the log (from the same connection attempt) from
> the other node (primary) also?
>
> regards roland
>
> Am Sonntag 24 August 2014 22:09:44 schrieb Christian Koschmieder:
>> I have two servers to host a website.
>> Only one is actively used at a time, the other one acts as hot standby.
>> All data ist replicated via DRBD from the currentlly active server
>> (primary) to the backup server (secondary).
>>
>> I recently had to set up a new secondary, because the original one had
>> hardware problems.
>> So i followed the instructions in the documentation
>> (http://www.drbd.org/users-guide-8.3/s-node-failure.html#s-perm-node-failure).
>>
>> The status of the primary node:
>> version: 8.3.7 (api:88/proto:86-91)
>> srcversion: EE47D8BF18AC166BE219757
>>
>>    1: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r----
>>       ns:0 nr:0 dw:202926340 dr:247194962 al:2452 bm:757 lo:0 pe:0 ua:0
>> ap:0 ep:1 wo:b oos:215272
>>
>> The status of the secondary node:
>> version: 8.3.11 (api:88/proto:86-96)
>> srcversion: F937DCB2E5D83C6CCE4A6C9
>>
>>    1: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown C r-----
>>       ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f
>> oos:242699884
>>
>> This seems to be all right.
>> But when issuing a connect on the primary it immediately disconnects again.
>> The log on the secondary has the following entries:
>>
>> Aug 24 21:59:15 www2 kernel: [ 3780.076072] block drbd1: Handshake
>> successful: Agreed network protocol version 91
>> Aug 24 21:59:15 www2 kernel: [ 3780.076122] block drbd1: conn(
>> WFConnection -> WFReportParams )
>> Aug 24 21:59:15 www2 kernel: [ 3780.076180] block drbd1: Starting
>> asender thread (from drbd1_receiver [2502])
>> Aug 24 21:59:15 www2 kernel: [ 3780.077178] block drbd1:
>> data-integrity-alg: <not-used>
>> Aug 24 21:59:15 www2 kernel: [ 3780.077235] block drbd1:
>> drbd_sync_handshake:
>> Aug 24 21:59:15 www2 kernel: [ 3780.077272] block drbd1: self
>> 0000000000000004:0000000000000000:0000000000000000:0000000000000000
>> bits:60674971 flags:0
>> Aug 24 21:59:15 www2 kernel: [ 3780.077328] block drbd1: peer
>> E2227E948E7B07CD:4445769C1EF0ADCC:B744D0729CC042CC:5AD0061929ED5B9D
>> bits:53813 flags:0
>> Aug 24 21:59:15 www2 kernel: [ 3780.077343] block drbd1: conn(
>> WFReportParams -> NetworkFailure )
>> Aug 24 21:59:15 www2 kernel: [ 3780.077349] block drbd1: asender terminated
>> Aug 24 21:59:15 www2 kernel: [ 3780.077351] block drbd1: Terminating
>> drbd1_asender
>> Aug 24 21:59:15 www2 kernel: [ 3780.077509] block drbd1:
>> uuid_compare()=-2 by rule 20
>> Aug 24 21:59:15 www2 kernel: [ 3780.077549] block drbd1: Becoming sync
>> target due to disk states.
>> Aug 24 21:59:15 www2 kernel: [ 3780.077586] block drbd1: Writing the
>> whole bitmap, full sync required after drbd_sync_handshake.
>> Aug 24 21:59:15 www2 kernel: [ 3780.162981] block drbd1: bitmap WRITE of
>> 1852 pages took 10 jiffies
>> Aug 24 21:59:15 www2 kernel: [ 3780.224437] block drbd1: 231 GB
>> (60674971 bits) marked out-of-sync by on disk bit-map.
>> Aug 24 21:59:15 www2 kernel: [ 3780.232894] block drbd1:
>> drbd_sync_handshake:
>> Aug 24 21:59:15 www2 kernel: [ 3780.232932] block drbd1: self
>> 0000000000000004:0000000000000000:0000000000000000:0000000000000000
>> bits:60674971 flags:0
>> Aug 24 21:59:15 www2 kernel: [ 3780.232975] block drbd1: peer
>> E2227E948E7B07CD:4445769C1EF0ADCC:B744D0729CC042CC:5AD0061929ED5B9D
>> bits:53813 flags:0
>> Aug 24 21:59:15 www2 kernel: [ 3780.233017] block drbd1:
>> uuid_compare()=-2 by rule 20
>> Aug 24 21:59:15 www2 kernel: [ 3780.233053] block drbd1: Becoming sync
>> target due to disk states.
>> Aug 24 21:59:15 www2 kernel: [ 3780.233091] block drbd1: Writing the
>> whole bitmap, full sync required after drbd_sync_handshake.
>> Aug 24 21:59:15 www2 kernel: [ 3780.287424] block drbd1: bitmap WRITE of
>> 1852 pages took 10 jiffies
>> Aug 24 21:59:15 www2 kernel: [ 3780.348835] block drbd1: 231 GB
>> (60674971 bits) marked out-of-sync by on disk bit-map.
>> Aug 24 21:59:15 www2 kernel: [ 3780.357295] block drbd1: peer( Unknown
>> -> Primary ) conn( NetworkFailure -> WFBitMapT ) pdsk( DUnknown ->
>> UpToDate )
>> Aug 24 21:59:15 www2 kernel: [ 3780.365646] block drbd1: Connection closed
>> Aug 24 21:59:15 www2 kernel: [ 3780.365688] block drbd1: peer( Primary
>> -> Unknown ) conn( WFBitMapT -> Unconnected ) pdsk( UpToDate -> DUnknown )
>> Aug 24 21:59:15 www2 kernel: [ 3780.365731] block drbd1: receiver terminated
>> Aug 24 21:59:15 www2 kernel: [ 3780.365771] block drbd1: Restarting
>> drbd1_receiver
>> Aug 24 21:59:15 www2 kernel: [ 3780.365808] block drbd1: receiver
>> (re)started
>> Aug 24 21:59:15 www2 kernel: [ 3780.365871] block drbd1: conn(
>> Unconnected -> WFConnection )
>> Aug 24 21:59:15 www2 kernel: [ 3780.373914] block drbd1: bitmap WRITE of
>> 0 pages took 0 jiffies
>> Aug 24 21:59:15 www2 kernel: [ 3780.374072] block drbd1: 231 GB
>> (60674971 bits) marked out-of-sync by on disk bit-map.
>>
>>
>> As far as i understand it, they do have a connection, agree on a
>> protocol, notice that secondary needs to be fully synced and then just
>> drop the connection for no apparent reason.
>>
>> Can you tell me why this might be or where i can get further information
>> as for why the conenction is being dropped?
>>
>>
>> Thanks a lot
>>
>> Koschi
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user at lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user




More information about the drbd-user mailing list