Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, Based on the story below, why are so many bits set while it was perfectly in sync a few moments before the Local IO failure? I really like to aviod these semi-fullsync's. Please help. The story goes like: Left-side server is Secondary and has Heartbeat running ready to takeover.. 13:14:45 drbd0: Secondary/Secondary --> Secondary/Primary 13:28:34 drbd0: PARTNER DISKLESS 13:28:50 drbd0: PingAck did not arrive in time. 13:28:50 drbd0: drbd0_asender [24044]: cstate Connected --> NetworkFailure 13:28:50 drbd0: asender terminated 13:28:50 drbd0: drbd0_receiver [18435]: cstate NetworkFailure --> BrokenPipe 13:28:50 drbd0: short read expecting header on sock: r=-512 13:28:50 drbd0: worker terminated 13:28:50 drbd0: drbd0_receiver [18435]: cstate BrokenPipe --> Unconnected 13:28:50 drbd0: Connection lost. 13:28:50 drbd0: drbd0_receiver [18435]: cstate Unconnected --> WFConnection 13:29:00 drbd0: Secondary/Unknown --> Primary/Unknown 13:29:01 EXT3 FS on drbd0, internal journal 13:56:55 drbd0: drbd0_receiver [18435]: cstate WFConnection --> WFReportParams 13:56:55 drbd0: Handshake successful: DRBD Network Protocol version 74 13:56:55 drbd0: Connection established. 13:56:55 drbd0: I am(P): 1:00000003:00000001:00000040:0000000b:10 13:56:55 drbd0: Peer(S): 1:00000003:00000001:0000003f:0000000a:11 13:56:55 drbd0: drbd0_receiver [18435]: cstate WFReportParams --> WFBitMapS 13:57:36 drbd0: 1453882468 KB now marked out-of-sync by on disk bit-map. 13:57:37 drbd0: Primary/Unknown --> Primary/Secondary 13:57:37 drbd0: drbd0_receiver [18435]: cstate WFBitMapS --> SyncSource 13:57:37 drbd0: Resync started as SyncSource (need to sync 1453882468 KB [363470617 bits set]). Right-side server is Primary and has Heartbeat running plus the nfsd and smbd services running. In order to see what will happen I disconnected the scsi subsystem and a mon-monitor kills heartbeat and tries to reboot. 13:14:45 drbd0: Secondary/Secondary --> Primary/Secondary 13:14:46 EXT3 FS on drbd0, internal journal 13:28:34 drbd0: Local IO failed. Detaching... 13:28:35 drbd0: Notified peer that my disk is broken. 13:55:50 drbd0: resync bitmap: bits=363470617 words=11358458 13:55:50 drbd0: size = 1386 GB (1453882468 KB) 13:56:09 drbd0: 0 KB marked out-of-sync by on disk bit-map. 13:56:09 drbd0: Found 10 transactions (592 active extents) in activity log. 13:56:09 drbd0: Marked additional 260 MB as out-of-sync based on AL. 13:56:09 drbd0: drbdsetup [2957]: cstate Unconfigured --> StandAlone 13:56:55 drbd0: drbdsetup [3010]: cstate StandAlone --> Unconnected 13:56:55 drbd0: drbd0_receiver [3011]: cstate Unconnected --> WFConnection 13:56:55 drbd0: drbd0_receiver [3011]: cstate WFConnection --> WFReportParams 13:56:55 drbd0: Handshake successful: DRBD Network Protocol version 74 13:56:55 drbd0: Connection established. 13:56:55 drbd0: I am(S): 1:00000003:00000001:0000003f:0000000a:11 13:56:55 drbd0: Peer(P): 1:00000003:00000001:00000040:0000000b:10 13:56:55 drbd0: drbd0_receiver [3011]: cstate WFReportParams --> WFBitMapT 13:56:55 drbd0: Secondary/Unknown --> Secondary/Primary 13:57:37 drbd0: drbd0_receiver [3011]: cstate WFBitMapT --> SyncTarget 13:57:37 drbd0: Resync started as SyncTarget (need to sync 1453882468 KB [363470617 bits set]). Drbd configuration: /etc/drbd.conf resource drbd0 { protocol C; incon-degr-cmd "logger '!DRBD! pri on incon-degr'"; on left { device /dev/drbd0; disk /dev/sdc1; address 192.168.0.3:7788; meta-disk /dev/sdc2[0]; } on right { device /dev/drbd0; disk /dev/sdc1; address 192.168.0.4:7788; meta-disk /dev/sdc2[0]; } disk { on-io-error detach; } syncer { rate 99M; al-extents 521; } startup { degr-wfc-timeout 300; } } resource drbd1 { protocol C; incon-degr-cmd "logger '!DRBD! pri on incon-degr'"; on left { device /dev/drbd1; disk /dev/sdd1; address 192.168.0.3:7888; meta-disk /dev/sdd2[0]; } on right { device /dev/drbd1; disk /dev/sdd1; address 192.168.0.4:7888; meta-disk /dev/sdd2[0]; } disk { on-io-error detach; } syncer { rate 99M; al-extents 521; } startup { degr-wfc-timeout 300; } } resource drbd2 { protocol C; incon-degr-cmd "logger '!DRBD! pri on incon-degr'"; on left { device /dev/drbd2; disk /dev/sde1; address 192.168.0.3:7988; meta-disk /dev/sde2[0]; } on right { device /dev/drbd2; disk /dev/sde1; address 192.168.0.4:7988; meta-disk /dev/sde2[0]; } disk { on-io-error detach; } syncer { rate 99M; al-extents 521; } startup { degr-wfc-timeout 300; } } Many thanks, Leroy PS: This aint the full /var/log/messages, it doesnt include two other drbds [1,2] which synced quickly PS2: I asking the list because this happend twice now..