Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Sun, 07 Aug 2011 22:23:34 +0000, Alessandro Bono wrote:
> Hi all
>
> I'm having problem attaching two drbd machine this is an old cluster
> migrated from drbd8.0/heartbeat2 to drbd8.3/corosync some months ago
> this combination worked till now but after a swap primary/secondary it's
> not possible to reconnect
>
> any idea?
switching xen vm to kvm solved the problem
mixing two different type of vm it's not a good idea
>
> ga2-srv is a kvm 64bit vm, ga1-srv is a xen 64bit vm kernel 2.6.32-33
> ubuntu lucid, drbd 8.3.11 from git
>
> ga2-srv machine
>
> [ 118.275923] drbd: initialized. Version: 8.3.11 (api:88/proto:86-96) [
> 118.275927] drbd: GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5
> debian/changelog debian/control build by root at nebbiolo-dev, 2011-07-21
> 12:29:37 [ 118.275930] drbd: registered as block device major 147 [
> 118.275932] drbd: minor_table @ 0xffff880119989a00 [ 118.512597] block
> drbd0: Starting worker thread (from cqueue [1343]) [ 118.516728] block
> drbd0: disk( Diskless -> Attaching ) [ 118.517163] block drbd0: ASSERT(
> from_tnr - cnr + i - from == mx+1 ) in
> /usr/src/modules/drbd/drbd/drbd_actlog.c:514 [ 118.541669] block drbd0:
> ASSERT( from_tnr - cnr + i - from == mx+1 ) in
> /usr/src/modules/drbd/drbd/drbd_actlog.c:514 [ 118.556393] block drbd0:
> Found 3 transactions (72 active extents) in activity log. [ 118.556417]
> block drbd0: Method to ensure write ordering: barrier [ 118.556425]
> block drbd0: max BIO size = 4294966784 [ 118.556445] block drbd0:
> drbd_bm_resize called with capacity == 3221127096 [ 118.660476] block
> drbd0: resync bitmap: bits=402640887 words=6291264 pages=12288 [
> 118.660483] block drbd0: size = 1536 GB (1610563548 KB) [ 118.705459]
> block drbd0: bitmap READ of 12288 pages took 4 jiffies [ 118.776528]
> block drbd0: recounting of set bits took additional 7 jiffies [
> 118.776544] block drbd0: 78 MB (20094 bits) marked out-of-sync by on
> disk bit-map. [ 118.776567] block drbd0: disk( Attaching -> UpToDate )
> pdsk( DUnknown -> Outdated ) [ 118.776571] block drbd0: attached to
> UUIDs
> B567B75768F63E0C:1F2B1CCF41A2DA74:1F2A1CCF41A2DA74:1F291CCF41A2DA74 [
> 118.874708] block drbd0: conn( StandAlone -> Unconnected ) [
> 118.876675] block drbd0: Starting receiver thread (from drbd0_worker
> [1373]) [ 118.886220] block drbd0: receiver (re)started [ 118.886228]
> block drbd0: conn( Unconnected -> WFConnection ) [ 119.520504] block
> drbd0: role( Secondary -> Primary ) [ 119.994679] XFS mounting
> filesystem drbd0 [ 120.038783] Ending clean XFS mount for filesystem:
> drbd0 [ 2615.120092] block drbd0: Handshake successful: Agreed network
> protocol version 96 [ 2615.120105] block drbd0: conn( WFConnection ->
> WFReportParams ) [ 2615.120139] block drbd0: Starting asender thread
> (from drbd0_receiver [1394]) [ 2615.120729] block drbd0:
> data-integrity-alg: sha1 [ 2615.121283] block drbd0:
> drbd_sync_handshake: [ 2615.121287] block drbd0: self
> B567B75768F63E0D:1F2B1CCF41A2DA74:1F2A1CCF41A2DA74:1F291CCF41A2DA74
> bits:20525 flags:0 [ 2615.121296] block drbd0: peer
> 1F2B1CCF41A2DA74:0000000000000000:EEC3A25A1800DCBC:EEC2A25A1800DCBC
> bits:20094 flags:0 [ 2615.121326] block drbd0: uuid_compare()=1 by rule
> 70 [ 2615.121328] block drbd0: Becoming sync source due to disk states.
> [ 2615.121334] block drbd0: peer( Unknown -> Secondary ) conn(
> WFReportParams -> WFBitMapS ) pdsk( Outdated -> Inconsistent ) [
> 2616.515283] block drbd0: helper command: /sbin/drbdadm
> before-resync-source minor-0 [ 2616.522098] block drbd0: helper command:
> /sbin/drbdadm before-resync-source minor-0 exit code 0 (0x0) [
> 2616.522107] block drbd0: conn( WFBitMapS -> SyncSource ) [ 2616.522113]
> block drbd0: Began resync as SyncSource (will sync 82100 KB [20525 bits
> set]). [ 2616.522128] block drbd0: updated sync UUID
> B567B75768F63E0D:1F2C1CCF41A2DA74:1F2B1CCF41A2DA74:1F2A1CCF41A2DA74 [
> 2616.629374] block drbd0:
> /usr/src/modules/drbd/drbd/drbd_receiver.c:2204: sector: 1023242240s,
> size: 262144 [ 2616.655139] block drbd0: error receiving CsumRSRequest,
> l: 44! [ 2616.662718] block drbd0: peer( Secondary -> Unknown ) conn(
> SyncSource -> ProtocolError ) [ 2616.699875] block drbd0: asender
> terminated [ 2616.699892] block drbd0: Terminating asender thread [
> 2616.701435] block drbd0: bitmap WRITE of 12284 pages took 4 jiffies [
> 2616.701440] block drbd0: 80 MB (20514 bits) marked out-of-sync by on
> disk bit-map. [ 2616.701461] block drbd0: Connection closed [
> 2616.701470] block drbd0: conn( ProtocolError -> Unconnected ) [
> 2616.701481] block drbd0: receiver terminated [ 2616.701483] block
> drbd0: Restarting receiver thread [ 2616.701486] block drbd0: receiver
> (re)started [ 2616.701490] block drbd0: conn( Unconnected ->
> WFConnection )
>
> and so on
>
>
> on ga1-srv machine
>
> [ 11.390918] drbd: initialized. Version: 8.3.11 (api:88/proto:86-96) [
> 11.390923] drbd: GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5
> debian/changelog debian/control build by root at nebbiolo-dev, 2011-07-21
> 12:29:37 [ 11.390926] drbd: registered as block device major 147 [
> 11.390929] drbd: minor_table @ 0xffff880001e75b00 [ 11.642435] block
> drbd0: Starting worker thread (from cqueue [988]) [ 11.644385] block
> drbd0: disk( Diskless -> Attaching ) [ 11.652930] block drbd0: Found 6
> transactions (324 active extents) in activity log. [ 11.652935] block
> drbd0: Method to ensure write ordering: barrier [ 11.652939] block
> drbd0: max BIO size = 4096 [ 11.652947] block drbd0: drbd_bm_resize
> called with capacity == 3221127096 [ 11.675027] block drbd0: resync
> bitmap: bits=402640887 words=6291264 pages=12288 [ 11.675035] block
> drbd0: size = 1536 GB (1610563548 KB) [ 12.067368] block drbd0: bitmap
> READ of 12288 pages took 39 jiffies [ 12.116378] block drbd0:
> recounting of set bits took additional 5 jiffies [ 12.116386] block
> drbd0: 39 MB (9988 bits) marked out-of-sync by on disk bit-map. [
> 12.116397] block drbd0: disk( Attaching -> Inconsistent ) [ 12.116402]
> block drbd0: attached to UUIDs
> 1EE21CCF41A2DA74:0000000000000000:EEC3A25A1800DCBC:EEC2A25A1800DCBC [
> 12.152591] block drbd0: conn( StandAlone -> Unconnected ) [ 12.155071]
> block drbd0: Starting receiver thread (from drbd0_worker [1025]) [
> 12.157579] block drbd0: receiver (re)started [ 12.157587] block drbd0:
> conn( Unconnected -> WFConnection ) [ 15.161152] block drbd0:
> Handshake successful: Agreed network protocol version 96 [ 15.162211]
> block drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC [
> 15.162223] block drbd0: conn( WFConnection -> WFReportParams ) [
> 15.162251] block drbd0: Starting asender thread (from drbd0_receiver
> [1071]) [ 15.163958] block drbd0: data-integrity-alg: sha1 [
> 15.163973] block drbd0: max BIO size = 4294966784 [ 15.163986] block
> drbd0: drbd_sync_handshake: [ 15.163991] block drbd0: self
> 1EE21CCF41A2DA74:0000000000000000:EEC3A25A1800DCBC:EEC2A25A1800DCBC
> bits:9988 flags:0 [ 15.163995] block drbd0: peer
> B567B75768F63E0D:1EE21CCF41A2DA74:1EE11CCF41A2DA74:1EE01CCF41A2DA74
> bits:10588 flags:0 [ 15.164000] block drbd0: uuid_compare()=-1 by rule
> 50 [ 15.164003] block drbd0: Becoming sync target due to disk states.
> [ 15.164011] block drbd0: peer( Unknown -> Primary ) conn(
> WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate ) [
> 17.826566] block drbd0: conn( WFBitMapT -> WFSyncUUID ) [ 17.898960]
> block drbd0: updated sync uuid
> 1EE31CCF41A2DA74:0000000000000000:EEC3A25A1800DCBC:EEC2A25A1800DCBC [
> 17.899256] block drbd0: helper command: /sbin/drbdadm
> before-resync-target minor-0 [ 17.903674] block drbd0: helper command:
> /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0) [
> 17.903683] block drbd0: conn( WFSyncUUID -> SyncTarget ) [ 17.903691]
> block drbd0: Began resync as SyncTarget (will sync 42356 KB [10589 bits
> set]). [ 18.737711] block drbd0: sock was shut down by peer [
> 18.737723] block drbd0: peer( Primary -> Unknown ) conn( SyncTarget ->
> BrokenPipe ) pdsk( UpToDate -> DUnknown ) [ 18.737735] block drbd0:
> short read expecting header on sock: r=0 [ 18.737986] block drbd0:
> asender terminated [ 18.737991] block drbd0: Terminating asender
> thread [ 19.456024] block drbd0: bitmap WRITE of 12285 pages took 72
> jiffies [ 19.456030] block drbd0: 41 MB (10586 bits) marked
> out-of-sync by on disk bit-map. [ 19.456047] block drbd0: Connection
> closed [ 19.456056] block drbd0: conn( BrokenPipe -> Unconnected ) [
> 19.456064] block drbd0: receiver terminated [ 19.456067] block drbd0:
> Restarting receiver thread [ 19.456071] block drbd0: receiver
> (re)started [ 19.456077] block drbd0: conn( Unconnected ->
> WFConnection )
>
>
> configuration
>
> resource r0 {
> syncer {
> rate 25M;
> csums-alg sha1;
> verify-alg sha1;
> }
>
> net {
> data-integrity-alg sha1;
> cram-hmac-alg "sha1";
> shared-secret "xxxxxxxxxx";
> }
>
> disk {
> no-disk-flushes;
> no-md-flushes;
> }
>
> on ga1-srv {
> device /dev/drbd0;
> disk /dev/xvda4;
> address 10.12.24.206:7788;
> meta-disk internal;
> }
>
> on ga2-srv {
> device /dev/drbd0;
> disk /dev/vdd;
> address 10.12.24.207:7788;
> meta-disk internal;
> }
>
> }
--
Cordiali saluti
Alessandro Bono