[DRBD-user] meta data flush failed with status -95, disabling md-flushes

Ravi Kanth raveeknth at gmail.com
Thu Sep 23 16:59:45 CEST 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

I have 3 machines A, B and C with very similar configuration. I used A and B
for my 2 node setup. I have connected C with B in past to copy the state of
disks on to C (by replacing B's drbd.conf file). When I make changes on A -
B and then disconnect B and connect it with node C, those changes are not
seen by C. Both machines show Updated/Updated as status.

##### Conf files #####

A - B conf file


global {

#minor-count 10;

usage-count no;

}

common { syncer { rate 1G; } }

resource r7 {

protocol C;

handlers {

pri-on-incon-degr "echo '!DRBD! pri on incon-degr' | wall;
/etc/init.d/hearbeat stop ";

outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater";

}

on ninja1 {

device /dev/drbd7;

disk /dev/loop0;

address 10.0.2.150:7788;

meta-disk internal;

}

on ninja2 {

device /dev/drbd7;

disk /dev/loop0;

address 10.0.2.151:7788;

meta-disk internal;

}

 net {

allow-two-primaries;

sndbuf-size 512k;

timeout 60;

connect-int 10;

ping-int 10;

ping-timeout 5;

#max-buffers 256;

#on-disconnect reconnect;

ko-count 0;

#max-epoch-size 128;

max-epoch-size 8096;

}

}

##########

B - C conf file





global {

#minor-count 10;

usage-count no;

}

common { syncer { rate 1G; } }

resource r7 {

protocol C;

handlers {

pri-on-incon-degr "echo '!DRBD! pri on incon-degr' | wall;
/etc/init.d/hearbeat stop ";

outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater";

}

on ninja2 {

device /dev/drbd7;

disk /dev/loop0;

address 10.0.2.151:7788;

meta-disk internal;

}

on ninja3 {

device /dev/drbd7;

disk /dev/loop0;

address 10.0.2.152:7788;

meta-disk internal;

}

 net {

allow-two-primaries;

sndbuf-size 512k;

timeout 60;

connect-int 10;

ping-int 10;

ping-timeout 5;

#max-buffers 256;

#on-disconnect reconnect;

ko-count 0;

#max-epoch-size 128;

max-epoch-size 8096;

}

}

##############

When I make some changes on A-B and then connect B-C, those changes are not
shown on B.

Here are the dmesg log of 3 machines

###### A ########

Sep 21 15:04:03 ninja1 kernel: block drbd7: Starting worker thread (from
cqueue/0 [176])

Sep 21 15:04:03 ninja1 kernel: block drbd7: disk( Diskless -> Attaching )

Sep 21 15:04:03 ninja1 kernel: block drbd7: Found 4 transactions (192 active
extents) in activity log.

Sep 21 15:04:03 ninja1 kernel: block drbd7: Method to ensure write ordering:
barrier

Sep 21 15:04:03 ninja1 kernel: block drbd7: max_segment_size ( = BIO size )
= 32768

Sep 21 15:04:03 ninja1 kernel: block drbd7: drbd_bm_resize called with
capacity == 102396800

Sep 21 15:04:03 ninja1 kernel: block drbd7: resync bitmap: bits=12799600
words=199994

Sep 21 15:04:03 ninja1 kernel: block drbd7: size = 49 GB (51198400 KB)

Sep 21 15:04:03 ninja1 kernel: block drbd7: recounting of set bits took
additional 1 jiffies

Sep 21 15:04:03 ninja1 kernel: block drbd7: 0 KB (0 bits) marked out-of-sync
by on disk bit-map.

Sep 21 15:04:03 ninja1 kernel: block drbd7: disk( Attaching -> UpToDate )

Sep 21 15:04:03 ninja1 kernel: block drbd7: conn( StandAlone -> Unconnected
)

Sep 21 15:04:03 ninja1 kernel: block drbd7: Starting receiver thread (from
drbd7_worker [17300])

Sep 21 15:04:03 ninja1 kernel: block drbd7: receiver (re)started

Sep 21 15:04:03 ninja1 kernel: block drbd7: conn( Unconnected ->
WFConnection )

Sep 21 15:04:04 ninja1 kernel: block drbd7: Handshake successful: Agreed
network protocol version 94

Sep 21 15:04:04 ninja1 kernel: block drbd7: conn( WFConnection ->
WFReportParams )

Sep 21 15:04:04 ninja1 kernel: block drbd7: Starting asender thread (from
drbd7_receiver [17308])

Sep 21 15:04:04 ninja1 kernel: block drbd7: data-integrity-alg: <not-used>

Sep 21 15:04:04 ninja1 kernel: block drbd7: drbd_sync_handshake:

Sep 21 15:04:04 ninja1 kernel: block drbd7: self
14DBCB9EF0296C64:0000000000000000:A74BC6FF259D3C18:FA8BE08ADA36FF1F bits:0
flags:0

Sep 21 15:04:04 ninja1 kernel: block drbd7: peer
14DBCB9EF0296C64:0000000000000000:5859AAC8839A86AD:A74BC6FF259D3C18 bits:0
flags:0

Sep 21 15:04:04 ninja1 kernel: block drbd7: uuid_compare()=0 by rule 40

Sep 21 15:04:04 ninja1 kernel: block drbd7: peer( Unknown -> Secondary )
conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate )

Sep 21 15:04:31 ninja1 kernel: block drbd7: role( Secondary -> Primary )

Sep 21 15:04:36 ninja1 ntpd[3219]: synchronized to 208.75.88.4, stratum 2

Sep 21 15:04:41 ninja1 kernel: block drbd7: role( Primary -> Secondary )

Sep 21 15:04:41 ninja1 kernel: block drbd7: meta data flush failed with
status -95, disabling md-flushes

  Sep 21 15:05:58 ninja1 kernel: block drbd7: peer( Secondary -> Unknown )
conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )

Sep 21 15:05:58 ninja1 kernel: block drbd7: short read expecting header on
sock: r=-512

Sep 21 15:05:58 ninja1 kernel: block drbd7: asender terminated

Sep 21 15:05:58 ninja1 kernel: block drbd7: Terminating asender thread

Sep 21 15:05:58 ninja1 kernel: block drbd7: Connection closed

Sep 21 15:05:58 ninja1 kernel: block drbd7: conn( Disconnecting ->
StandAlone )

Sep 21 15:05:58 ninja1 kernel: block drbd7: receiver terminated

Sep 21 15:05:58 ninja1 kernel: block drbd7: Terminating receiver thread

Sep 21 15:05:58 ninja1 kernel: block drbd7: disk( UpToDate -> Diskless )

Sep 21 15:05:58 ninja1 kernel: block drbd7: drbd_bm_resize called with
capacity == 0

Sep 21 15:05:58 ninja1 kernel: block drbd7: worker terminated

Sep 21 15:05:58 ninja1 kernel: block drbd7: Terminating worker thread

########## B ############

Sep 21 15:02:40 ninja2 kernel: block drbd7: Starting worker thread (from
cqueue/2 [178])

Sep 21 15:02:40 ninja2 kernel: block drbd7: disk( Diskless -> Attaching )

Sep 21 15:02:40 ninja2 kernel: block drbd7: Found 4 transactions (192 active
extents) in activity log.

Sep 21 15:02:40 ninja2 kernel: block drbd7: Method to ensure write ordering:
barrier

Sep 21 15:02:40 ninja2 kernel: block drbd7: max_segment_size ( = BIO size )
= 32768

Sep 21 15:02:40 ninja2 kernel: block drbd7: drbd_bm_resize called with
capacity == 102396800

Sep 21 15:02:40 ninja2 kernel: block drbd7: resync bitmap: bits=12799600
words=199994

Sep 21 15:02:40 ninja2 kernel: block drbd7: size = 49 GB (51198400 KB)

Sep 21 15:02:40 ninja2 kernel: block drbd7: recounting of set bits took
additional 1 jiffies

Sep 21 15:02:40 ninja2 kernel: block drbd7: 0 KB (0 bits) marked out-of-sync
by on disk bit-map.

Sep 21 15:02:40 ninja2 kernel: block drbd7: disk( Attaching -> UpToDate )

Sep 21 15:02:40 ninja2 kernel: block drbd7: conn( StandAlone -> Unconnected
)

Sep 21 15:02:40 ninja2 kernel: block drbd7: Starting receiver thread (from
drbd7_worker [14200])

Sep 21 15:02:40 ninja2 kernel: block drbd7: receiver (re)started

Sep 21 15:02:40 ninja2 kernel: block drbd7: conn( Unconnected ->
WFConnection )

Sep 21 15:02:40 ninja2 kernel: block drbd7: Handshake successful: Agreed
network protocol version 94

Sep 21 15:02:40 ninja2 kernel: block drbd7: conn( WFConnection ->
WFReportParams )

Sep 21 15:02:40 ninja2 kernel: block drbd7: Starting asender thread (from
drbd7_receiver [14208])

Sep 21 15:02:40 ninja2 kernel: block drbd7: data-integrity-alg: <not-used>

Sep 21 15:02:40 ninja2 kernel: block drbd7: drbd_sync_handshake:

Sep 21 15:02:40 ninja2 kernel: block drbd7: self
14DBCB9EF0296C64:0000000000000000:5859AAC8839A86AD:A74BC6FF259D3C18 bits:0
flags:0

Sep 21 15:02:40 ninja2 kernel: block drbd7: peer
14DBCB9EF0296C64:0000000000000000:A74BC6FF259D3C18:FA8BE08ADA36FF1F bits:0
flags:0

Sep 21 15:02:40 ninja2 kernel: block drbd7: uuid_compare()=0 by rule 40

Sep 21 15:02:40 ninja2 kernel: block drbd7: peer( Unknown -> Secondary )
conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate )

Sep 21 15:03:07 ninja2 kernel: block drbd7: peer( Secondary -> Primary )

Sep 21 15:03:17 ninja2 kernel: block drbd7: peer( Primary -> Secondary )

Sep 21 15:04:34 ninja2 kernel: block drbd7: peer( Secondary -> Unknown )
conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )

Sep 21 15:04:34 ninja2 kernel: block drbd7: asender terminated

Sep 21 15:04:34 ninja2 kernel: block drbd7: Terminating asender thread

Sep 21 15:04:34 ninja2 kernel: block drbd7: Connection closed

Sep 21 15:04:34 ninja2 kernel: block drbd7: conn( TearDown -> Unconnected )

Sep 21 15:04:34 ninja2 kernel: block drbd7: receiver terminated

Sep 21 15:04:34 ninja2 kernel: block drbd7: Restarting receiver thread

Sep 21 15:04:34 ninja2 kernel: block drbd7: receiver (re)started

Sep 21 15:04:34 ninja2 kernel: block drbd7: conn( Unconnected ->
WFConnection )

Sep 21 15:04:43 ninja2 kernel: block drbd7: conn( WFConnection ->
Disconnecting )

Sep 21 15:04:43 ninja2 kernel: block drbd7: Discarding network
configuration.

Sep 21 15:04:43 ninja2 kernel: block drbd7: Connection closed

Sep 21 15:04:43 ninja2 kernel: block drbd7: conn( Disconnecting ->
StandAlone )

Sep 21 15:04:43 ninja2 kernel: block drbd7: receiver terminated

Sep 21 15:04:43 ninja2 kernel: block drbd7: Terminating receiver thread

Sep 21 15:04:43 ninja2 kernel: block drbd7: disk( UpToDate -> Diskless )

Sep 21 15:04:43 ninja2 kernel: block drbd7: drbd_bm_resize called with
capacity == 0

Sep 21 15:04:43 ninja2 kernel: block drbd7: worker terminated

Sep 21 15:04:43 ninja2 kernel: block drbd7: Terminating worker thread

 Sep 21 15:06:44 ninja2 kernel: block drbd7: Starting worker thread (from
cqueue/0 [176])

Sep 21 15:06:44 ninja2 kernel: block drbd7: disk( Diskless -> Attaching )

Sep 21 15:06:44 ninja2 kernel: block drbd7: Found 4 transactions (192 active
extents) in activity log.

Sep 21 15:06:44 ninja2 kernel: block drbd7: Method to ensure write ordering:
barrier

Sep 21 15:06:44 ninja2 kernel: block drbd7: max_segment_size ( = BIO size )
= 32768

Sep 21 15:06:44 ninja2 kernel: block drbd7: drbd_bm_resize called with
capacity == 102396800

Sep 21 15:06:44 ninja2 kernel: block drbd7: resync bitmap: bits=12799600
words=199994

Sep 21 15:06:44 ninja2 kernel: block drbd7: size = 49 GB (51198400 KB)

Sep 21 15:06:44 ninja2 kernel: block drbd7: recounting of set bits took
additional 1 jiffies

Sep 21 15:06:44 ninja2 kernel: block drbd7: 0 KB (0 bits) marked out-of-sync
by on disk bit-map.

Sep 21 15:06:44 ninja2 kernel: block drbd7: disk( Attaching -> UpToDate )

Sep 21 15:06:44 ninja2 kernel: block drbd7: conn( StandAlone -> Unconnected
)

Sep 21 15:06:44 ninja2 kernel: block drbd7: Starting receiver thread (from
drbd7_worker [15182])

Sep 21 15:06:44 ninja2 kernel: block drbd7: receiver (re)started

Sep 21 15:06:44 ninja2 kernel: block drbd7: conn( Unconnected ->
WFConnection )

Sep 21 15:06:45 ninja2 kernel: block drbd7: Handshake successful: Agreed
network protocol version 94

Sep 21 15:06:45 ninja2 kernel: block drbd7: conn( WFConnection ->
WFReportParams )

Sep 21 15:06:45 ninja2 kernel: block drbd7: Starting asender thread (from
drbd7_receiver [15190])

Sep 21 15:06:45 ninja2 kernel: block drbd7: data-integrity-alg: <not-used>

Sep 21 15:06:45 ninja2 kernel: block drbd7: drbd_sync_handshake:

Sep 21 15:06:45 ninja2 kernel: block drbd7: self
14DBCB9EF0296C64:0000000000000000:5859AAC8839A86AD:A74BC6FF259D3C18 bits:0
flags:0

Sep 21 15:06:45 ninja2 kernel: block drbd7: peer
14DBCB9EF0296C64:0000000000000000:5859AAC8839A86AC:A74BC6FF259D3C18 bits:0
flags:0

Sep 21 15:06:45 ninja2 kernel: block drbd7: uuid_compare()=0 by rule 40

Sep 21 15:06:45 ninja2 kernel: block drbd7: peer( Unknown -> Secondary )
conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate )

  Sep 21 15:13:20 ninja2 kernel: block drbd7: role( Secondary -> Primary )

Sep 21 15:13:40 ninja2 kernel: block drbd7: role( Primary -> Secondary )

Sep 21 15:13:40 ninja2 kernel: block drbd7: meta data flush failed with
status -95, disabling md-flushes

  Sep 21 15:15:51 ninja2 kernel: block drbd7: conn( WFConnection ->
Disconnecting )

Sep 21 15:15:51 ninja2 kernel: block drbd7: Discarding network
configuration.

Sep 21 15:15:51 ninja2 kernel: block drbd7: Connection closed

Sep 21 15:15:51 ninja2 kernel: block drbd7: conn( Disconnecting ->
StandAlone )

Sep 21 15:15:51 ninja2 kernel: block drbd7: receiver terminated

Sep 21 15:15:51 ninja2 kernel: block drbd7: Terminating receiver thread

Sep 21 15:15:51 ninja2 kernel: block drbd7: disk( UpToDate -> Diskless )

Sep 21 15:15:51 ninja2 kernel: block drbd7: drbd_bm_resize called with
capacity == 0

Sep 21 15:15:51 ninja2 kernel: block drbd7: worker terminated

Sep 21 15:15:51 ninja2 kernel: block drbd7: Terminating worker thread



######### C #########

Sep 21 23:04:37 ninja3 kernel: block drbd7: Starting worker thread (from
cqueue/2 [178])

Sep 21 23:04:37 ninja3 kernel: block drbd7: disk( Diskless -> Attaching )

Sep 21 23:04:37 ninja3 kernel: block drbd7: Found 2 transactions (2 active
extents) in activity log.

Sep 21 23:04:37 ninja3 kernel: block drbd7: Method to ensure write ordering:
barrier

Sep 21 23:04:37 ninja3 kernel: block drbd7: max_segment_size ( = BIO size )
= 32768

Sep 21 23:04:37 ninja3 kernel: block drbd7: drbd_bm_resize called with
capacity == 102396800

Sep 21 23:04:37 ninja3 kernel: block drbd7: resync bitmap: bits=12799600
words=199994

Sep 21 23:04:37 ninja3 kernel: block drbd7: size = 49 GB (51198400 KB)

Sep 21 23:04:37 ninja3 kernel: block drbd7: recounting of set bits took
additional 1 jiffies

Sep 21 23:04:37 ninja3 kernel: block drbd7: 0 KB (0 bits) marked out-of-sync
by on disk bit-map.

Sep 21 23:04:37 ninja3 kernel: block drbd7: disk( Attaching -> UpToDate )

Sep 21 23:04:37 ninja3 kernel: block drbd7: conn( StandAlone -> Unconnected
)

Sep 21 23:04:37 ninja3 kernel: block drbd7: Starting receiver thread (from
drbd7_worker [15612])

Sep 21 23:04:37 ninja3 kernel: block drbd7: receiver (re)started

Sep 21 23:04:37 ninja3 kernel: block drbd7: conn( Unconnected ->
WFConnection )

Sep 21 23:04:37 ninja3 kernel: block drbd7: Handshake successful: Agreed
network protocol version 94

Sep 21 23:04:37 ninja3 kernel: block drbd7: conn( WFConnection ->
WFReportParams )

Sep 21 23:04:37 ninja3 kernel: block drbd7: Starting asender thread (from
drbd7_receiver [15620])

Sep 21 23:04:37 ninja3 kernel: block drbd7: data-integrity-alg: <not-used>

Sep 21 23:04:37 ninja3 kernel: block drbd7: drbd_sync_handshake:

Sep 21 23:04:37 ninja3 kernel: block drbd7: self
14DBCB9EF0296C64:0000000000000000:5859AAC8839A86AC:A74BC6FF259D3C18 bits:0
flags:0

Sep 21 23:04:37 ninja3 kernel: block drbd7: peer
14DBCB9EF0296C64:0000000000000000:5859AAC8839A86AD:A74BC6FF259D3C18 bits:0
flags:0

Sep 21 23:04:37 ninja3 kernel: block drbd7: uuid_compare()=0 by rule 40

Sep 21 23:04:37 ninja3 kernel: block drbd7: peer( Unknown -> Secondary )
conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate )

  Sep 21 23:11:12 ninja3 kernel: block drbd7: peer( Secondary -> Primary )

Sep 21 23:11:32 ninja3 kernel: block drbd7: peer( Primary -> Secondary )

 Sep 21 23:12:33 ninja3 kernel: block drbd7: role( Secondary -> Primary )

Sep 21 23:12:41 ninja3 kernel: block drbd7: role( Primary -> Secondary )

Sep 21 23:12:41 ninja3 kernel: block drbd7: meta data flush failed with
status -95, disabling md-flushes

 Sep 21 23:13:16 ninja3 kernel: block drbd7: peer( Secondary -> Unknown )
conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )

Sep 21 23:13:16 ninja3 kernel: block drbd7: short read expecting header on
sock: r=-512

Sep 21 23:13:16 ninja3 kernel: block drbd7: asender terminated

Sep 21 23:13:16 ninja3 kernel: block drbd7: Terminating asender thread

Sep 21 23:13:16 ninja3 kernel: block drbd7: Connection closed

Sep 21 23:13:16 ninja3 kernel: block drbd7: conn( Disconnecting ->
StandAlone )

Sep 21 23:13:16 ninja3 kernel: block drbd7: receiver terminated

Sep 21 23:13:16 ninja3 kernel: block drbd7: Terminating receiver thread

Sep 21 23:13:16 ninja3 kernel: block drbd7: disk( UpToDate -> Diskless )

Sep 21 23:13:16 ninja3 kernel: block drbd7: drbd_bm_resize called with
capacity == 0

Sep 21 23:13:16 ninja3 kernel: block drbd7: worker terminated

Sep 21 23:13:16 ninja3 kernel: block drbd7: Terminating worker thread

##############

What I am trying to acheive here is make a copy of changes done on A-B on to
C. I don't want to use stacked resources method. So I switch the conf files
and I get these errors. Please let me know what I might be doing wrong or
this approach is wrong?

Thanking you

Ravee
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20100923/15ceb894/attachment.htm>


More information about the drbd-user mailing list