Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi,
I use DRBD ver 8.3.15 with configuratios
"net - on-congestion = pull-ahead".
"protocol C"
The connect state "Ahead" and never resync again.
"Ahead Stuck" Problem.
Fortunately, I can reproduce this problem with two
I/O generaotors "fio" on Virtual Machines of
VMFS5(ESXi Server v5.0u1) in 3 minutes.
So, I analisys the Ahead->SourceSync mechanizm by runnning code.
At function got_BarrierACK (drbd_receiver.c) will
try to resync again.
But when mdev->ap_in_flight is negative value, can not resync again.
I modified the source "drbd_receiver.c" and the problem is away at
the I/O pattern.
The value "mdev->ap_in_flight" is used for only for
"congestion fill", I think. Is this right?
Please check difference below!!
==[[Difference]]=================================
# diff -c drbd_receiver.c.old drbd_receiver.c
*** drbd_receiver.c.old
--- drbd_receiver.c
***************
*** 4815,4831 ****
return true;
}
STATIC int got_BarrierAck(struct drbd_conf *mdev, struct p_header80 *h)
{
struct p_barrier_ack *p = (struct p_barrier_ack *)h;
tl_release(mdev, p->barrier, be32_to_cpu(p->set_size));
if (mdev->state.conn == C_AHEAD &&
! atomic_read(&mdev->ap_in_flight) == 0 &&
!drbd_test_and_set_flag(mdev, AHEAD_TO_SYNC_SOURCE)) {
mdev->start_resync_timer.expires = jiffies + HZ;
add_timer(&mdev->start_resync_timer);
}
return true;--- 4815,4846 ----
return true;
}
+ /* Fix Ahead Stuck Problem */
STATIC int got_BarrierAck(struct drbd_conf *mdev, struct p_header80 *h)
{
struct p_barrier_ack *p = (struct p_barrier_ack *)h;
tl_release(mdev, p->barrier, be32_to_cpu(p->set_size));
+ /*[Debug] Output Detail */
+ dev_info( DEV, "got_BarrierACK(state.conn=%d,ap_in_flight=%d)\n",
+ mdev->state.conn, atomic_read(&mdev->ap_in_flight) );
+
if (mdev->state.conn == C_AHEAD &&
! /* ap_in_flight is sometime less than zero */
! /* atomic_read(&mdev->ap_in_flight) == 0 && */
! atomic_read(&mdev->ap_in_flight) <= 0 &&
!drbd_test_and_set_flag(mdev, AHEAD_TO_SYNC_SOURCE)) {
+
+ /* Reset ap_in_flight into zero */
+ atomic_set(&mdev->ap_in_flight, 0);
+
mdev->start_resync_timer.expires = jiffies + HZ;
add_timer(&mdev->start_resync_timer);
+
+ /*[Debug] resync timer start*/
+ dev_info(DEV,"start resync timer!!\n" );
+
}
return true;
Regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20140315/6d87376a/attachment.htm>