[DRBD-user] speed of fail-over..

Tue Oct 21 06:54:25 CEST 2008

On Mon, Oct 20, 2008 at 4:23 PM, Florian Haas <florian.haas at linbit.com> wrote:
> Little, Kevin wrote:
>> I searched the archives for this, but found nothing.
>
> Well if you search _this_ list's archives, that's not exactly suprising
> as what you are after is more of a Heartbeat issue than a DRBD one.
> Searching the linux-ha archives is likely to yield better results.
>
>> I've seen mentioned
>> (http://fghaas.wordpress.com/2007/06/26/when-not-to-use-drbd/) that the
>> fail-over time for DRBD+Heartbeat is on the order of 20 seconds.
>
> Well it's configurable (via the deadtime config entry in
> /etc/ha.d/ha.cf), but on the order of between 15 to 30 seconds is what
> people usually pick.

We use the following settings:

keepalive 75ms
deadtime 300ms
warntime 200ms

Which seems to work ok. The link between the two nodes is a 10GE
direct connection, so there shouldn't be any issues with the network
delaying packets. When I first configured this, I kept an eye out for
the 'warntime' messages in the logs for a week or so, and didn't see
any.

Our application is just iscsi exporting LVM luns from the drbd
volumes, so there's no issue with recovery time - failover of the
entire stack (ip, drbd, lvm, iscsi) is at most two seconds. I've
tested it, and I'm able to play video off the iscsi volume, hard-reset
the primary node, and have the video continue playing after a 5-odd
second pause (presumably as the iscsi times out and retries).

-Patrick

-- 
http://www.labyrinthdata.net.au - WA Backup, Web and VPS Hosting