[DRBD-user] Cluster filesystem question

Thu Dec 1 20:47:11 CET 2011

I got it now, I think... 

So, no matter what, multipathing two separate iSCSI targets is bad... This is just because of how iSCSI works.

Is there another transport I could use to multipath safely to a  dual-primary DRBD? CMAN with GNBD running on each DRBD node? Any other alternative?

Also, an ESX specific question... the software iSCSI initiator on ESX does not allow me to change the default time to retain or default time to wait (grayed out, max 60 but set to 0). So in a scenario where I have to wait for a floating IP to failover (like Digimer's example) I end up with the ESX host freezing up and not recovering. Any recommendations?

Thanks,
Mike

-----Original Message-----
From: Lars Ellenberg [mailto:lars.ellenberg at linbit.com] 
Sent: Thursday, December 01, 2011 2:32 PM
To: drbd-user at lists.linbit.com
Subject: Re: [DRBD-user] Cluster filesystem question

On Thu, Dec 01, 2011 at 02:18:42PM -0500, Digimer wrote:
> On 12/01/2011 02:13 PM, Lars Ellenberg wrote:
> > On Thu, Dec 01, 2011 at 01:58:15PM -0500, Kushnir, Michael (NIH/NLM/LHC) [C] wrote:
> >> Hi Lars,
> >>
> >> I'm a bit confused by this discussion. Can you please clarify the difference?
> >>
> >> What I think you are saying is:
> >>
> >> OK:
> >> Dual-primary DRBD -> cluster aware something (OCFS, GFS, clvmd, 
> >> etc...) -> exported via iSCSI on both nodes -> multipathed on the 
> >> client
> > 
> > No.
> > 
> > OK:
> > Dual-primary DRBD (done right) -> cluster aware something (OCFS, 
> > GFS, clvmd, etc...)
> > 
> > NOT OK:
> > -> exported via iSCSI on both nodes -> multipathed on the client
> > 
> > NOT OK:
> > anything non-cluster-aware using it "concurrently" on both nodes.
> 
> What I've done in the past, and perhaps it isn't the wisest (Lars, 
> Florian?), is to create a Dual-primary DRBD (with fencing!), then 
> export it as-is to my nodes using a floating/virtual IP address 
> managed by a simple cluster.
> 
> Then on the clients (all of whom are in the same cluster), I mount the 
> iSCSI target and set it up as a clustered LVM PV/VG/LVs. If you need a 
> normal FS, then format one or more of the LVs using a cluster-aware FS.
> 
> When the primary node (the one with the floating IP) fails, all the 
> cluster has to do is move the IP down to the backup node and it's 
> ready to go.

And that's where you made it "OK" again: you arbitrate which side you talk to by having the IP available on one node only.
The targets are not used "concurrently".

> I suppose you could just as easily do Primary/Secondary and include 
> the promotion of the backup to primary as part of the failover, too.

Yes, and that would be the recommended approach, obviously.

Depending on how you configure your iSCSI targets, the way you do (did?) it, you could even run into cache inconsistencies: if you go through page cache/buffer cache, you need some layer responsible for cache coherence, but this setup has none.
(in ietd speak: only blockio allowed; similar for other targets).

> In my case, knowing I had fencing in place already, I went for the 
> "simpler" cluster config of managing an IP only.
> 
> Caveat - I did not read the thread before now. If this is totally out 
> to left field, my apologies. :)

The original question was somehow cluster file system related, someone suggested that dual-primary DRBD + two independend iSCSI targets + multipath or MC/S on the initiator side might be an option.

What we try to explain here,
and apparently fail at explaining good enough, is that an initiator, regardless of multipath or MC/S, assumes (and relies uppon) that it talks to *ONE AND THE SAME* target (via multiple paths), but now in fact talks to two different, independend targets, that do not know about each other.

And that can not work.

> --
> Digimer
> E-Mail:              digimer at alteeve.com
> Freenode handle:     digimer
> Papers and Projects: http://alteeve.com
> Node Assassin:       http://nodeassassin.org
> "omg my singularity battery is dead again.
> stupid hawking radiation." - epitron

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com _______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user