[DRBD-user] DRBD and iSCSI (which? ^o^) versus scalability

Fri Jul 27 12:04:04 CEST 2012

Hello,

On Fri, Jul 27, 2012 at 4:32 AM, Christian Balzer <chibi at gol.com> wrote:
>
> Hello,
>
> I'm pondering a HA iSCSI (really iSER or SRP, Infiniband backend) storage
> cluster based on DRBD and Pacemaker. So something that has been documented
> and implemented numerous times.
>
> However setting up things on one of my test clusters it became clear to me
> that this is probably not something all that rosy.
>
> Issues:
>
> 1. Which bloody iSCSI stack?

Sincere apologies on behalf of the open source community to be
offering you too much choice. :)

Really though, you get to pick and choose. IET and SCST have the
greatest longevity, STGT happens to be the only target supported on
RHEL, LIO is the current upstream default.

> The obvious choice would be LIO, being the
> official stack and certainly having the least "fend for yourself and use
> the source Luke" homepage. Alas that requires at least a 3.4 kernel (3.3
> really but that's EOL) if one wants SRP. A bit on the cutting edge, esp.
> considering stable user land distributions, Debian in my case. Also what I
> really want is iSER, being more feature rich and a real [tm] standard.

iSER is supported in STGT, which you do have available on Debian. Not
sure about the others.

> But for the sake of going with the times, I used LIO for the testbed,
> foregoing SRP and going with plain iSCSI (no Infiniband on that test
> cluster anyway ^o^)
>
> 2. House of cards. Setting this up I ran into several issues that boil
> down to: "if anything goes wrong, wipe the slate". As in, reboot or
> manually clean up anything left behind by either LIO (LUNs/block device
> attachments from failed attempts or unclean shut down RAs) or LVM (still
> active LVs due to LIO still hogging them or Pacemaker otherwise failing
> and leaving crud behind).

Can we have slightly more useful details please, more than "leaving
crud behind"? Like logs and your configuration, perhaps?

> The Debian sid (bleeding edge) pacemaker seems
> to be either not quite up to date or nobody ever uses LIO, this warning
> every 10 seconds doesn't instill confidence either:
> ---
> Jul 27 10:52:41 borg00b iSCSILogicalUnit[27911]: WARNING: Configuration paramete
> r "scsi_id" is not supported by the iSCSI implementation and will be ignored.

Um, patches accepted?

> And before anybody asks, I followed the Linbit guide.
> I simply can not believe that a setup this fragile will survive normal
> operations like adding additional targets or LUNs, least a real incident.

Again, how about if you shared your configuration?

> Especially not with 1000 targets/LUNs/LVs.

That would make about 4000 resources in Pacemaker, not something that
I would attempt light-heartedly.

> Also reading what others found out about SRP with LIO is that it isn't as
> mature as one would wish for, example in case was the lack of support for
> disconnection. If that works both ways, it would result in lingering
> targets/LUNs and the impact described above.

Logs please?

> I am looking at about 1000 VMs connecting to that storage cluster, meaning
> 1000 targets, each with probably 2 LUNs. Doing this in pacemaker is a
> divine punishment and I can see it taking a loooong time getting these
> started/stopped (with all the problems that can entail in the pacemaker
> logic).

If we're talking 1,000 VMs and as many block devices, may I suggest
OpenStack and Ceph (RBD) for you. Have you considered those?

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now