Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, I'm pondering a HA iSCSI (really iSER or SRP, Infiniband backend) storage cluster based on DRBD and Pacemaker. So something that has been documented and implemented numerous times. However setting up things on one of my test clusters it became clear to me that this is probably not something all that rosy. Issues: 1. Which bloody iSCSI stack? The obvious choice would be LIO, being the official stack and certainly having the least "fend for yourself and use the source Luke" homepage. Alas that requires at least a 3.4 kernel (3.3 really but that's EOL) if one wants SRP. A bit on the cutting edge, esp. considering stable user land distributions, Debian in my case. Also what I really want is iSER, being more feature rich and a real [tm] standard. But for the sake of going with the times, I used LIO for the testbed, foregoing SRP and going with plain iSCSI (no Infiniband on that test cluster anyway ^o^) 2. House of cards. Setting this up I ran into several issues that boil down to: "if anything goes wrong, wipe the slate". As in, reboot or manually clean up anything left behind by either LIO (LUNs/block device attachments from failed attempts or unclean shut down RAs) or LVM (still active LVs due to LIO still hogging them or Pacemaker otherwise failing and leaving crud behind). The Debian sid (bleeding edge) pacemaker seems to be either not quite up to date or nobody ever uses LIO, this warning every 10 seconds doesn't instill confidence either: --- Jul 27 10:52:41 borg00b iSCSILogicalUnit[27911]: WARNING: Configuration paramete r "scsi_id" is not supported by the iSCSI implementation and will be ignored. --- And before anybody asks, I followed the Linbit guide. I simply can not believe that a setup this fragile will survive normal operations like adding additional targets or LUNs, least a real incident. Especially not with 1000 targets/LUNs/LVs. Also reading what others found out about SRP with LIO is that it isn't as mature as one would wish for, example in case was the lack of support for disconnection. If that works both ways, it would result in lingering targets/LUNs and the impact described above. 3. Objects in the rear view mirror. Has anybody here deployed more than 10 targets/LUNs? And done so w/o going crazy or running into issues mentioned in 2)? How? Self made scripts/puppet? I am looking at about 1000 VMs connecting to that storage cluster, meaning 1000 targets, each with probably 2 LUNs. Doing this in pacemaker is a divine punishment and I can see it taking a loooong time getting these started/stopped (with all the problems that can entail in the pacemaker logic). I'm not asking for free counseling, I just would like to hear if anybody climbed those heights before w/o falling of the cliff or succumbing to hypoxia. ^o^ Regards, Christian -- Christian Balzer Network/Systems Engineer chibi at gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/