[DRBD-user] VMware ESX 3 + iSCSI Enterprise Target/DRBD goneterribly wrong - help!

Wed Aug 1 22:52:30 CEST 2007

> -----Original Message-----
> From: drbd-user-bounces at lists.linbit.com 
> [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Lars 
> Ellenberg
> Sent: Wednesday, August 01, 2007 4:18 PM
> To: drbd-user at lists.linbit.com
> Subject: Re: [DRBD-user] VMware ESX 3 + iSCSI Enterprise 
> Target/DRBD goneterribly wrong - help!
> 
> On Tue, Jul 03, 2007 at 12:38:23AM -0700, Jakobsen wrote:
> > I have some critical issues with three ESX 3.0.1 servers, 
> that access
> > an iSCSI Enterprise Target. The iSCSI target is replicating with
> > another server with DRBD, and everything works fine WITHOUT DRBD
> > ENABLED.
> > 
> > When I enable DRBD, it starts a full sync that takes about 1 hour to
> > complete, and everything seems fine. After the full sync, 
> DRBD is not
> > under heavy load anymore. Suddenly - without any errors on the DRBD
> > servers - the VMware guests starts throwing I/O errors at me, and
> > everything goes read-only.
> > 
> > Have any of you guys got the same problem?
> > I have no clue what the problem can be.
> 
> meanwhile...  they hired me for consulting.
> 
> what we found is basically that it is only the shifted timing
> when you add drbd into the picture that makes it more likely
> to _triggers_ the issue.
> 
> when you stress the vm clients (or the iscsi server, or both),
> you will hit the very same problem anyways,
> without drbd being involved.
> 
> it is actually a problem combining (certain versions of) the 
> linux guest
> kernel scsi drivers (mptscsih) with the ESX initiator and 
> some "hickups"
> (scsi timeouts) on the iscsi side, and it is independent on 
> whether you
> use software or hardware initiator, or what sort of iSCSI 
> target you use
> (ietd base linux stuff, or EMC SAN or any other SAN box).
> you only change the likelyhood to trigger the problem.
> 
> the issue (and workaround/fix) is well documented in at least these
> forum threads, blogs, vmware advisory and redhat bugzillas:
> [1] 
> http://www.vmware.com/community/thread.jspa?threadID=58121&tstart=0
> [2] 
> http://www.vmware.com/community/thread.jspa?threadID=58081&tstart=0
> [3] 
> http://www.tuxyturvy.com/blog/index.php?/archives/31-VMware-ES
> X-and-ext3-journal-aborts.html
> [4] https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=197158
> [5] https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228108
> [6] 
> http://kb.vmware.com/selfservice/microsites/search.do?cmd=disp
> layKC&externalId=51306
> 
> the solution is to change the guest linux kernel, or at least
> patch its mptscsih driver module as explained in [3] and [1].

So would you say the core problem has to do with how well the mptscsi
driver handles scsi timeouts? Sounds like it in the description.

Also I have found that Ethernet flow-control can wreck havoc with
iSCSI during heavy io which will present itself as a series of
timeouts.

Ethernet flow-control should be avoided in favor of the new TCP
scaling in most new OSes where possible and completely avoided when
doing jumbo MTUs as most switches these days just do not have the
bufferring and pipelining for these large frames.

-Ross

______________________________________________________________________
This e-mail, and any attachments thereto, is intended only for use by
the addressee(s) named herein and may contain legally privileged
and/or confidential information. If you are not the intended recipient
of this e-mail, you are hereby notified that any dissemination,
distribution or copying of this e-mail, and any attachments thereto,
is strictly prohibited. If you have received this e-mail in error,
please immediately notify the sender and permanently delete the
original and any copy or printout thereof.