Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Sunny, and thanks for your input. > -----Original Message----- > From: Sunny [mailto:sunyucong at gmail.com] > Sent: jeudi 7 octobre 2010 18:41 > To: Patrick Zwahlen > Cc: drbd-user at lists.linbit.com > Subject: Re: [DRBD-user] Effects of zeroing a DRBD device before use > > For option 2) I am having the same problem, I roughly found it is > related to that I enabled the "Storage IO control" on the parent VMFS > store. Which has a congestion control throttle from 30ms to 100ms, > that's not necessarily enough in this setup. My current semi-prod setup is running ESX 4.0u2, so I haven't touched/enabled the "Storage IO control". I will make sure it remains disabled when going ESX 4.1 > So when VM write to iscsi -> iscsi target writes to VMFS -> VMFS > writes get throttled -> iscsi target is staled -> VM hangs > > and in my case, ESXi itself also locks up because of the iscsi IO > queue overflow, which is very bad :-( > > So, disabling that storage IO control on that vmfs store where your > iscsi target host resist on, may help. also, move to a dedicated > machine and a backup plan is not that expensive but more reliable. > > > On Thu, Oct 7, 2010 at 7:43 AM, Patrick Zwahlen <paz at navixia.com> > wrote: > > Hi all, > > > > I'm looking for some inputs from the experts ! > > > > Short story > > ----------- > > zeroing my DRBD device before using it turns a non-working system > into a > > working one, and I'm trying to figure out why. I'm also trying to > > understand if I will have other problems down the road. > > > > Long story > > ---------- > > I am building a pair of redundant iSCSI targets for VMware ESX4.1, > using > > the following software components: > > - Fedora 12 x86_64 > > - DRBD 8.3.8.1 > > - pacemaker 1.0.9 > > - corosync 1.2.8 > > - SCST iSCSI Target (using SVN trunk, almost 2.0) > > > > SCST isn't cluster aware, so I'm using DRBD in primary/secondary > mode. > > I'm creating two iSCSI targets, one on each node, with mutual > failover > > and no multipath. As a reference for the discussion, I'm attaching my > > resource agent, my CIB and my DRBD config files. The resource agent > is a > > modification of iSCSITarget/iSCSILun with some SCST specifics. > > > > When running this setup on a pair of physical hosts, everything works > > fine. However, my interest is in small setups and I want to run the > two > > targets in VMs, hosted on the ESX hosts that will be the iSCSI > > initiators. The market calls this a virtual SAN... I know, I know, > this > > is not recommended, but it definitely exists as commercial solutions, > > and makes a lot of sense for small setups. I'm not looking for perf, > but > > for high-availability. > > > > This being said, I have two ways to present disk space (physical) to > > DRBD (they are /dev/sdb and /dev/sdc in the VMs): > > > > 1) Map raid volumes to the Fedora VMs using RDM (Raw Device Mapping) > > 2) Format the raid volumes with VMFS, and create virtual disks > (VMDKs) > > in that datastore for the Fedora VMs. > > > > Option 1) obviously works better, but is not always possible (many > > restrictions on RAID controllers, for instance). > > > > Option 2) works fine until I put iSCSI WRITE load on my Fedora VM. > When > > using large blocks, I quickly end up with stale VMs. The iSCSI target > > complains that the backend device doesn't respond, and the kernel > gives > > me 120 seconds timeouts for the DRBD threads. The DRBD backend > devices > > appear dead. At this stage, there is no iSCSI traffic anymore, CPU > usage > > in null, memory is fine, starvation.... Rebooting the Fedora VM > solves > > the problem. Seen from a DRBD/SCST point of view, it's as if backend > > hardware was failing. However, physical disks/arrays are fine. The > > problem is clearly within VMware. > > > > One of the VMware recommendation is to create the large VMDKs in > > 'eagerZeroedThick', which basically zeroes everything before use. > This > > helps, but doesn't solve the problem completely. > > > > I then tried a third option: format /dev/drbd0 with XFS, create one > BIG > > file (using dd) on that filesystem, and export this file via > iSCSI/SCST > > (instead of exporting the /dev/drbd0 block device directly). I > couldn't > > crash this setup, but I don't like the idea of having a single 200G > file > > on a 99% full filesystem. > > > > This brought me to option 4: I directly export /dev/drbd0 via SCST > (same > > as option 1 and 2), but before using it, I issue a: > > > > dd if=/dev/zero of=/dev/drbd0 bs=4096 > > > > I'm now running this setup since 2 weeks, trying to put as much load > as > > I can on it (mainly using dd, bonnie++, DiskTT and running VMware > > Storage vMotion). The only issue I have faced is that sometimes the > > pacemaker 'monitor' action takes more than 20 seconds to run on DRBD, > so > > I have increased this timeout to 60s. Since then, no problem at all! > > > > As you can imagine, I'm pretty happy with the setup, but I still > don't > > fully understand why it now works. I hate these situations... > > > > Can zeroing make such a big difference ? Does it just make a > difference > > at the RAID/disk level, or does it also make a difference at the DRBD > > level ? > > > > Sorry for the long e-mail, and thanks a ton for any input. - Patrick > - > > > > PS: Based on my reading, many people are trying to implement such > > solutions. XtraVirt had a VM at some point, but not anymore. People > are > > trying to do it with OpenFiler, but IET and VMware don't like each > > other. My setup is not documented the way it should, but I'm ready to > > share if anyone wants to play with it. > > > > > > > *********************************************************************** > *************** > > This email and any files transmitted with it are confidential and > > intended solely for the use of the individual or entity to whom they > > are addressed. If you have received this email in error please notify > > the system manager. postmaster at navixia.com > > > *********************************************************************** > *************** > > > > _______________________________________________ > > drbd-user mailing list > > drbd-user at lists.linbit.com > > http://lists.linbit.com/mailman/listinfo/drbd-user > > > >