[DRBD-user] Still experiencing resource spikes

Mon Dec 17 20:27:28 CET 2012

I have a real sticky problem and would appreciate if anyone has insight.

We currently have the following physical configuration

2) Dell PowerEdge R710 (dual 6-core with hyperthreading enabled)
     120Gbytes of memory each
     4x 10GE Nics  (2-bonded for NFS and 2-bonded for Replication)    Broadcom 57711.
     4x 8Gbit FC HBAs  which are tied to a dual controller NEXSAN E60 array (controllers in non-redundant mode), each controller has 4Gbytes of memory.
      Raw throughput is around 340Mbytes/sec per volume.

 Each system is running RHEL 6.3 with heartbeat and now with DRBD-8.4.2,  problem was there with 8.4.0 (could not get 8.4.1 to work).
System configured as Active/Passive pair with EXT4 and the filesystem, barriers off.   Filesystems exported via NFS to vSphere  4.1 clients.

 Main problem is that everything works for most of the time but every now and then a resource stall (high load average and no I/O) occurs which
Is not good for running VMs.   Has anyone seen this?    No errors recorded just no I/O and high load for a few minutes (3-4).    This has been driving
me crazy.      One more thing,   these events do not occur with "replication disabled"  i.e. drbdadm down all (on the peer member).     I have adjusted
many sysctl parameters (up memory buffers , etc) ,  changed I/O schedulers and turned on and off hyperthreading and still have the issue.

Thanks in advance.

James

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20121217/52cb4ba5/attachment.htm>