<br><font size=2 face="sans-serif">I am experiencing similiar problems

though not so severe.</font>

<br><font size=2 face="sans-serif">A Num,ber of Systems running SLES 11

HA / SLES11 HA SP1 on both IBM and Dell Hardware, Interlinks Intel IGB,

Boradcom or E1000 all showing the same behaviour. From Time to time the

verification fails, the device disconnects, reconnects, is fine again.

we are not using Jumbo frames, most systems are directly connected, some

going over switches, bonding mode 4 or 0, dependinig on tha system. However,

we are able to do a full sync without problems, and there are no OOS-blocks

found when doing a verify. The only impact is on performance, if the problem

occurrs more frequently. Reconnect ans resync are done within 1-2 seconds.

Quality of the intterlink has a huge impact, however even wit new cables

and new NICS and short cables this shows up, just less frequently. I am

not sure if it is tied to bonding, because AFAIR i have also seen this

with a single link. DRBD-versions tested range from 8.3.4 to 8.3.8. Kernel

2.6.27 (Sles11) and 2.6.32 (Sles11SP1) at the newest available patchlevel,

DRBD also patched to the newest levels available from Novell for the respective

SLES releases. </font>

<br><font size=2 face="sans-serif">The ammount of trouble seems tied to

the IO-load on the system. Most notable not all devices are affected at

the same time, so there can be two DRBD devices under load, with onel having

issues all the time, while another running over the same path on the same

systems under simmilar load at the same time has none. disabling traffic

offloading does not always help, but gives better stability on some systems.

firmware- and driverupdates did not help so far.</font>

<br><font size=2 face="sans-serif"><br>

</font><font size=2 color=#5f5f5f face="sans-serif">Mit freundlichen Grüßen

/ Best Regards<b><br>

</b></font>

<br><font size=2 color=#5f5f5f face="sans-serif">Robert Köppl<br>

<br>

Systemadministration<br>

<b><br>

KNAPP Systemintegration GmbH</b><br>

Waltenbachstraße 9<br>

8700 Leoben, Austria <br>

Phone: +43 3842 805-910<br>

Fax: +43 3842 82930-500<br>

robert.koeppl@knapp.com <br>

www.KNAPP.com <br>

<br>

Commercial register number: FN 138870x<br>

Commercial register court: Leoben<br>

</font><font size=1 color=#d2d2d2 face="sans-serif"><br>

The information in this e-mail (including any attachment) is confidential

and intended to be for the use of the addressee(s) only. If you have received

the e-mail by mistake, any disclosure, copy, distribution or use of the

contents of the e-mail is prohibited, and you must delete the e-mail from

your system. As e-mail can be changed electronically KNAPP assumes no responsibility

for any alteration to this e-mail or its attachments. KNAPP has taken every

reasonable precaution to ensure that any attachment to this e-mail has

been swept for virus. However, KNAPP does not accept any liability for

damage sustained as a result of such attachment being virus infected and

strongly recommend that you carry out your own virus check before opening

any attachment.</font>

<br>

<br>

<br>

<table width=100%>

<tr valign=top>

<td width=40%><font size=1 face="sans-serif"><b>Steve Thompson &lt;smt@vgersoft.com&gt;</b>

</font>

<br><font size=1 face="sans-serif">Gesendet von: drbd-user-bounces@lists.linbit.com</font>

<p><font size=1 face="sans-serif">13.01.2011 22:28</font>

<td width=59%>

<table width=100%>

<tr valign=top>

<td>

<div align=right><font size=1 face="sans-serif">An</font></div>

<td><font size=1 face="sans-serif">drbd-user@lists.linbit.com</font>

<tr valign=top>

<td>

<div align=right><font size=1 face="sans-serif">Kopie</font></div>

<td>

<tr valign=top>

<td>

<div align=right><font size=1 face="sans-serif">Thema</font></div>

<td><font size=1 face="sans-serif">[DRBD-user] Bonding [WAS repeated resync/fail/resync]</font></table>

<br>

<table>

<tr valign=top>

<td>

<td></table>

<br></table>

<br>

<br>

<br><tt><font size=2>On Sat, 8 Jan 2011, Steve Thompson wrote:<br>

<br>

CentOS 5.5, x86_64, drbd 8.3.8, Dell PE2900 servers w/16GB memory. The

<br>

replication link is a dual GbE bonded pair (point-to-point, no switches)

<br>

in balance-rr mode with MTU=9000. Using tcp_reordering=127.<br>

<br>

I reported that a resync failed and restarted every minute or so for a

<br>

couple of weeks. I have found the cause, but am not sure of the solution.<br>

I'll try to keep it short.<br>

<br>

First, I swapped cables, cards, systems, etc in order to be sure of the

<br>

integrity of the hardware. All hardware checks out OK.<br>

<br>

Secondly, I was using a data-integrity-alg of sha1 or crc32c (tried both).

<br>

Only when this was removed from the configuration was I able to get a full

<br>

resync to complete. There is an ext3 file system on the drbd volume, but

<br>

it is quiet; this is a non-production test system.<br>

<br>

After this, a verify pass showed several out of sync blocks. I disconnect

<br>

and reconnect and re-run the verify pass. Now more out of sync blocks,

but <br>

in a different place. Rinse and repeat; verify was never clean, and out

of <br>

sync blocks were never in the same place twice.<br>

<br>

I changed MTU to 1500. No difference; still can't get a clean verify.<br>

<br>

I changed tcp_reordering to 3. No difference (no difference in <br>

performance, either).<br>

<br>

Finally, I shut down half of the bonded pair on each system, so I'm using

<br>

effectively a single GbE link with MTU=9000 and tcp_reordering=127. Wow,

<br>

now everything is working fine; syncs are clean, verifies are clean, <br>

violins are playing.<br>

<br>

My question is: WTF? I'd really like to get the bonding pair working <br>

again, for redundancy and performance, but it very quickly falls apart

in <br>

this case. I'd appreciate any insight into this that anyone can give.<br>

<br>

Steve<br>

_______________________________________________<br>

drbd-user mailing list<br>

drbd-user@lists.linbit.com<br>

http://lists.linbit.com/mailman/listinfo/drbd-user<br>

</font></tt>

<br>