Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Marco Barbero wrote: >> I'm exeriencing a nasty kernel soft lockup on one cluster. Have to >> say I have tons of clusters using same config and all is working fine i recently experienced cluster lockup issues because my ethernet adapters (4 / server) which were bonded to "bond0" inadvertently had some options which "appear to be incompatible" with intra-cluster communications, i.e.: tx-checksumming was on; and scatter-gather was on; and TSO was on; and generic segmentation offload was on. i noted that these options can cause problems on both bonded and non-bonded intf's. i verified the error using wireshark: i saw tcp and udp checksum errors coming from one server with these options set. if you do not bond, you can reset these with: ethtool -K eth"x" tx off sg off tso off gso off if you bond, you'll probably be unable to reset these with ethtool; instead, reset the eth"x" intf's themselves (even tho they are slaves) and the bonded intf will reset automagically. i verified the fix using wireshark: the server with the checksum errors is now behaving nicely. hth yvette hirth