Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi there, I randomly come across system hangs when copying relatively large amount of data in a gfs area over drbd 0.8pre4. System is a two-node cluster with two primaries, running kernel 2.6.15.7 with Ubuntu cluster patchset (including gfs/dlm etc) and Debian Sarge. The actual data is typically separated into small files, somewhat similar to the kernel source. I experienced that system hang usually happens after copying about 2-2.5 GB of data. Then (not always) one of the nodes hangs, and comes back to life after hw reset only. On the other node this message comes to the kernel log: Aug 18 09:48:26 imind-front2 kernel: drbd0: [drbd0_worker/3920] sock_sendmsg time expired, ko = 4294967295 Aug 18 09:48:29 imind-front2 kernel: drbd0: [drbd0_worker/3920] sock_sendmsg time expired, ko = 4294967294 Aug 18 09:48:29 imind-front2 kernel: drbd0: PingAck did not arrive in time. Aug 18 09:48:29 imind-front2 kernel: drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Aug 18 09:48:29 imind-front2 kernel: drbd0: Creating new current UUID Aug 18 09:48:29 imind-front2 kernel: drbd0: asender terminated Aug 18 09:48:29 imind-front2 kernel: drbd0: conn( NetworkFailure -> BrokenPipe ) Please, help me to work around this sytem hang issue, if possible. Thanks, Balint