Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
has anybody seen this before, got any insight? James James Masson wrote: > Hi list, > > I'm using DRBD and NFS to provide HA to Virtual Machine images between pairs of storage servers. > > Systems are RHEL5.4 2.6.18-164.el5 + drbd8.3 from Centos Extras > > We've been having issues where disk I/O problems on the DRBD Secondary stops all IO to the Primary > too. DRBD doesn't seem to recognise these disk I/O problems, the Secondary isn't disconnected > automatically. Everything just hangs. > > During this state: > If I try a "drbdadm disconnect all" on the Primary, the command hangs. > If I try this on the Secondary, the command eventually completes, and NFS I/O returns to normal > operation on the Primary. > > I've tried the following things to fix this: > > 1) Putting in a custom local-io-error handler to hard reset the problem node. > > This never triggers. Just like the default "detach", never triggers. > > 2) Changing the net connection parameters to: > > net { > ko-count 2; > timeout 20; > } > > Again, this never triggers. > > > 3) Changing the protocol used from C to B > > Doesn't have any effect on the issue - I'd prefer to use C anyway. > > > Any further ideas on how to track this issue down and fix it? > > thanks > > James Masson > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user