Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2006-03-16 14:04:41 -0500 \ Monty Taylor: > Lars Ellenberg wrote: > >> I just set up a drbd replication pair and shipped it to the colo. > >> It was working great. > > so you did local tests first, and all was working as expected? > > Yes. > > >now. your report is somewhat unspecific. > >anyways, when I read your last sentence "machine is hung", > >this might point to a deadlock that could occur when stressing the box. > > Fair enough. I think the hang was that block device was busy and > couldn't be unmounted, and so when I tried to shutdown the machine it > blocked waiting for the fs to unmount. > >please report your findings. > > I will when I get another test machine and 2.6.15 again, which should > be next week. If there's a possible known deadlock, I bet that's what > I ran into. On the other hand, is the default value for on-disconnect > reconnect or freeze_io? Because if it's freeze_io I would maybe see > that being what happened, too. freeze_io is mentioned, but not yet configurable. the kernel side implementation is missing from drbd 0.7 ... default would be reconnect. > I'll keep trying to isolate for you. I think we're going to be using > drbd a bit more as part of our Professional Services offerings to some > clients, so it'll be nice to know where the problem actually sits. sounds interessting! cheers, -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com : __ please use the "List-Reply" function of your email client.