Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Lars Ellenberg wrote: > > / 2004-04-13 07:15:56 -0500 > \ Todd Denniston: > > all of my drbd device net sections contain (and did at the time of the lockup > > too): <SNIP> > > timeout = 60 # unit: 0.1 seconds <SNIP> > > ko-count = 10 # if some block send times out this many times, <SNIP> > > Which I thought meant that in ~60 seconds[1] I would get a fallover. > > Ah. No. > This only detects whether I was able to send something to the Secondary. > If not (for that ~60 seconds), I disconnect and ignore my peer. > > This does NOT detect local IO failure, since that normally is caught by > the "do-panic" option: when I get a local IO failure, I will just panic > the box. Which should always trigger a failover ... Unfortunately the lower level has not yet (at that time) declared an IO failure, might be a buglet there (adaptec SCSI layer). :{ > > > What I was suspecting is the 2 drbd's can still talk to one another, and the > > failing node's drbd is blocking (but not failing) on the write to the SCSI > > layer, because the scsi layer is in a loop retrying the reset on the Promise > > box ''hard drive''. > > <SNIP> > Hm. > If the Secondary manages to get a "throughput" of more than one block > per (ko-count*ping-intervall), we do not disconnect. If it "even" > manages to get >= (4k) per ping intervall, ko-count won't trigger at all. > Drat, so I am getting 'some' data through to the disk more often than once every 6 seconds... might lower that timeout and see if I get some ko markers. > So a "very slow" but "not slow enough" write throughput on the Secondary > will throttle the Primary to the same slowness. > > On the Primary, if it still is responsive, try to watch the "ns:". > If it still increases, this is what happens. > good point, I'll look at it when it latches up today or tomorrow. (seems to happen ~14?? local time for 2 working days in a row). > If you have some "creative" idea how to cope with this, tell us. unfortunately I was expecting that the throughput to the disk was 0, but you are suggesting it is slightly higher, and that you have already covered the 0 case. perhaps a minimum-pending-throughput (min-pend-speed)? and pend-grace-period warning really nasty pseudo code: if new data received from primary && we have some data that has been pending its sync to disk > pend-grace-period && (avg_speed_to_disk_since_we_started_pending < min-pend-speed) { call ko-count type routines } requires receipt/send time markers on data packets, and something tracking average data rate... looks easy in english ... could be a pain to code. :} <SNIP> -- Todd Denniston Crane Division, Naval Surface Warfare Center (NSWC Crane) Harnessing the Power of Technology for the Warfighter