Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, Jan 08, 2008 at 09:06:44AM +0000, Ben Clewett wrote:
>
>
> Dear Lars,
>
> I found a server at lock when I got to my desk this morning. Not wanting to
> waist any time, these are the numbers you ask for.
>
> Lock on 'hp-tm-02', twin with 'hp-tm-04' which is partially locked.
>
> I use the term 'lock' to explain a server with high load and very much
> reduced throughput.
as long as there is still throughput,
it is not likely not a problem in drbd.
but see below.
> hp-tm-02: (lock)
>
> version: 8.2.1 (api:86/proto:86-87)
> GIT-hash: 318925802fc2638479ad090b73d7af45503dd184 build by root at hp-tm-02,
> 2007-12-19 22:25:46
> 0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate B r---
> ns:0 nr:91896348 dw:91896340 dr:0 al:0 bm:0 lo:2 pe:0 ua:1 ap:0
^ ^
two requests pending against local disk,
one answer still to be sent to the peer
(which will happen once the local requests complete).
on a Secondary,
if ua stays != zero, and ns,nr,dw,dr do not increase during that time,
drbd has a problem. if those ns,nr,dw,dr still increase, or ua is zero,
all is fine.
> 1: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate B r---
> ns:125994544 nr:0 dw:125994540 dr:57151581 al:477198 bm:0 lo:2 pe:0
> ua:0 ap:2
on a Primary,
if ap or pe stays != zero, and the nfs,nrw,dw,dr do not increase,
drbd has a problem, if those ns,nrw,dw,dr do still increase,
or pe is zero, all is fine.
> > the "lo:NNN", on the Secondary, does that change?
> > I mean, does this gauge change,
> > and eventually decrease to zero, during "lock"?
>
> When found was at zero.
see above.
> > do both drbd live on the same physical device?
> > (physical meaning the same io-queue in linux, i.e.
> > the same /dev/sdX eventually,
> > when propagating down all the lvm layers, if any)
>
> Both DRBD resource partitions and both DRBD bitmaps live on the same devise:
> /dev/cciss, split into four partitions. This is a hardware RAID 5
> devise from seven + one physical SAS disks. This has 256MB
> write-ahead-cache (with battery) and a tested write rate (bonnie++) of
> about 250MB/sec.
>
> I do not use the LVM system, if you mean the IBM piece of dynamic
> partitioning software ported onto Linux.
>
> > how many cpus?
>
> Four physical = eight logical on twin duel Zeon core.
>
> > how many pdflush threads (ps ax | grep pdflush)
> > during "lock", are one or more of those in "D" state?
> > if so, does it stay in "D" state?
>
> On hp-tm-02 (locked)
>
> # ps axl | grep pdflush
> 1 0 196 15 15 0 0 0 pdflus S ? 0:15 [pdflush]
> 1 0 197 15 15 0 0 0 sync_b D ? 1:34 [pdflush]
>
> Seems to move between S and D for the 197 pid, mostly S.
as long as it is mostly S, thats good.
> > during lock, do (on both nodes)
> > ps -eo pid,state,wchan:40,comm | grep -Ee " D |drbd"
> > that should give some more information.
>
> hp-tm-02: (locked)
>
> 3788 S - drbd0_worker
> 3796 D drbd_md_sync_page_io drbd1_worker
> 3817 S - drbd0_receiver
> 3825 S - drbd1_receiver
> 4394 S - drbd0_asender
> 4395 S - drbd1_asender
> 2996 D sync_buffer find
>
> And:
>
> 197 D sync_buffer pdflush
> 959 D - reiserfs/3
> 3788 S - drbd0_worker
> 3796 S - drbd1_worker
> 3817 D drbd_wait_ee_list_empty drbd0_receiver
> 3825 S - drbd1_receiver
> 4394 S - drbd0_asender
> 4395 S - drbd1_asender
>
> But mostly:
>
> 3788 S - drbd0_worker
> 3796 S - drbd1_worker
> 3817 S - drbd0_receiver
> 3825 S - drbd1_receiver
> 4394 S - drbd0_asender
> 4395 S - drbd1_asender
that is fine, no indication of misbehaviour.
> hp-tm-04: (partially locked)
>
> 14188 S - drbd0_worker
> 14194 S - drbd1_worker
> 14214 S - drbd0_receiver
> 14216 S - drbd0_asender
> 14223 S - drbd1_receiver
> 14225 S - drbd1_asender
just fine.
> > during lock, does it help if you
> > drbdadm disconnect $resource ; sleep 3; drbdadm adjust $resource
> > (on one or the other node)
>
> I am sorry I can't disconnect these resources.
>
> > how frequently do you run into these locks?
>
> Depending on loading. I don't have much quantitative data. It seems
> to hit after about 4 days runtime, at least the last few times. Once
> the locking has started it will continue until some time after loading,
> say 30 minutes. But once hit, it will return frequently at lower load.
> Will continue on and off until (i) restart DRBD (ii) restart server.
> I am not sure at this stage which.
>
>
> > during lock,
> > what does "netstat -tnp" say (always on both nodes)?
> > (preferably grep for the drbd connection,
> > so something like
> > netstat -tnp | grep ':778[89] '
> > if your drbd ports are configured to be 7788 and 7789.
>
> hp-tm-02:
>
> tcp 0 0 192.168.95.5:7788 192.168.95.6:45579 ESTABLISHED -
> tcp 0 0 192.168.95.5:7789 192.168.95.6:50365 ESTABLISHED -
> tcp 0 0 192.168.95.5:51501 192.168.95.6:7789 ESTABLISHED -
> tcp 0 0 192.168.95.5:54029 192.168.95.6:7788 ESTABLISHED -
>
> hp-tm-04:
>
> tcp 0 0 192.168.95.6:7788 192.168.95.5:54029 ESTABLISHED -
> tcp 0 0 192.168.95.6:7789 192.168.95.5:51501 ESTABLISHED -
> tcp 0 0 192.168.95.6:50365 192.168.95.5:7789 ESTABLISHED -
> tcp 0 0 192.168.95.6:45579 192.168.95.5:7788 ESTABLISHED -
> tcp 0 0 192.168.95.6:45579 192.168.95.5:7788 ESTABLISHED -
last line is duplicate. bug in netstat, probably.
all as it should be, no queuing in the tcp buffers.
> Now that's odd, why should there be five? But repeat of test shows
> just four entries.
drbd apears to be just healthy and happy.
an other thought, what file system, and mount options?
--
: Lars Ellenberg http://www.linbit.com :
: DRBD/HA support and consulting sales at linbit.com :
: LINBIT Information Technologies GmbH Tel +43-1-8178292-0 :
: Vivenotgasse 48, A-1120 Vienna/Europe Fax +43-1-8178292-82 :
__
please use the "List-Reply" function of your email client.