Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, Jan 08, 2008 at 09:06:44AM +0000, Ben Clewett wrote: > > > Dear Lars, > > I found a server at lock when I got to my desk this morning. Not wanting to > waist any time, these are the numbers you ask for. > > Lock on 'hp-tm-02', twin with 'hp-tm-04' which is partially locked. > > I use the term 'lock' to explain a server with high load and very much > reduced throughput. as long as there is still throughput, it is not likely not a problem in drbd. but see below. > hp-tm-02: (lock) > > version: 8.2.1 (api:86/proto:86-87) > GIT-hash: 318925802fc2638479ad090b73d7af45503dd184 build by root at hp-tm-02, > 2007-12-19 22:25:46 > 0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate B r--- > ns:0 nr:91896348 dw:91896340 dr:0 al:0 bm:0 lo:2 pe:0 ua:1 ap:0 ^ ^ two requests pending against local disk, one answer still to be sent to the peer (which will happen once the local requests complete). on a Secondary, if ua stays != zero, and ns,nr,dw,dr do not increase during that time, drbd has a problem. if those ns,nr,dw,dr still increase, or ua is zero, all is fine. > 1: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate B r--- > ns:125994544 nr:0 dw:125994540 dr:57151581 al:477198 bm:0 lo:2 pe:0 > ua:0 ap:2 on a Primary, if ap or pe stays != zero, and the nfs,nrw,dw,dr do not increase, drbd has a problem, if those ns,nrw,dw,dr do still increase, or pe is zero, all is fine. > > the "lo:NNN", on the Secondary, does that change? > > I mean, does this gauge change, > > and eventually decrease to zero, during "lock"? > > When found was at zero. see above. > > do both drbd live on the same physical device? > > (physical meaning the same io-queue in linux, i.e. > > the same /dev/sdX eventually, > > when propagating down all the lvm layers, if any) > > Both DRBD resource partitions and both DRBD bitmaps live on the same devise: > /dev/cciss, split into four partitions. This is a hardware RAID 5 > devise from seven + one physical SAS disks. This has 256MB > write-ahead-cache (with battery) and a tested write rate (bonnie++) of > about 250MB/sec. > > I do not use the LVM system, if you mean the IBM piece of dynamic > partitioning software ported onto Linux. > > > how many cpus? > > Four physical = eight logical on twin duel Zeon core. > > > how many pdflush threads (ps ax | grep pdflush) > > during "lock", are one or more of those in "D" state? > > if so, does it stay in "D" state? > > On hp-tm-02 (locked) > > # ps axl | grep pdflush > 1 0 196 15 15 0 0 0 pdflus S ? 0:15 [pdflush] > 1 0 197 15 15 0 0 0 sync_b D ? 1:34 [pdflush] > > Seems to move between S and D for the 197 pid, mostly S. as long as it is mostly S, thats good. > > during lock, do (on both nodes) > > ps -eo pid,state,wchan:40,comm | grep -Ee " D |drbd" > > that should give some more information. > > hp-tm-02: (locked) > > 3788 S - drbd0_worker > 3796 D drbd_md_sync_page_io drbd1_worker > 3817 S - drbd0_receiver > 3825 S - drbd1_receiver > 4394 S - drbd0_asender > 4395 S - drbd1_asender > 2996 D sync_buffer find > > And: > > 197 D sync_buffer pdflush > 959 D - reiserfs/3 > 3788 S - drbd0_worker > 3796 S - drbd1_worker > 3817 D drbd_wait_ee_list_empty drbd0_receiver > 3825 S - drbd1_receiver > 4394 S - drbd0_asender > 4395 S - drbd1_asender > > But mostly: > > 3788 S - drbd0_worker > 3796 S - drbd1_worker > 3817 S - drbd0_receiver > 3825 S - drbd1_receiver > 4394 S - drbd0_asender > 4395 S - drbd1_asender that is fine, no indication of misbehaviour. > hp-tm-04: (partially locked) > > 14188 S - drbd0_worker > 14194 S - drbd1_worker > 14214 S - drbd0_receiver > 14216 S - drbd0_asender > 14223 S - drbd1_receiver > 14225 S - drbd1_asender just fine. > > during lock, does it help if you > > drbdadm disconnect $resource ; sleep 3; drbdadm adjust $resource > > (on one or the other node) > > I am sorry I can't disconnect these resources. > > > how frequently do you run into these locks? > > Depending on loading. I don't have much quantitative data. It seems > to hit after about 4 days runtime, at least the last few times. Once > the locking has started it will continue until some time after loading, > say 30 minutes. But once hit, it will return frequently at lower load. > Will continue on and off until (i) restart DRBD (ii) restart server. > I am not sure at this stage which. > > > > during lock, > > what does "netstat -tnp" say (always on both nodes)? > > (preferably grep for the drbd connection, > > so something like > > netstat -tnp | grep ':778[89] ' > > if your drbd ports are configured to be 7788 and 7789. > > hp-tm-02: > > tcp 0 0 192.168.95.5:7788 192.168.95.6:45579 ESTABLISHED - > tcp 0 0 192.168.95.5:7789 192.168.95.6:50365 ESTABLISHED - > tcp 0 0 192.168.95.5:51501 192.168.95.6:7789 ESTABLISHED - > tcp 0 0 192.168.95.5:54029 192.168.95.6:7788 ESTABLISHED - > > hp-tm-04: > > tcp 0 0 192.168.95.6:7788 192.168.95.5:54029 ESTABLISHED - > tcp 0 0 192.168.95.6:7789 192.168.95.5:51501 ESTABLISHED - > tcp 0 0 192.168.95.6:50365 192.168.95.5:7789 ESTABLISHED - > tcp 0 0 192.168.95.6:45579 192.168.95.5:7788 ESTABLISHED - > tcp 0 0 192.168.95.6:45579 192.168.95.5:7788 ESTABLISHED - last line is duplicate. bug in netstat, probably. all as it should be, no queuing in the tcp buffers. > Now that's odd, why should there be five? But repeat of test shows > just four entries. drbd apears to be just healthy and happy. an other thought, what file system, and mount options? -- : Lars Ellenberg http://www.linbit.com : : DRBD/HA support and consulting sales at linbit.com : : LINBIT Information Technologies GmbH Tel +43-1-8178292-0 : : Vivenotgasse 48, A-1120 Vienna/Europe Fax +43-1-8178292-82 : __ please use the "List-Reply" function of your email client.