[DRBD-user] Possible issue with DRBD and KNFSD

John Frisk john_a_frisk at yahoo.com
Fri May 25 18:26:20 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


--- Lars Ellenberg <lars.ellenberg at linbit.com> wrote:

> On Fri, May 25, 2007 at 12:41:15AM -0700, John Frisk
> wrote:
> > Team,
> > I have been working with the NFS folks for a bit
> now
> > and am experiencing something strange which causes
> NFS
> > clients to hang when attempting to use
> > heartbeat/drbd/knfsd as a HA NFS server.  I don't
> > believe this is a NFS issue any longer.  
> > 
> > Setup:
> > Machine A: primary drbd and HA machine (SMP
> machine)
> > Machine B: secondary drbd and HA machine (UP
> machine)
> > Machine C: NFS client using bonnie++ as a test for
> I/O
> > activity ( -f -s 100 -n 1 -r 0 are the parameters
> )
> > All machines are Debian 4.0 etch with vanilla
> > 2.6.22-rc2 kernel + known NFS issues patched
> 
> > So I have been asking myself "What is the
> difference
> > between machine A and B that would cause the issue
> > only on machine A".  The network adapters on both
> A &
> > B are realtek r8169 style adapters.  The biggest
> > difference I can think of is machine A is an SMP
> > machine Athlon 64 X2 (running on a i386 kernel due
> to
> > support issues with the 64bit kernel) while
> machine B
> > is an older Athlon slot A processor.
> 
> lower level io-subsystem differs?

True, machine A is SATA and machine B is IDE, but
should that matter?

> available RAM differs?

Both machines have the same amount the physical RAM
being 1 GB.  The usage is about the same since they
are running the same programs.  They were recently
built to be as close to identical as possible.


> > I have attached a triggered sysrq from machine
> 
> not useful here.

OK, the NFS folks tend to use kernel dumps
extensively.  I didn't know what was preferred, but I
have a summary from Trond Myklebust from the NFS team:

"To be more precise, it looks from the sysrq there as
if the NFS hang is
due to rpc.mountd getting stuck in
drbd_make_request..."

> look at /proc/drbd,
> and tell us what is there, when your nfs-client
> hangs.

Here's the output from Machine A during a hang:
version: 8.0.3 (api:86/proto:86)
SVN Revision: 2881 build by frisk at inferno, 2007-05-24
23:27:32
 0: cs:Connected st:Primary/Secondary
ds:UpToDate/UpToDate C r---
    ns:3581188 nr:78482644 dw:82063832 dr:2911 al:250
bm:4813 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:4879340 misses:4770
starving:0 dirty:0 changed:4770
        act_log: used:0/257 hits:895047 misses:398
starving:0 dirty:148 changed:250

Here's the output from Machine B during a hang:
version: 8.0.3 (api:86/proto:86)
SVN Revision: 2881 build by frisk at saint, 2007-05-24
23:30:12
 0: cs:Connected st:Secondary/Primary
ds:UpToDate/UpToDate C r---
    ns:78482644 nr:3581200 dw:3918108 dr:78147366
al:73 bm:4807 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:4879340 misses:4770
starving:0 dirty:0 changed:4770
        act_log: used:0/257 hits:84154 misses:73
starving:0 dirty:0 changed:73

> does generating local io
> (or "sync" or "emergency sync" via sysrq)
> on the Primary help?
> on the Secondary?

Test 1)
Running a sync does nothing on either the primary or
secondary.

Test 2)
Running a sysrq sync also does not do anything.

> maybe you can also watch with wireshark and find
> some
> "interessting" behaviour (please, I'm not
> interessted
> in any unprocessed dump files; this is just a
> suggestion
> for a tool you might use to further nail down what
> happens).

I would be happy to install wireshark and capture
packets.  Any help on what I'm looking for?  Should I
only look for drbd traffic on ports 7788 or NFS too?

Thank you for your help.


       
____________________________________________________________________________________You snooze, you lose. Get messages ASAP with AutoCheck
in the all-new Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/newmail_html.html



More information about the drbd-user mailing list