From: Jeff Layton <jlayton@redhat.com>
To: "david m. richter" <richterd@citi.umich.edu>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
Oleg Drokin <Oleg.Drokin@Sun.COM>,
Marc Eshel <eshel@almaden.ibm.com>,
linux-fsdevel@vger.kernel.org, Manoj Naik <manoj@almaden.ibm.com>
Subject: Re: NFS client hang on attempt to do async blocking posix lock enqueue
Date: Fri, 8 Feb 2008 15:54:14 -0500 [thread overview]
Message-ID: <20080208155414.269f44d9@tleilax.poochiereds.net> (raw)
In-Reply-To: <Pine.BSO.4.64.0802081302130.6952@citi.umich.edu>
On Fri, 8 Feb 2008 13:49:01 -0500 (EST)
"david m. richter" <richterd@citi.umich.edu> wrote:
> On Fri, 8 Feb 2008, J. Bruce Fields wrote:
>
> > On Fri, Feb 08, 2008 at 07:15:02AM -0500, Jeff Layton wrote:
> > > On Thu, 7 Feb 2008 18:26:18 -0500
> > > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > >
> > > > On Sun, Jan 20, 2008 at 09:58:59AM -0500, Oleg Drokin wrote:
> > > > > Hello!
> > > > >
> > > > > On Jan 18, 2008, at 6:07 PM, J. Bruce Fields wrote:
> > > > >
> > > > >> On Thu, Nov 29, 2007 at 02:41:57PM -0800, Marc Eshel wrote:
> > > > >>> The problem seems to be with the fact that the client and server are
> > > > >>> on
> > > > >>> the same machine. This test work fine with or without an underlaying
> > > > >>> fs
> > > > >>> that supports locking when the client and the server are on a
> > > > >>> different
> > > > >>> machines. Like you said the server is trying to send the grant
> > > > >>> message to
> > > > >>> the client but for some reason it fails when the client is on the
> > > > >>> same
> > > > >>> machine.
> > > > >> That *shouldn't* make a difference, so we need to take another look at
> > > > >> this--Oleg, this problem is still unfixed, right?
> > > > >
> > > > > Yes, I just pulled your latest nfs tree and I still can reproduce the
> > > > > problem.
> > > >
> > > > OK, we have finally reproduced this problem here, and David's working on
> > > > debugging. It does indeed seem to only be reproduceable with client and
> > > > server on the same machine. Thanks for the report....
> > > >
> > > > --b.
> > >
> > > It might be worth testing this both with and without the patchset I
> > > posted to linux-nfs recently to take care of the lockd hang. If
> > > lockd is stuck trying to rpc_ping itself then it probably would hang
> > > like this, wouldn't it?
> >
> > Of course! Yes, that fits.
> >
> > --b.
>
> right on, jeff, good catch and thanks for directing my attention
> to your patches.
>
Excellent! Glad that took care of it...
> i applied them on top of 2.6.23.1 and tested them on a cluster
> exporting GFS2 over NFS, using oleg's reproducer code. your patches fix
> that lockd hang.
>
> in a bit more detail, oleg's reproducer basically gets a
> whole-file read lock, tests the lock, upgrades to a whole-file exclusive
> lock, tests the lock, then unlocks. the problem was that when getting
> that exclusive lock things would hang. this only happened when the client
> and server were on the same machine, and i could reproduce it with NFS
> exporting GFS2 but not NFS exporting EXT3.
>
>
Interesting. It's not clear me why the underlying filesystem would make
any difference there. Though now that I look, it looks like fl_grant
really only gets called from dlm code, and that queues up the block for
an immediate grant callback attempt. So perhaps that's the reason.
--
Jeff Layton <jlayton@redhat.com>
next prev parent reply other threads:[~2008-02-08 20:57 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-29 19:15 NFS client hang on attempt to do async blocking posix lock enqueue J. Bruce Fields
2007-11-29 22:41 ` Marc Eshel
2008-01-18 23:07 ` J. Bruce Fields
2008-01-20 14:58 ` Oleg Drokin
2008-02-07 23:26 ` J. Bruce Fields
2008-02-08 12:15 ` Jeff Layton
2008-02-08 14:33 ` J. Bruce Fields
2008-02-08 18:49 ` david m. richter
2008-02-08 20:54 ` Jeff Layton [this message]
2008-02-08 21:12 ` J. Bruce Fields
2008-02-08 21:27 ` Jeff Layton
-- strict thread matches above, loose matches on Subject: below --
2007-11-29 19:04 Oleg Drokin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080208155414.269f44d9@tleilax.poochiereds.net \
--to=jlayton@redhat.com \
--cc=Oleg.Drokin@Sun.COM \
--cc=bfields@fieldses.org \
--cc=eshel@almaden.ibm.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=manoj@almaden.ibm.com \
--cc=richterd@citi.umich.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).