linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@redhat.com>
To: "david m. richter" <richterd@citi.umich.edu>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
	Oleg Drokin <Oleg.Drokin@Sun.COM>,
	Marc Eshel <eshel@almaden.ibm.com>,
	linux-fsdevel@vger.kernel.org, Manoj Naik <manoj@almaden.ibm.com>
Subject: Re: NFS client hang on attempt to do async blocking posix lock enqueue
Date: Fri, 8 Feb 2008 15:54:14 -0500	[thread overview]
Message-ID: <20080208155414.269f44d9@tleilax.poochiereds.net> (raw)
In-Reply-To: <Pine.BSO.4.64.0802081302130.6952@citi.umich.edu>

On Fri, 8 Feb 2008 13:49:01 -0500 (EST)
"david m. richter" <richterd@citi.umich.edu> wrote:

> On Fri, 8 Feb 2008, J. Bruce Fields wrote:
> 
> > On Fri, Feb 08, 2008 at 07:15:02AM -0500, Jeff Layton wrote:
> > > On Thu, 7 Feb 2008 18:26:18 -0500
> > > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > > 
> > > > On Sun, Jan 20, 2008 at 09:58:59AM -0500, Oleg Drokin wrote:
> > > > > Hello!
> > > > >
> > > > > On Jan 18, 2008, at 6:07 PM, J. Bruce Fields wrote:
> > > > >
> > > > >> On Thu, Nov 29, 2007 at 02:41:57PM -0800, Marc Eshel wrote:
> > > > >>> The problem seems to be with the fact that the client and server are 
> > > > >>> on
> > > > >>> the same machine. This test work fine with or without an underlaying 
> > > > >>> fs
> > > > >>> that supports locking when the client and the server are on a  
> > > > >>> different
> > > > >>> machines. Like you said the server is trying to send the grant  
> > > > >>> message to
> > > > >>> the client but for some reason it fails when the client is on the  
> > > > >>> same
> > > > >>> machine.
> > > > >> That *shouldn't* make a difference, so we need to take another look at
> > > > >> this--Oleg, this problem is still unfixed, right?
> > > > >
> > > > > Yes, I just pulled your latest nfs tree and I still can reproduce the  
> > > > > problem.
> > > > 
> > > > OK, we have finally reproduced this problem here, and David's working on
> > > > debugging.  It does indeed seem to only be reproduceable with client and
> > > > server on the same machine.  Thanks for the report....
> > > > 
> > > > --b.
> > > 
> > > It might be worth testing this both with and without the patchset I
> > > posted to linux-nfs recently to take care of the lockd hang. If
> > > lockd is stuck trying to rpc_ping itself then it probably would hang
> > > like this, wouldn't it?
> > 
> > Of course!  Yes, that fits.
> > 
> > --b.
> 
> 	right on, jeff, good catch and thanks for directing my attention 
> to your patches.
> 

Excellent! Glad that took care of it...

> 	i applied them on top of 2.6.23.1 and tested them on a cluster 
> exporting GFS2 over NFS, using oleg's reproducer code.  your patches fix 
> that lockd hang.
> 
> 	in a bit more detail, oleg's reproducer basically gets a 
> whole-file read lock, tests the lock, upgrades to a whole-file exclusive 
> lock, tests the lock, then unlocks.  the problem was that when getting 
> that exclusive lock things would hang.  this only happened when the client 
> and server were on the same machine, and i could reproduce it with NFS 
> exporting GFS2 but not NFS exporting EXT3.
> 
> 

Interesting. It's not clear me why the underlying filesystem would make
any difference there. Though now that I look, it looks like fl_grant
really only gets called from dlm code, and that queues up the block for
an immediate grant callback attempt. So perhaps that's the reason.

-- 
Jeff Layton <jlayton@redhat.com>

  reply	other threads:[~2008-02-08 20:57 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-29 19:15 NFS client hang on attempt to do async blocking posix lock enqueue J. Bruce Fields
2007-11-29 22:41 ` Marc Eshel
2008-01-18 23:07   ` J. Bruce Fields
2008-01-20 14:58     ` Oleg Drokin
2008-02-07 23:26       ` J. Bruce Fields
2008-02-08 12:15         ` Jeff Layton
2008-02-08 14:33           ` J. Bruce Fields
2008-02-08 18:49             ` david m. richter
2008-02-08 20:54               ` Jeff Layton [this message]
2008-02-08 21:12                 ` J. Bruce Fields
2008-02-08 21:27                   ` Jeff Layton
  -- strict thread matches above, loose matches on Subject: below --
2007-11-29 19:04 Oleg Drokin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080208155414.269f44d9@tleilax.poochiereds.net \
    --to=jlayton@redhat.com \
    --cc=Oleg.Drokin@Sun.COM \
    --cc=bfields@fieldses.org \
    --cc=eshel@almaden.ibm.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=manoj@almaden.ibm.com \
    --cc=richterd@citi.umich.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).