linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: trond.myklebust@fys.uio.no, eshel@almaden.ibm.com, neilb@suse.de,
	akpm@linux-foundation.org, linux-nfs@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: nfs: infinite loop in fcntl(F_SETLKW)
Date: Thu, 17 Apr 2008 18:26:20 -0400	[thread overview]
Message-ID: <20080417222620.GL9912@fieldses.org> (raw)
In-Reply-To: <E1JmAUP-0002VA-9g@pomaz-ex.szeredi.hu>

On Wed, Apr 16, 2008 at 06:28:05PM +0200, Miklos Szeredi wrote:
> > > > The behavior of lockd will still depend to some degree on the exact
> > > > error returned from the filesystem--e.g. if you return -EAGAIN from
> > > > ->lock() without later calling ->fl_grant() it will cause some
> > > > unexpected delays.  (Though again clients will eventually give up and
> > > > poll for the lock.)
> > > 
> > > OK, so the semantics of vfs_lock_file() are now:
> > 
> > Not quite; the original idea was that we didn't care about the blocking
> > lock case, since there's already a lock manager callback for that.
> > (It's arguably a minor abuse for us to return NLM_LCK_BLOCKED and then
> > do a grant later even in the case where there's no conflict, but it
> > works.)  So we only changed the nonblocking case.
> > 
> > Which may be sacrificing elegance for expediency, and I'm not opposed to
> > fixing that in whatever way seems best.  As a start, we could document
> > this better.  So:
> > 
> > > 
> > > 1) if !FL_SLEEP, then return 0 if granted, -EAGAIN on contention
> > > 2) if FL_SLEEP and fl_grant == NULL, then return 0 if granted, block on
> > >    contention
> > > 3) if FL_SLEEP and fl_grant != NULL, then return 0 if granted, on
> > >    contention:
> > >    a) either return -EINPROGRESS and call fl_grant when granted
> > >    b) or return -EAGAIN  and call fl_notify when the lock needs retrying
> > 
> > I'd put it this way (after a quick check of the code to convince myself
> > I'm remembering this right...):
> > 
> >   1) If FL_SLEEP, then return 0 if granted, and on contention either:
> >      a) block, or
> >      b) return -EAGAIN, and call fl_notify when the lock should be
> >         retried.
> 
> Gfs2 seems to return -EINPROGRESS regardless of the FL_SLEEP flag:

Oops, you're right; in FL_SLEEP case fs/lockd/svclock.c:nlmsvc_lock() 
returns NLM_LCK_BLOCKED.  I believe it'll get an fl_grant() callback
after that and do a grant call back to the client, but I haven't
checked....

Note, as has been pointed out by Mark Snitzer
(http://article.gmane.org/gmane.linux.nfs/17801/), this limits the kind
of error reporting the filesystem can do--if it needs to block on the
initial lock call, it has to commit to queueing up, and eventually
granting, the lock.

> > OK, but I haven't read your patch yet, apologies....
> 
> No problem.  Here it is again with some cosmetic fixes and testing.

Thanks!  Ping me in a couple days if I haven't made any comments.  From
a quick skim the GFS2 change and the error return change both seem
reasonable.

I need to a real GFS2 testing setup....  (Did you test GFS2 locking
specifically?)

--b.

> 
> Miklos
> --
> 
> 
> Use a special error value FILE_LOCK_DEFERRED to mean that a locking
> operation returned asynchronously.  This is returned by
> 
>   posix_lock_file() for sleeping locks to mean that the lock has been
>   queued on the block list, and will be woken up when it might become
>   available and needs to be retried (either fl_lmops->fl_notify() is
>   called or fl_wait is woken up).
> 
>   f_op->lock() to mean either the above, or that the filesystem will
>   call back with fl_lmops->fl_grant() when the result of the locking
>   operation is known.  The filesystem can do this for sleeping as well
>   as non-sleeping locks.
> 
> This is to make sure, that return values of -EAGAIN and -EINPROGRESS
> by filesystems are not mistaken to mean an asynchronous locking.
> 
> This also makes error handling in fs/locks.c and lockd/svclock.c
> slightly cleaner.
> 
> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
> ---
>  fs/gfs2/locking/dlm/plock.c |    2 +-
>  fs/lockd/svclock.c          |   13 ++++---------
>  fs/locks.c                  |   28 ++++++++++++++--------------
>  include/linux/fs.h          |    6 ++++++
>  4 files changed, 25 insertions(+), 24 deletions(-)
> 
> Index: linux-2.6/fs/locks.c
> ===================================================================
> --- linux-2.6.orig/fs/locks.c	2008-04-16 17:33:49.000000000 +0200
> +++ linux-2.6/fs/locks.c	2008-04-16 17:38:01.000000000 +0200
> @@ -784,8 +784,10 @@ find_conflict:
>  		if (!flock_locks_conflict(request, fl))
>  			continue;
>  		error = -EAGAIN;
> -		if (request->fl_flags & FL_SLEEP)
> -			locks_insert_block(fl, request);
> +		if (!(request->fl_flags & FL_SLEEP))
> +			goto out;
> +		error = FILE_LOCK_DEFERRED;
> +		locks_insert_block(fl, request);
>  		goto out;
>  	}
>  	if (request->fl_flags & FL_ACCESS)
> @@ -841,7 +843,7 @@ static int __posix_lock_file(struct inod
>  			error = -EDEADLK;
>  			if (posix_locks_deadlock(request, fl))
>  				goto out;
> -			error = -EAGAIN;
> +			error = FILE_LOCK_DEFERRED;
>  			locks_insert_block(fl, request);
>  			goto out;
>    		}
> @@ -1040,7 +1042,7 @@ int posix_lock_file_wait(struct file *fi
>  	might_sleep ();
>  	for (;;) {
>  		error = posix_lock_file(filp, fl, NULL);
> -		if ((error != -EAGAIN) || !(fl->fl_flags & FL_SLEEP))
> +		if (error != FILE_LOCK_DEFERRED)
>  			break;
>  		error = wait_event_interruptible(fl->fl_wait, !fl->fl_next);
>  		if (!error)
> @@ -1112,9 +1114,7 @@ int locks_mandatory_area(int read_write,
>  
>  	for (;;) {
>  		error = __posix_lock_file(inode, &fl, NULL);
> -		if (error != -EAGAIN)
> -			break;
> -		if (!(fl.fl_flags & FL_SLEEP))
> +		if (error != FILE_LOCK_DEFERRED)
>  			break;
>  		error = wait_event_interruptible(fl.fl_wait, !fl.fl_next);
>  		if (!error) {
> @@ -1534,7 +1534,7 @@ int flock_lock_file_wait(struct file *fi
>  	might_sleep();
>  	for (;;) {
>  		error = flock_lock_file(filp, fl);
> -		if ((error != -EAGAIN) || !(fl->fl_flags & FL_SLEEP))
> +		if (error != FILE_LOCK_DEFERRED)
>  			break;
>  		error = wait_event_interruptible(fl->fl_wait, !fl->fl_next);
>  		if (!error)
> @@ -1719,17 +1719,17 @@ out:
>   * fl_grant is set. Callers expecting ->lock() to return asynchronously
>   * will only use F_SETLK, not F_SETLKW; they will set FL_SLEEP if (and only if)
>   * the request is for a blocking lock. When ->lock() does return asynchronously,
> - * it must return -EINPROGRESS, and call ->fl_grant() when the lock
> + * it must return FILE_LOCK_DEFERRED, and call ->fl_grant() when the lock
>   * request completes.
>   * If the request is for non-blocking lock the file system should return
> - * -EINPROGRESS then try to get the lock and call the callback routine with
> - * the result. If the request timed out the callback routine will return a
> + * FILE_LOCK_DEFERRED then try to get the lock and call the callback routine
> + * with the result. If the request timed out the callback routine will return a
>   * nonzero return code and the file system should release the lock. The file
>   * system is also responsible to keep a corresponding posix lock when it
>   * grants a lock so the VFS can find out which locks are locally held and do
>   * the correct lock cleanup when required.
>   * The underlying filesystem must not drop the kernel lock or call
> - * ->fl_grant() before returning to the caller with a -EINPROGRESS
> + * ->fl_grant() before returning to the caller with a FILE_LOCK_DEFERRED
>   * return code.
>   */
>  int vfs_lock_file(struct file *filp, unsigned int cmd, struct file_lock *fl, struct file_lock *conf)
> @@ -1806,7 +1806,7 @@ again:
>  	else {
>  		for (;;) {
>  			error = posix_lock_file(filp, file_lock, NULL);
> -			if (error != -EAGAIN || cmd == F_SETLK)
> +			if (error != FILE_LOCK_DEFERRED)
>  				break;
>  			error = wait_event_interruptible(file_lock->fl_wait,
>  					!file_lock->fl_next);
> @@ -1934,7 +1934,7 @@ again:
>  	else {
>  		for (;;) {
>  			error = posix_lock_file(filp, file_lock, NULL);
> -			if (error != -EAGAIN || cmd == F_SETLK64)
> +			if (error != FILE_LOCK_DEFERRED)
>  				break;
>  			error = wait_event_interruptible(file_lock->fl_wait,
>  					!file_lock->fl_next);
> Index: linux-2.6/fs/gfs2/locking/dlm/plock.c
> ===================================================================
> --- linux-2.6.orig/fs/gfs2/locking/dlm/plock.c	2008-04-16 17:33:49.000000000 +0200
> +++ linux-2.6/fs/gfs2/locking/dlm/plock.c	2008-04-16 17:35:46.000000000 +0200
> @@ -108,7 +108,7 @@ int gdlm_plock(void *lockspace, struct l
>  	if (xop->callback == NULL)
>  		wait_event(recv_wq, (op->done != 0));
>  	else
> -		return -EINPROGRESS;
> +		return FILE_LOCK_DEFERRED;
>  
>  	spin_lock(&ops_lock);
>  	if (!list_empty(&op->list)) {
> Index: linux-2.6/fs/lockd/svclock.c
> ===================================================================
> --- linux-2.6.orig/fs/lockd/svclock.c	2008-04-16 17:33:49.000000000 +0200
> +++ linux-2.6/fs/lockd/svclock.c	2008-04-16 17:35:46.000000000 +0200
> @@ -423,8 +423,8 @@ nlmsvc_lock(struct svc_rqst *rqstp, stru
>  			goto out;
>  		case -EAGAIN:
>  			ret = nlm_lck_denied;
> -			break;
> -		case -EINPROGRESS:
> +			goto out;
> +		case FILE_LOCK_DEFERRED:
>  			if (wait)
>  				break;
>  			/* Filesystem lock operation is in progress
> @@ -439,10 +439,6 @@ nlmsvc_lock(struct svc_rqst *rqstp, stru
>  			goto out;
>  	}
>  
> -	ret = nlm_lck_denied;
> -	if (!wait)
> -		goto out;
> -
>  	ret = nlm_lck_blocked;
>  
>  	/* Append to list of blocked */
> @@ -520,7 +516,7 @@ nlmsvc_testlock(struct svc_rqst *rqstp, 
>  	}
>  
>  	error = vfs_test_lock(file->f_file, &lock->fl);
> -	if (error == -EINPROGRESS) {
> +	if (error == FILE_LOCK_DEFERRED) {
>  		ret = nlmsvc_defer_lock_rqst(rqstp, block);
>  		goto out;
>  	}
> @@ -744,8 +740,7 @@ nlmsvc_grant_blocked(struct nlm_block *b
>  	switch (error) {
>  	case 0:
>  		break;
> -	case -EAGAIN:
> -	case -EINPROGRESS:
> +	case FILE_LOCK_DEFERRED:
>  		dprintk("lockd: lock still blocked error %d\n", error);
>  		nlmsvc_insert_block(block, NLM_NEVER);
>  		nlmsvc_release_block(block);
> Index: linux-2.6/include/linux/fs.h
> ===================================================================
> --- linux-2.6.orig/include/linux/fs.h	2008-04-16 17:33:49.000000000 +0200
> +++ linux-2.6/include/linux/fs.h	2008-04-16 18:17:12.000000000 +0200
> @@ -837,6 +837,12 @@ extern spinlock_t files_lock;
>  #define FL_SLEEP	128	/* A blocking lock */
>  
>  /*
> + * Special return value from posix_lock_file() and vfs_lock_file() for
> + * asynchronous locking.
> + */
> +#define FILE_LOCK_DEFERRED 1
> +
> +/*
>   * The POSIX file lock owner is determined by
>   * the "struct files_struct" in the thread group
>   * (or NULL for no owner - BSD locks).
> 
> 
> 

  reply	other threads:[~2008-04-17 22:26 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-09 15:57 [patch] fix infinite loop in generic_file_splice_read() Miklos Szeredi
2008-04-09 17:05 ` Oliver Pinter
2008-04-09 18:57 ` Andrew Morton
2008-04-09 19:25   ` Miklos Szeredi
2008-04-09 19:52   ` Jens Axboe
2008-04-10  6:29   ` Allard Hoeve
2008-04-10 19:51 ` nfs: infinite loop in fcntl(F_SETLKW) Miklos Szeredi
2008-04-10 21:02   ` Trond Myklebust
2008-04-10 21:07     ` Trond Myklebust
     [not found]       ` <1207861661.8180.18.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-04-10 21:20         ` Trond Myklebust
2008-04-10 21:54           ` J. Bruce Fields
2008-04-11 19:12             ` Miklos Szeredi
2008-04-11 19:19               ` J. Bruce Fields
     [not found]                 ` <20080411191910.GB16965-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2008-04-11 19:22                   ` Miklos Szeredi
2008-04-13  0:08               ` J. Bruce Fields
     [not found]                 ` <20080413000830.GF31789-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2008-04-13  8:13                   ` Miklos Szeredi
2008-04-14 17:07                     ` J. Bruce Fields
     [not found]                     ` <E1JkxKz-0003A8-9V-8f8m9JG5TPIdUIPVzhDTVZP2KDSNp7ea@public.gmane.org>
2008-04-14 19:03                       ` [PATCH] locks: fix possible infinite loop in fcntl(F_SETLKW) over nfs J. Bruce Fields
     [not found]             ` <20080410215410.GF22324-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2008-04-13  8:28               ` nfs: infinite loop in fcntl(F_SETLKW) Miklos Szeredi
2008-04-14 17:19                 ` J. Bruce Fields
2008-04-14 21:15                   ` Miklos Szeredi
2008-04-15 18:58                     ` J. Bruce Fields
2008-04-16 16:28                       ` Miklos Szeredi
2008-04-17 22:26                         ` J. Bruce Fields [this message]
     [not found]                           ` <20080417222620.GL9912-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2008-04-18 12:47                             ` Miklos Szeredi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080417222620.GL9912@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=akpm@linux-foundation.org \
    --cc=eshel@almaden.ibm.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=neilb@suse.de \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).