public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Bruce Fields <bfields@fieldses.org>
To: Jeff Layton <jlayton@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Matthew Wilcox <matthew@wil.cx>,
	David Howells <dhowells@redhat.com>
Subject: Re: [PATCH v3] locks: close potential race between setlease and open
Date: Thu, 15 Aug 2013 15:32:03 -0400	[thread overview]
Message-ID: <20130815193203.GT17781@fieldses.org> (raw)
In-Reply-To: <1376482310-7348-1-git-send-email-jlayton@redhat.com>

On Wed, Aug 14, 2013 at 08:11:50AM -0400, Jeff Layton wrote:
> v2:
> - fix potential double-free of lease if second check finds conflict
> - add smp_mb's to ensure that other CPUs see i_flock changes
> 
> v3:
> - remove smp_mb calls. Partial ordering is unlikely to help here.

Forgive me here, I still don't understand.  So to simplify massively,
the situation looks like:

	setlease		open
	------------		------

	atomic_read		atomic_inc
	write i_flock		read i_flock
	atomic_read

And we want to be sure that either the setlease caller sees the result
of the atomic_inc, or the opener sees the result of the write to
i_flock.

As an example, suppose the above steps happen in the order:

	atomic_read
	write i_flock
	atomic_read
				atomic_inc
				read i_flock

How do we know that the read of i_flock at the last step reflects the
latest value of i_flock?  For example, couldn't the write still be stuck
in first CPU's cache?

--b.

> 
> As Al Viro points out, there is an unlikely, but possible race between
> opening a file and setting a lease on it. generic_add_lease is done with
> the i_lock held, but the inode->i_flock check in break_lease is
> lockless. It's possible for another task doing an open to do the entire
> pathwalk and call break_lease between the point where generic_add_lease
> checks for a conflicting open and adds the lease to the list. If this
> occurs, we can end up with a lease set on the file with a conflicting
> open.
> 
> To guard against that, check again for a conflicting open after adding
> the lease to the i_flock list. If the above race occurs, then we can
> simply unwind the lease setting and return -EAGAIN.
> 
> Because we take dentry references and acquire write access on the file
> before calling break_lease, we know that if the i_flock list is empty
> when the open caller goes to check it then the necessary refcounts have
> already been incremented. Thus the additional check for a conflicting
> open will see that there is one and the setlease call will fail.
> 
> Cc: Bruce Fields <bfields@fieldses.org>
> Cc: David Howells <dhowells@redhat.com>
> Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  fs/locks.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++------------
>  1 file changed, 58 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index b27a300..a99adec 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -652,15 +652,18 @@ static void locks_insert_lock(struct file_lock **pos, struct file_lock *fl)
>  	locks_insert_global_locks(fl);
>  }
>  
> -/*
> - * Delete a lock and then free it.
> - * Wake up processes that are blocked waiting for this lock,
> - * notify the FS that the lock has been cleared and
> - * finally free the lock.
> +/**
> + * locks_delete_lock - Delete a lock and then free it.
> + * @thisfl_p: pointer that points to the fl_next field of the previous
> + * 	      inode->i_flock list entry
> + *
> + * Unlink a lock from all lists and free the namespace reference, but don't
> + * free it yet. Wake up processes that are blocked waiting for this lock and
> + * notify the FS that the lock has been cleared.
>   *
>   * Must be called with the i_lock held!
>   */
> -static void locks_delete_lock(struct file_lock **thisfl_p)
> +static void locks_unlink_lock(struct file_lock **thisfl_p)
>  {
>  	struct file_lock *fl = *thisfl_p;
>  
> @@ -675,6 +678,18 @@ static void locks_delete_lock(struct file_lock **thisfl_p)
>  	}
>  
>  	locks_wake_up_blocks(fl);
> +}
> +
> +/*
> + * Unlink a lock from all lists and free it.
> + *
> + * Must be called with i_lock held!
> + */
> +static void locks_delete_lock(struct file_lock **thisfl_p)
> +{
> +	struct file_lock *fl = *thisfl_p;
> +
> +	locks_unlink_lock(thisfl_p);
>  	locks_free_lock(fl);
>  }
>  
> @@ -1455,6 +1470,32 @@ int fcntl_getlease(struct file *filp)
>  	return type;
>  }
>  
> +/**
> + * check_conflicting_open - see if the given dentry points to a file that has
> + * 	an existing open that would conflict with the desired lease.
> + *
> + * @dentry:	dentry to check
> + * @arg:	type of lease that we're trying to acquire
> + *
> + * Check to see if there's an existing open fd on this file that would
> + * conflict with the lease we're trying to set.
> + */
> +static int
> +check_conflicting_open(const struct dentry *dentry, const long arg)
> +{
> +	int ret = 0;
> +	struct inode *inode = dentry->d_inode;
> +
> +	if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
> +		return -EAGAIN;
> +
> +	if ((arg == F_WRLCK) && ((d_count(dentry) > 1) ||
> +	    (atomic_read(&inode->i_count) > 1)))
> +		ret = -EAGAIN;
> +
> +	return ret;
> +}
> +
>  static int generic_add_lease(struct file *filp, long arg, struct file_lock **flp)
>  {
>  	struct file_lock *fl, **before, **my_before = NULL, *lease;
> @@ -1464,12 +1505,8 @@ static int generic_add_lease(struct file *filp, long arg, struct file_lock **flp
>  
>  	lease = *flp;
>  
> -	error = -EAGAIN;
> -	if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
> -		goto out;
> -	if ((arg == F_WRLCK)
> -	    && ((d_count(dentry) > 1)
> -		|| (atomic_read(&inode->i_count) > 1)))
> +	error = check_conflicting_open(dentry, arg);
> +	if (error)
>  		goto out;
>  
>  	/*
> @@ -1514,8 +1551,16 @@ static int generic_add_lease(struct file *filp, long arg, struct file_lock **flp
>  		goto out;
>  
>  	locks_insert_lock(before, lease);
> -	return 0;
>  
> +	/*
> +	 * The check in break_lease() is lockless. It's possible for another
> +	 * open to race in after we did the earlier check for a conflicting
> +	 * open but before the lease was inserted. Check again for a
> +	 * conflicting open and cancel the lease if there is one.
> +	 */
> +	error = check_conflicting_open(dentry, arg);
> +	if (error)
> +		locks_unlink_lock(flp);
>  out:
>  	return error;
>  }
> -- 
> 1.8.3.1
> 

  reply	other threads:[~2013-08-15 19:32 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-08 19:07 [PATCH] locks: close potential race between setlease and open Jeff Layton
2013-08-14 12:11 ` [PATCH v3] " Jeff Layton
2013-08-15 19:32   ` Bruce Fields [this message]
2013-08-15 19:43     ` Jeff Layton
2013-08-15 20:30       ` Bruce Fields
2013-08-15 20:44     ` memory barriers in flock (Re: [PATCH v3] locks: close potential race between setlease and open) David Howells
2013-08-15 21:31       ` Paul E. McKenney
2013-08-16 12:09         ` Jeff Layton
2013-08-19 13:31         ` Bruce Fields
2013-08-16 14:49 ` [PATCH v4] locks: close potential race between setlease and open Jeff Layton
2013-08-19 13:34   ` Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130815193203.GT17781@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=dhowells@redhat.com \
    --cc=jlayton@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew@wil.cx \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox