linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Chuck Lever <chuck.lever@oracle.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] locks: fix TOCTOU race when granting write lease
Date: Mon, 15 Aug 2022 07:21:40 -0400	[thread overview]
Message-ID: <d910e1ef7c8fcf65fbdb0bc438ebba2d7a1d6f83.camel@kernel.org> (raw)
In-Reply-To: <20220814152322.569296-1-amir73il@gmail.com>

On Sun, 2022-08-14 at 18:23 +0300, Amir Goldstein wrote:
> Thread A trying to acquire a write lease checks the value of i_readcount
> and i_writecount in check_conflicting_open() to verify that its own fd
> is the only fd referencing the file.
> 
> Thread B trying to open the file for read will call break_lease() in
> do_dentry_open() before incrementing i_readcount, which leaves a small
> window where thread A can acquire the write lease and then thread B
> completes the open of the file for read without breaking the write lease
> that was acquired by thread A.
> 
> Fix this race by incrementing i_readcount before checking for existing
> leases, same as the case with i_writecount.
> 

Nice catch.

> Use a helper put_file_access() to decrement i_readcount or i_writecount
> in do_dentry_open() and __fput().
> 
> Fixes: 387e3746d01c ("locks: eliminate false positive conflicts for write lease")
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
> 
> Hi Jeff,
> 
> This fixes a race I found during code audit - I do not have a reproducer
> for it.
> 
> I ran the fstests I found for locks and leases:
> generic/131 generic/478 generic/504 generic/571
> and the LTP fcntl tests.
> 
> Encountered this warning with generic/131, but I also see it on
> current master:
> 
>  =============================
>  WARNING: suspicious RCU usage
>  5.19.0-xfstests-14277-gbd6ab3ef4e93 #966 Not tainted
>  -----------------------------
>  include/net/sock.h:592 suspicious rcu_dereference_check() usage!
> 
>  other info that might help us debug this:
> 
> 
>  rcu_scheduler_active = 2, debug_locks = 1
>  5 locks held by locktest/3996:
>   #0: ffff88800be1d7a0 (&sb->s_type->i_mutex_key#8){+.+.}-{3:3}, at: __sock_release+0x25/0x97
>   #1: ffff88800909ce00 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_close+0x14/0x60
>   #2: ffff888006847cc8 (&h->lhash2[i].lock){+.+.}-{2:2}, at: inet_unhash+0x3a/0xcf
>   #3: ffffffff82a8ac18 (reuseport_lock){+...}-{2:2}, at: reuseport_detach_sock+0x17/0xb8
>   #4: ffff88800909d0b0 (clock-AF_INET){++..}-{2:2}, at: bpf_sk_reuseport_detach+0x1b/0x85
> 
>  stack backtrace:
>  CPU: 1 PID: 3996 Comm: locktest Not tainted 5.19.0-xfstests-14277-gbd6ab3ef4e93 #966
>  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
>  Call Trace:
>   <TASK>
>   dump_stack_lvl+0x45/0x5d
>   bpf_sk_reuseport_detach+0x5c/0x85
>   reuseport_detach_sock+0x65/0xb8
>   inet_unhash+0x55/0xcf
>   tcp_set_state+0xb3/0x10d
>   ? mark_lock.part.0+0x30/0x101
>   __tcp_close+0x26/0x32d
>   tcp_close+0x20/0x60
>   inet_release+0x50/0x64
>   __sock_release+0x32/0x97
>   sock_close+0x14/0x1b
>   __fput+0x118/0x1eb
> 
> 
> Let me know what you think.
> 
> Thanks,
> Amir.
> 
>  fs/file_table.c    |  7 +------
>  fs/open.c          | 11 ++++-------
>  include/linux/fs.h | 10 ++++++++++
>  3 files changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/file_table.c b/fs/file_table.c
> index 99c6796c9f28..dd88701e54a9 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -324,12 +324,7 @@ static void __fput(struct file *file)
>  	}
>  	fops_put(file->f_op);
>  	put_pid(file->f_owner.pid);
> -	if ((mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ)
> -		i_readcount_dec(inode);
> -	if (mode & FMODE_WRITER) {
> -		put_write_access(inode);
> -		__mnt_drop_write(mnt);
> -	}
> +	put_file_access(file);
>  	dput(dentry);
>  	if (unlikely(mode & FMODE_NEED_UNMOUNT))
>  		dissolve_on_fput(mnt);
> diff --git a/fs/open.c b/fs/open.c
> index 8a813fa5ca56..a98572585815 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -840,7 +840,9 @@ static int do_dentry_open(struct file *f,
>  		return 0;
>  	}
>  
> -	if (f->f_mode & FMODE_WRITE && !special_file(inode->i_mode)) {
> +	if ((f->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ) {
> +		i_readcount_inc(inode);
> +	} else if (f->f_mode & FMODE_WRITE && !special_file(inode->i_mode)) {
>  		error = get_write_access(inode);
>  		if (unlikely(error))
>  			goto cleanup_file;
> @@ -880,8 +882,6 @@ static int do_dentry_open(struct file *f,
>  			goto cleanup_all;
>  	}
>  	f->f_mode |= FMODE_OPENED;
> -	if ((f->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ)
> -		i_readcount_inc(inode);
>  	if ((f->f_mode & FMODE_READ) &&
>  	     likely(f->f_op->read || f->f_op->read_iter))
>  		f->f_mode |= FMODE_CAN_READ;
> @@ -935,10 +935,7 @@ static int do_dentry_open(struct file *f,
>  	if (WARN_ON_ONCE(error > 0))
>  		error = -EINVAL;
>  	fops_put(f->f_op);
> -	if (f->f_mode & FMODE_WRITER) {
> -		put_write_access(inode);
> -		__mnt_drop_write(f->f_path.mnt);
> -	}
> +	put_file_access(f);
>  cleanup_file:
>  	path_put(&f->f_path);
>  	f->f_path.mnt = NULL;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 9eced4cc286e..8bc04852c3da 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3000,6 +3000,16 @@ static inline void i_readcount_inc(struct inode *inode)
>  	return;
>  }
>  #endif
> +static inline void put_file_access(struct file *file)
> +{
> +	if ((file->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ) {
> +		i_readcount_dec(file->f_inode);
> +	} else if (file->f_mode & FMODE_WRITER) {
> +		put_write_access(file->f_inode);
> +		__mnt_drop_write(file->f_path.mnt);
> +	}
> +}
> +
>  extern int do_pipe_flags(int *, int);
>  
>  extern ssize_t kernel_read(struct file *, void *, size_t, loff_t *);

Looks good to me. I like the new helper.

In addition to Al's comment about which header this should go in, it
might also be good to put a kerneldoc comment over it.

Al, did you want to take this via your tree or do you want me to take it
via the filelocks tree?

Thanks,
-- 
Jeff Layton <jlayton@kernel.org>

  parent reply	other threads:[~2022-08-15 11:21 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-14 15:23 [PATCH] locks: fix TOCTOU race when granting write lease Amir Goldstein
2022-08-14 17:57 ` Al Viro
2022-08-15  7:18   ` Amir Goldstein
2022-08-16  3:18     ` Al Viro
2022-08-15 11:21 ` Jeff Layton [this message]
2022-08-16 14:03   ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d910e1ef7c8fcf65fbdb0bc438ebba2d7a1d6f83.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=amir73il@gmail.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).