From: Jeff Layton <jlayton@kernel.org>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Chuck Lever <chuck.lever@oracle.com>,
Al Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] locks: fix TOCTOU race when granting write lease
Date: Mon, 15 Aug 2022 07:21:40 -0400 [thread overview]
Message-ID: <d910e1ef7c8fcf65fbdb0bc438ebba2d7a1d6f83.camel@kernel.org> (raw)
In-Reply-To: <20220814152322.569296-1-amir73il@gmail.com>
On Sun, 2022-08-14 at 18:23 +0300, Amir Goldstein wrote:
> Thread A trying to acquire a write lease checks the value of i_readcount
> and i_writecount in check_conflicting_open() to verify that its own fd
> is the only fd referencing the file.
>
> Thread B trying to open the file for read will call break_lease() in
> do_dentry_open() before incrementing i_readcount, which leaves a small
> window where thread A can acquire the write lease and then thread B
> completes the open of the file for read without breaking the write lease
> that was acquired by thread A.
>
> Fix this race by incrementing i_readcount before checking for existing
> leases, same as the case with i_writecount.
>
Nice catch.
> Use a helper put_file_access() to decrement i_readcount or i_writecount
> in do_dentry_open() and __fput().
>
> Fixes: 387e3746d01c ("locks: eliminate false positive conflicts for write lease")
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>
> Hi Jeff,
>
> This fixes a race I found during code audit - I do not have a reproducer
> for it.
>
> I ran the fstests I found for locks and leases:
> generic/131 generic/478 generic/504 generic/571
> and the LTP fcntl tests.
>
> Encountered this warning with generic/131, but I also see it on
> current master:
>
> =============================
> WARNING: suspicious RCU usage
> 5.19.0-xfstests-14277-gbd6ab3ef4e93 #966 Not tainted
> -----------------------------
> include/net/sock.h:592 suspicious rcu_dereference_check() usage!
>
> other info that might help us debug this:
>
>
> rcu_scheduler_active = 2, debug_locks = 1
> 5 locks held by locktest/3996:
> #0: ffff88800be1d7a0 (&sb->s_type->i_mutex_key#8){+.+.}-{3:3}, at: __sock_release+0x25/0x97
> #1: ffff88800909ce00 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_close+0x14/0x60
> #2: ffff888006847cc8 (&h->lhash2[i].lock){+.+.}-{2:2}, at: inet_unhash+0x3a/0xcf
> #3: ffffffff82a8ac18 (reuseport_lock){+...}-{2:2}, at: reuseport_detach_sock+0x17/0xb8
> #4: ffff88800909d0b0 (clock-AF_INET){++..}-{2:2}, at: bpf_sk_reuseport_detach+0x1b/0x85
>
> stack backtrace:
> CPU: 1 PID: 3996 Comm: locktest Not tainted 5.19.0-xfstests-14277-gbd6ab3ef4e93 #966
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> Call Trace:
> <TASK>
> dump_stack_lvl+0x45/0x5d
> bpf_sk_reuseport_detach+0x5c/0x85
> reuseport_detach_sock+0x65/0xb8
> inet_unhash+0x55/0xcf
> tcp_set_state+0xb3/0x10d
> ? mark_lock.part.0+0x30/0x101
> __tcp_close+0x26/0x32d
> tcp_close+0x20/0x60
> inet_release+0x50/0x64
> __sock_release+0x32/0x97
> sock_close+0x14/0x1b
> __fput+0x118/0x1eb
>
>
> Let me know what you think.
>
> Thanks,
> Amir.
>
> fs/file_table.c | 7 +------
> fs/open.c | 11 ++++-------
> include/linux/fs.h | 10 ++++++++++
> 3 files changed, 15 insertions(+), 13 deletions(-)
>
> diff --git a/fs/file_table.c b/fs/file_table.c
> index 99c6796c9f28..dd88701e54a9 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -324,12 +324,7 @@ static void __fput(struct file *file)
> }
> fops_put(file->f_op);
> put_pid(file->f_owner.pid);
> - if ((mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ)
> - i_readcount_dec(inode);
> - if (mode & FMODE_WRITER) {
> - put_write_access(inode);
> - __mnt_drop_write(mnt);
> - }
> + put_file_access(file);
> dput(dentry);
> if (unlikely(mode & FMODE_NEED_UNMOUNT))
> dissolve_on_fput(mnt);
> diff --git a/fs/open.c b/fs/open.c
> index 8a813fa5ca56..a98572585815 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -840,7 +840,9 @@ static int do_dentry_open(struct file *f,
> return 0;
> }
>
> - if (f->f_mode & FMODE_WRITE && !special_file(inode->i_mode)) {
> + if ((f->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ) {
> + i_readcount_inc(inode);
> + } else if (f->f_mode & FMODE_WRITE && !special_file(inode->i_mode)) {
> error = get_write_access(inode);
> if (unlikely(error))
> goto cleanup_file;
> @@ -880,8 +882,6 @@ static int do_dentry_open(struct file *f,
> goto cleanup_all;
> }
> f->f_mode |= FMODE_OPENED;
> - if ((f->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ)
> - i_readcount_inc(inode);
> if ((f->f_mode & FMODE_READ) &&
> likely(f->f_op->read || f->f_op->read_iter))
> f->f_mode |= FMODE_CAN_READ;
> @@ -935,10 +935,7 @@ static int do_dentry_open(struct file *f,
> if (WARN_ON_ONCE(error > 0))
> error = -EINVAL;
> fops_put(f->f_op);
> - if (f->f_mode & FMODE_WRITER) {
> - put_write_access(inode);
> - __mnt_drop_write(f->f_path.mnt);
> - }
> + put_file_access(f);
> cleanup_file:
> path_put(&f->f_path);
> f->f_path.mnt = NULL;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 9eced4cc286e..8bc04852c3da 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3000,6 +3000,16 @@ static inline void i_readcount_inc(struct inode *inode)
> return;
> }
> #endif
> +static inline void put_file_access(struct file *file)
> +{
> + if ((file->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ) {
> + i_readcount_dec(file->f_inode);
> + } else if (file->f_mode & FMODE_WRITER) {
> + put_write_access(file->f_inode);
> + __mnt_drop_write(file->f_path.mnt);
> + }
> +}
> +
> extern int do_pipe_flags(int *, int);
>
> extern ssize_t kernel_read(struct file *, void *, size_t, loff_t *);
Looks good to me. I like the new helper.
In addition to Al's comment about which header this should go in, it
might also be good to put a kerneldoc comment over it.
Al, did you want to take this via your tree or do you want me to take it
via the filelocks tree?
Thanks,
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2022-08-15 11:21 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-14 15:23 [PATCH] locks: fix TOCTOU race when granting write lease Amir Goldstein
2022-08-14 17:57 ` Al Viro
2022-08-15 7:18 ` Amir Goldstein
2022-08-16 3:18 ` Al Viro
2022-08-15 11:21 ` Jeff Layton [this message]
2022-08-16 14:03 ` Amir Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d910e1ef7c8fcf65fbdb0bc438ebba2d7a1d6f83.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=amir73il@gmail.com \
--cc=chuck.lever@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.