From: Josef Bacik <josef@toxicpanda.com>
To: fdmanana@kernel.org
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: fix race setting file private on concurrent lseek using same fd
Date: Tue, 10 Sep 2024 11:35:59 -0400 [thread overview]
Message-ID: <20240910153559.GA3943310@perftesting> (raw)
In-Reply-To: <a351fd3b0397b6fe8d94f11a6744e1655349d687.1725356877.git.fdmanana@suse.com>
On Tue, Sep 03, 2024 at 10:55:36AM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> When doing concurrent lseek(2) system calls against the same file
> descriptor, using multiple threads belonging to the same process, we have
> a short time window where a race happens and can result in a memory leak.
>
> The race happens like this:
>
> 1) A program opens a file descriptor for a file and then spawns two
> threads (with the pthreads library for example), lets call them
> task A and task B;
>
> 2) Task A calls lseek with SEEK_DATA or SEEK_HOLE and ends up at
> file.c:find_desired_extent() while holding a read lock on the inode;
>
> 3) At the start of find_desired_extent(), it extracts the file's
> private_data pointer into a local variable named 'private', which has
> a value of NULL;
>
> 4) Task B also calls lseek with SEEK_DATA or SEEK_HOLE, locks the inode
> in shared mode and enters file.c:find_desired_extent(), where it also
> extracts file->private_data into its local variable 'private', which
> has a NULL value;
>
> 5) Because it saw a NULL file private, task A allocates a private
> structure and assigns to the file structure;
>
> 6) Task B also saw a NULL file private so it also allocates its own file
> private and then assigns it to the same file structure, since both
> tasks are using the same file descriptor.
>
> At this point we leak the private structure allocated by task A.
>
> Besides the memory leak, there's also the detail that both tasks end up
> using the same cached state record in the private structure (struct
> btrfs_file_private::llseek_cached_state), which can result in a
> use-after-free problem since one task can free it while the other is
> still using it (only one task took a reference count on it). Also, sharing
> the cached state is not a good idea since it could result in incorrect
> results in the future - right now it should not be a problem because it
> end ups being used only in extent-io-tree.c:count_range_bits() where we do
> range validation before using the cached state.
>
> Fix this by protecting the private assignment and check of a file while
> holding the inode's spinlock and keep track of the task that allocated
> the private, so that it's used only by that task in order to prevent
> user-after-free issues with the cached state record as well as potentially
> using it incorrectly in the future.
>
> Fixes: 3c32c7212f16 ("btrfs: use cached state when looking for delalloc ranges with lseek")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
We should maybe re-look at our useage of btrfs_file_private and see if we can
come up with a better solution for this style of problem. I think the directory
readdir stuff does the same sort of thing and we might end up with a similar
issue if we have two tasks using the same fd to do readdir.
But these are just random other thoughts for future work, for this you can add
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Thanks,
Josef
next prev parent reply other threads:[~2024-09-10 15:36 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-03 9:55 [PATCH] btrfs: fix race setting file private on concurrent lseek using same fd fdmanana
2024-09-10 15:35 ` Josef Bacik [this message]
2024-09-10 15:38 ` Filipe Manana
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240910153559.GA3943310@perftesting \
--to=josef@toxicpanda.com \
--cc=fdmanana@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox