From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14EA817CA12; Mon, 10 Mar 2025 17:52:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741629173; cv=none; b=hmB3BgmAV4at2FrS+Mz/mReqWz44iuGXAfqTXe1yK5gw2WvbJDCzSUIcmupeHGoEXmiweKLlzOlcrCRNmlnpTq/NMYS6IxpB+OIFqowBrcqx0gMG/vv9W0vh3ubNmYosMEtBfenxS5KoklQpkfWrO/kZLF6pZ5OPP8m06yPBa/4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741629173; c=relaxed/simple; bh=EVaZJ0BH1Z3UV46J4oI+pTyE9MMsSkVurjn3gFOoZnM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nsLQTabtEQnMJHVDZJuL1cKJTMkyFw6Xl4ULnqVEvKteCk26dufjuGf10V5dcQs4slq1/4PqoLgwGQGmPhdDjcO+L2QJFMRzUtzfVnBelkdW1YuYX/1NdZTmYsZpDKbecBMUAWYeII8NzEQPnR3UE+2kPF4ilRK3M4fte5dpCPA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=Zjt6k2Vd; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="Zjt6k2Vd" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2DB42C4CEE5; Mon, 10 Mar 2025 17:52:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1741629172; bh=EVaZJ0BH1Z3UV46J4oI+pTyE9MMsSkVurjn3gFOoZnM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Zjt6k2VdUsUk3v2StuOTiol2gtKksRqfTtIFGkV05t4xrk1UIVvBEOn3nDgoOQ+NE EFNtZX/cdMGePPIkwczvA/LdPpa8p1UoVGZeEZgHm5RP4b6v3iUhgVHhPh7yZUfVH7 9nhyoDUNVlKb6zHzpbVfFXsJbf6zYndsAoPjoS0Q= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Filipe Manana , Hao-ran Zheng , David Sterba , Sasha Levin Subject: [PATCH 5.15 202/620] btrfs: fix data race when accessing the inodes disk_i_size at btrfs_drop_extents() Date: Mon, 10 Mar 2025 18:00:48 +0100 Message-ID: <20250310170553.603019161@linuxfoundation.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250310170545.553361750@linuxfoundation.org> References: <20250310170545.553361750@linuxfoundation.org> User-Agent: quilt/0.68 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 5.15-stable review patch. If anyone has any objections, please let me know. ------------------ From: Hao-ran Zheng [ Upstream commit 5324c4e10e9c2ce307a037e904c0d9671d7137d9 ] A data race occurs when the function `insert_ordered_extent_file_extent()` and the function `btrfs_inode_safe_disk_i_size_write()` are executed concurrently. The function `insert_ordered_extent_file_extent()` is not locked when reading inode->disk_i_size, causing `btrfs_inode_safe_disk_i_size_write()` to cause data competition when writing inode->disk_i_size, thus affecting the value of `modify_tree`. The specific call stack that appears during testing is as follows: ============DATA_RACE============ btrfs_drop_extents+0x89a/0xa060 [btrfs] insert_reserved_file_extent+0xb54/0x2960 [btrfs] insert_ordered_extent_file_extent+0xff5/0x1760 [btrfs] btrfs_finish_one_ordered+0x1b85/0x36a0 [btrfs] btrfs_finish_ordered_io+0x37/0x60 [btrfs] finish_ordered_fn+0x3e/0x50 [btrfs] btrfs_work_helper+0x9c9/0x27a0 [btrfs] process_scheduled_works+0x716/0xf10 worker_thread+0xb6a/0x1190 kthread+0x292/0x330 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30 ============OTHER_INFO============ btrfs_inode_safe_disk_i_size_write+0x4ec/0x600 [btrfs] btrfs_finish_one_ordered+0x24c7/0x36a0 [btrfs] btrfs_finish_ordered_io+0x37/0x60 [btrfs] finish_ordered_fn+0x3e/0x50 [btrfs] btrfs_work_helper+0x9c9/0x27a0 [btrfs] process_scheduled_works+0x716/0xf10 worker_thread+0xb6a/0x1190 kthread+0x292/0x330 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30 ================================= The main purpose of the check of the inode's disk_i_size is to avoid taking write locks on a btree path when we have a write at or beyond EOF, since in these cases we don't expect to find extent items in the root to drop. However if we end up taking write locks due to a data race on disk_i_size, everything is still correct, we only add extra lock contention on the tree in case there's concurrency from other tasks. If the race causes us to not take write locks when we actually need them, then everything is functionally correct as well, since if we find out we have extent items to drop and we took read locks (modify_tree set to 0), we release the path and retry again with write locks. Since this data race does not affect the correctness of the function, it is a harmless data race, use data_race() to check inode->disk_i_size. Reviewed-by: Filipe Manana Signed-off-by: Hao-ran Zheng Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin --- fs/btrfs/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 44160d4ad53e0..31b25cb2f5cc3 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -731,7 +731,7 @@ int btrfs_drop_extents(struct btrfs_trans_handle *trans, if (args->drop_cache) btrfs_drop_extent_cache(inode, args->start, args->end - 1, 0); - if (args->start >= inode->disk_i_size && !args->replace_extent) + if (data_race(args->start >= inode->disk_i_size) && !args->replace_extent) modify_tree = 0; update_refs = (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID); -- 2.39.5