From: Johannes Thumshirn <jth@kernel.org>
To: linux-btrfs@vger.kernel.org
Cc: Filipe Manana <fdmanana@kernel.org>,
Johannes Thumshirn <johannes.thumshirn@wdc.com>
Subject: [PATCH v2 1/2] btrfs: don't hold dev_replace rwsem over whole of btrfs_map_block
Date: Wed, 8 May 2024 13:40:15 +0200 [thread overview]
Message-ID: <20240508114016.18119-2-jth@kernel.org> (raw)
In-Reply-To: <20240508114016.18119-1-jth@kernel.org>
From: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Don't hold the dev_replace rwsem for the entirety of btrfs_map_block().
It is only needed to protect
a) calls to find_live_mirror() and
b) calling into handle_ops_on_dev_replace().
But there is no need to hold the rwsem for any kind of set_io_stripe()
calls.
So relax taking the dev_replace rwsem to only protect both cases and check
if the device replace status has changed in the meantime, for which we have
to re-do the find_live_mirror() calls.
This fixes a deadlock on raid-stripe-tree where device replace performs a
scrub operation, which in turn calls into btrfs_map_block() to find the
physical location of the block.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/volumes.c | 33 ++++++++++++++++++++-------------
1 file changed, 20 insertions(+), 13 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b6a701011fb0..5eb41f50ee0c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6650,14 +6650,9 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
max_len = btrfs_max_io_len(map, map_offset, &io_geom);
*length = min_t(u64, map->chunk_len - map_offset, max_len);
+again:
down_read(&dev_replace->rwsem);
dev_replace_is_ongoing = btrfs_dev_replace_is_ongoing(dev_replace);
- /*
- * Hold the semaphore for read during the whole operation, write is
- * requested at commit time but must wait.
- */
- if (!dev_replace_is_ongoing)
- up_read(&dev_replace->rwsem);
switch (map->type & BTRFS_BLOCK_GROUP_PROFILE_MASK) {
case BTRFS_BLOCK_GROUP_RAID0:
@@ -6690,6 +6685,9 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
map_blocks_single(map, &io_geom);
break;
}
+
+ up_read(&dev_replace->rwsem);
+
if (io_geom.stripe_index >= map->num_stripes) {
btrfs_crit(fs_info,
"stripe index math went horribly wrong, got stripe_index=%u, num_stripes=%u",
@@ -6785,11 +6783,25 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
if (op != BTRFS_MAP_READ)
io_geom.max_errors = btrfs_chunk_max_errors(map);
+ /*
+ * Check if something changed the dev_replace state since
+ * we've checked it for the last time and if redo the whole
+ * mapping operation.
+ */
+ down_read(&dev_replace->rwsem);
+ if (dev_replace_is_ongoing !=
+ btrfs_dev_replace_is_ongoing(dev_replace)) {
+ up_read(&dev_replace->rwsem);
+ goto again;
+ }
+
if (dev_replace_is_ongoing && dev_replace->tgtdev != NULL &&
- op != BTRFS_MAP_READ) {
+ op != BTRFS_MAP_READ)
handle_ops_on_dev_replace(op, bioc, dev_replace, logical,
&io_geom.num_stripes, &io_geom.max_errors);
- }
+
+ up_read(&dev_replace->rwsem);
+
*bioc_ret = bioc;
bioc->num_stripes = io_geom.num_stripes;
@@ -6797,11 +6809,6 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
bioc->mirror_num = io_geom.mirror_num;
out:
- if (dev_replace_is_ongoing) {
- lockdep_assert_held(&dev_replace->rwsem);
- /* Unlock and let waiting writers proceed */
- up_read(&dev_replace->rwsem);
- }
btrfs_free_chunk_map(map);
return ret;
}
--
2.35.3
next prev parent reply other threads:[~2024-05-08 11:40 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-08 11:40 [PATCH v2 0/2] don't hold dev_replace rwsem over whole of btrfs_map_block Johannes Thumshirn
2024-05-08 11:40 ` Johannes Thumshirn [this message]
2024-05-08 11:40 ` [PATCH v2 2/2] btrfs: pass 'struct btrfs_io_geometry' into handle_ops_on_dev_replace Johannes Thumshirn
2024-05-08 12:38 ` [PATCH v2 0/2] don't hold dev_replace rwsem over whole of btrfs_map_block Filipe Manana
2024-05-09 17:39 ` David Sterba
2024-05-10 9:09 ` Johannes Thumshirn
2024-05-09 17:43 ` David Sterba
2024-05-10 9:11 ` Johannes Thumshirn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240508114016.18119-2-jth@kernel.org \
--to=jth@kernel.org \
--cc=fdmanana@kernel.org \
--cc=johannes.thumshirn@wdc.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.