From: Josef Bacik <josef@toxicpanda.com>
To: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [RFC v3 05/11] btrfs: lookup physical address from stripe extent
Date: Thu, 20 Oct 2022 11:34:14 -0400 [thread overview]
Message-ID: <Y1FqdvFiC2V3FwCa@localhost.localdomain> (raw)
In-Reply-To: <85853887c5f50188e32f879be823c690c33af9d3.1666007330.git.johannes.thumshirn@wdc.com>
On Mon, Oct 17, 2022 at 04:55:23AM -0700, Johannes Thumshirn wrote:
> Lookup the physical address from the raid stripe tree when a read on an
> RAID volume formatted with the raid stripe tree was attempted.
>
> If the requested logical address was not found in the stripe tree, it may
> still be in the in-memory ordered stripe tree, so fallback to searching
> the ordered stripe tree in this case.
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
> fs/btrfs/raid-stripe-tree.c | 142 ++++++++++++++++++++++++++++++++++++
> fs/btrfs/raid-stripe-tree.h | 3 +
> fs/btrfs/volumes.c | 30 ++++++--
> 3 files changed, 168 insertions(+), 7 deletions(-)
>
> diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c
> index 5750857c2a75..91e67600e01a 100644
> --- a/fs/btrfs/raid-stripe-tree.c
> +++ b/fs/btrfs/raid-stripe-tree.c
> @@ -218,3 +218,145 @@ int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans,
>
> return ret;
> }
> +
> +static bool btrfs_physical_from_ordered_stripe(struct btrfs_fs_info *fs_info,
> + u64 logical, u64 *length,
> + int num_stripes,
> + struct btrfs_io_stripe *stripe)
> +{
> + struct btrfs_ordered_stripe *os;
> + u64 offset;
> + u64 found_end;
> + u64 end;
> + int i;
> +
> + os = btrfs_lookup_ordered_stripe(fs_info, logical);
> + if (!os)
> + return false;
> +
> + end = logical + *length;
> + found_end = os->logical + os->num_bytes;
> + if (end > found_end)
> + *length -= end - found_end;
> +
> + for (i = 0; i < num_stripes; i++) {
> + if (os->stripes[i].dev != stripe->dev)
> + continue;
> +
> + offset = logical - os->logical;
> + ASSERT(offset >= 0);
> + stripe->physical = os->stripes[i].physical + offset;
> + btrfs_put_ordered_stripe(fs_info, os);
> + break;
> + }
> +
> + return true;
> +}
> +
> +int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info,
> + u64 logical, u64 *length, u64 map_type,
> + struct btrfs_io_stripe *stripe)
> +{
> + struct btrfs_root *stripe_root = fs_info->stripe_root;
> + int num_stripes = btrfs_bg_type_to_factor(map_type);
> + struct btrfs_dp_stripe *raid_stripe;
> + struct btrfs_key stripe_key;
> + struct btrfs_key found_key;
> + struct btrfs_path *path;
> + struct extent_buffer *leaf;
> + u64 offset;
> + u64 found_logical;
> + u64 found_length;
> + u64 end;
> + u64 found_end;
> + int slot;
> + int ret;
> + int i;
> +
> + /*
> + * If we still have the stripe in the ordered stripe tree get it from
> + * there
> + */
> + if (btrfs_physical_from_ordered_stripe(fs_info, logical, length,
> + num_stripes, stripe))
> + return 0;
> +
> + stripe_key.objectid = logical;
> + stripe_key.type = BTRFS_RAID_STRIPE_KEY;
> + stripe_key.offset = 0;
> +
> + path = btrfs_alloc_path();
> + if (!path)
> + return -ENOMEM;
> +
> + ret = btrfs_search_slot(NULL, stripe_root, &stripe_key, path, 0, 0);
> + if (ret < 0)
> + goto out;
> + if (ret) {
> + if (path->slots[0] != 0)
> + path->slots[0]--;
> + }
> +
> + end = logical + *length;
> +
> + while (1) {
> + leaf = path->nodes[0];
> + slot = path->slots[0];
> +
> + btrfs_item_key_to_cpu(leaf, &found_key, slot);
> + found_logical = found_key.objectid;
> + found_length = found_key.offset;
> +
Don't we have fancy new iterators for walking through the btree? Can that be
used here instead of this old style walk through?
> + if (found_logical > end)
> + break;
> +
> + if (!in_range(logical, found_logical, found_length))
> + goto next;
> +
> + offset = logical - found_logical;
> + found_end = found_logical + found_length;
> +
> + /*
> + * If we have a logically contiguous, but physically
> + * noncontinuous range, we need to split the bio. Record the
> + * length after which we must split the bio.
> + */
> + if (end > found_end)
> + *length -= end - found_end;
> +
> + raid_stripe = btrfs_item_ptr(leaf, slot, struct btrfs_dp_stripe);
> + for (i = 0; i < num_stripes; i++) {
> + if (btrfs_stripe_extent_devid_nr(leaf, raid_stripe, i) !=
> + stripe->dev->devid)
> + continue;
> + stripe->physical = btrfs_stripe_extent_physical_nr(leaf,
> + raid_stripe, i) + offset;
> + ret = 0;
> + goto out;
> + }
> +
> + /*
> + * If we're here, we haven't found the requested devid in the
> + * stripe.
> + */
> + ret = -ENOENT;
> + goto out;
> +next:
> + ret = btrfs_next_item(stripe_root, path);
> + if (ret)
> + break;
> + }
> +
> +out:
> + if (ret > 0)
> + ret = -ENOENT;
> + if (ret) {
Maybe instead
if (ret && ret != -EIO)
I have a lot of boxes, and a given percentage of them have bad disks, which ends
up with a lot of btrfs_print_tree()'s that I don't need.
> + btrfs_err(fs_info,
> + "cannot find raid-stripe for logical [%llu, %llu]",
> + logical, logical + *length);
> + btrfs_print_tree(leaf, 1);
> + }
> + btrfs_free_path(path);
> +
> + return ret;
> +}
> diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h
> index 3456251d0739..083e754f5239 100644
> --- a/fs/btrfs/raid-stripe-tree.h
> +++ b/fs/btrfs/raid-stripe-tree.h
> @@ -16,6 +16,9 @@ struct btrfs_ordered_stripe {
> refcount_t ref;
> };
>
> +int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info,
> + u64 logical, u64 *length, u64 map_type,
> + struct btrfs_io_stripe *stripe);
> int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start,
> u64 length);
> int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans,
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 261bf6dd17bc..c67d76d93982 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -6313,12 +6313,21 @@ static u64 btrfs_max_io_len(struct map_lookup *map, enum btrfs_map_op op,
> return U64_MAX;
> }
>
> -static void set_io_stripe(struct btrfs_io_stripe *dst, const struct map_lookup *map,
> - u32 stripe_index, u64 stripe_offset, u64 stripe_nr)
> +static int set_io_stripe(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
> + u64 logical, u64 *length, struct btrfs_io_stripe *dst,
> + struct map_lookup *map, u32 stripe_index,
> + u64 stripe_offset, u64 stripe_nr)
> {
> dst->dev = map->stripes[stripe_index].dev;
> +
> + if (fs_info->stripe_root && op == BTRFS_MAP_READ &&
> + btrfs_need_stripe_tree_update(fs_info, map->type))
We already check if (fs_info->stripe_root) in here, so this can be simplified to
if (op == BTRFS_MAP_READ && btrfs_need_stripe_tree_update())
Thanks,
Josef
next prev parent reply other threads:[~2022-10-20 15:34 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-17 11:55 [RFC v3 00/11] btrfs: raid-stripe-tree draft patches Johannes Thumshirn
2022-10-17 11:55 ` [RFC v3 01/11] btrfs: add raid stripe tree definitions Johannes Thumshirn
2022-10-20 15:21 ` Josef Bacik
2022-10-20 15:49 ` Johannes Thumshirn
2022-10-17 11:55 ` [RFC v3 02/11] btrfs: read raid-stripe-tree from disk Johannes Thumshirn
2022-10-17 11:55 ` [RFC v3 03/11] btrfs: add support for inserting raid stripe extents Johannes Thumshirn
2022-10-20 15:24 ` Josef Bacik
2022-10-20 15:30 ` Josef Bacik
2022-10-21 8:13 ` Johannes Thumshirn
2022-10-17 11:55 ` [RFC v3 04/11] btrfs: delete stripe extent on extent deletion Johannes Thumshirn
2022-10-17 11:55 ` [RFC v3 05/11] btrfs: lookup physical address from stripe extent Johannes Thumshirn
2022-10-20 15:34 ` Josef Bacik [this message]
2022-10-21 8:16 ` Johannes Thumshirn
2022-10-17 11:55 ` [RFC v3 06/11] btrfs: add raid stripe tree pretty printer Johannes Thumshirn
2022-10-20 15:34 ` Josef Bacik
2022-10-17 11:55 ` [RFC v3 07/11] btrfs: zoned: allow zoned RAID1 Johannes Thumshirn
2022-10-20 15:35 ` Josef Bacik
2022-10-17 11:55 ` [RFC v3 08/11] btrfs: allow zoned RAID0 and 10 Johannes Thumshirn
2022-10-20 15:36 ` Josef Bacik
2022-10-17 11:55 ` [RFC v3 09/11] btrfs: fix striping with RST Johannes Thumshirn
2022-10-20 15:36 ` Josef Bacik
2022-10-17 11:55 ` [RFC v3 10/11] btrfs: check for leaks of ordered stripes on umount Johannes Thumshirn
2022-10-20 15:37 ` Josef Bacik
2022-10-21 8:17 ` Johannes Thumshirn
2022-10-17 11:55 ` [RFC v3 11/11] btrfs: add tracepoints for ordered stripes Johannes Thumshirn
2022-10-20 15:38 ` Josef Bacik
2022-10-20 15:42 ` [RFC v3 00/11] btrfs: raid-stripe-tree draft patches Josef Bacik
2022-10-21 8:40 ` Johannes Thumshirn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y1FqdvFiC2V3FwCa@localhost.localdomain \
--to=josef@toxicpanda.com \
--cc=johannes.thumshirn@wdc.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.