* [PATCH v4 0/2] implement truncation for RAID stripe-extents @ 2024-10-17 9:04 Johannes Thumshirn 2024-10-17 9:04 ` [PATCH v4 1/2] btrfs: implement partial deletion of RAID stripe extents Johannes Thumshirn 2024-10-17 9:04 ` [PATCH v4 2/2] btrfs: implement self-tests for partial RAID srtipe-tree delete Johannes Thumshirn 0 siblings, 2 replies; 7+ messages in thread From: Johannes Thumshirn @ 2024-10-17 9:04 UTC (permalink / raw) To: linux-btrfs Cc: Josef Bacik, David Sterba, Filipe Manana, Naohiro Aota, Johannes Thumshirn From: Johannes Thumshirn <johannes.thumshirn@wdc.com> Implement truncation of RAID stripe-tree entries and add selftests for these two edge cases of deletion to the RAID stripe-tree selftests. Changes to v3: - Use btrfs_duplicate_item() (Filipe) - User key.offset = 0 in btrfs_search_slot() so we can find "front" delete Link to v3: https://lore.kernel.org/linux-btrfs/20241009153032.23336-1-jth@kernel.org Changes to v2: - Correctly adjust the btree keys and insert new items instead (Filipe) - Add selftests Link to v2: https://lore.kernel.org/linux-btrfs/20240911095206.31060-1-jth@kernel.org Johannes Thumshirn (2): btrfs: implement partial deletion of RAID stripe extents btrfs: implement self-tests for partial RAID srtipe-tree delete fs/btrfs/ctree.c | 1 + fs/btrfs/raid-stripe-tree.c | 78 ++++++++- fs/btrfs/tests/raid-stripe-tree-tests.c | 214 ++++++++++++++++++++++++ 3 files changed, 292 insertions(+), 1 deletion(-) -- 2.43.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 1/2] btrfs: implement partial deletion of RAID stripe extents 2024-10-17 9:04 [PATCH v4 0/2] implement truncation for RAID stripe-extents Johannes Thumshirn @ 2024-10-17 9:04 ` Johannes Thumshirn 2024-10-22 13:52 ` Filipe Manana 2024-10-17 9:04 ` [PATCH v4 2/2] btrfs: implement self-tests for partial RAID srtipe-tree delete Johannes Thumshirn 1 sibling, 1 reply; 7+ messages in thread From: Johannes Thumshirn @ 2024-10-17 9:04 UTC (permalink / raw) To: linux-btrfs Cc: Josef Bacik, David Sterba, Filipe Manana, Naohiro Aota, Johannes Thumshirn From: Johannes Thumshirn <johannes.thumshirn@wdc.com> In our CI system, the RAID stripe tree configuration sometimes fails with the following ASSERT(): assertion failed: found_start >= start && found_end <= end, in fs/btrfs/raid-stripe-tree.c:64 This ASSERT()ion triggers, because for the initial design of RAID stripe-tree, I had the "one ordered-extent equals one bio" rule of zoned btrfs in mind. But for a RAID stripe-tree based system, that is not hosted on a zoned storage device, but on a regular device this rule doesn't apply. So in case the range we want to delete starts in the middle of the previous item, grab the item and "truncate" it's length. That is, clone the item, subtract the deleted portion from the key's offset, delete the old item and insert the new one. In case the range to delete ends in the middle of an item, we have to adjust both the item's key as well as the stripe extents and then re-insert the modified clone into the tree after deleting the old stripe extent. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> --- fs/btrfs/ctree.c | 1 + fs/btrfs/raid-stripe-tree.c | 94 ++++++++++++++++++++++++++++++++++--- 2 files changed, 88 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index b11ec86102e3..3f320f6e7767 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -3863,6 +3863,7 @@ static noinline int setup_leaf_for_split(struct btrfs_trans_handle *trans, btrfs_item_key_to_cpu(leaf, &key, path->slots[0]); BUG_ON(key.type != BTRFS_EXTENT_DATA_KEY && + key.type != BTRFS_RAID_STRIPE_KEY && key.type != BTRFS_EXTENT_CSUM_KEY); if (btrfs_leaf_free_space(leaf) >= ins_len) diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 41970bbdb05f..569273e42d85 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -13,6 +13,50 @@ #include "volumes.h" #include "print-tree.h" +static int btrfs_partially_delete_raid_extent(struct btrfs_trans_handle *trans, + struct btrfs_path *path, + struct btrfs_key *oldkey, + u64 newlen, u64 frontpad) +{ + struct btrfs_root *stripe_root = trans->fs_info->stripe_root; + struct btrfs_stripe_extent *extent; + struct extent_buffer *leaf; + int slot; + size_t item_size; + int ret; + struct btrfs_key newkey = { + .objectid = oldkey->objectid + frontpad, + .type = BTRFS_RAID_STRIPE_KEY, + .offset = newlen, + }; + + ASSERT(oldkey->type == BTRFS_RAID_STRIPE_KEY); + ret = btrfs_duplicate_item(trans, stripe_root, path, &newkey); + if (ret) + return ret; + + leaf = path->nodes[0]; + slot = path->slots[0]; + item_size = btrfs_item_size(leaf, slot); + extent = btrfs_item_ptr(leaf, slot, struct btrfs_stripe_extent); + + for (int i = 0; i < btrfs_num_raid_stripes(item_size); i++) { + struct btrfs_raid_stride *stride = &extent->strides[i]; + u64 phys; + + phys = btrfs_raid_stride_physical(leaf, stride); + btrfs_set_raid_stride_physical(leaf, stride, phys + frontpad); + } + + btrfs_mark_buffer_dirty(trans, leaf); + + /* delete the old item, after we've inserted a new one. */ + path->slots[0]--; + ret = btrfs_del_item(trans, stripe_root, path); + + return ret; +} + int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 length) { struct btrfs_fs_info *fs_info = trans->fs_info; @@ -36,23 +80,24 @@ int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 le while (1) { key.objectid = start; key.type = BTRFS_RAID_STRIPE_KEY; - key.offset = length; + key.offset = 0; ret = btrfs_search_slot(trans, stripe_root, &key, path, -1, 1); if (ret < 0) break; - if (ret > 0) { - ret = 0; - if (path->slots[0] == 0) - break; + + if (path->slots[0] == btrfs_header_nritems(path->nodes[0])) path->slots[0]--; - } leaf = path->nodes[0]; slot = path->slots[0]; btrfs_item_key_to_cpu(leaf, &key, slot); found_start = key.objectid; found_end = found_start + key.offset; + ret = 0; + + if (key.type != BTRFS_RAID_STRIPE_KEY) + break; /* That stripe ends before we start, we're done. */ if (found_end <= start) @@ -61,7 +106,42 @@ int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 le trace_btrfs_raid_extent_delete(fs_info, start, end, found_start, found_end); - ASSERT(found_start >= start && found_end <= end); + /* + * The stripe extent starts before the range we want to delete: + * + * |--- RAID Stripe Extent ---| + * |--- keep ---|--- drop ---| + * + * This means we have to duplicate the tree item, truncate the + * length to the new size and then re-insert the item. + */ + if (found_start < start) { + u64 diff = start - found_start; + + ret = btrfs_partially_delete_raid_extent(trans, path, + &key, + diff, 0); + break; + } + + /* + * The stripe extent ends after the range we want to delete: + * + * |--- RAID Stripe Extent ---| + * |--- drop ---|--- keep ---| + * + * This means we have to duplicate the tree item, truncate the + * length to the new size and then re-insert the item. + */ + if (found_end > end) { + u64 diff = found_end - end; + + ret = btrfs_partially_delete_raid_extent(trans, path, + &key, + diff, diff); + break; + } + ret = btrfs_del_item(trans, stripe_root, path); if (ret) break; -- 2.43.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v4 1/2] btrfs: implement partial deletion of RAID stripe extents 2024-10-17 9:04 ` [PATCH v4 1/2] btrfs: implement partial deletion of RAID stripe extents Johannes Thumshirn @ 2024-10-22 13:52 ` Filipe Manana 2024-10-22 15:37 ` Johannes Thumshirn 0 siblings, 1 reply; 7+ messages in thread From: Filipe Manana @ 2024-10-22 13:52 UTC (permalink / raw) To: Johannes Thumshirn Cc: linux-btrfs, Josef Bacik, David Sterba, Filipe Manana, Naohiro Aota, Johannes Thumshirn On Thu, Oct 17, 2024 at 10:04 AM Johannes Thumshirn <jth@kernel.org> wrote: > > From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > In our CI system, the RAID stripe tree configuration sometimes fails with > the following ASSERT(): > > assertion failed: found_start >= start && found_end <= end, in fs/btrfs/raid-stripe-tree.c:64 > > This ASSERT()ion triggers, because for the initial design of RAID > stripe-tree, I had the "one ordered-extent equals one bio" rule of zoned > btrfs in mind. > > But for a RAID stripe-tree based system, that is not hosted on a zoned > storage device, but on a regular device this rule doesn't apply. > > So in case the range we want to delete starts in the middle of the > previous item, grab the item and "truncate" it's length. That is, clone > the item, subtract the deleted portion from the key's offset, delete the > old item and insert the new one. > > In case the range to delete ends in the middle of an item, we have to > adjust both the item's key as well as the stripe extents and then > re-insert the modified clone into the tree after deleting the old stripe > extent. > > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > --- > fs/btrfs/ctree.c | 1 + > fs/btrfs/raid-stripe-tree.c | 94 ++++++++++++++++++++++++++++++++++--- > 2 files changed, 88 insertions(+), 7 deletions(-) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index b11ec86102e3..3f320f6e7767 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -3863,6 +3863,7 @@ static noinline int setup_leaf_for_split(struct btrfs_trans_handle *trans, > btrfs_item_key_to_cpu(leaf, &key, path->slots[0]); > > BUG_ON(key.type != BTRFS_EXTENT_DATA_KEY && > + key.type != BTRFS_RAID_STRIPE_KEY && > key.type != BTRFS_EXTENT_CSUM_KEY); > > if (btrfs_leaf_free_space(leaf) >= ins_len) > diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c > index 41970bbdb05f..569273e42d85 100644 > --- a/fs/btrfs/raid-stripe-tree.c > +++ b/fs/btrfs/raid-stripe-tree.c > @@ -13,6 +13,50 @@ > #include "volumes.h" > #include "print-tree.h" > > +static int btrfs_partially_delete_raid_extent(struct btrfs_trans_handle *trans, > + struct btrfs_path *path, > + struct btrfs_key *oldkey, oldkey can be made const. > + u64 newlen, u64 frontpad) > +{ > + struct btrfs_root *stripe_root = trans->fs_info->stripe_root; > + struct btrfs_stripe_extent *extent; > + struct extent_buffer *leaf; > + int slot; > + size_t item_size; > + int ret; > + struct btrfs_key newkey = { > + .objectid = oldkey->objectid + frontpad, > + .type = BTRFS_RAID_STRIPE_KEY, > + .offset = newlen, > + }; > + > + ASSERT(oldkey->type == BTRFS_RAID_STRIPE_KEY); > + ret = btrfs_duplicate_item(trans, stripe_root, path, &newkey); > + if (ret) > + return ret; > + > + leaf = path->nodes[0]; > + slot = path->slots[0]; > + item_size = btrfs_item_size(leaf, slot); > + extent = btrfs_item_ptr(leaf, slot, struct btrfs_stripe_extent); > + > + for (int i = 0; i < btrfs_num_raid_stripes(item_size); i++) { > + struct btrfs_raid_stride *stride = &extent->strides[i]; > + u64 phys; > + > + phys = btrfs_raid_stride_physical(leaf, stride); > + btrfs_set_raid_stride_physical(leaf, stride, phys + frontpad); > + } > + > + btrfs_mark_buffer_dirty(trans, leaf); This is redundant, it was already done by btrfs_duplicate_item(), by the btrfs_search_slot() call in the caller and done by btrfs_del_item() below as well. > + > + /* delete the old item, after we've inserted a new one. */ > + path->slots[0]--; > + ret = btrfs_del_item(trans, stripe_root, path); So actually looking at this, we don't need btrfs_duplicate_item() plus btrfs_del_item(), this can be more lightweight and simpler by doing just: 1) Do the for loop as it is. 2) Then after, or before the for loop, the order doesn't really matter, just do: btrfs_set_item_key_safe(trans, path, &newkey). Less code and it avoids adding a new item and deleting another one, with the shiftings of data in the leaf, etc. Thanks. > + > + return ret; > +} > + > int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 length) > { > struct btrfs_fs_info *fs_info = trans->fs_info; > @@ -36,23 +80,24 @@ int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 le > while (1) { > key.objectid = start; > key.type = BTRFS_RAID_STRIPE_KEY; > - key.offset = length; > + key.offset = 0; > > ret = btrfs_search_slot(trans, stripe_root, &key, path, -1, 1); > if (ret < 0) > break; > - if (ret > 0) { > - ret = 0; > - if (path->slots[0] == 0) > - break; > + > + if (path->slots[0] == btrfs_header_nritems(path->nodes[0])) > path->slots[0]--; > - } > > leaf = path->nodes[0]; > slot = path->slots[0]; > btrfs_item_key_to_cpu(leaf, &key, slot); > found_start = key.objectid; > found_end = found_start + key.offset; > + ret = 0; > + > + if (key.type != BTRFS_RAID_STRIPE_KEY) > + break; > > /* That stripe ends before we start, we're done. */ > if (found_end <= start) > @@ -61,7 +106,42 @@ int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 le > trace_btrfs_raid_extent_delete(fs_info, start, end, > found_start, found_end); > > - ASSERT(found_start >= start && found_end <= end); > + /* > + * The stripe extent starts before the range we want to delete: > + * > + * |--- RAID Stripe Extent ---| > + * |--- keep ---|--- drop ---| > + * > + * This means we have to duplicate the tree item, truncate the > + * length to the new size and then re-insert the item. > + */ > + if (found_start < start) { > + u64 diff = start - found_start; > + > + ret = btrfs_partially_delete_raid_extent(trans, path, > + &key, > + diff, 0); > + break; > + } > + > + /* > + * The stripe extent ends after the range we want to delete: > + * > + * |--- RAID Stripe Extent ---| > + * |--- drop ---|--- keep ---| > + * > + * This means we have to duplicate the tree item, truncate the > + * length to the new size and then re-insert the item. > + */ > + if (found_end > end) { > + u64 diff = found_end - end; > + > + ret = btrfs_partially_delete_raid_extent(trans, path, > + &key, > + diff, diff); > + break; > + } > + > ret = btrfs_del_item(trans, stripe_root, path); > if (ret) > break; > -- > 2.43.0 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 1/2] btrfs: implement partial deletion of RAID stripe extents 2024-10-22 13:52 ` Filipe Manana @ 2024-10-22 15:37 ` Johannes Thumshirn 2024-10-22 15:41 ` Filipe Manana 0 siblings, 1 reply; 7+ messages in thread From: Johannes Thumshirn @ 2024-10-22 15:37 UTC (permalink / raw) To: Filipe Manana, Johannes Thumshirn Cc: linux-btrfs@vger.kernel.org, Josef Bacik, David Sterba, Filipe Manana, Naohiro Aota On 22.10.24 15:53, Filipe Manana wrote: > On Thu, Oct 17, 2024 at 10:04 AM Johannes Thumshirn <jth@kernel.org> wrote: >> >> From: Johannes Thumshirn <johannes.thumshirn@wdc.com> >> >> In our CI system, the RAID stripe tree configuration sometimes fails with >> the following ASSERT(): >> >> assertion failed: found_start >= start && found_end <= end, in fs/btrfs/raid-stripe-tree.c:64 >> >> This ASSERT()ion triggers, because for the initial design of RAID >> stripe-tree, I had the "one ordered-extent equals one bio" rule of zoned >> btrfs in mind. >> >> But for a RAID stripe-tree based system, that is not hosted on a zoned >> storage device, but on a regular device this rule doesn't apply. >> >> So in case the range we want to delete starts in the middle of the >> previous item, grab the item and "truncate" it's length. That is, clone >> the item, subtract the deleted portion from the key's offset, delete the >> old item and insert the new one. >> >> In case the range to delete ends in the middle of an item, we have to >> adjust both the item's key as well as the stripe extents and then >> re-insert the modified clone into the tree after deleting the old stripe >> extent. >> >> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> >> --- >> fs/btrfs/ctree.c | 1 + >> fs/btrfs/raid-stripe-tree.c | 94 ++++++++++++++++++++++++++++++++++--- >> 2 files changed, 88 insertions(+), 7 deletions(-) >> >> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c >> index b11ec86102e3..3f320f6e7767 100644 >> --- a/fs/btrfs/ctree.c >> +++ b/fs/btrfs/ctree.c >> @@ -3863,6 +3863,7 @@ static noinline int setup_leaf_for_split(struct btrfs_trans_handle *trans, >> btrfs_item_key_to_cpu(leaf, &key, path->slots[0]); >> >> BUG_ON(key.type != BTRFS_EXTENT_DATA_KEY && >> + key.type != BTRFS_RAID_STRIPE_KEY && >> key.type != BTRFS_EXTENT_CSUM_KEY); >> >> if (btrfs_leaf_free_space(leaf) >= ins_len) >> diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c >> index 41970bbdb05f..569273e42d85 100644 >> --- a/fs/btrfs/raid-stripe-tree.c >> +++ b/fs/btrfs/raid-stripe-tree.c >> @@ -13,6 +13,50 @@ >> #include "volumes.h" >> #include "print-tree.h" >> >> +static int btrfs_partially_delete_raid_extent(struct btrfs_trans_handle *trans, >> + struct btrfs_path *path, >> + struct btrfs_key *oldkey, > > oldkey can be made const. > >> + u64 newlen, u64 frontpad) >> +{ >> + struct btrfs_root *stripe_root = trans->fs_info->stripe_root; >> + struct btrfs_stripe_extent *extent; >> + struct extent_buffer *leaf; >> + int slot; >> + size_t item_size; >> + int ret; >> + struct btrfs_key newkey = { >> + .objectid = oldkey->objectid + frontpad, >> + .type = BTRFS_RAID_STRIPE_KEY, >> + .offset = newlen, >> + }; >> + >> + ASSERT(oldkey->type == BTRFS_RAID_STRIPE_KEY); >> + ret = btrfs_duplicate_item(trans, stripe_root, path, &newkey); >> + if (ret) >> + return ret; >> + >> + leaf = path->nodes[0]; >> + slot = path->slots[0]; >> + item_size = btrfs_item_size(leaf, slot); >> + extent = btrfs_item_ptr(leaf, slot, struct btrfs_stripe_extent); >> + >> + for (int i = 0; i < btrfs_num_raid_stripes(item_size); i++) { >> + struct btrfs_raid_stride *stride = &extent->strides[i]; >> + u64 phys; >> + >> + phys = btrfs_raid_stride_physical(leaf, stride); >> + btrfs_set_raid_stride_physical(leaf, stride, phys + frontpad); >> + } >> + >> + btrfs_mark_buffer_dirty(trans, leaf); > > This is redundant, it was already done by btrfs_duplicate_item(), by > the btrfs_search_slot() call in the caller and done by > btrfs_del_item() below as well. > > >> + >> + /* delete the old item, after we've inserted a new one. */ >> + path->slots[0]--; >> + ret = btrfs_del_item(trans, stripe_root, path); > > So actually looking at this, we don't need btrfs_duplicate_item() > plus btrfs_del_item(), this can be more lightweight and simpler by > doing just: > > 1) Do the for loop as it is. > > 2) Then after, or before the for loop, the order doesn't really > matter, just do: btrfs_set_item_key_safe(trans, path, &newkey). > > Less code and it avoids adding a new item and deleting another one, > with the shiftings of data in the leaf, etc. Oh I didn't know about btrfs_set_item_key_safe(), that sounds like a good plan thanks :) Can I still get rid of btrfs_mark_buffer_dirty then? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 1/2] btrfs: implement partial deletion of RAID stripe extents 2024-10-22 15:37 ` Johannes Thumshirn @ 2024-10-22 15:41 ` Filipe Manana 0 siblings, 0 replies; 7+ messages in thread From: Filipe Manana @ 2024-10-22 15:41 UTC (permalink / raw) To: Johannes Thumshirn Cc: Johannes Thumshirn, linux-btrfs@vger.kernel.org, Josef Bacik, David Sterba, Filipe Manana, Naohiro Aota On Tue, Oct 22, 2024 at 4:37 PM Johannes Thumshirn <Johannes.Thumshirn@wdc.com> wrote: > > On 22.10.24 15:53, Filipe Manana wrote: > > On Thu, Oct 17, 2024 at 10:04 AM Johannes Thumshirn <jth@kernel.org> wrote: > >> > >> From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > >> > >> In our CI system, the RAID stripe tree configuration sometimes fails with > >> the following ASSERT(): > >> > >> assertion failed: found_start >= start && found_end <= end, in fs/btrfs/raid-stripe-tree.c:64 > >> > >> This ASSERT()ion triggers, because for the initial design of RAID > >> stripe-tree, I had the "one ordered-extent equals one bio" rule of zoned > >> btrfs in mind. > >> > >> But for a RAID stripe-tree based system, that is not hosted on a zoned > >> storage device, but on a regular device this rule doesn't apply. > >> > >> So in case the range we want to delete starts in the middle of the > >> previous item, grab the item and "truncate" it's length. That is, clone > >> the item, subtract the deleted portion from the key's offset, delete the > >> old item and insert the new one. > >> > >> In case the range to delete ends in the middle of an item, we have to > >> adjust both the item's key as well as the stripe extents and then > >> re-insert the modified clone into the tree after deleting the old stripe > >> extent. > >> > >> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > >> --- > >> fs/btrfs/ctree.c | 1 + > >> fs/btrfs/raid-stripe-tree.c | 94 ++++++++++++++++++++++++++++++++++--- > >> 2 files changed, 88 insertions(+), 7 deletions(-) > >> > >> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > >> index b11ec86102e3..3f320f6e7767 100644 > >> --- a/fs/btrfs/ctree.c > >> +++ b/fs/btrfs/ctree.c > >> @@ -3863,6 +3863,7 @@ static noinline int setup_leaf_for_split(struct btrfs_trans_handle *trans, > >> btrfs_item_key_to_cpu(leaf, &key, path->slots[0]); > >> > >> BUG_ON(key.type != BTRFS_EXTENT_DATA_KEY && > >> + key.type != BTRFS_RAID_STRIPE_KEY && > >> key.type != BTRFS_EXTENT_CSUM_KEY); > >> > >> if (btrfs_leaf_free_space(leaf) >= ins_len) > >> diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c > >> index 41970bbdb05f..569273e42d85 100644 > >> --- a/fs/btrfs/raid-stripe-tree.c > >> +++ b/fs/btrfs/raid-stripe-tree.c > >> @@ -13,6 +13,50 @@ > >> #include "volumes.h" > >> #include "print-tree.h" > >> > >> +static int btrfs_partially_delete_raid_extent(struct btrfs_trans_handle *trans, > >> + struct btrfs_path *path, > >> + struct btrfs_key *oldkey, > > > > oldkey can be made const. > > > >> + u64 newlen, u64 frontpad) > >> +{ > >> + struct btrfs_root *stripe_root = trans->fs_info->stripe_root; > >> + struct btrfs_stripe_extent *extent; > >> + struct extent_buffer *leaf; > >> + int slot; > >> + size_t item_size; > >> + int ret; > >> + struct btrfs_key newkey = { > >> + .objectid = oldkey->objectid + frontpad, > >> + .type = BTRFS_RAID_STRIPE_KEY, > >> + .offset = newlen, > >> + }; > >> + > >> + ASSERT(oldkey->type == BTRFS_RAID_STRIPE_KEY); > >> + ret = btrfs_duplicate_item(trans, stripe_root, path, &newkey); > >> + if (ret) > >> + return ret; > >> + > >> + leaf = path->nodes[0]; > >> + slot = path->slots[0]; > >> + item_size = btrfs_item_size(leaf, slot); > >> + extent = btrfs_item_ptr(leaf, slot, struct btrfs_stripe_extent); > >> + > >> + for (int i = 0; i < btrfs_num_raid_stripes(item_size); i++) { > >> + struct btrfs_raid_stride *stride = &extent->strides[i]; > >> + u64 phys; > >> + > >> + phys = btrfs_raid_stride_physical(leaf, stride); > >> + btrfs_set_raid_stride_physical(leaf, stride, phys + frontpad); > >> + } > >> + > >> + btrfs_mark_buffer_dirty(trans, leaf); > > > > This is redundant, it was already done by btrfs_duplicate_item(), by > > the btrfs_search_slot() call in the caller and done by > > btrfs_del_item() below as well. > > > > > >> + > >> + /* delete the old item, after we've inserted a new one. */ > >> + path->slots[0]--; > >> + ret = btrfs_del_item(trans, stripe_root, path); > > > > So actually looking at this, we don't need btrfs_duplicate_item() > > plus btrfs_del_item(), this can be more lightweight and simpler by > > doing just: > > > > 1) Do the for loop as it is. > > > > 2) Then after, or before the for loop, the order doesn't really > > matter, just do: btrfs_set_item_key_safe(trans, path, &newkey). > > > > Less code and it avoids adding a new item and deleting another one, > > with the shiftings of data in the leaf, etc. > > Oh I didn't know about btrfs_set_item_key_safe(), that sounds like a > good plan thanks :) > Can I still get rid of btrfs_mark_buffer_dirty then? Yes, even because btrfs_set_item_key_safe() already does it. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 2/2] btrfs: implement self-tests for partial RAID srtipe-tree delete 2024-10-17 9:04 [PATCH v4 0/2] implement truncation for RAID stripe-extents Johannes Thumshirn 2024-10-17 9:04 ` [PATCH v4 1/2] btrfs: implement partial deletion of RAID stripe extents Johannes Thumshirn @ 2024-10-17 9:04 ` Johannes Thumshirn 2024-10-22 13:57 ` Filipe Manana 1 sibling, 1 reply; 7+ messages in thread From: Johannes Thumshirn @ 2024-10-17 9:04 UTC (permalink / raw) To: linux-btrfs Cc: Josef Bacik, David Sterba, Filipe Manana, Naohiro Aota, Johannes Thumshirn From: Johannes Thumshirn <johannes.thumshirn@wdc.com> Implement self-tests for partial deletion of RAID stripe-tree entries. These two new tests cover both the deletion of the front of a RAID stripe-tree stripe extent as well as truncation of an item to make it smaller. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> --- fs/btrfs/tests/raid-stripe-tree-tests.c | 223 ++++++++++++++++++++++++ 1 file changed, 223 insertions(+) diff --git a/fs/btrfs/tests/raid-stripe-tree-tests.c b/fs/btrfs/tests/raid-stripe-tree-tests.c index b8013ab13c43..5f1f1342b291 100644 --- a/fs/btrfs/tests/raid-stripe-tree-tests.c +++ b/fs/btrfs/tests/raid-stripe-tree-tests.c @@ -29,6 +29,227 @@ static struct btrfs_device *btrfs_device_by_devid(struct btrfs_fs_devices *fs_de return NULL; } +/* + * Test a 64K RST write on a 2 disk RAID1 at a logical address of 1M and then + * delete the 1st 32K, making the new start address 1M+32K. + */ +static int test_front_delete(struct btrfs_trans_handle *trans) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct btrfs_io_context *bioc; + struct btrfs_io_stripe io_stripe = { 0 }; + u64 map_type = RST_TEST_RAID1_TYPE; + u64 logical = SZ_1M; + u64 len = SZ_64K; + int ret; + + bioc = alloc_btrfs_io_context(fs_info, logical, RST_TEST_NUM_DEVICES); + if (!bioc) { + test_std_err(TEST_ALLOC_IO_CONTEXT); + ret = -ENOMEM; + goto out; + } + + io_stripe.dev = btrfs_device_by_devid(fs_info->fs_devices, 0); + bioc->map_type = map_type; + bioc->size = len; + + for (int i = 0; i < RST_TEST_NUM_DEVICES; i++) { + struct btrfs_io_stripe *stripe = &bioc->stripes[i]; + + stripe->dev = btrfs_device_by_devid(fs_info->fs_devices, i); + if (!stripe->dev) { + test_err("cannot find device with devid %d", i); + ret = -EINVAL; + goto out; + } + + stripe->physical = logical + i * SZ_1G; + } + + ret = btrfs_insert_one_raid_extent(trans, bioc); + if (ret) { + test_err("inserting RAID extent failed: %d", ret); + goto out; + } + + ret = btrfs_get_raid_extent_offset(fs_info, logical, &len, map_type, 0, + &io_stripe); + if (ret) { + test_err("lookup of RAID extent [%llu, %llu] failed", logical, + logical + len); + goto out; + } + + if (io_stripe.physical != logical) { + test_err("invalid physical address, expected %llu got %llu", + logical, io_stripe.physical); + ret = -EINVAL; + goto out; + } + + if (len != SZ_64K) { + test_err("invalid stripe length, expected %llu got %llu", + (u64)SZ_64K, len); + ret = -EINVAL; + goto out; + } + + ret = btrfs_delete_raid_extent(trans, logical, SZ_32K); + if (ret) { + test_err("deleting RAID extent [%llu, %llu] failed", logical, + logical + SZ_32K); + goto out; + } + + len = SZ_32K; + ret = btrfs_get_raid_extent_offset(fs_info, logical + SZ_32K, &len, + map_type, 0, &io_stripe); + if (ret) { + test_err("lookup of RAID extent [%llu, %llu] failed", + logical + SZ_32K, logical + SZ_32K + len); + goto out; + } + + if (io_stripe.physical != logical + SZ_32K) { + test_err("invalid physical address, expected %llu, got %llu", + logical + SZ_32K, io_stripe.physical); + ret = -EINVAL; + goto out; + } + + if (len != SZ_32K) { + test_err("invalid stripe length, expected %llu, got %llu", + (u64)SZ_32K, len); + ret = -EINVAL; + goto out; + } + + ret = btrfs_get_raid_extent_offset(fs_info, logical, &len, map_type, 0, + &io_stripe); + if (!ret) { + ret = -EINVAL; + test_err("lookup of RAID extent [%llu, %llu] succeeded, should fail", + logical, logical + SZ_32K); + goto out; + } + + ret = btrfs_delete_raid_extent(trans, logical + SZ_32K, SZ_32K); + out: + return ret; +} + +/* + * Test a 64K RST write on a 2 disk RAID1 at a logical address of 1M and then + * truncate the stripe extent down to 32K. + */ +static int test_tail_delete(struct btrfs_trans_handle *trans) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct btrfs_io_context *bioc; + struct btrfs_io_stripe io_stripe = { 0 }; + u64 map_type = RST_TEST_RAID1_TYPE; + u64 logical = SZ_1M; + u64 len = SZ_64K; + int ret; + + bioc = alloc_btrfs_io_context(fs_info, logical, RST_TEST_NUM_DEVICES); + if (!bioc) { + test_std_err(TEST_ALLOC_IO_CONTEXT); + ret = -ENOMEM; + goto out; + } + + io_stripe.dev = btrfs_device_by_devid(fs_info->fs_devices, 0); + bioc->map_type = map_type; + bioc->size = len; + + for (int i = 0; i < RST_TEST_NUM_DEVICES; i++) { + struct btrfs_io_stripe *stripe = &bioc->stripes[i]; + + stripe->dev = btrfs_device_by_devid(fs_info->fs_devices, i); + if (!stripe->dev) { + test_err("cannot find device with devid %d", i); + ret = -EINVAL; + goto out; + } + + stripe->physical = logical + i * SZ_1G; + } + + ret = btrfs_insert_one_raid_extent(trans, bioc); + if (ret) { + test_err("inserting RAID extent failed: %d", ret); + goto out; + } + + io_stripe.dev = btrfs_device_by_devid(fs_info->fs_devices, 0); + if (!io_stripe.dev) { + ret = -EINVAL; + goto out; + } + + ret = btrfs_get_raid_extent_offset(fs_info, logical, &len, map_type, 0, + &io_stripe); + if (ret) { + test_err("lookup of RAID extent [%llu, %llu] failed", logical, + logical + len); + goto out; + } + + if (io_stripe.physical != logical) { + test_err("invalid physical address, expected %llu got %llu", + logical, io_stripe.physical); + ret = -EINVAL; + goto out; + } + + if (len != SZ_64K) { + test_err("invalid stripe length, expected %llu got %llu", + (u64)SZ_64K, len); + ret = -EINVAL; + goto out; + } + + ret = btrfs_delete_raid_extent(trans, logical + SZ_32K, SZ_32K); + if (ret) { + test_err("deleting RAID extent [%llu, %llu] failed", + logical + SZ_32K, logical + SZ_64K); + goto out; + } + + len = SZ_32K; + ret = btrfs_get_raid_extent_offset(fs_info, logical, &len, map_type, 0, &io_stripe); + if (ret) { + test_err("lookup of RAID extent [%llu, %llu] failed", logical, + logical + len); + goto out; + } + + if (io_stripe.physical != logical) { + test_err("invalid physical address, expected %llu, got %llu", + logical, io_stripe.physical); + ret = -EINVAL; + goto out; + } + + if (len != SZ_32K) { + test_err("invalid stripe length, expected %llu, got %llu", + (u64)SZ_32K, len); + ret = -EINVAL; + goto out; + } + + ret = btrfs_delete_raid_extent(trans, logical, len); + if (ret) + test_err("deleting RAID extent [%llu, %llu] failed", logical, + logical + len); + +out: + btrfs_put_bioc(bioc); + return ret; +} + /* * Test a 64K RST write on a 2 disk RAID1 at a logical address of 1M and then * overwrite the whole range giving it new physical address at an offset of 1G. @@ -235,6 +456,8 @@ static int test_simple_create_delete(struct btrfs_trans_handle *trans) static const test_func_t tests[] = { test_simple_create_delete, test_create_update_delete, + test_tail_delete, + test_front_delete, }; static int run_test(test_func_t test, u32 sectorsize, u32 nodesize) -- 2.43.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v4 2/2] btrfs: implement self-tests for partial RAID srtipe-tree delete 2024-10-17 9:04 ` [PATCH v4 2/2] btrfs: implement self-tests for partial RAID srtipe-tree delete Johannes Thumshirn @ 2024-10-22 13:57 ` Filipe Manana 0 siblings, 0 replies; 7+ messages in thread From: Filipe Manana @ 2024-10-22 13:57 UTC (permalink / raw) To: Johannes Thumshirn Cc: linux-btrfs, Josef Bacik, David Sterba, Filipe Manana, Naohiro Aota, Johannes Thumshirn On Thu, Oct 17, 2024 at 10:09 AM Johannes Thumshirn <jth@kernel.org> wrote: > > From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > Implement self-tests for partial deletion of RAID stripe-tree entries. > > These two new tests cover both the deletion of the front of a RAID > stripe-tree stripe extent as well as truncation of an item to make it > smaller. > > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > --- > fs/btrfs/tests/raid-stripe-tree-tests.c | 223 ++++++++++++++++++++++++ > 1 file changed, 223 insertions(+) > > diff --git a/fs/btrfs/tests/raid-stripe-tree-tests.c b/fs/btrfs/tests/raid-stripe-tree-tests.c > index b8013ab13c43..5f1f1342b291 100644 > --- a/fs/btrfs/tests/raid-stripe-tree-tests.c > +++ b/fs/btrfs/tests/raid-stripe-tree-tests.c > @@ -29,6 +29,227 @@ static struct btrfs_device *btrfs_device_by_devid(struct btrfs_fs_devices *fs_de > return NULL; > } > > +/* > + * Test a 64K RST write on a 2 disk RAID1 at a logical address of 1M and then > + * delete the 1st 32K, making the new start address 1M+32K. > + */ > +static int test_front_delete(struct btrfs_trans_handle *trans) > +{ > + struct btrfs_fs_info *fs_info = trans->fs_info; > + struct btrfs_io_context *bioc; > + struct btrfs_io_stripe io_stripe = { 0 }; > + u64 map_type = RST_TEST_RAID1_TYPE; > + u64 logical = SZ_1M; > + u64 len = SZ_64K; > + int ret; > + > + bioc = alloc_btrfs_io_context(fs_info, logical, RST_TEST_NUM_DEVICES); > + if (!bioc) { > + test_std_err(TEST_ALLOC_IO_CONTEXT); > + ret = -ENOMEM; > + goto out; > + } > + > + io_stripe.dev = btrfs_device_by_devid(fs_info->fs_devices, 0); > + bioc->map_type = map_type; > + bioc->size = len; > + > + for (int i = 0; i < RST_TEST_NUM_DEVICES; i++) { > + struct btrfs_io_stripe *stripe = &bioc->stripes[i]; > + > + stripe->dev = btrfs_device_by_devid(fs_info->fs_devices, i); > + if (!stripe->dev) { > + test_err("cannot find device with devid %d", i); > + ret = -EINVAL; > + goto out; > + } > + > + stripe->physical = logical + i * SZ_1G; > + } > + > + ret = btrfs_insert_one_raid_extent(trans, bioc); > + if (ret) { > + test_err("inserting RAID extent failed: %d", ret); > + goto out; > + } > + > + ret = btrfs_get_raid_extent_offset(fs_info, logical, &len, map_type, 0, > + &io_stripe); > + if (ret) { > + test_err("lookup of RAID extent [%llu, %llu] failed", logical, > + logical + len); > + goto out; > + } > + > + if (io_stripe.physical != logical) { > + test_err("invalid physical address, expected %llu got %llu", > + logical, io_stripe.physical); > + ret = -EINVAL; > + goto out; > + } > + > + if (len != SZ_64K) { > + test_err("invalid stripe length, expected %llu got %llu", > + (u64)SZ_64K, len); > + ret = -EINVAL; > + goto out; > + } > + > + ret = btrfs_delete_raid_extent(trans, logical, SZ_32K); > + if (ret) { > + test_err("deleting RAID extent [%llu, %llu] failed", logical, > + logical + SZ_32K); > + goto out; > + } > + > + len = SZ_32K; > + ret = btrfs_get_raid_extent_offset(fs_info, logical + SZ_32K, &len, > + map_type, 0, &io_stripe); > + if (ret) { > + test_err("lookup of RAID extent [%llu, %llu] failed", > + logical + SZ_32K, logical + SZ_32K + len); > + goto out; > + } > + > + if (io_stripe.physical != logical + SZ_32K) { > + test_err("invalid physical address, expected %llu, got %llu", > + logical + SZ_32K, io_stripe.physical); > + ret = -EINVAL; > + goto out; > + } > + > + if (len != SZ_32K) { > + test_err("invalid stripe length, expected %llu, got %llu", > + (u64)SZ_32K, len); > + ret = -EINVAL; > + goto out; > + } > + > + ret = btrfs_get_raid_extent_offset(fs_info, logical, &len, map_type, 0, > + &io_stripe); > + if (!ret) { > + ret = -EINVAL; > + test_err("lookup of RAID extent [%llu, %llu] succeeded, should fail", > + logical, logical + SZ_32K); > + goto out; > + } > + > + ret = btrfs_delete_raid_extent(trans, logical + SZ_32K, SZ_32K); > + out: Missies a: btrfs_put_bioc(bioc); The rest looks good, and with that fixed: Reviewed-by: Filipe Manana <fdmanana@suse.com> Thanks. > + return ret; > +} > + > +/* > + * Test a 64K RST write on a 2 disk RAID1 at a logical address of 1M and then > + * truncate the stripe extent down to 32K. > + */ > +static int test_tail_delete(struct btrfs_trans_handle *trans) > +{ > + struct btrfs_fs_info *fs_info = trans->fs_info; > + struct btrfs_io_context *bioc; > + struct btrfs_io_stripe io_stripe = { 0 }; > + u64 map_type = RST_TEST_RAID1_TYPE; > + u64 logical = SZ_1M; > + u64 len = SZ_64K; > + int ret; > + > + bioc = alloc_btrfs_io_context(fs_info, logical, RST_TEST_NUM_DEVICES); > + if (!bioc) { > + test_std_err(TEST_ALLOC_IO_CONTEXT); > + ret = -ENOMEM; > + goto out; > + } > + > + io_stripe.dev = btrfs_device_by_devid(fs_info->fs_devices, 0); > + bioc->map_type = map_type; > + bioc->size = len; > + > + for (int i = 0; i < RST_TEST_NUM_DEVICES; i++) { > + struct btrfs_io_stripe *stripe = &bioc->stripes[i]; > + > + stripe->dev = btrfs_device_by_devid(fs_info->fs_devices, i); > + if (!stripe->dev) { > + test_err("cannot find device with devid %d", i); > + ret = -EINVAL; > + goto out; > + } > + > + stripe->physical = logical + i * SZ_1G; > + } > + > + ret = btrfs_insert_one_raid_extent(trans, bioc); > + if (ret) { > + test_err("inserting RAID extent failed: %d", ret); > + goto out; > + } > + > + io_stripe.dev = btrfs_device_by_devid(fs_info->fs_devices, 0); > + if (!io_stripe.dev) { > + ret = -EINVAL; > + goto out; > + } > + > + ret = btrfs_get_raid_extent_offset(fs_info, logical, &len, map_type, 0, > + &io_stripe); > + if (ret) { > + test_err("lookup of RAID extent [%llu, %llu] failed", logical, > + logical + len); > + goto out; > + } > + > + if (io_stripe.physical != logical) { > + test_err("invalid physical address, expected %llu got %llu", > + logical, io_stripe.physical); > + ret = -EINVAL; > + goto out; > + } > + > + if (len != SZ_64K) { > + test_err("invalid stripe length, expected %llu got %llu", > + (u64)SZ_64K, len); > + ret = -EINVAL; > + goto out; > + } > + > + ret = btrfs_delete_raid_extent(trans, logical + SZ_32K, SZ_32K); > + if (ret) { > + test_err("deleting RAID extent [%llu, %llu] failed", > + logical + SZ_32K, logical + SZ_64K); > + goto out; > + } > + > + len = SZ_32K; > + ret = btrfs_get_raid_extent_offset(fs_info, logical, &len, map_type, 0, &io_stripe); > + if (ret) { > + test_err("lookup of RAID extent [%llu, %llu] failed", logical, > + logical + len); > + goto out; > + } > + > + if (io_stripe.physical != logical) { > + test_err("invalid physical address, expected %llu, got %llu", > + logical, io_stripe.physical); > + ret = -EINVAL; > + goto out; > + } > + > + if (len != SZ_32K) { > + test_err("invalid stripe length, expected %llu, got %llu", > + (u64)SZ_32K, len); > + ret = -EINVAL; > + goto out; > + } > + > + ret = btrfs_delete_raid_extent(trans, logical, len); > + if (ret) > + test_err("deleting RAID extent [%llu, %llu] failed", logical, > + logical + len); > + > +out: > + btrfs_put_bioc(bioc); > + return ret; > +} > + > /* > * Test a 64K RST write on a 2 disk RAID1 at a logical address of 1M and then > * overwrite the whole range giving it new physical address at an offset of 1G. > @@ -235,6 +456,8 @@ static int test_simple_create_delete(struct btrfs_trans_handle *trans) > static const test_func_t tests[] = { > test_simple_create_delete, > test_create_update_delete, > + test_tail_delete, > + test_front_delete, > }; > > static int run_test(test_func_t test, u32 sectorsize, u32 nodesize) > -- > 2.43.0 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-10-22 15:42 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-10-17 9:04 [PATCH v4 0/2] implement truncation for RAID stripe-extents Johannes Thumshirn 2024-10-17 9:04 ` [PATCH v4 1/2] btrfs: implement partial deletion of RAID stripe extents Johannes Thumshirn 2024-10-22 13:52 ` Filipe Manana 2024-10-22 15:37 ` Johannes Thumshirn 2024-10-22 15:41 ` Filipe Manana 2024-10-17 9:04 ` [PATCH v4 2/2] btrfs: implement self-tests for partial RAID srtipe-tree delete Johannes Thumshirn 2024-10-22 13:57 ` Filipe Manana
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).