Linux Btrfs filesystem development
 help / color / mirror / Atom feed
* [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents
@ 2024-06-04 11:08 fdmanana
  2024-06-04 11:08 ` [PATCH 1/6] btrfs: reduce critical section at btrfs_wait_ordered_roots() fdmanana
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: fdmanana @ 2024-06-04 11:08 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

Details in the individual change logs.

Filipe Manana (6):
  btrfs: reduce critical section at btrfs_wait_ordered_roots()
  btrfs: reduce critical section at btrfs_wait_ordered_extents()
  btrfs: add comment about locking to btrfs_split_ordered_extent()
  btrfs: avoid removal and re-insertion of split ordered extent
  btrfs: mark ordered extent insertion failure checks as unlikely
  btrfs: update panic message when splitting ordered extent

 fs/btrfs/ordered-data.c | 51 ++++++++++++++++++++++++-----------------
 1 file changed, 30 insertions(+), 21 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/6] btrfs: reduce critical section at btrfs_wait_ordered_roots()
  2024-06-04 11:08 [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents fdmanana
@ 2024-06-04 11:08 ` fdmanana
  2024-06-04 11:08 ` [PATCH 2/6] btrfs: reduce critical section at btrfs_wait_ordered_extents() fdmanana
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: fdmanana @ 2024-06-04 11:08 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

At btrfs_wait_ordered_roots(), there's no point in decrementing the
counter after locking fs_info->ordered_root_lock as the counter is local.
So change this to decrement the counter before taking the lock.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ordered-data.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 65d0464cd646..15428a8d4886 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -829,10 +829,10 @@ void btrfs_wait_ordered_roots(struct btrfs_fs_info *fs_info, u64 nr,
 		done = btrfs_wait_ordered_extents(root, nr, bg);
 		btrfs_put_root(root);
 
-		spin_lock(&fs_info->ordered_root_lock);
-		if (nr != U64_MAX) {
+		if (nr != U64_MAX)
 			nr -= done;
-		}
+
+		spin_lock(&fs_info->ordered_root_lock);
 	}
 	list_splice_tail(&splice, &fs_info->ordered_roots);
 	spin_unlock(&fs_info->ordered_root_lock);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/6] btrfs: reduce critical section at btrfs_wait_ordered_extents()
  2024-06-04 11:08 [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents fdmanana
  2024-06-04 11:08 ` [PATCH 1/6] btrfs: reduce critical section at btrfs_wait_ordered_roots() fdmanana
@ 2024-06-04 11:08 ` fdmanana
  2024-06-04 11:08 ` [PATCH 3/6] btrfs: add comment about locking to btrfs_split_ordered_extent() fdmanana
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: fdmanana @ 2024-06-04 11:08 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

At btrfs_wait_ordered_extents(), there's no point in updating the counters
after locking the root's ordered extent lock, as the counters are local.
So change this to update the counters before taking the lock.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ordered-data.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 15428a8d4886..1cabcfa85e7c 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -783,10 +783,10 @@ u64 btrfs_wait_ordered_extents(struct btrfs_root *root, u64 nr,
 		btrfs_queue_work(fs_info->flush_workers, &ordered->flush_work);
 
 		cond_resched();
-		spin_lock(&root->ordered_extent_lock);
 		if (nr != U64_MAX)
 			nr--;
 		count++;
+		spin_lock(&root->ordered_extent_lock);
 	}
 	list_splice_tail(&skipped, &root->ordered_extents);
 	list_splice_tail(&splice, &root->ordered_extents);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/6] btrfs: add comment about locking to btrfs_split_ordered_extent()
  2024-06-04 11:08 [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents fdmanana
  2024-06-04 11:08 ` [PATCH 1/6] btrfs: reduce critical section at btrfs_wait_ordered_roots() fdmanana
  2024-06-04 11:08 ` [PATCH 2/6] btrfs: reduce critical section at btrfs_wait_ordered_extents() fdmanana
@ 2024-06-04 11:08 ` fdmanana
  2024-06-04 11:08 ` [PATCH 4/6] btrfs: avoid removal and re-insertion of split ordered extent fdmanana
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: fdmanana @ 2024-06-04 11:08 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

There are subtle details about why the root's ordered_extent_lock is held,
so add a comment mentioning them.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ordered-data.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 1cabcfa85e7c..1f7f6720b2ea 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -1247,6 +1247,23 @@ struct btrfs_ordered_extent *btrfs_split_ordered_extent(
 	/* One ref for the tree. */
 	refcount_inc(&new->refs);
 
+	/*
+	 * Take the root's ordered_extent_lock to avoid a race with
+	 * btrfs_wait_ordered_extents() when updating the disk_bytenr and
+	 * disk_num_bytes fields of the ordered extent below. And we disable
+	 * IRQs because the inode's ordered_tree_lock is used in IRQ context
+	 * elsewhere.
+	 *
+	 * There's no concern about a previous caller of
+	 * btrfs_wait_ordered_extents() getting the trimmed ordered extent
+	 * before we insert the new one, because even if it gets the ordered
+	 * extent before it's trimmed and the new one inserted, right before it
+	 * uses it or during its use, the ordered extent might have been
+	 * trimmed in the meanwhile, and it missed the new ordered extent.
+	 * There's no way around this and it's harmless for current use cases,
+	 * so we take the root's ordered_extent_lock to fix that race during
+	 * trimming and silence tools like KCSAN.
+	 */
 	spin_lock_irq(&root->ordered_extent_lock);
 	spin_lock(&inode->ordered_tree_lock);
 	/* Remove from tree once */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/6] btrfs: avoid removal and re-insertion of split ordered extent
  2024-06-04 11:08 [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents fdmanana
                   ` (2 preceding siblings ...)
  2024-06-04 11:08 ` [PATCH 3/6] btrfs: add comment about locking to btrfs_split_ordered_extent() fdmanana
@ 2024-06-04 11:08 ` fdmanana
  2024-06-04 11:08 ` [PATCH 5/6] btrfs: mark ordered extent insertion failure checks as unlikely fdmanana
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: fdmanana @ 2024-06-04 11:08 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

At btrfs_split_ordered_extent(), we are removing and re-inserting the
ordered extent that we are trimming, but we don't need to since the
trimming doesn't change its position in the red black tree because we
don't have overlapping ordered extents (that would imply double allocation
of extents) and we know the split length is smaller than the ordered
extent's num_bytes field (we checked that early in the function).

So drop the remove and re-insert code for the slit ordered extent.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ordered-data.c | 20 ++++++--------------
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 1f7f6720b2ea..1d7707948833 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -1266,13 +1266,13 @@ struct btrfs_ordered_extent *btrfs_split_ordered_extent(
 	 */
 	spin_lock_irq(&root->ordered_extent_lock);
 	spin_lock(&inode->ordered_tree_lock);
-	/* Remove from tree once */
-	node = &ordered->rb_node;
-	rb_erase(node, &inode->ordered_tree);
-	RB_CLEAR_NODE(node);
-	if (inode->ordered_tree_last == node)
-		inode->ordered_tree_last = NULL;
 
+	/*
+	 * We don't have overlapping ordered extents (that would imply double
+	 * allocation of extents) and we checked above that the split length
+	 * does not cross the ordered extent's num_bytes field, so there's
+	 * no need to remove it and re-insert it in the tree.
+	 */
 	ordered->file_offset += len;
 	ordered->disk_bytenr += len;
 	ordered->num_bytes -= len;
@@ -1302,14 +1302,6 @@ struct btrfs_ordered_extent *btrfs_split_ordered_extent(
 		offset += sum->len;
 	}
 
-	/* Re-insert the node */
-	node = tree_insert(&inode->ordered_tree, ordered->file_offset,
-			   &ordered->rb_node);
-	if (node)
-		btrfs_panic(fs_info, -EEXIST,
-			"zoned: inconsistency in ordered tree at offset %llu",
-			ordered->file_offset);
-
 	node = tree_insert(&inode->ordered_tree, new->file_offset, &new->rb_node);
 	if (node)
 		btrfs_panic(fs_info, -EEXIST,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/6] btrfs: mark ordered extent insertion failure checks as unlikely
  2024-06-04 11:08 [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents fdmanana
                   ` (3 preceding siblings ...)
  2024-06-04 11:08 ` [PATCH 4/6] btrfs: avoid removal and re-insertion of split ordered extent fdmanana
@ 2024-06-04 11:08 ` fdmanana
  2024-06-04 11:08 ` [PATCH 6/6] btrfs: update panic message when splitting ordered extent fdmanana
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: fdmanana @ 2024-06-04 11:08 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

We never expect an ordered extent insertion to fail due to already having
another ordered extent in the tree for the same file offset, since we
always wait for existing ordered extents in a range to complete before
writing into the range again. So mark the failure checks for the results
of tree_insert() as unlikely, to make it clear it's never expected (save
exceptional causes like bugs or memory corruptions) and to serve as a hint
for the compiler to possibly generate better code.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ordered-data.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 1d7707948833..c98c8fdc14a1 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -224,7 +224,7 @@ static void insert_ordered_extent(struct btrfs_ordered_extent *entry)
 	spin_lock_irq(&inode->ordered_tree_lock);
 	node = tree_insert(&inode->ordered_tree, entry->file_offset,
 			   &entry->rb_node);
-	if (node)
+	if (unlikely(node))
 		btrfs_panic(fs_info, -EEXIST,
 				"inconsistency in ordered tree at offset %llu",
 				entry->file_offset);
@@ -1303,7 +1303,7 @@ struct btrfs_ordered_extent *btrfs_split_ordered_extent(
 	}
 
 	node = tree_insert(&inode->ordered_tree, new->file_offset, &new->rb_node);
-	if (node)
+	if (unlikely(node))
 		btrfs_panic(fs_info, -EEXIST,
 			"zoned: inconsistency in ordered tree at offset %llu",
 			new->file_offset);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/6] btrfs: update panic message when splitting ordered extent
  2024-06-04 11:08 [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents fdmanana
                   ` (4 preceding siblings ...)
  2024-06-04 11:08 ` [PATCH 5/6] btrfs: mark ordered extent insertion failure checks as unlikely fdmanana
@ 2024-06-04 11:08 ` fdmanana
  2024-06-04 16:02 ` [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents Josef Bacik
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: fdmanana @ 2024-06-04 11:08 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

During ordered extent splitting if we find a duplicated ordered extent
when attempting to insert the new ordered extent we panic but with a
message that has the "zoned:" prefix. This is because the splitting used
to be exclusive for zoned filesystems, but as of commit b73a6fd1b1ef
("btrfs: split partial dio bios before submit") it can also be done for
non zoned filesystems during direct IO writes. So remove the "zoned:"
prefix from the message and mention the split to make it more specific
and different from the panic message at insert_ordered_extent().

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ordered-data.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index c98c8fdc14a1..a3343656e0a7 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -1305,7 +1305,7 @@ struct btrfs_ordered_extent *btrfs_split_ordered_extent(
 	node = tree_insert(&inode->ordered_tree, new->file_offset, &new->rb_node);
 	if (unlikely(node))
 		btrfs_panic(fs_info, -EEXIST,
-			"zoned: inconsistency in ordered tree at offset %llu",
+			"inconsistency in ordered tree at offset %llu after split",
 			new->file_offset);
 	spin_unlock(&inode->ordered_tree_lock);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents
  2024-06-04 11:08 [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents fdmanana
                   ` (5 preceding siblings ...)
  2024-06-04 11:08 ` [PATCH 6/6] btrfs: update panic message when splitting ordered extent fdmanana
@ 2024-06-04 16:02 ` Josef Bacik
  2024-06-04 22:50 ` Qu Wenruo
  2024-06-05  7:35 ` Johannes Thumshirn
  8 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2024-06-04 16:02 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Tue, Jun 04, 2024 at 12:08:46PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> Details in the individual change logs.
> 

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents
  2024-06-04 11:08 [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents fdmanana
                   ` (6 preceding siblings ...)
  2024-06-04 16:02 ` [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents Josef Bacik
@ 2024-06-04 22:50 ` Qu Wenruo
  2024-06-05  7:35 ` Johannes Thumshirn
  8 siblings, 0 replies; 10+ messages in thread
From: Qu Wenruo @ 2024-06-04 22:50 UTC (permalink / raw)
  To: fdmanana, linux-btrfs



在 2024/6/4 20:38, fdmanana@kernel.org 写道:
> From: Filipe Manana <fdmanana@suse.com>
>
> Details in the individual change logs.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
>
> Filipe Manana (6):
>    btrfs: reduce critical section at btrfs_wait_ordered_roots()
>    btrfs: reduce critical section at btrfs_wait_ordered_extents()
>    btrfs: add comment about locking to btrfs_split_ordered_extent()
>    btrfs: avoid removal and re-insertion of split ordered extent
>    btrfs: mark ordered extent insertion failure checks as unlikely
>    btrfs: update panic message when splitting ordered extent
>
>   fs/btrfs/ordered-data.c | 51 ++++++++++++++++++++++++-----------------
>   1 file changed, 30 insertions(+), 21 deletions(-)
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents
  2024-06-04 11:08 [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents fdmanana
                   ` (7 preceding siblings ...)
  2024-06-04 22:50 ` Qu Wenruo
@ 2024-06-05  7:35 ` Johannes Thumshirn
  8 siblings, 0 replies; 10+ messages in thread
From: Johannes Thumshirn @ 2024-06-05  7:35 UTC (permalink / raw)
  To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org

On 04.06.24 13:09, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> Details in the individual change logs.
> 
> Filipe Manana (6):
>    btrfs: reduce critical section at btrfs_wait_ordered_roots()
>    btrfs: reduce critical section at btrfs_wait_ordered_extents()
>    btrfs: add comment about locking to btrfs_split_ordered_extent()
>    btrfs: avoid removal and re-insertion of split ordered extent
>    btrfs: mark ordered extent insertion failure checks as unlikely
>    btrfs: update panic message when splitting ordered extent
> 
>   fs/btrfs/ordered-data.c | 51 ++++++++++++++++++++++++-----------------
>   1 file changed, 30 insertions(+), 21 deletions(-)
> 

For the series,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-06-05  7:35 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-04 11:08 [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents fdmanana
2024-06-04 11:08 ` [PATCH 1/6] btrfs: reduce critical section at btrfs_wait_ordered_roots() fdmanana
2024-06-04 11:08 ` [PATCH 2/6] btrfs: reduce critical section at btrfs_wait_ordered_extents() fdmanana
2024-06-04 11:08 ` [PATCH 3/6] btrfs: add comment about locking to btrfs_split_ordered_extent() fdmanana
2024-06-04 11:08 ` [PATCH 4/6] btrfs: avoid removal and re-insertion of split ordered extent fdmanana
2024-06-04 11:08 ` [PATCH 5/6] btrfs: mark ordered extent insertion failure checks as unlikely fdmanana
2024-06-04 11:08 ` [PATCH 6/6] btrfs: update panic message when splitting ordered extent fdmanana
2024-06-04 16:02 ` [PATCH 0/6] btrfs: some tweaks and cleanups around ordered extents Josef Bacik
2024-06-04 22:50 ` Qu Wenruo
2024-06-05  7:35 ` Johannes Thumshirn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox