All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups
@ 2025-10-23 15:59 fdmanana
  2025-10-23 15:59 ` [PATCH 01/28] btrfs: return real error when failing tickets in maybe_fail_all_tickets() fdmanana
                   ` (28 more replies)
  0 siblings, 29 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

Several optimizations to reduce critical sections delimited by a
space_info's spinlock and several cleanups mostly arround space
reservation and flushing. Details in the changelogs.

Filipe Manana (28):
  btrfs: return real error when failing tickets in maybe_fail_all_tickets()
  btrfs: avoid recomputing used space in btrfs_try_granting_tickets()
  btrfs: make btrfs_can_overcommit() return bool instead of int
  btrfs: avoid used space computation when trying to grant tickets
  btrfs: avoid used space computation when reserving space
  btrfs: inline btrfs_space_info_used()
  btrfs: bail out earlier from need_preemptive_reclaim() if we have tickets
  btrfs: increment loop count outside critical section during metadata reclaim
  btrfs: shorten critical section in btrfs_preempt_reclaim_metadata_space()
  btrfs: avoid unnecessary reclaim calculation in priority_reclaim_metadata_space()
  btrfs: assert space_info is locked in steal_from_global_rsv()
  btrfs: assign booleans to global reserve's full field
  btrfs: process ticket outside global reserve critical section
  btrfs: remove double underscore prefix from __reserve_bytes()
  btrfs: reduce space_info critical section in btrfs_chunk_alloc()
  btrfs: reduce block group critical section in btrfs_free_reserved_bytes()
  btrfs: reduce block group critical section in btrfs_add_reserved_bytes()
  btrfs: reduce block group critical section in do_trimming()
  btrfs: reduce block group critical section in pin_down_extent()
  btrfs: use local variable for space_info in pin_down_extent()
  btrfs: remove 'reserved' argument from btrfs_pin_extent()
  btrfs: change 'reserved' argument from pin_down_extent() to bool
  btrfs: reduce block group critical section in unpin_extent_range()
  btrfs: remove pointless label and goto from unpin_extent_range()
  btrfs: add data_race() in btrfs_account_ro_block_groups_free_space()
  btrfs: move ticket wakeup and finalization to remove_ticket()
  btrfs: avoid space_info locking when checking if tickets are served
  btrfs: tag as unlikely fs aborted checks in space flushing code

 fs/btrfs/block-group.c      |  41 +++---
 fs/btrfs/extent-tree.c      |  72 ++++++-----
 fs/btrfs/extent-tree.h      |   3 +-
 fs/btrfs/free-space-cache.c |  20 +--
 fs/btrfs/space-info.c       | 243 +++++++++++++++++++-----------------
 fs/btrfs/space-info.h       |  18 ++-
 6 files changed, 219 insertions(+), 178 deletions(-)

-- 
2.47.2


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 01/28] btrfs: return real error when failing tickets in maybe_fail_all_tickets()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 02/28] btrfs: avoid recomputing used space in btrfs_try_granting_tickets() fdmanana
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

In case we had a transaction abort we set a ticket's error to -EIO, but we
have the real error that caused the transaction to be aborted returned by
the macro BTRFS_FS_ERROR(). So use that real error instead of -EIO.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 69237f5d6078..8b1cf7f6c223 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1082,7 +1082,7 @@ static bool maybe_fail_all_tickets(struct btrfs_space_info *space_info)
 	struct btrfs_fs_info *fs_info = space_info->fs_info;
 	struct reserve_ticket *ticket;
 	u64 tickets_id = space_info->tickets_id;
-	const bool aborted = BTRFS_FS_ERROR(fs_info);
+	const int abort_error = BTRFS_FS_ERROR(fs_info);
 
 	trace_btrfs_fail_all_tickets(fs_info, space_info);
 
@@ -1096,16 +1096,16 @@ static bool maybe_fail_all_tickets(struct btrfs_space_info *space_info)
 		ticket = list_first_entry(&space_info->tickets,
 					  struct reserve_ticket, list);
 
-		if (!aborted && steal_from_global_rsv(space_info, ticket))
+		if (!abort_error && steal_from_global_rsv(space_info, ticket))
 			return true;
 
-		if (!aborted && btrfs_test_opt(fs_info, ENOSPC_DEBUG))
+		if (!abort_error && btrfs_test_opt(fs_info, ENOSPC_DEBUG))
 			btrfs_info(fs_info, "failing ticket with %llu bytes",
 				   ticket->bytes);
 
 		remove_ticket(space_info, ticket);
-		if (aborted)
-			ticket->error = -EIO;
+		if (abort_error)
+			ticket->error = abort_error;
 		else
 			ticket->error = -ENOSPC;
 		wake_up(&ticket->wait);
@@ -1116,7 +1116,7 @@ static bool maybe_fail_all_tickets(struct btrfs_space_info *space_info)
 		 * here to see if we can make progress with the next ticket in
 		 * the list.
 		 */
-		if (!aborted)
+		if (!abort_error)
 			btrfs_try_granting_tickets(space_info);
 	}
 	return (tickets_id != space_info->tickets_id);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 02/28] btrfs: avoid recomputing used space in btrfs_try_granting_tickets()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
  2025-10-23 15:59 ` [PATCH 01/28] btrfs: return real error when failing tickets in maybe_fail_all_tickets() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 03/28] btrfs: make btrfs_can_overcommit() return bool instead of int fdmanana
                   ` (26 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

In every iteration of the loop we call btrfs_space_info_used() which sums
a bunch of fields from a space_info object. This implies doing a function
call besides the sum, and we are holding the space_info's spinlock while
we do this, so we want to keep the critical section as short as possible
since that spinlock is used in all the code for space reservarion and
flushing (therefore it's heavily used).

So call btrfs_try_granting_tickets() only once, before entering the loop,
and then update it as we remove tickets.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 8b1cf7f6c223..c0bad6914bb7 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -526,6 +526,7 @@ void btrfs_try_granting_tickets(struct btrfs_space_info *space_info)
 {
 	struct list_head *head;
 	enum btrfs_reserve_flush_enum flush = BTRFS_RESERVE_NO_FLUSH;
+	u64 used = btrfs_space_info_used(space_info, true);
 
 	lockdep_assert_held(&space_info->lock);
 
@@ -533,18 +534,20 @@ void btrfs_try_granting_tickets(struct btrfs_space_info *space_info)
 again:
 	while (!list_empty(head)) {
 		struct reserve_ticket *ticket;
-		u64 used = btrfs_space_info_used(space_info, true);
+		u64 used_after;
 
 		ticket = list_first_entry(head, struct reserve_ticket, list);
+		used_after = used + ticket->bytes;
 
 		/* Check and see if our ticket can be satisfied now. */
-		if ((used + ticket->bytes <= space_info->total_bytes) ||
+		if (used_after <= space_info->total_bytes ||
 		    btrfs_can_overcommit(space_info, ticket->bytes, flush)) {
 			btrfs_space_info_update_bytes_may_use(space_info, ticket->bytes);
 			remove_ticket(space_info, ticket);
 			ticket->bytes = 0;
 			space_info->tickets_id++;
 			wake_up(&ticket->wait);
+			used = used_after;
 		} else {
 			break;
 		}
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 03/28] btrfs: make btrfs_can_overcommit() return bool instead of int
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
  2025-10-23 15:59 ` [PATCH 01/28] btrfs: return real error when failing tickets in maybe_fail_all_tickets() fdmanana
  2025-10-23 15:59 ` [PATCH 02/28] btrfs: avoid recomputing used space in btrfs_try_granting_tickets() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 04/28] btrfs: avoid used space computation when trying to grant tickets fdmanana
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

It's a boolean function, so switch its return type to bool.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 10 ++++------
 fs/btrfs/space-info.h |  4 ++--
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index c0bad6914bb7..0fdf60f05228 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -490,22 +490,20 @@ static u64 calc_available_free_space(const struct btrfs_space_info *space_info,
 	return avail;
 }
 
-int btrfs_can_overcommit(const struct btrfs_space_info *space_info, u64 bytes,
-			 enum btrfs_reserve_flush_enum flush)
+bool btrfs_can_overcommit(const struct btrfs_space_info *space_info, u64 bytes,
+			  enum btrfs_reserve_flush_enum flush)
 {
 	u64 avail;
 	u64 used;
 
 	/* Don't overcommit when in mixed mode */
 	if (space_info->flags & BTRFS_BLOCK_GROUP_DATA)
-		return 0;
+		return false;
 
 	used = btrfs_space_info_used(space_info, true);
 	avail = calc_available_free_space(space_info, flush);
 
-	if (used + bytes < space_info->total_bytes + avail)
-		return 1;
-	return 0;
+	return (used + bytes < space_info->total_bytes + avail);
 }
 
 static void remove_ticket(struct btrfs_space_info *space_info,
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index 2fad2e4c2252..d97b0799649f 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -282,8 +282,8 @@ int btrfs_reserve_metadata_bytes(struct btrfs_space_info *space_info,
 				 u64 orig_bytes,
 				 enum btrfs_reserve_flush_enum flush);
 void btrfs_try_granting_tickets(struct btrfs_space_info *space_info);
-int btrfs_can_overcommit(const struct btrfs_space_info *space_info, u64 bytes,
-			 enum btrfs_reserve_flush_enum flush);
+bool btrfs_can_overcommit(const struct btrfs_space_info *space_info, u64 bytes,
+			  enum btrfs_reserve_flush_enum flush);
 
 static inline void btrfs_space_info_free_bytes_may_use(
 				struct btrfs_space_info *space_info,
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 04/28] btrfs: avoid used space computation when trying to grant tickets
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (2 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 03/28] btrfs: make btrfs_can_overcommit() return bool instead of int fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 05/28] btrfs: avoid used space computation when reserving space fdmanana
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

In btrfs_try_granting_tickets(), we call btrfs_can_overcommit() and that
calls btrfs_space_info_used(). But we already keep track, in the 'used'
local variable, of the used space in the space_info, so we are just
repeating the same computation and doing an extra function call while we
are holding the space_info's spinlock, which is heavily used by the space
reservation and flushing code.

So add a local variant of btrfs_can_overcommit() that takes in the used
space as an argument and therefore does not call btrfs_space_info_used(),
and use it in btrfs_try_granting_tickets().

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 0fdf60f05228..f5ff51680f41 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -490,10 +490,29 @@ static u64 calc_available_free_space(const struct btrfs_space_info *space_info,
 	return avail;
 }
 
+static inline bool check_can_overcommit(const struct btrfs_space_info *space_info,
+					u64 space_info_used_bytes, u64 bytes,
+					enum btrfs_reserve_flush_enum flush)
+{
+	const u64 avail = calc_available_free_space(space_info, flush);
+
+	return (space_info_used_bytes + bytes < space_info->total_bytes + avail);
+}
+
+static inline bool can_overcommit(const struct btrfs_space_info *space_info,
+				  u64 space_info_used_bytes, u64 bytes,
+				  enum btrfs_reserve_flush_enum flush)
+{
+	/* Don't overcommit when in mixed mode. */
+	if (space_info->flags & BTRFS_BLOCK_GROUP_DATA)
+		return false;
+
+	return check_can_overcommit(space_info, space_info_used_bytes, bytes, flush);
+}
+
 bool btrfs_can_overcommit(const struct btrfs_space_info *space_info, u64 bytes,
 			  enum btrfs_reserve_flush_enum flush)
 {
-	u64 avail;
 	u64 used;
 
 	/* Don't overcommit when in mixed mode */
@@ -501,9 +520,8 @@ bool btrfs_can_overcommit(const struct btrfs_space_info *space_info, u64 bytes,
 		return false;
 
 	used = btrfs_space_info_used(space_info, true);
-	avail = calc_available_free_space(space_info, flush);
 
-	return (used + bytes < space_info->total_bytes + avail);
+	return check_can_overcommit(space_info, used, bytes, flush);
 }
 
 static void remove_ticket(struct btrfs_space_info *space_info,
@@ -539,7 +557,7 @@ void btrfs_try_granting_tickets(struct btrfs_space_info *space_info)
 
 		/* Check and see if our ticket can be satisfied now. */
 		if (used_after <= space_info->total_bytes ||
-		    btrfs_can_overcommit(space_info, ticket->bytes, flush)) {
+		    can_overcommit(space_info, used, ticket->bytes, flush)) {
 			btrfs_space_info_update_bytes_may_use(space_info, ticket->bytes);
 			remove_ticket(space_info, ticket);
 			ticket->bytes = 0;
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 05/28] btrfs: avoid used space computation when reserving space
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (3 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 04/28] btrfs: avoid used space computation when trying to grant tickets fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 06/28] btrfs: inline btrfs_space_info_used() fdmanana
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

In __reserve_bytes() we have 3 repeated calls to btrfs_space_info_used(),
one early on as soon as take the space_info's spinlock, another one when
we call btrfs_can_overcommit(), which calls btrfs_space_info_used() again,
and a final one when we are reserving for a flush emergency.

During all these calls we are holding the space_info's spinlock, which is
heavily used by the space reservation and flushing code, so it's desirable
to make the critical sections as short as possible.

So make this more efficient by:

1) Instead of calling btrfs_can_overcommit() call the new variant
   can_overcommit() which takes the space_info's used space as an argument
   and pass the value we already computed and have in the 'used' variable;

2) Instead of calling btrfs_space_info_used() with its second argument as
   false when we are doing a flush emergency, decrement the space_info's
   bytes_may_use counter from the 'used' variable, as the difference
   between passing true or false as the second argument to
   btrfs_space_info_used() is whether or not to include the space_info's
   bytes_may_use counter in the computation.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index f5ff51680f41..6c2769044b55 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1783,7 +1783,7 @@ static int __reserve_bytes(struct btrfs_space_info *space_info, u64 orig_bytes,
 	 */
 	if (!pending_tickets &&
 	    ((used + orig_bytes <= space_info->total_bytes) ||
-	     btrfs_can_overcommit(space_info, orig_bytes, flush))) {
+	     can_overcommit(space_info, used, orig_bytes, flush))) {
 		btrfs_space_info_update_bytes_may_use(space_info, orig_bytes);
 		ret = 0;
 	}
@@ -1794,7 +1794,7 @@ static int __reserve_bytes(struct btrfs_space_info *space_info, u64 orig_bytes,
 	 * left to allocate for the block.
 	 */
 	if (ret && unlikely(flush == BTRFS_RESERVE_FLUSH_EMERGENCY)) {
-		used = btrfs_space_info_used(space_info, false);
+		used -= space_info->bytes_may_use;
 		if (used + orig_bytes <= space_info->total_bytes) {
 			btrfs_space_info_update_bytes_may_use(space_info, orig_bytes);
 			ret = 0;
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 06/28] btrfs: inline btrfs_space_info_used()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (4 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 05/28] btrfs: avoid used space computation when reserving space fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-24  6:52   ` Johannes Thumshirn
  2025-10-23 15:59 ` [PATCH 07/28] btrfs: bail out earlier from need_preemptive_reclaim() if we have tickets fdmanana
                   ` (22 subsequent siblings)
  28 siblings, 1 reply; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

The function is simple enough to be inlined and in fact doing it even
reduces the object code. In x86_64 with 14.2.0-19 from Debian the results
were the following:

  Before this change

    $ size fs/btrfs/btrfs.ko
      text	   data	    bss	    dec	    hex	filename
    1919410	 161703	  15592	2096705	 1ffe41	fs/btrfs/btrfs.ko

  After this change

    $ size fs/btrfs/btrfs.ko
      text	   data	    bss	    dec	    hex	filename
    1918991	 161675	  15592	2096258	 1ffc82	fs/btrfs/btrfs.ko

Also remove the ASSERT() that checks the space_info argument is not NULL,
as it's odd to be there since it can never be NULL and in case that ever
happens during development, a stack trace from a NULL pointer dereference
will be obvious. It was originally added when btrfs_space_info_used() was
introduced in commit 4136135b080f ("Btrfs: use helper to get used bytes
of space_info").

Also add a lockdep assertion to check the space_info's lock is being held
by the calling task.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 10 ----------
 fs/btrfs/space-info.h | 13 +++++++++++--
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 6c2769044b55..53677ecb8c15 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -172,16 +172,6 @@
  *   thing with or without extra unallocated space.
  */
 
-u64 __pure btrfs_space_info_used(const struct btrfs_space_info *s_info,
-			  bool may_use_included)
-{
-	ASSERT(s_info);
-	return s_info->bytes_used + s_info->bytes_reserved +
-		s_info->bytes_pinned + s_info->bytes_readonly +
-		s_info->bytes_zone_unusable +
-		(may_use_included ? s_info->bytes_may_use : 0);
-}
-
 /*
  * after adding space to the filesystem, we need to clear the full flags
  * on all the space infos.
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index d97b0799649f..7e16d4c116c8 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -266,6 +266,17 @@ DECLARE_SPACE_INFO_UPDATE(bytes_may_use, "space_info");
 DECLARE_SPACE_INFO_UPDATE(bytes_pinned, "pinned");
 DECLARE_SPACE_INFO_UPDATE(bytes_zone_unusable, "zone_unusable");
 
+static inline u64 btrfs_space_info_used(const struct btrfs_space_info *s_info,
+					bool may_use_included)
+{
+	lockdep_assert_held(&s_info->lock);
+
+	return s_info->bytes_used + s_info->bytes_reserved +
+		s_info->bytes_pinned + s_info->bytes_readonly +
+		s_info->bytes_zone_unusable +
+		(may_use_included ? s_info->bytes_may_use : 0);
+}
+
 int btrfs_init_space_info(struct btrfs_fs_info *fs_info);
 void btrfs_add_bg_to_space_info(struct btrfs_fs_info *info,
 				struct btrfs_block_group *block_group);
@@ -273,8 +284,6 @@ void btrfs_update_space_info_chunk_size(struct btrfs_space_info *space_info,
 					u64 chunk_size);
 struct btrfs_space_info *btrfs_find_space_info(struct btrfs_fs_info *info,
 					       u64 flags);
-u64 __pure btrfs_space_info_used(const struct btrfs_space_info *s_info,
-			  bool may_use_included);
 void btrfs_clear_space_info_full(struct btrfs_fs_info *info);
 void btrfs_dump_space_info(struct btrfs_space_info *info, u64 bytes,
 			   bool dump_block_groups);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 07/28] btrfs: bail out earlier from need_preemptive_reclaim() if we have tickets
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (5 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 06/28] btrfs: inline btrfs_space_info_used() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 08/28] btrfs: increment loop count outside critical section during metadata reclaim fdmanana
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

Instead of doing some calculations and then return false if it turns out
we have queued tickets, check first if we have tickets and return false
immediately if we have tickets, without wasting time on doing those
computations.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 53677ecb8c15..bd206fc300e7 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -937,10 +937,17 @@ static bool need_preemptive_reclaim(const struct btrfs_space_info *space_info)
 	u64 thresh;
 	u64 used;
 
-	thresh = mult_perc(space_info->total_bytes, 90);
-
 	lockdep_assert_held(&space_info->lock);
 
+	/*
+	 * We have tickets queued, bail so we don't compete with the async
+	 * flushers.
+	 */
+	if (space_info->reclaim_size)
+		return false;
+
+	thresh = mult_perc(space_info->total_bytes, 90);
+
 	/* If we're just plain full then async reclaim just slows us down. */
 	if ((space_info->bytes_used + space_info->bytes_reserved +
 	     global_rsv_size) >= thresh)
@@ -960,13 +967,6 @@ static bool need_preemptive_reclaim(const struct btrfs_space_info *space_info)
 	if (used - global_rsv_size <= SZ_128M)
 		return false;
 
-	/*
-	 * We have tickets queued, bail so we don't compete with the async
-	 * flushers.
-	 */
-	if (space_info->reclaim_size)
-		return false;
-
 	/*
 	 * If we have over half of the free space occupied by reservations or
 	 * pinned then we want to start flushing.
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 08/28] btrfs: increment loop count outside critical section during metadata reclaim
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (6 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 07/28] btrfs: bail out earlier from need_preemptive_reclaim() if we have tickets fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 09/28] btrfs: shorten critical section in btrfs_preempt_reclaim_metadata_space() fdmanana
                   ` (20 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

In btrfs_preempt_reclaim_metadata_space() there's no need to increment the
local variable that tracks the number of iterations of the while loop
while inside the critical section delimited by the space_info's spinlock.
That spinlock is heavily used by space reservation and flushing code, so
it's desirable to have its critical sections as short as possible.
So move the loop count incremented outside the critical section.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index bd206fc300e7..2dd9d4e5c2c2 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1264,8 +1264,6 @@ static void btrfs_preempt_reclaim_metadata_space(struct work_struct *work)
 		u64 to_reclaim, block_rsv_size;
 		const u64 global_rsv_size = btrfs_block_rsv_reserved(global_rsv);
 
-		loops++;
-
 		/*
 		 * We don't have a precise counter for the metadata being
 		 * reserved for delalloc, so we'll approximate it by subtracting
@@ -1311,6 +1309,8 @@ static void btrfs_preempt_reclaim_metadata_space(struct work_struct *work)
 
 		spin_unlock(&space_info->lock);
 
+		loops++;
+
 		/*
 		 * We don't want to reclaim everything, just a portion, so scale
 		 * down the to_reclaim by 1/4.  If it takes us down to 0,
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 09/28] btrfs: shorten critical section in btrfs_preempt_reclaim_metadata_space()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (7 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 08/28] btrfs: increment loop count outside critical section during metadata reclaim fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 10/28] btrfs: avoid unnecessary reclaim calculation in priority_reclaim_metadata_space() fdmanana
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

We are doing a lot of small calculations and assignments while holding the
space_info's spinlock, which is a heavily used lock for space reservation
and flushing. There's no point in holding the lock for so long when all we
want is to call need_preemptive_reclaim() and get a consistent value for a
couple of counters from the space_info. Instead, grab the counters into
local variables, release the lock and then use the local variables.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 2dd9d4e5c2c2..9a072009eec8 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1263,7 +1263,10 @@ static void btrfs_preempt_reclaim_metadata_space(struct work_struct *work)
 		u64 delalloc_size = 0;
 		u64 to_reclaim, block_rsv_size;
 		const u64 global_rsv_size = btrfs_block_rsv_reserved(global_rsv);
+		const u64 bytes_may_use = space_info->bytes_may_use;
+		const u64 bytes_pinned = space_info->bytes_pinned;
 
+		spin_unlock(&space_info->lock);
 		/*
 		 * We don't have a precise counter for the metadata being
 		 * reserved for delalloc, so we'll approximate it by subtracting
@@ -1275,8 +1278,8 @@ static void btrfs_preempt_reclaim_metadata_space(struct work_struct *work)
 			btrfs_block_rsv_reserved(delayed_block_rsv) +
 			btrfs_block_rsv_reserved(delayed_refs_rsv) +
 			btrfs_block_rsv_reserved(trans_rsv);
-		if (block_rsv_size < space_info->bytes_may_use)
-			delalloc_size = space_info->bytes_may_use - block_rsv_size;
+		if (block_rsv_size < bytes_may_use)
+			delalloc_size = bytes_may_use - block_rsv_size;
 
 		/*
 		 * We don't want to include the global_rsv in our calculation,
@@ -1293,10 +1296,10 @@ static void btrfs_preempt_reclaim_metadata_space(struct work_struct *work)
 		if (delalloc_size > block_rsv_size) {
 			to_reclaim = delalloc_size;
 			flush = FLUSH_DELALLOC;
-		} else if (space_info->bytes_pinned >
+		} else if (bytes_pinned >
 			   (btrfs_block_rsv_reserved(delayed_block_rsv) +
 			    btrfs_block_rsv_reserved(delayed_refs_rsv))) {
-			to_reclaim = space_info->bytes_pinned;
+			to_reclaim = bytes_pinned;
 			flush = COMMIT_TRANS;
 		} else if (btrfs_block_rsv_reserved(delayed_block_rsv) >
 			   btrfs_block_rsv_reserved(delayed_refs_rsv)) {
@@ -1307,8 +1310,6 @@ static void btrfs_preempt_reclaim_metadata_space(struct work_struct *work)
 			flush = FLUSH_DELAYED_REFS_NR;
 		}
 
-		spin_unlock(&space_info->lock);
-
 		loops++;
 
 		/*
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 10/28] btrfs: avoid unnecessary reclaim calculation in priority_reclaim_metadata_space()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (8 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 09/28] btrfs: shorten critical section in btrfs_preempt_reclaim_metadata_space() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-24  7:06   ` Johannes Thumshirn
  2025-10-23 15:59 ` [PATCH 11/28] btrfs: assert space_info is locked in steal_from_global_rsv() fdmanana
                   ` (18 subsequent siblings)
  28 siblings, 1 reply; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

If the given ticket was already served (its ->bytes is 0), then we wasted
time calculating the metadata reclaim size. So calculate it only after we
checked the ticket was not yet served.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 9a072009eec8..b03c015d5d51 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1501,7 +1501,6 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
 	int flush_state = 0;
 
 	spin_lock(&space_info->lock);
-	to_reclaim = btrfs_calc_reclaim_metadata_size(space_info);
 	/*
 	 * This is the priority reclaim path, so to_reclaim could be >0 still
 	 * because we may have only satisfied the priority tickets and still
@@ -1513,6 +1512,8 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
 		return;
 	}
 
+	to_reclaim = btrfs_calc_reclaim_metadata_size(space_info);
+
 	while (flush_state < states_nr) {
 		spin_unlock(&space_info->lock);
 		flush_space(space_info, to_reclaim, states[flush_state], false);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 11/28] btrfs: assert space_info is locked in steal_from_global_rsv()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (9 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 10/28] btrfs: avoid unnecessary reclaim calculation in priority_reclaim_metadata_space() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 12/28] btrfs: assign booleans to global reserve's full field fdmanana
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

The caller is supposed to have locked the space_info, so assert that.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index b03c015d5d51..a2af55178c69 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1047,6 +1047,8 @@ static bool steal_from_global_rsv(struct btrfs_space_info *space_info,
 	struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv;
 	u64 min_bytes;
 
+	lockdep_assert_held(&space_info->lock);
+
 	if (!ticket->steal)
 		return false;
 
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 12/28] btrfs: assign booleans to global reserve's full field
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (10 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 11/28] btrfs: assert space_info is locked in steal_from_global_rsv() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 13/28] btrfs: process ticket outside global reserve critical section fdmanana
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

We have a couple places that are assigning 0 and 1 to the full field of
the global reserve. This is harmless since 0 is converted to false and
1 converted to true, but for better readability, replace these with true
and false since the field is of type bool.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index a2af55178c69..62e1ba7f09c0 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1067,7 +1067,7 @@ static bool steal_from_global_rsv(struct btrfs_space_info *space_info,
 	wake_up(&ticket->wait);
 	space_info->tickets_id++;
 	if (global_rsv->reserved < global_rsv->size)
-		global_rsv->full = 0;
+		global_rsv->full = false;
 	spin_unlock(&global_rsv->lock);
 
 	return true;
@@ -2186,7 +2186,7 @@ void btrfs_return_free_space(struct btrfs_space_info *space_info, u64 len)
 		global_rsv->reserved += to_add;
 		btrfs_space_info_update_bytes_may_use(space_info, to_add);
 		if (global_rsv->reserved >= global_rsv->size)
-			global_rsv->full = 1;
+			global_rsv->full = true;
 		len -= to_add;
 	}
 	spin_unlock(&global_rsv->lock);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 13/28] btrfs: process ticket outside global reserve critical section
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (11 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 12/28] btrfs: assign booleans to global reserve's full field fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 14/28] btrfs: remove double underscore prefix from __reserve_bytes() fdmanana
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

In steal_from_global_rsv() there's no need to process the ticket inside
the critical section of the global reserve. Move the ticket processing to
happen after the critical section. This helps reduce contention on the
global reserve's spinlock.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 62e1ba7f09c0..957477e5f01f 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1062,13 +1062,14 @@ static bool steal_from_global_rsv(struct btrfs_space_info *space_info,
 		return false;
 	}
 	global_rsv->reserved -= ticket->bytes;
+	if (global_rsv->reserved < global_rsv->size)
+		global_rsv->full = false;
+	spin_unlock(&global_rsv->lock);
+
 	remove_ticket(space_info, ticket);
 	ticket->bytes = 0;
 	wake_up(&ticket->wait);
 	space_info->tickets_id++;
-	if (global_rsv->reserved < global_rsv->size)
-		global_rsv->full = false;
-	spin_unlock(&global_rsv->lock);
 
 	return true;
 }
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 14/28] btrfs: remove double underscore prefix from __reserve_bytes()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (12 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 13/28] btrfs: process ticket outside global reserve critical section fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 15/28] btrfs: reduce space_info critical section in btrfs_chunk_alloc() fdmanana
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

The use of a double underscore prefix is discouraged and we have no
justification at all for it all since there's no reserved_bytes() counter
part. So remove the prefix.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 957477e5f01f..edeb46f1aa33 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -67,7 +67,7 @@
  *   Assume we are unable to simply make the reservation because we do not have
  *   enough space
  *
- *   -> __reserve_bytes
+ *   -> reserve_bytes
  *     create a reserve_ticket with ->bytes set to our reservation, add it to
  *     the tail of space_info->tickets, kick async flush thread
  *
@@ -1728,8 +1728,8 @@ static inline bool can_ticket(enum btrfs_reserve_flush_enum flush)
  * regain reservations will be made and this will fail if there is not enough
  * space already.
  */
-static int __reserve_bytes(struct btrfs_space_info *space_info, u64 orig_bytes,
-			   enum btrfs_reserve_flush_enum flush)
+static int reserve_bytes(struct btrfs_space_info *space_info, u64 orig_bytes,
+			 enum btrfs_reserve_flush_enum flush)
 {
 	struct btrfs_fs_info *fs_info = space_info->fs_info;
 	struct work_struct *async_work;
@@ -1879,7 +1879,7 @@ int btrfs_reserve_metadata_bytes(struct btrfs_space_info *space_info,
 {
 	int ret;
 
-	ret = __reserve_bytes(space_info, orig_bytes, flush);
+	ret = reserve_bytes(space_info, orig_bytes, flush);
 	if (ret == -ENOSPC) {
 		struct btrfs_fs_info *fs_info = space_info->fs_info;
 
@@ -1913,7 +1913,7 @@ int btrfs_reserve_data_bytes(struct btrfs_space_info *space_info, u64 bytes,
 	       flush == BTRFS_RESERVE_NO_FLUSH);
 	ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
 
-	ret = __reserve_bytes(space_info, bytes, flush);
+	ret = reserve_bytes(space_info, bytes, flush);
 	if (ret == -ENOSPC) {
 		trace_btrfs_space_reservation(fs_info, "space_info:enospc",
 					      space_info->flags, bytes, 1);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 15/28] btrfs: reduce space_info critical section in btrfs_chunk_alloc()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (13 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 14/28] btrfs: remove double underscore prefix from __reserve_bytes() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 16/28] btrfs: reduce block group critical section in btrfs_free_reserved_bytes() fdmanana
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

There's no need to update local variables while holding the space_info's
spinlock, since the update isn't using anything from the space_info. So
move these updates outside the critical section to shorten it.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/block-group.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index ec1e4fc0cd51..ebd4c514c2c8 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -4191,11 +4191,11 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans,
 		should_alloc = should_alloc_chunk(fs_info, space_info, force);
 		if (space_info->full) {
 			/* No more free physical space */
+			spin_unlock(&space_info->lock);
 			if (should_alloc)
 				ret = -ENOSPC;
 			else
 				ret = 0;
-			spin_unlock(&space_info->lock);
 			return ret;
 		} else if (!should_alloc) {
 			spin_unlock(&space_info->lock);
@@ -4207,16 +4207,16 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans,
 			 * recheck if we should continue with our allocation
 			 * attempt.
 			 */
+			spin_unlock(&space_info->lock);
 			wait_for_alloc = true;
 			force = CHUNK_ALLOC_NO_FORCE;
-			spin_unlock(&space_info->lock);
 			mutex_lock(&fs_info->chunk_mutex);
 			mutex_unlock(&fs_info->chunk_mutex);
 		} else {
 			/* Proceed with allocation */
 			space_info->chunk_alloc = true;
-			wait_for_alloc = false;
 			spin_unlock(&space_info->lock);
+			wait_for_alloc = false;
 		}
 
 		cond_resched();
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 16/28] btrfs: reduce block group critical section in btrfs_free_reserved_bytes()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (14 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 15/28] btrfs: reduce space_info critical section in btrfs_chunk_alloc() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 17/28] btrfs: reduce block group critical section in btrfs_add_reserved_bytes() fdmanana
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

There's no need to update the space_info fields (bytes_reserved,
max_extent_size, bytes_readonly, bytes_zone_unusable) while holding the
block group's spinlock. So move those updates to happen after we unlock
the block group (and while holding the space_info locked of course), so
that all we do under the block group's critical section is to update the
block group itself.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/block-group.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index ebd4c514c2c8..856bda9c99d9 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -3858,21 +3858,24 @@ void btrfs_free_reserved_bytes(struct btrfs_block_group *cache, u64 num_bytes,
 			       bool is_delalloc)
 {
 	struct btrfs_space_info *space_info = cache->space_info;
+	bool bg_ro;
 
 	spin_lock(&space_info->lock);
 	spin_lock(&cache->lock);
-	if (cache->ro)
+	bg_ro = cache->ro;
+	cache->reserved -= num_bytes;
+	if (is_delalloc)
+		cache->delalloc_bytes -= num_bytes;
+	spin_unlock(&cache->lock);
+
+	if (bg_ro)
 		space_info->bytes_readonly += num_bytes;
 	else if (btrfs_is_zoned(cache->fs_info))
 		space_info->bytes_zone_unusable += num_bytes;
-	cache->reserved -= num_bytes;
+
 	space_info->bytes_reserved -= num_bytes;
 	space_info->max_extent_size = 0;
 
-	if (is_delalloc)
-		cache->delalloc_bytes -= num_bytes;
-	spin_unlock(&cache->lock);
-
 	btrfs_try_granting_tickets(space_info);
 	spin_unlock(&space_info->lock);
 }
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 17/28] btrfs: reduce block group critical section in btrfs_add_reserved_bytes()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (15 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 16/28] btrfs: reduce block group critical section in btrfs_free_reserved_bytes() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 18/28] btrfs: reduce block group critical section in do_trimming() fdmanana
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

We are doing some things inside the block group's critical section that
are relevant only to the space_info: updating the space_info counters
bytes_reserved and bytes_may_use as well as trying to grant tickets
(calling btrfs_try_granting_tickets()), and this later can take some
time. So move all those updates to outside the block group's critical
section and still inside the space_info's critical section. Like this
we keep the block group's critical section only for block group updates
and can help reduce contention on a block group's lock.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/block-group.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 856bda9c99d9..b964eacc1610 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -3813,22 +3813,26 @@ int btrfs_add_reserved_bytes(struct btrfs_block_group *cache,
 	spin_lock(&cache->lock);
 	if (cache->ro) {
 		ret = -EAGAIN;
-		goto out;
+		goto out_error;
 	}
 
 	if (btrfs_block_group_should_use_size_class(cache)) {
 		size_class = btrfs_calc_block_group_size_class(num_bytes);
 		ret = btrfs_use_block_group_size_class(cache, size_class, force_wrong_size_class);
 		if (ret)
-			goto out;
+			goto out_error;
 	}
+
 	cache->reserved += num_bytes;
-	space_info->bytes_reserved += num_bytes;
+	if (delalloc)
+		cache->delalloc_bytes += num_bytes;
+
 	trace_btrfs_space_reservation(cache->fs_info, "space_info",
 				      space_info->flags, num_bytes, 1);
+	spin_unlock(&cache->lock);
+
+	space_info->bytes_reserved += num_bytes;
 	btrfs_space_info_update_bytes_may_use(space_info, -ram_bytes);
-	if (delalloc)
-		cache->delalloc_bytes += num_bytes;
 
 	/*
 	 * Compression can use less space than we reserved, so wake tickets if
@@ -3836,7 +3840,11 @@ int btrfs_add_reserved_bytes(struct btrfs_block_group *cache,
 	 */
 	if (num_bytes < ram_bytes)
 		btrfs_try_granting_tickets(space_info);
-out:
+	spin_unlock(&space_info->lock);
+
+	return 0;
+
+out_error:
 	spin_unlock(&cache->lock);
 	spin_unlock(&space_info->lock);
 	return ret;
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 18/28] btrfs: reduce block group critical section in do_trimming()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (16 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 17/28] btrfs: reduce block group critical section in btrfs_add_reserved_bytes() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-24  7:19   ` Johannes Thumshirn
  2025-10-23 15:59 ` [PATCH 19/28] btrfs: reduce block group critical section in pin_down_extent() fdmanana
                   ` (10 subsequent siblings)
  28 siblings, 1 reply; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

There's no need to update the bytes_reserved and bytes_readonly fields of
the space_info while holding the block group's spinlock. We are only
making the critical section longer than necessary. So move the space_info
updates outside of the block group's critical section.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/free-space-cache.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index ab873bd67192..6ccb492eae8e 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -3656,7 +3656,7 @@ static int do_trimming(struct btrfs_block_group *block_group,
 	struct btrfs_fs_info *fs_info = block_group->fs_info;
 	struct btrfs_free_space_ctl *ctl = block_group->free_space_ctl;
 	int ret;
-	int update = 0;
+	bool bg_ro;
 	const u64 end = start + bytes;
 	const u64 reserved_end = reserved_start + reserved_bytes;
 	enum btrfs_trim_state trim_state = BTRFS_TRIM_STATE_UNTRIMMED;
@@ -3664,12 +3664,14 @@ static int do_trimming(struct btrfs_block_group *block_group,
 
 	spin_lock(&space_info->lock);
 	spin_lock(&block_group->lock);
-	if (!block_group->ro) {
+	bg_ro = block_group->ro;
+	if (!bg_ro) {
 		block_group->reserved += reserved_bytes;
+		spin_unlock(&block_group->lock);
 		space_info->bytes_reserved += reserved_bytes;
-		update = 1;
+	} else {
+		spin_unlock(&block_group->lock);
 	}
-	spin_unlock(&block_group->lock);
 	spin_unlock(&space_info->lock);
 
 	ret = btrfs_discard_extent(fs_info, start, bytes, &trimmed);
@@ -3690,14 +3692,16 @@ static int do_trimming(struct btrfs_block_group *block_group,
 	list_del(&trim_entry->list);
 	mutex_unlock(&ctl->cache_writeout_mutex);
 
-	if (update) {
+	if (!bg_ro) {
 		spin_lock(&space_info->lock);
 		spin_lock(&block_group->lock);
-		if (block_group->ro)
-			space_info->bytes_readonly += reserved_bytes;
+		bg_ro = block_group->ro;
 		block_group->reserved -= reserved_bytes;
-		space_info->bytes_reserved -= reserved_bytes;
 		spin_unlock(&block_group->lock);
+
+		space_info->bytes_reserved -= reserved_bytes;
+		if (bg_ro)
+			space_info->bytes_readonly += reserved_bytes;
 		spin_unlock(&space_info->lock);
 	}
 
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 19/28] btrfs: reduce block group critical section in pin_down_extent()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (17 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 18/28] btrfs: reduce block group critical section in do_trimming() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-24  7:21   ` Johannes Thumshirn
  2025-10-23 15:59 ` [PATCH 20/28] btrfs: use local variable for space_info " fdmanana
                   ` (9 subsequent siblings)
  28 siblings, 1 reply; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

There's no need to update the bytes_reserved and bytes_may_use fields of
the space_info while holding the block group's spinlock. We are only
making the critical section longer than necessary. So move the space_info
updates outside of the block group's critical section.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/extent-tree.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index ae2c3dc9957e..70b77fe21b9f 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2594,15 +2594,15 @@ static int pin_down_extent(struct btrfs_trans_handle *trans,
 			   struct btrfs_block_group *cache,
 			   u64 bytenr, u64 num_bytes, int reserved)
 {
+	const u64 reserved_bytes = (reserved ? num_bytes : 0);
+
 	spin_lock(&cache->space_info->lock);
 	spin_lock(&cache->lock);
 	cache->pinned += num_bytes;
-	btrfs_space_info_update_bytes_pinned(cache->space_info, num_bytes);
-	if (reserved) {
-		cache->reserved -= num_bytes;
-		cache->space_info->bytes_reserved -= num_bytes;
-	}
+	cache->reserved -= reserved_bytes;
 	spin_unlock(&cache->lock);
+	cache->space_info->bytes_reserved -= reserved_bytes;
+	btrfs_space_info_update_bytes_pinned(cache->space_info, num_bytes);
 	spin_unlock(&cache->space_info->lock);
 
 	btrfs_set_extent_bit(&trans->transaction->pinned_extents, bytenr,
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 20/28] btrfs: use local variable for space_info in pin_down_extent()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (18 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 19/28] btrfs: reduce block group critical section in pin_down_extent() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 21/28] btrfs: remove 'reserved' argument from btrfs_pin_extent() fdmanana
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

Instead of derefercing the block group multiple times to access its
space_info, use a local variable to shorten the code horizontal wise and
make it easier to read. Also, while at it, also rename the block group
argument from 'cache' to 'bg', as the cache name is confusing and it's
from the old days where the block group structure was named as
'btrfs_block_group_cache'.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/extent-tree.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 70b77fe21b9f..4be20949f0ba 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2591,19 +2591,20 @@ static u64 first_logical_byte(struct btrfs_fs_info *fs_info)
 }
 
 static int pin_down_extent(struct btrfs_trans_handle *trans,
-			   struct btrfs_block_group *cache,
+			   struct btrfs_block_group *bg,
 			   u64 bytenr, u64 num_bytes, int reserved)
 {
+	struct btrfs_space_info *space_info = bg->space_info;
 	const u64 reserved_bytes = (reserved ? num_bytes : 0);
 
-	spin_lock(&cache->space_info->lock);
-	spin_lock(&cache->lock);
-	cache->pinned += num_bytes;
-	cache->reserved -= reserved_bytes;
-	spin_unlock(&cache->lock);
-	cache->space_info->bytes_reserved -= reserved_bytes;
-	btrfs_space_info_update_bytes_pinned(cache->space_info, num_bytes);
-	spin_unlock(&cache->space_info->lock);
+	spin_lock(&space_info->lock);
+	spin_lock(&bg->lock);
+	bg->pinned += num_bytes;
+	bg->reserved -= reserved_bytes;
+	spin_unlock(&bg->lock);
+	space_info->bytes_reserved -= reserved_bytes;
+	btrfs_space_info_update_bytes_pinned(space_info, num_bytes);
+	spin_unlock(&space_info->lock);
 
 	btrfs_set_extent_bit(&trans->transaction->pinned_extents, bytenr,
 			     bytenr + num_bytes - 1, EXTENT_DIRTY, NULL);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 21/28] btrfs: remove 'reserved' argument from btrfs_pin_extent()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (19 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 20/28] btrfs: use local variable for space_info " fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 22/28] btrfs: change 'reserved' argument from pin_down_extent() to bool fdmanana
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

All callers pass a value of 1 (true) to it, so remove it.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/extent-tree.c | 15 +++++++--------
 fs/btrfs/extent-tree.h |  3 +--
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 4be20949f0ba..21420dc26a50 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1764,7 +1764,7 @@ static int run_one_delayed_ref(struct btrfs_trans_handle *trans,
 
 	if (TRANS_ABORTED(trans)) {
 		if (insert_reserved) {
-			btrfs_pin_extent(trans, node->bytenr, node->num_bytes, 1);
+			btrfs_pin_extent(trans, node->bytenr, node->num_bytes);
 			free_head_ref_squota_rsv(trans->fs_info, href);
 		}
 		return 0;
@@ -1783,7 +1783,7 @@ static int run_one_delayed_ref(struct btrfs_trans_handle *trans,
 	else
 		BUG();
 	if (ret && insert_reserved)
-		btrfs_pin_extent(trans, node->bytenr, node->num_bytes, 1);
+		btrfs_pin_extent(trans, node->bytenr, node->num_bytes);
 	if (ret < 0)
 		btrfs_err(trans->fs_info,
 "failed to run delayed ref for logical %llu num_bytes %llu type %u action %u ref_mod %d: %d",
@@ -1890,7 +1890,7 @@ static int cleanup_ref_head(struct btrfs_trans_handle *trans,
 	spin_unlock(&delayed_refs->lock);
 
 	if (head->must_insert_reserved) {
-		btrfs_pin_extent(trans, head->bytenr, head->num_bytes, 1);
+		btrfs_pin_extent(trans, head->bytenr, head->num_bytes);
 		if (head->is_data) {
 			struct btrfs_root *csum_root;
 
@@ -2611,15 +2611,14 @@ static int pin_down_extent(struct btrfs_trans_handle *trans,
 	return 0;
 }
 
-int btrfs_pin_extent(struct btrfs_trans_handle *trans,
-		     u64 bytenr, u64 num_bytes, int reserved)
+int btrfs_pin_extent(struct btrfs_trans_handle *trans, u64 bytenr, u64 num_bytes)
 {
 	struct btrfs_block_group *cache;
 
 	cache = btrfs_lookup_block_group(trans->fs_info, bytenr);
 	BUG_ON(!cache); /* Logic error */
 
-	pin_down_extent(trans, cache, bytenr, num_bytes, reserved);
+	pin_down_extent(trans, cache, bytenr, num_bytes, 1);
 
 	btrfs_put_block_group(cache);
 	return 0;
@@ -3538,7 +3537,7 @@ int btrfs_free_extent(struct btrfs_trans_handle *trans, struct btrfs_ref *ref)
 	 * tree, just update pinning info and exit early.
 	 */
 	if (ref->ref_root == BTRFS_TREE_LOG_OBJECTID) {
-		btrfs_pin_extent(trans, ref->bytenr, ref->num_bytes, 1);
+		btrfs_pin_extent(trans, ref->bytenr, ref->num_bytes);
 		ret = 0;
 	} else if (ref->type == BTRFS_REF_METADATA) {
 		ret = btrfs_add_delayed_tree_ref(trans, ref, NULL);
@@ -5022,7 +5021,7 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans,
 	ret = alloc_reserved_file_extent(trans, 0, root_objectid, 0, owner,
 					 offset, ins, 1, root_objectid);
 	if (ret)
-		btrfs_pin_extent(trans, ins->objectid, ins->offset, 1);
+		btrfs_pin_extent(trans, ins->objectid, ins->offset);
 	ret = btrfs_record_squota_delta(fs_info, &delta);
 	btrfs_put_block_group(block_group);
 	return ret;
diff --git a/fs/btrfs/extent-tree.h b/fs/btrfs/extent-tree.h
index e970ac42a871..e573509c5a71 100644
--- a/fs/btrfs/extent-tree.h
+++ b/fs/btrfs/extent-tree.h
@@ -110,8 +110,7 @@ int btrfs_lookup_extent_info(struct btrfs_trans_handle *trans,
 			     struct btrfs_fs_info *fs_info, u64 bytenr,
 			     u64 offset, int metadata, u64 *refs, u64 *flags,
 			     u64 *owner_root);
-int btrfs_pin_extent(struct btrfs_trans_handle *trans, u64 bytenr, u64 num,
-		     int reserved);
+int btrfs_pin_extent(struct btrfs_trans_handle *trans, u64 bytenr, u64 num);
 int btrfs_pin_extent_for_log_replay(struct btrfs_trans_handle *trans,
 				    const struct extent_buffer *eb);
 int btrfs_exclude_logged_extents(struct extent_buffer *eb);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 22/28] btrfs: change 'reserved' argument from pin_down_extent() to bool
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (20 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 21/28] btrfs: remove 'reserved' argument from btrfs_pin_extent() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 23/28] btrfs: reduce block group critical section in unpin_extent_range() fdmanana
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

It's used as a boolean, so convert it from int type to bool type.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/extent-tree.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 21420dc26a50..36963b4a6303 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2592,7 +2592,7 @@ static u64 first_logical_byte(struct btrfs_fs_info *fs_info)
 
 static int pin_down_extent(struct btrfs_trans_handle *trans,
 			   struct btrfs_block_group *bg,
-			   u64 bytenr, u64 num_bytes, int reserved)
+			   u64 bytenr, u64 num_bytes, bool reserved)
 {
 	struct btrfs_space_info *space_info = bg->space_info;
 	const u64 reserved_bytes = (reserved ? num_bytes : 0);
@@ -2618,7 +2618,7 @@ int btrfs_pin_extent(struct btrfs_trans_handle *trans, u64 bytenr, u64 num_bytes
 	cache = btrfs_lookup_block_group(trans->fs_info, bytenr);
 	BUG_ON(!cache); /* Logic error */
 
-	pin_down_extent(trans, cache, bytenr, num_bytes, 1);
+	pin_down_extent(trans, cache, bytenr, num_bytes, true);
 
 	btrfs_put_block_group(cache);
 	return 0;
@@ -2642,7 +2642,7 @@ int btrfs_pin_extent_for_log_replay(struct btrfs_trans_handle *trans,
 	if (ret)
 		goto out;
 
-	pin_down_extent(trans, cache, eb->start, eb->len, 0);
+	pin_down_extent(trans, cache, eb->start, eb->len, false);
 
 	/* remove us from the free space cache (if we're there at all) */
 	ret = btrfs_remove_free_space(cache, eb->start, eb->len);
@@ -3483,7 +3483,7 @@ int btrfs_free_tree_block(struct btrfs_trans_handle *trans,
 	bg = btrfs_lookup_block_group(fs_info, buf->start);
 
 	if (btrfs_header_flag(buf, BTRFS_HEADER_FLAG_WRITTEN)) {
-		pin_down_extent(trans, bg, buf->start, buf->len, 1);
+		pin_down_extent(trans, bg, buf->start, buf->len, true);
 		btrfs_put_block_group(bg);
 		goto out;
 	}
@@ -3507,7 +3507,7 @@ int btrfs_free_tree_block(struct btrfs_trans_handle *trans,
 
 	if (test_bit(BTRFS_FS_TREE_MOD_LOG_USERS, &fs_info->flags)
 		     || btrfs_is_zoned(fs_info)) {
-		pin_down_extent(trans, bg, buf->start, buf->len, 1);
+		pin_down_extent(trans, bg, buf->start, buf->len, true);
 		btrfs_put_block_group(bg);
 		goto out;
 	}
@@ -4775,7 +4775,7 @@ int btrfs_pin_reserved_extent(struct btrfs_trans_handle *trans,
 		return -ENOSPC;
 	}
 
-	ret = pin_down_extent(trans, cache, eb->start, eb->len, 1);
+	ret = pin_down_extent(trans, cache, eb->start, eb->len, true);
 	btrfs_put_block_group(cache);
 	return ret;
 }
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 23/28] btrfs: reduce block group critical section in unpin_extent_range()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (21 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 22/28] btrfs: change 'reserved' argument from pin_down_extent() to bool fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 24/28] btrfs: remove pointless label and goto from unpin_extent_range() fdmanana
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

There's no need to update the bytes_pinned, bytes_readonly and
max_extent_size fields of the space_info while inside the critical section
delimited by the block group's lock. So move that out of the block group's
critical section, but sill inside the space_info's critical section.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/extent-tree.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 36963b4a6303..d839d8d32412 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2747,13 +2747,12 @@ static int unpin_extent_range(struct btrfs_fs_info *fs_info,
 	struct btrfs_free_cluster *cluster = NULL;
 	u64 total_unpinned = 0;
 	u64 empty_cluster = 0;
-	bool readonly;
 	int ret = 0;
 
 	while (start <= end) {
 		u64 len;
+		bool readonly;
 
-		readonly = false;
 		if (!cache ||
 		    start >= cache->start + cache->length) {
 			if (cache)
@@ -2797,20 +2796,21 @@ static int unpin_extent_range(struct btrfs_fs_info *fs_info,
 
 		spin_lock(&space_info->lock);
 		spin_lock(&cache->lock);
+		readonly = cache->ro;
 		cache->pinned -= len;
+		spin_unlock(&cache->lock);
+
 		btrfs_space_info_update_bytes_pinned(space_info, -len);
 		space_info->max_extent_size = 0;
-		if (cache->ro) {
+
+		if (readonly) {
 			space_info->bytes_readonly += len;
-			readonly = true;
 		} else if (btrfs_is_zoned(fs_info)) {
 			/* Need reset before reusing in a zoned block group */
 			btrfs_space_info_update_bytes_zone_unusable(space_info, len);
-			readonly = true;
-		}
-		spin_unlock(&cache->lock);
-		if (!readonly && return_free_space)
+		} else if (return_free_space) {
 			btrfs_return_free_space(space_info, len);
+		}
 		spin_unlock(&space_info->lock);
 	}
 
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 24/28] btrfs: remove pointless label and goto from unpin_extent_range()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (22 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 23/28] btrfs: reduce block group critical section in unpin_extent_range() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 25/28] btrfs: add data_race() in btrfs_account_ro_block_groups_free_space() fdmanana
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

There's no need to have an 'out' label and jump there in case we can
not find a block group. We can simply return directly since there are no
resources to release, removing the need for the label and the 'ret'
variable.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/extent-tree.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index d839d8d32412..f981ff72fb98 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2747,7 +2747,6 @@ static int unpin_extent_range(struct btrfs_fs_info *fs_info,
 	struct btrfs_free_cluster *cluster = NULL;
 	u64 total_unpinned = 0;
 	u64 empty_cluster = 0;
-	int ret = 0;
 
 	while (start <= end) {
 		u64 len;
@@ -2761,8 +2760,7 @@ static int unpin_extent_range(struct btrfs_fs_info *fs_info,
 			cache = btrfs_lookup_block_group(fs_info, start);
 			if (unlikely(cache == NULL)) {
 				/* Logic error, something removed the block group. */
-				ret = -EUCLEAN;
-				goto out;
+				return -EUCLEAN;
 			}
 
 			cluster = fetch_cluster_info(fs_info,
@@ -2816,8 +2814,8 @@ static int unpin_extent_range(struct btrfs_fs_info *fs_info,
 
 	if (cache)
 		btrfs_put_block_group(cache);
-out:
-	return ret;
+
+	return 0;
 }
 
 int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans)
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 25/28] btrfs: add data_race() in btrfs_account_ro_block_groups_free_space()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (23 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 24/28] btrfs: remove pointless label and goto from unpin_extent_range() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 15:59 ` [PATCH 26/28] btrfs: move ticket wakeup and finalization to remove_ticket() fdmanana
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

Surround the intentional empty list check with the data_race() annotation
so that tools like KCSAN don't report a data race. The race is intentional
as it's harmless and we want to avoid lock contention of the space_info
since its lock is heavily used (space reservation, space flushing, extent
allocation and deallocation, etc, etc).

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index edeb46f1aa33..be58f702cc61 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1948,7 +1948,7 @@ u64 btrfs_account_ro_block_groups_free_space(struct btrfs_space_info *sinfo)
 	int factor;
 
 	/* It's df, we don't care if it's racy */
-	if (list_empty(&sinfo->ro_bgs))
+	if (data_race(list_empty(&sinfo->ro_bgs)))
 		return 0;
 
 	spin_lock(&sinfo->lock);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 26/28] btrfs: move ticket wakeup and finalization to remove_ticket()
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (24 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 25/28] btrfs: add data_race() in btrfs_account_ro_block_groups_free_space() fdmanana
@ 2025-10-23 15:59 ` fdmanana
  2025-10-23 16:00 ` [PATCH 27/28] btrfs: avoid space_info locking when checking if tickets are served fdmanana
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 15:59 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

Instead of repeating the wakeup and setup of the ->bytes or ->error field,
move those steps to remove_ticket() to avoid duplication. This is also
needed for the next patch in the series, so that we avoid duplicating more
logic.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 40 ++++++++++++++++++----------------------
 1 file changed, 18 insertions(+), 22 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index be58f702cc61..86cd87c5884a 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -515,13 +515,20 @@ bool btrfs_can_overcommit(const struct btrfs_space_info *space_info, u64 bytes,
 }
 
 static void remove_ticket(struct btrfs_space_info *space_info,
-			  struct reserve_ticket *ticket)
+			  struct reserve_ticket *ticket, int error)
 {
 	if (!list_empty(&ticket->list)) {
 		list_del_init(&ticket->list);
 		ASSERT(space_info->reclaim_size >= ticket->bytes);
 		space_info->reclaim_size -= ticket->bytes;
 	}
+
+	if (error)
+		ticket->error = error;
+	else
+		ticket->bytes = 0;
+
+	wake_up(&ticket->wait);
 }
 
 /*
@@ -549,10 +556,8 @@ void btrfs_try_granting_tickets(struct btrfs_space_info *space_info)
 		if (used_after <= space_info->total_bytes ||
 		    can_overcommit(space_info, used, ticket->bytes, flush)) {
 			btrfs_space_info_update_bytes_may_use(space_info, ticket->bytes);
-			remove_ticket(space_info, ticket);
-			ticket->bytes = 0;
+			remove_ticket(space_info, ticket, 0);
 			space_info->tickets_id++;
-			wake_up(&ticket->wait);
 			used = used_after;
 		} else {
 			break;
@@ -1066,9 +1071,7 @@ static bool steal_from_global_rsv(struct btrfs_space_info *space_info,
 		global_rsv->full = false;
 	spin_unlock(&global_rsv->lock);
 
-	remove_ticket(space_info, ticket);
-	ticket->bytes = 0;
-	wake_up(&ticket->wait);
+	remove_ticket(space_info, ticket, 0);
 	space_info->tickets_id++;
 
 	return true;
@@ -1115,12 +1118,10 @@ static bool maybe_fail_all_tickets(struct btrfs_space_info *space_info)
 			btrfs_info(fs_info, "failing ticket with %llu bytes",
 				   ticket->bytes);
 
-		remove_ticket(space_info, ticket);
 		if (abort_error)
-			ticket->error = abort_error;
+			remove_ticket(space_info, ticket, abort_error);
 		else
-			ticket->error = -ENOSPC;
-		wake_up(&ticket->wait);
+			remove_ticket(space_info, ticket, -ENOSPC);
 
 		/*
 		 * We're just throwing tickets away, so more flushing may not
@@ -1536,13 +1537,10 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
 	 * just to have caller fail immediately instead of later when trying to
 	 * modify the fs, making it easier to debug -ENOSPC problems.
 	 */
-	if (BTRFS_FS_ERROR(fs_info)) {
-		ticket->error = BTRFS_FS_ERROR(fs_info);
-		remove_ticket(space_info, ticket);
-	} else if (!steal_from_global_rsv(space_info, ticket)) {
-		ticket->error = -ENOSPC;
-		remove_ticket(space_info, ticket);
-	}
+	if (BTRFS_FS_ERROR(fs_info))
+		remove_ticket(space_info, ticket, BTRFS_FS_ERROR(fs_info));
+	else if (!steal_from_global_rsv(space_info, ticket))
+		remove_ticket(space_info, ticket, -ENOSPC);
 
 	/*
 	 * We must run try_granting_tickets here because we could be a large
@@ -1574,8 +1572,7 @@ static void priority_reclaim_data_space(struct btrfs_space_info *space_info,
 		}
 	}
 
-	ticket->error = -ENOSPC;
-	remove_ticket(space_info, ticket);
+	remove_ticket(space_info, ticket, -ENOSPC);
 	btrfs_try_granting_tickets(space_info);
 	spin_unlock(&space_info->lock);
 }
@@ -1599,8 +1596,7 @@ static void wait_reserve_ticket(struct btrfs_space_info *space_info,
 			 * despite getting an error, resulting in a space leak
 			 * (bytes_may_use counter of our space_info).
 			 */
-			remove_ticket(space_info, ticket);
-			ticket->error = -EINTR;
+			remove_ticket(space_info, ticket, -EINTR);
 			break;
 		}
 		spin_unlock(&space_info->lock);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 27/28] btrfs: avoid space_info locking when checking if tickets are served
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (25 preceding siblings ...)
  2025-10-23 15:59 ` [PATCH 26/28] btrfs: move ticket wakeup and finalization to remove_ticket() fdmanana
@ 2025-10-23 16:00 ` fdmanana
  2025-10-24  7:29   ` Johannes Thumshirn
  2025-11-06 11:29   ` Daniel Vacek
  2025-10-23 16:00 ` [PATCH 28/28] btrfs: tag as unlikely fs aborted checks in space flushing code fdmanana
  2025-10-24  7:30 ` [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups Johannes Thumshirn
  28 siblings, 2 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 16:00 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

When checking if a ticket was served, we take the space_info's spinlock.
If the ticket was served (its ->bytes is 0) or had an error (its ->error
it not 0) then we just unlock the space_info and return.

This however causes contention on the space_info's spinlock, which is
heavily used (space reservation, space flushing, allocating and
deallocating an extent from a block group (btrfs_update_block_group()),
etc).

Instead of using the space_info's spinlock to check if a ticket was
served, use a per ticket spinlock which isn't used by anyone other than
the task that created the ticket (stack allocated) and the task that
serves the ticket (a reclaim task or any task deallocating space that
ends up at btrfs_try_granting_tickets()).

After applying this patch and all previous patches from the same patchset
(many attempt to reduce space_info critical sections), lockstat showed
some improvements for a fs_mark test regarding the space_info's spinlock
'lock'. The lockstat results:

Before patchset:

  con-bounces:     13733858
  contentions:     15902322
  waittime-total:  264902529.72
  acq-bounces:     28161791
  acquisitions:    38679282

After patchset:

  con-bounces:     12032220
  contentions:     13598034
  waittime-total:  221806127.28
  acq-bounces:     24717947
  acquisitions:    34103281

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 56 +++++++++++++++++++++++++------------------
 fs/btrfs/space-info.h |  1 +
 2 files changed, 34 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 86cd87c5884a..cce53a452fd3 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -517,18 +517,22 @@ bool btrfs_can_overcommit(const struct btrfs_space_info *space_info, u64 bytes,
 static void remove_ticket(struct btrfs_space_info *space_info,
 			  struct reserve_ticket *ticket, int error)
 {
+	lockdep_assert_held(&space_info->lock);
+
 	if (!list_empty(&ticket->list)) {
 		list_del_init(&ticket->list);
 		ASSERT(space_info->reclaim_size >= ticket->bytes);
 		space_info->reclaim_size -= ticket->bytes;
 	}
 
+	spin_lock(&ticket->lock);
 	if (error)
 		ticket->error = error;
 	else
 		ticket->bytes = 0;
 
 	wake_up(&ticket->wait);
+	spin_unlock(&ticket->lock);
 }
 
 /*
@@ -1495,6 +1499,17 @@ static const enum btrfs_flush_state evict_flush_states[] = {
 	RESET_ZONES,
 };
 
+static bool is_ticket_served(struct reserve_ticket *ticket)
+{
+	bool ret;
+
+	spin_lock(&ticket->lock);
+	ret = (ticket->bytes == 0);
+	spin_unlock(&ticket->lock);
+
+	return ret;
+}
+
 static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
 					    struct reserve_ticket *ticket,
 					    const enum btrfs_flush_state *states,
@@ -1504,29 +1519,25 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
 	u64 to_reclaim;
 	int flush_state = 0;
 
-	spin_lock(&space_info->lock);
 	/*
 	 * This is the priority reclaim path, so to_reclaim could be >0 still
 	 * because we may have only satisfied the priority tickets and still
 	 * left non priority tickets on the list.  We would then have
 	 * to_reclaim but ->bytes == 0.
 	 */
-	if (ticket->bytes == 0) {
-		spin_unlock(&space_info->lock);
+	if (is_ticket_served(ticket))
 		return;
-	}
 
+	spin_lock(&space_info->lock);
 	to_reclaim = btrfs_calc_reclaim_metadata_size(space_info);
 
 	while (flush_state < states_nr) {
 		spin_unlock(&space_info->lock);
 		flush_space(space_info, to_reclaim, states[flush_state], false);
+		if (is_ticket_served(ticket))
+			return;
 		flush_state++;
 		spin_lock(&space_info->lock);
-		if (ticket->bytes == 0) {
-			spin_unlock(&space_info->lock);
-			return;
-		}
 	}
 
 	/*
@@ -1554,22 +1565,17 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
 static void priority_reclaim_data_space(struct btrfs_space_info *space_info,
 					struct reserve_ticket *ticket)
 {
-	spin_lock(&space_info->lock);
-
 	/* We could have been granted before we got here. */
-	if (ticket->bytes == 0) {
-		spin_unlock(&space_info->lock);
+	if (is_ticket_served(ticket))
 		return;
-	}
 
+	spin_lock(&space_info->lock);
 	while (!space_info->full) {
 		spin_unlock(&space_info->lock);
 		flush_space(space_info, U64_MAX, ALLOC_CHUNK_FORCE, false);
-		spin_lock(&space_info->lock);
-		if (ticket->bytes == 0) {
-			spin_unlock(&space_info->lock);
+		if (is_ticket_served(ticket))
 			return;
-		}
+		spin_lock(&space_info->lock);
 	}
 
 	remove_ticket(space_info, ticket, -ENOSPC);
@@ -1582,11 +1588,13 @@ static void wait_reserve_ticket(struct btrfs_space_info *space_info,
 
 {
 	DEFINE_WAIT(wait);
-	int ret = 0;
 
-	spin_lock(&space_info->lock);
+	spin_lock(&ticket->lock);
 	while (ticket->bytes > 0 && ticket->error == 0) {
+		int ret;
+
 		ret = prepare_to_wait_event(&ticket->wait, &wait, TASK_KILLABLE);
+		spin_unlock(&ticket->lock);
 		if (ret) {
 			/*
 			 * Delete us from the list. After we unlock the space
@@ -1596,17 +1604,18 @@ static void wait_reserve_ticket(struct btrfs_space_info *space_info,
 			 * despite getting an error, resulting in a space leak
 			 * (bytes_may_use counter of our space_info).
 			 */
+			spin_lock(&space_info->lock);
 			remove_ticket(space_info, ticket, -EINTR);
-			break;
+			spin_unlock(&space_info->lock);
+			return;
 		}
-		spin_unlock(&space_info->lock);
 
 		schedule();
 
 		finish_wait(&ticket->wait, &wait);
-		spin_lock(&space_info->lock);
+		spin_lock(&ticket->lock);
 	}
-	spin_unlock(&space_info->lock);
+	spin_unlock(&ticket->lock);
 }
 
 /*
@@ -1804,6 +1813,7 @@ static int reserve_bytes(struct btrfs_space_info *space_info, u64 orig_bytes,
 		ticket.error = 0;
 		space_info->reclaim_size += ticket.bytes;
 		init_waitqueue_head(&ticket.wait);
+		spin_lock_init(&ticket.lock);
 		ticket.steal = can_steal(flush);
 		if (trace_btrfs_reserve_ticket_enabled())
 			start_ns = ktime_get_ns();
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index 7e16d4c116c8..a4c2a3c8b388 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -230,6 +230,7 @@ struct reserve_ticket {
 	bool steal;
 	struct list_head list;
 	wait_queue_head_t wait;
+	spinlock_t lock;
 };
 
 static inline bool btrfs_mixed_space_info(const struct btrfs_space_info *space_info)
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 28/28] btrfs: tag as unlikely fs aborted checks in space flushing code
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (26 preceding siblings ...)
  2025-10-23 16:00 ` [PATCH 27/28] btrfs: avoid space_info locking when checking if tickets are served fdmanana
@ 2025-10-23 16:00 ` fdmanana
  2025-10-24  7:30 ` [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups Johannes Thumshirn
  28 siblings, 0 replies; 41+ messages in thread
From: fdmanana @ 2025-10-23 16:00 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

It's not expected to have the fs in an aborted state, so surround the
abortion checks with unlikely to make it clear it's unexpected and to
hint the compiler to generate better code.

Also at maybe_fail_all_tickets() untangle all repeated checks for the
abortion into a single if-then-else. This makes things more readable
and makes the compiler generate less code. On x86_64 with gcc 14.2.0-19
from Debian I got the following object size differentes.

Before this change:

  $ size fs/btrfs/btrfs.ko
     text	   data	    bss	    dec	    hex	filename
  2021606	 179704	  25088	2226398	 21f8de	fs/btrfs/btrfs.ko

After this change:

  $ size fs/btrfs/btrfs.ko
     text	   data	    bss	    dec	    hex	filename
  2021458	 179704	  25088	2226250	 21f84a	fs/btrfs/btrfs.ko

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/space-info.c | 37 ++++++++++++++++++-------------------
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index cce53a452fd3..cc8015c8b1ff 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1114,27 +1114,26 @@ static bool maybe_fail_all_tickets(struct btrfs_space_info *space_info)
 	       tickets_id == space_info->tickets_id) {
 		ticket = list_first_entry(&space_info->tickets,
 					  struct reserve_ticket, list);
+		if (unlikely(abort_error)) {
+			remove_ticket(space_info, ticket, abort_error);
+		} else {
+			if (steal_from_global_rsv(space_info, ticket))
+				return true;
 
-		if (!abort_error && steal_from_global_rsv(space_info, ticket))
-			return true;
-
-		if (!abort_error && btrfs_test_opt(fs_info, ENOSPC_DEBUG))
-			btrfs_info(fs_info, "failing ticket with %llu bytes",
-				   ticket->bytes);
+			if (btrfs_test_opt(fs_info, ENOSPC_DEBUG))
+				btrfs_info(fs_info, "failing ticket with %llu bytes",
+					   ticket->bytes);
 
-		if (abort_error)
-			remove_ticket(space_info, ticket, abort_error);
-		else
 			remove_ticket(space_info, ticket, -ENOSPC);
 
-		/*
-		 * We're just throwing tickets away, so more flushing may not
-		 * trip over btrfs_try_granting_tickets, so we need to call it
-		 * here to see if we can make progress with the next ticket in
-		 * the list.
-		 */
-		if (!abort_error)
+			/*
+			 * We're just throwing tickets away, so more flushing may
+			 * not trip over btrfs_try_granting_tickets, so we need
+			 * to call it here to see if we can make progress with
+			 * the next ticket in the list.
+			 */
 			btrfs_try_granting_tickets(space_info);
+		}
 	}
 	return (tickets_id != space_info->tickets_id);
 }
@@ -1410,7 +1409,7 @@ static void do_async_reclaim_data_space(struct btrfs_space_info *space_info)
 		}
 
 		/* Something happened, fail everything and bail. */
-		if (BTRFS_FS_ERROR(fs_info))
+		if (unlikely(BTRFS_FS_ERROR(fs_info)))
 			goto aborted_fs;
 		last_tickets_id = space_info->tickets_id;
 		spin_unlock(&space_info->lock);
@@ -1444,7 +1443,7 @@ static void do_async_reclaim_data_space(struct btrfs_space_info *space_info)
 			}
 
 			/* Something happened, fail everything and bail. */
-			if (BTRFS_FS_ERROR(fs_info))
+			if (unlikely(BTRFS_FS_ERROR(fs_info)))
 				goto aborted_fs;
 
 		}
@@ -1548,7 +1547,7 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
 	 * just to have caller fail immediately instead of later when trying to
 	 * modify the fs, making it easier to debug -ENOSPC problems.
 	 */
-	if (BTRFS_FS_ERROR(fs_info))
+	if (unlikely(BTRFS_FS_ERROR(fs_info)))
 		remove_ticket(space_info, ticket, BTRFS_FS_ERROR(fs_info));
 	else if (!steal_from_global_rsv(space_info, ticket))
 		remove_ticket(space_info, ticket, -ENOSPC);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 06/28] btrfs: inline btrfs_space_info_used()
  2025-10-23 15:59 ` [PATCH 06/28] btrfs: inline btrfs_space_info_used() fdmanana
@ 2025-10-24  6:52   ` Johannes Thumshirn
  2025-10-24  7:55     ` Filipe Manana
  0 siblings, 1 reply; 41+ messages in thread
From: Johannes Thumshirn @ 2025-10-24  6:52 UTC (permalink / raw)
  To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org

On 10/23/25 6:01 PM, fdmanana@kernel.org wrote:
> reduces the object code. In x86_64 with 14.2.0-19 from Debian the results

with GCC 14.2.0-19 from Debian?


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 10/28] btrfs: avoid unnecessary reclaim calculation in priority_reclaim_metadata_space()
  2025-10-23 15:59 ` [PATCH 10/28] btrfs: avoid unnecessary reclaim calculation in priority_reclaim_metadata_space() fdmanana
@ 2025-10-24  7:06   ` Johannes Thumshirn
  2025-10-24  7:55     ` Filipe Manana
  0 siblings, 1 reply; 41+ messages in thread
From: Johannes Thumshirn @ 2025-10-24  7:06 UTC (permalink / raw)
  To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org

On 10/23/25 6:01 PM, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> If the given ticket was already served (its ->bytes is 0), then we wasted
> time calculating the metadata reclaim size. So calculate it only after we
> checked the ticket was not yet served.
>
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>   fs/btrfs/space-info.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 9a072009eec8..b03c015d5d51 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -1501,7 +1501,6 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
>   	int flush_state = 0;
>   
>   	spin_lock(&space_info->lock);
> -	to_reclaim = btrfs_calc_reclaim_metadata_size(space_info);
>   	/*
>   	 * This is the priority reclaim path, so to_reclaim could be >0 still
>   	 * because we may have only satisfied the priority tickets and still
Is the ticket (or ticket->bytes) also protected by the space_info lock? 
If yes that would be an odd dependency TBH. If not we can move the 
spin_lock() call below the ticket->bytes == 0 check.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 18/28] btrfs: reduce block group critical section in do_trimming()
  2025-10-23 15:59 ` [PATCH 18/28] btrfs: reduce block group critical section in do_trimming() fdmanana
@ 2025-10-24  7:19   ` Johannes Thumshirn
  2025-10-24  8:00     ` Filipe Manana
  0 siblings, 1 reply; 41+ messages in thread
From: Johannes Thumshirn @ 2025-10-24  7:19 UTC (permalink / raw)
  To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org

On 10/23/25 6:01 PM, fdmanana@kernel.org wrote:
>   	spin_lock(&space_info->lock);
>   	spin_lock(&block_group->lock);
> -	if (!block_group->ro) {
> +	bg_ro = block_group->ro;
> +	if (!bg_ro) {
>   		block_group->reserved += reserved_bytes;
> +		spin_unlock(&block_group->lock);
>   		space_info->bytes_reserved += reserved_bytes;
> -		update = 1;
> +	} else {
> +		spin_unlock(&block_group->lock);
>   	}
> -	spin_unlock(&block_group->lock);
>   	spin_unlock(&space_info->lock);

Hmm maybe something like:

spin_lock(&block_group->lock);

bg_ro = block_group->ro;

if (!bg_ro)

     block_group->reserved += reserved_bytes;

spin_unlock(&block_group->lock);

spin_lock(&space_info->lock);

space_info->bytes_reserved += reserved_bytes;

update = 1;
spin_unlock(&space_info->lock);

This will not only get rid of the else path but also the nested locking


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 19/28] btrfs: reduce block group critical section in pin_down_extent()
  2025-10-23 15:59 ` [PATCH 19/28] btrfs: reduce block group critical section in pin_down_extent() fdmanana
@ 2025-10-24  7:21   ` Johannes Thumshirn
  2025-10-24  8:02     ` Filipe Manana
  0 siblings, 1 reply; 41+ messages in thread
From: Johannes Thumshirn @ 2025-10-24  7:21 UTC (permalink / raw)
  To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org

On 10/23/25 6:02 PM, fdmanana@kernel.org wrote:
>   {
> +	const u64 reserved_bytes = (reserved ? num_bytes : 0);
> +
>   	spin_lock(&cache->space_info->lock);
>   	spin_lock(&cache->lock);
>   	cache->pinned += num_bytes;
> -	btrfs_space_info_update_bytes_pinned(cache->space_info, num_bytes);
> -	if (reserved) {
> -		cache->reserved -= num_bytes;
> -		cache->space_info->bytes_reserved -= num_bytes;
> -	}
> +	cache->reserved -= reserved_bytes;
>   	spin_unlock(&cache->lock);
> +	cache->space_info->bytes_reserved -= reserved_bytes;
> +	btrfs_space_info_update_bytes_pinned(cache->space_info, num_bytes);
>   	spin_unlock(&cache->space_info->lock);

Again do we need to have cache->lock and space_info->lock to be nested?


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 27/28] btrfs: avoid space_info locking when checking if tickets are served
  2025-10-23 16:00 ` [PATCH 27/28] btrfs: avoid space_info locking when checking if tickets are served fdmanana
@ 2025-10-24  7:29   ` Johannes Thumshirn
  2025-11-06 11:29   ` Daniel Vacek
  1 sibling, 0 replies; 41+ messages in thread
From: Johannes Thumshirn @ 2025-10-24  7:29 UTC (permalink / raw)
  To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org

On 10/23/25 6:01 PM, fdmanana@kernel.org wrote:
> @@ -1504,29 +1519,25 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
>   	u64 to_reclaim;
>   	int flush_state = 0;
>   
> -	spin_lock(&space_info->lock);
>   	/*
>   	 * This is the priority reclaim path, so to_reclaim could be >0 still
>   	 * because we may have only satisfied the priority tickets and still
>   	 * left non priority tickets on the list.  We would then have
>   	 * to_reclaim but ->bytes == 0.
>   	 */
> -	if (ticket->bytes == 0) {
> -		spin_unlock(&space_info->lock);
> +	if (is_ticket_served(ticket))
>   		return;
> -	}
>   
> +	spin_lock(&space_info->lock);
>   	to_reclaim = btrfs_calc_reclaim_metadata_size(space_info);
>   
>   	while (flush_state < states_nr) {
>   		spin_unlock(&space_info->lock);
>   		flush_space(space_info, to_reclaim, states[flush_state], false);
> +		if (is_ticket_served(ticket))
> +			return;
>   		flush_state++;
>   		spin_lock(&space_info->lock);
> -		if (ticket->bytes == 0) {
> -			spin_unlock(&space_info->lock);
> -			return;
> -		}
>   	}

Ah this answers the question on 10/28.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups
  2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
                   ` (27 preceding siblings ...)
  2025-10-23 16:00 ` [PATCH 28/28] btrfs: tag as unlikely fs aborted checks in space flushing code fdmanana
@ 2025-10-24  7:30 ` Johannes Thumshirn
  28 siblings, 0 replies; 41+ messages in thread
From: Johannes Thumshirn @ 2025-10-24  7:30 UTC (permalink / raw)
  To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org

On 10/23/25 6:00 PM, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> Several optimizations to reduce critical sections delimited by a
> space_info's spinlock and several cleanups mostly arround space
> reservation and flushing. Details in the changelogs.
>

Apart from the small locking questions and commit log fix

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 10/28] btrfs: avoid unnecessary reclaim calculation in priority_reclaim_metadata_space()
  2025-10-24  7:06   ` Johannes Thumshirn
@ 2025-10-24  7:55     ` Filipe Manana
  0 siblings, 0 replies; 41+ messages in thread
From: Filipe Manana @ 2025-10-24  7:55 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: linux-btrfs@vger.kernel.org

On Fri, Oct 24, 2025 at 8:06 AM Johannes Thumshirn
<Johannes.Thumshirn@wdc.com> wrote:
>
> On 10/23/25 6:01 PM, fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > If the given ticket was already served (its ->bytes is 0), then we wasted
> > time calculating the metadata reclaim size. So calculate it only after we
> > checked the ticket was not yet served.
> >
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > ---
> >   fs/btrfs/space-info.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> > index 9a072009eec8..b03c015d5d51 100644
> > --- a/fs/btrfs/space-info.c
> > +++ b/fs/btrfs/space-info.c
> > @@ -1501,7 +1501,6 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
> >       int flush_state = 0;
> >
> >       spin_lock(&space_info->lock);
> > -     to_reclaim = btrfs_calc_reclaim_metadata_size(space_info);
> >       /*
> >        * This is the priority reclaim path, so to_reclaim could be >0 still
> >        * because we may have only satisfied the priority tickets and still
> Is the ticket (or ticket->bytes) also protected by the space_info lock?
> If yes that would be an odd dependency TBH. If not we can move the
> spin_lock() call below the ticket->bytes == 0 check.

It is dependent and has to be done under space_info->lock.

Besides ensuring the waiter sees the updated value (->bytes set to 0),
they're also needed to synchronize a waiter and a reclaim task.
And that's because tickets are stack allocated - so while the stack
must not be destroyed in the middle of a wakeup or you get an invalid
memory access.

Removing the dependency on the space_info->lock, to reduce contention,
is eliminated in patch 27.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 06/28] btrfs: inline btrfs_space_info_used()
  2025-10-24  6:52   ` Johannes Thumshirn
@ 2025-10-24  7:55     ` Filipe Manana
  0 siblings, 0 replies; 41+ messages in thread
From: Filipe Manana @ 2025-10-24  7:55 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: linux-btrfs@vger.kernel.org

On Fri, Oct 24, 2025 at 7:52 AM Johannes Thumshirn
<Johannes.Thumshirn@wdc.com> wrote:
>
> On 10/23/25 6:01 PM, fdmanana@kernel.org wrote:
> > reduces the object code. In x86_64 with 14.2.0-19 from Debian the results
>
> with GCC 14.2.0-19 from Debian?

Yes, I missed "gcc" before pasting the version.

>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 18/28] btrfs: reduce block group critical section in do_trimming()
  2025-10-24  7:19   ` Johannes Thumshirn
@ 2025-10-24  8:00     ` Filipe Manana
  0 siblings, 0 replies; 41+ messages in thread
From: Filipe Manana @ 2025-10-24  8:00 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: linux-btrfs@vger.kernel.org

On Fri, Oct 24, 2025 at 8:19 AM Johannes Thumshirn
<Johannes.Thumshirn@wdc.com> wrote:
>
> On 10/23/25 6:01 PM, fdmanana@kernel.org wrote:
> >       spin_lock(&space_info->lock);
> >       spin_lock(&block_group->lock);
> > -     if (!block_group->ro) {
> > +     bg_ro = block_group->ro;
> > +     if (!bg_ro) {
> >               block_group->reserved += reserved_bytes;
> > +             spin_unlock(&block_group->lock);
> >               space_info->bytes_reserved += reserved_bytes;
> > -             update = 1;
> > +     } else {
> > +             spin_unlock(&block_group->lock);
> >       }
> > -     spin_unlock(&block_group->lock);
> >       spin_unlock(&space_info->lock);
>
> Hmm maybe something like:
>
> spin_lock(&block_group->lock);
>
> bg_ro = block_group->ro;
>
> if (!bg_ro)
>
>      block_group->reserved += reserved_bytes;
>
> spin_unlock(&block_group->lock);
>
> spin_lock(&space_info->lock);
>
> space_info->bytes_reserved += reserved_bytes;
>
> update = 1;
> spin_unlock(&space_info->lock);
>
> This will not only get rid of the else path but also the nested locking

No, that's not ok.
The block group must be updated not just under the block group's lock
but also under the space_info's lock. We have many places that depend
on that.

Thanks.
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 19/28] btrfs: reduce block group critical section in pin_down_extent()
  2025-10-24  7:21   ` Johannes Thumshirn
@ 2025-10-24  8:02     ` Filipe Manana
  0 siblings, 0 replies; 41+ messages in thread
From: Filipe Manana @ 2025-10-24  8:02 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: linux-btrfs@vger.kernel.org

On Fri, Oct 24, 2025 at 8:21 AM Johannes Thumshirn
<Johannes.Thumshirn@wdc.com> wrote:
>
> On 10/23/25 6:02 PM, fdmanana@kernel.org wrote:
> >   {
> > +     const u64 reserved_bytes = (reserved ? num_bytes : 0);
> > +
> >       spin_lock(&cache->space_info->lock);
> >       spin_lock(&cache->lock);
> >       cache->pinned += num_bytes;
> > -     btrfs_space_info_update_bytes_pinned(cache->space_info, num_bytes);
> > -     if (reserved) {
> > -             cache->reserved -= num_bytes;
> > -             cache->space_info->bytes_reserved -= num_bytes;
> > -     }
> > +     cache->reserved -= reserved_bytes;
> >       spin_unlock(&cache->lock);
> > +     cache->space_info->bytes_reserved -= reserved_bytes;
> > +     btrfs_space_info_update_bytes_pinned(cache->space_info, num_bytes);
> >       spin_unlock(&cache->space_info->lock);
>
> Again do we need to have cache->lock and space_info->lock to be nested?

Yes, many places depend on that.
The goal here, and other similar patches, is just to reduce the block
group's critical section.

Getting rid of this dependency, locking space_info, then locking bg
and then update bg, might be doable but will require broader changes
which are outside this patchset's scope.

Thanks.

>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 27/28] btrfs: avoid space_info locking when checking if tickets are served
  2025-10-23 16:00 ` [PATCH 27/28] btrfs: avoid space_info locking when checking if tickets are served fdmanana
  2025-10-24  7:29   ` Johannes Thumshirn
@ 2025-11-06 11:29   ` Daniel Vacek
  2025-11-06 12:05     ` Filipe Manana
  1 sibling, 1 reply; 41+ messages in thread
From: Daniel Vacek @ 2025-11-06 11:29 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Thu, 23 Oct 2025 at 18:02, <fdmanana@kernel.org> wrote:
>
> From: Filipe Manana <fdmanana@suse.com>
>
> When checking if a ticket was served, we take the space_info's spinlock.
> If the ticket was served (its ->bytes is 0) or had an error (its ->error
> it not 0) then we just unlock the space_info and return.
>
> This however causes contention on the space_info's spinlock, which is
> heavily used (space reservation, space flushing, allocating and
> deallocating an extent from a block group (btrfs_update_block_group()),
> etc).
>
> Instead of using the space_info's spinlock to check if a ticket was
> served, use a per ticket spinlock which isn't used by anyone other than
> the task that created the ticket (stack allocated) and the task that
> serves the ticket (a reclaim task or any task deallocating space that
> ends up at btrfs_try_granting_tickets()).
>
> After applying this patch and all previous patches from the same patchset
> (many attempt to reduce space_info critical sections), lockstat showed
> some improvements for a fs_mark test regarding the space_info's spinlock
> 'lock'. The lockstat results:
>
> Before patchset:
>
>   con-bounces:     13733858
>   contentions:     15902322
>   waittime-total:  264902529.72
>   acq-bounces:     28161791
>   acquisitions:    38679282
>
> After patchset:
>
>   con-bounces:     12032220
>   contentions:     13598034
>   waittime-total:  221806127.28
>   acq-bounces:     24717947
>   acquisitions:    34103281
>
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>  fs/btrfs/space-info.c | 56 +++++++++++++++++++++++++------------------
>  fs/btrfs/space-info.h |  1 +
>  2 files changed, 34 insertions(+), 23 deletions(-)
>
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 86cd87c5884a..cce53a452fd3 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -517,18 +517,22 @@ bool btrfs_can_overcommit(const struct btrfs_space_info *space_info, u64 bytes,
>  static void remove_ticket(struct btrfs_space_info *space_info,
>                           struct reserve_ticket *ticket, int error)
>  {
> +       lockdep_assert_held(&space_info->lock);
> +
>         if (!list_empty(&ticket->list)) {
>                 list_del_init(&ticket->list);
>                 ASSERT(space_info->reclaim_size >= ticket->bytes);
>                 space_info->reclaim_size -= ticket->bytes;
>         }
>
> +       spin_lock(&ticket->lock);
>         if (error)
>                 ticket->error = error;
>         else
>                 ticket->bytes = 0;
>
>         wake_up(&ticket->wait);
> +       spin_unlock(&ticket->lock);
>  }
>
>  /*
> @@ -1495,6 +1499,17 @@ static const enum btrfs_flush_state evict_flush_states[] = {
>         RESET_ZONES,
>  };
>
> +static bool is_ticket_served(struct reserve_ticket *ticket)
> +{
> +       bool ret;
> +
> +       spin_lock(&ticket->lock);
> +       ret = (ticket->bytes == 0);
> +       spin_unlock(&ticket->lock);
> +
> +       return ret;
> +}
> +
>  static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
>                                             struct reserve_ticket *ticket,
>                                             const enum btrfs_flush_state *states,
> @@ -1504,29 +1519,25 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
>         u64 to_reclaim;
>         int flush_state = 0;
>
> -       spin_lock(&space_info->lock);
>         /*
>          * This is the priority reclaim path, so to_reclaim could be >0 still
>          * because we may have only satisfied the priority tickets and still
>          * left non priority tickets on the list.  We would then have
>          * to_reclaim but ->bytes == 0.
>          */
> -       if (ticket->bytes == 0) {
> -               spin_unlock(&space_info->lock);
> +       if (is_ticket_served(ticket))
>                 return;
> -       }
>
> +       spin_lock(&space_info->lock);
>         to_reclaim = btrfs_calc_reclaim_metadata_size(space_info);
>
>         while (flush_state < states_nr) {
>                 spin_unlock(&space_info->lock);
>                 flush_space(space_info, to_reclaim, states[flush_state], false);
> +               if (is_ticket_served(ticket))
> +                       return;
>                 flush_state++;
>                 spin_lock(&space_info->lock);
> -               if (ticket->bytes == 0) {
> -                       spin_unlock(&space_info->lock);
> -                       return;
> -               }
>         }

The space_info->lock should be unlocked before the loop and grabbed
again only after it. There's no need to take that lock inside with
your change.

--nX

>
>         /*
> @@ -1554,22 +1565,17 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
>  static void priority_reclaim_data_space(struct btrfs_space_info *space_info,
>                                         struct reserve_ticket *ticket)
>  {
> -       spin_lock(&space_info->lock);
> -
>         /* We could have been granted before we got here. */
> -       if (ticket->bytes == 0) {
> -               spin_unlock(&space_info->lock);
> +       if (is_ticket_served(ticket))
>                 return;
> -       }
>
> +       spin_lock(&space_info->lock);
>         while (!space_info->full) {
>                 spin_unlock(&space_info->lock);
>                 flush_space(space_info, U64_MAX, ALLOC_CHUNK_FORCE, false);
> -               spin_lock(&space_info->lock);
> -               if (ticket->bytes == 0) {
> -                       spin_unlock(&space_info->lock);
> +               if (is_ticket_served(ticket))
>                         return;
> -               }
> +               spin_lock(&space_info->lock);
>         }
>
>         remove_ticket(space_info, ticket, -ENOSPC);
> @@ -1582,11 +1588,13 @@ static void wait_reserve_ticket(struct btrfs_space_info *space_info,
>
>  {
>         DEFINE_WAIT(wait);
> -       int ret = 0;
>
> -       spin_lock(&space_info->lock);
> +       spin_lock(&ticket->lock);
>         while (ticket->bytes > 0 && ticket->error == 0) {
> +               int ret;
> +
>                 ret = prepare_to_wait_event(&ticket->wait, &wait, TASK_KILLABLE);
> +               spin_unlock(&ticket->lock);
>                 if (ret) {
>                         /*
>                          * Delete us from the list. After we unlock the space
> @@ -1596,17 +1604,18 @@ static void wait_reserve_ticket(struct btrfs_space_info *space_info,
>                          * despite getting an error, resulting in a space leak
>                          * (bytes_may_use counter of our space_info).
>                          */
> +                       spin_lock(&space_info->lock);
>                         remove_ticket(space_info, ticket, -EINTR);
> -                       break;
> +                       spin_unlock(&space_info->lock);
> +                       return;
>                 }
> -               spin_unlock(&space_info->lock);
>
>                 schedule();
>
>                 finish_wait(&ticket->wait, &wait);
> -               spin_lock(&space_info->lock);
> +               spin_lock(&ticket->lock);
>         }
> -       spin_unlock(&space_info->lock);
> +       spin_unlock(&ticket->lock);
>  }
>
>  /*
> @@ -1804,6 +1813,7 @@ static int reserve_bytes(struct btrfs_space_info *space_info, u64 orig_bytes,
>                 ticket.error = 0;
>                 space_info->reclaim_size += ticket.bytes;
>                 init_waitqueue_head(&ticket.wait);
> +               spin_lock_init(&ticket.lock);
>                 ticket.steal = can_steal(flush);
>                 if (trace_btrfs_reserve_ticket_enabled())
>                         start_ns = ktime_get_ns();
> diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
> index 7e16d4c116c8..a4c2a3c8b388 100644
> --- a/fs/btrfs/space-info.h
> +++ b/fs/btrfs/space-info.h
> @@ -230,6 +230,7 @@ struct reserve_ticket {
>         bool steal;
>         struct list_head list;
>         wait_queue_head_t wait;
> +       spinlock_t lock;
>  };
>
>  static inline bool btrfs_mixed_space_info(const struct btrfs_space_info *space_info)
> --
> 2.47.2
>
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 27/28] btrfs: avoid space_info locking when checking if tickets are served
  2025-11-06 11:29   ` Daniel Vacek
@ 2025-11-06 12:05     ` Filipe Manana
  0 siblings, 0 replies; 41+ messages in thread
From: Filipe Manana @ 2025-11-06 12:05 UTC (permalink / raw)
  To: Daniel Vacek; +Cc: linux-btrfs

On Thu, Nov 6, 2025 at 11:29 AM Daniel Vacek <neelx@suse.com> wrote:
>
> On Thu, 23 Oct 2025 at 18:02, <fdmanana@kernel.org> wrote:
> >
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > When checking if a ticket was served, we take the space_info's spinlock.
> > If the ticket was served (its ->bytes is 0) or had an error (its ->error
> > it not 0) then we just unlock the space_info and return.
> >
> > This however causes contention on the space_info's spinlock, which is
> > heavily used (space reservation, space flushing, allocating and
> > deallocating an extent from a block group (btrfs_update_block_group()),
> > etc).
> >
> > Instead of using the space_info's spinlock to check if a ticket was
> > served, use a per ticket spinlock which isn't used by anyone other than
> > the task that created the ticket (stack allocated) and the task that
> > serves the ticket (a reclaim task or any task deallocating space that
> > ends up at btrfs_try_granting_tickets()).
> >
> > After applying this patch and all previous patches from the same patchset
> > (many attempt to reduce space_info critical sections), lockstat showed
> > some improvements for a fs_mark test regarding the space_info's spinlock
> > 'lock'. The lockstat results:
> >
> > Before patchset:
> >
> >   con-bounces:     13733858
> >   contentions:     15902322
> >   waittime-total:  264902529.72
> >   acq-bounces:     28161791
> >   acquisitions:    38679282
> >
> > After patchset:
> >
> >   con-bounces:     12032220
> >   contentions:     13598034
> >   waittime-total:  221806127.28
> >   acq-bounces:     24717947
> >   acquisitions:    34103281
> >
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > ---
> >  fs/btrfs/space-info.c | 56 +++++++++++++++++++++++++------------------
> >  fs/btrfs/space-info.h |  1 +
> >  2 files changed, 34 insertions(+), 23 deletions(-)
> >
> > diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> > index 86cd87c5884a..cce53a452fd3 100644
> > --- a/fs/btrfs/space-info.c
> > +++ b/fs/btrfs/space-info.c
> > @@ -517,18 +517,22 @@ bool btrfs_can_overcommit(const struct btrfs_space_info *space_info, u64 bytes,
> >  static void remove_ticket(struct btrfs_space_info *space_info,
> >                           struct reserve_ticket *ticket, int error)
> >  {
> > +       lockdep_assert_held(&space_info->lock);
> > +
> >         if (!list_empty(&ticket->list)) {
> >                 list_del_init(&ticket->list);
> >                 ASSERT(space_info->reclaim_size >= ticket->bytes);
> >                 space_info->reclaim_size -= ticket->bytes;
> >         }
> >
> > +       spin_lock(&ticket->lock);
> >         if (error)
> >                 ticket->error = error;
> >         else
> >                 ticket->bytes = 0;
> >
> >         wake_up(&ticket->wait);
> > +       spin_unlock(&ticket->lock);
> >  }
> >
> >  /*
> > @@ -1495,6 +1499,17 @@ static const enum btrfs_flush_state evict_flush_states[] = {
> >         RESET_ZONES,
> >  };
> >
> > +static bool is_ticket_served(struct reserve_ticket *ticket)
> > +{
> > +       bool ret;
> > +
> > +       spin_lock(&ticket->lock);
> > +       ret = (ticket->bytes == 0);
> > +       spin_unlock(&ticket->lock);
> > +
> > +       return ret;
> > +}
> > +
> >  static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
> >                                             struct reserve_ticket *ticket,
> >                                             const enum btrfs_flush_state *states,
> > @@ -1504,29 +1519,25 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
> >         u64 to_reclaim;
> >         int flush_state = 0;
> >
> > -       spin_lock(&space_info->lock);
> >         /*
> >          * This is the priority reclaim path, so to_reclaim could be >0 still
> >          * because we may have only satisfied the priority tickets and still
> >          * left non priority tickets on the list.  We would then have
> >          * to_reclaim but ->bytes == 0.
> >          */
> > -       if (ticket->bytes == 0) {
> > -               spin_unlock(&space_info->lock);
> > +       if (is_ticket_served(ticket))
> >                 return;
> > -       }
> >
> > +       spin_lock(&space_info->lock);
> >         to_reclaim = btrfs_calc_reclaim_metadata_size(space_info);
> >
> >         while (flush_state < states_nr) {
> >                 spin_unlock(&space_info->lock);
> >                 flush_space(space_info, to_reclaim, states[flush_state], false);
> > +               if (is_ticket_served(ticket))
> > +                       return;
> >                 flush_state++;
> >                 spin_lock(&space_info->lock);
> > -               if (ticket->bytes == 0) {
> > -                       spin_unlock(&space_info->lock);
> > -                       return;
> > -               }
> >         }
>
> The space_info->lock should be unlocked before the loop and grabbed
> again only after it. There's no need to take that lock inside with
> your change.

Yes, I'll fold that change into the patch. Thanks.

>
> --nX
>
> >
> >         /*
> > @@ -1554,22 +1565,17 @@ static void priority_reclaim_metadata_space(struct btrfs_space_info *space_info,
> >  static void priority_reclaim_data_space(struct btrfs_space_info *space_info,
> >                                         struct reserve_ticket *ticket)
> >  {
> > -       spin_lock(&space_info->lock);
> > -
> >         /* We could have been granted before we got here. */
> > -       if (ticket->bytes == 0) {
> > -               spin_unlock(&space_info->lock);
> > +       if (is_ticket_served(ticket))
> >                 return;
> > -       }
> >
> > +       spin_lock(&space_info->lock);
> >         while (!space_info->full) {
> >                 spin_unlock(&space_info->lock);
> >                 flush_space(space_info, U64_MAX, ALLOC_CHUNK_FORCE, false);
> > -               spin_lock(&space_info->lock);
> > -               if (ticket->bytes == 0) {
> > -                       spin_unlock(&space_info->lock);
> > +               if (is_ticket_served(ticket))
> >                         return;
> > -               }
> > +               spin_lock(&space_info->lock);
> >         }
> >
> >         remove_ticket(space_info, ticket, -ENOSPC);
> > @@ -1582,11 +1588,13 @@ static void wait_reserve_ticket(struct btrfs_space_info *space_info,
> >
> >  {
> >         DEFINE_WAIT(wait);
> > -       int ret = 0;
> >
> > -       spin_lock(&space_info->lock);
> > +       spin_lock(&ticket->lock);
> >         while (ticket->bytes > 0 && ticket->error == 0) {
> > +               int ret;
> > +
> >                 ret = prepare_to_wait_event(&ticket->wait, &wait, TASK_KILLABLE);
> > +               spin_unlock(&ticket->lock);
> >                 if (ret) {
> >                         /*
> >                          * Delete us from the list. After we unlock the space
> > @@ -1596,17 +1604,18 @@ static void wait_reserve_ticket(struct btrfs_space_info *space_info,
> >                          * despite getting an error, resulting in a space leak
> >                          * (bytes_may_use counter of our space_info).
> >                          */
> > +                       spin_lock(&space_info->lock);
> >                         remove_ticket(space_info, ticket, -EINTR);
> > -                       break;
> > +                       spin_unlock(&space_info->lock);
> > +                       return;
> >                 }
> > -               spin_unlock(&space_info->lock);
> >
> >                 schedule();
> >
> >                 finish_wait(&ticket->wait, &wait);
> > -               spin_lock(&space_info->lock);
> > +               spin_lock(&ticket->lock);
> >         }
> > -       spin_unlock(&space_info->lock);
> > +       spin_unlock(&ticket->lock);
> >  }
> >
> >  /*
> > @@ -1804,6 +1813,7 @@ static int reserve_bytes(struct btrfs_space_info *space_info, u64 orig_bytes,
> >                 ticket.error = 0;
> >                 space_info->reclaim_size += ticket.bytes;
> >                 init_waitqueue_head(&ticket.wait);
> > +               spin_lock_init(&ticket.lock);
> >                 ticket.steal = can_steal(flush);
> >                 if (trace_btrfs_reserve_ticket_enabled())
> >                         start_ns = ktime_get_ns();
> > diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
> > index 7e16d4c116c8..a4c2a3c8b388 100644
> > --- a/fs/btrfs/space-info.h
> > +++ b/fs/btrfs/space-info.h
> > @@ -230,6 +230,7 @@ struct reserve_ticket {
> >         bool steal;
> >         struct list_head list;
> >         wait_queue_head_t wait;
> > +       spinlock_t lock;
> >  };
> >
> >  static inline bool btrfs_mixed_space_info(const struct btrfs_space_info *space_info)
> > --
> > 2.47.2
> >
> >

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2025-11-06 12:05 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-23 15:59 [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups fdmanana
2025-10-23 15:59 ` [PATCH 01/28] btrfs: return real error when failing tickets in maybe_fail_all_tickets() fdmanana
2025-10-23 15:59 ` [PATCH 02/28] btrfs: avoid recomputing used space in btrfs_try_granting_tickets() fdmanana
2025-10-23 15:59 ` [PATCH 03/28] btrfs: make btrfs_can_overcommit() return bool instead of int fdmanana
2025-10-23 15:59 ` [PATCH 04/28] btrfs: avoid used space computation when trying to grant tickets fdmanana
2025-10-23 15:59 ` [PATCH 05/28] btrfs: avoid used space computation when reserving space fdmanana
2025-10-23 15:59 ` [PATCH 06/28] btrfs: inline btrfs_space_info_used() fdmanana
2025-10-24  6:52   ` Johannes Thumshirn
2025-10-24  7:55     ` Filipe Manana
2025-10-23 15:59 ` [PATCH 07/28] btrfs: bail out earlier from need_preemptive_reclaim() if we have tickets fdmanana
2025-10-23 15:59 ` [PATCH 08/28] btrfs: increment loop count outside critical section during metadata reclaim fdmanana
2025-10-23 15:59 ` [PATCH 09/28] btrfs: shorten critical section in btrfs_preempt_reclaim_metadata_space() fdmanana
2025-10-23 15:59 ` [PATCH 10/28] btrfs: avoid unnecessary reclaim calculation in priority_reclaim_metadata_space() fdmanana
2025-10-24  7:06   ` Johannes Thumshirn
2025-10-24  7:55     ` Filipe Manana
2025-10-23 15:59 ` [PATCH 11/28] btrfs: assert space_info is locked in steal_from_global_rsv() fdmanana
2025-10-23 15:59 ` [PATCH 12/28] btrfs: assign booleans to global reserve's full field fdmanana
2025-10-23 15:59 ` [PATCH 13/28] btrfs: process ticket outside global reserve critical section fdmanana
2025-10-23 15:59 ` [PATCH 14/28] btrfs: remove double underscore prefix from __reserve_bytes() fdmanana
2025-10-23 15:59 ` [PATCH 15/28] btrfs: reduce space_info critical section in btrfs_chunk_alloc() fdmanana
2025-10-23 15:59 ` [PATCH 16/28] btrfs: reduce block group critical section in btrfs_free_reserved_bytes() fdmanana
2025-10-23 15:59 ` [PATCH 17/28] btrfs: reduce block group critical section in btrfs_add_reserved_bytes() fdmanana
2025-10-23 15:59 ` [PATCH 18/28] btrfs: reduce block group critical section in do_trimming() fdmanana
2025-10-24  7:19   ` Johannes Thumshirn
2025-10-24  8:00     ` Filipe Manana
2025-10-23 15:59 ` [PATCH 19/28] btrfs: reduce block group critical section in pin_down_extent() fdmanana
2025-10-24  7:21   ` Johannes Thumshirn
2025-10-24  8:02     ` Filipe Manana
2025-10-23 15:59 ` [PATCH 20/28] btrfs: use local variable for space_info " fdmanana
2025-10-23 15:59 ` [PATCH 21/28] btrfs: remove 'reserved' argument from btrfs_pin_extent() fdmanana
2025-10-23 15:59 ` [PATCH 22/28] btrfs: change 'reserved' argument from pin_down_extent() to bool fdmanana
2025-10-23 15:59 ` [PATCH 23/28] btrfs: reduce block group critical section in unpin_extent_range() fdmanana
2025-10-23 15:59 ` [PATCH 24/28] btrfs: remove pointless label and goto from unpin_extent_range() fdmanana
2025-10-23 15:59 ` [PATCH 25/28] btrfs: add data_race() in btrfs_account_ro_block_groups_free_space() fdmanana
2025-10-23 15:59 ` [PATCH 26/28] btrfs: move ticket wakeup and finalization to remove_ticket() fdmanana
2025-10-23 16:00 ` [PATCH 27/28] btrfs: avoid space_info locking when checking if tickets are served fdmanana
2025-10-24  7:29   ` Johannes Thumshirn
2025-11-06 11:29   ` Daniel Vacek
2025-11-06 12:05     ` Filipe Manana
2025-10-23 16:00 ` [PATCH 28/28] btrfs: tag as unlikely fs aborted checks in space flushing code fdmanana
2025-10-24  7:30 ` [PATCH 00/28] btrfs: some small optimizations around space_info locking and cleanups Johannes Thumshirn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.