public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] btrfs: fix transaction commit blocking during trim of unallocated space
@ 2026-01-28  7:06 jinbaohong
  2026-01-28  7:06 ` [PATCH v4 1/4] btrfs: continue trimming remaining devices on failure jinbaohong
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: jinbaohong @ 2026-01-28  7:06 UTC (permalink / raw)
  To: linux-btrfs; +Cc: fdmanana, dsterba, jinbaohong

When trimming unallocated space, btrfs_trim_fs() holds device_list_mutex
for the entire duration while iterating through all devices. On large
filesystems with significant unallocated space, this operation can take
minutes to hours.

This causes a problem because btrfs_run_dev_stats(), called during
transaction commit, also requires device_list_mutex. While trim is
running, all transaction commits are blocked waiting for the mutex.

This series fixes the mutex blocking issue by refactoring
btrfs_trim_free_extents() to process devices in bounded chunks and
release device_list_mutex between chunks.

Additionally, this series includes related fixes and improvements to
error handling in btrfs_trim_fs():
- Fix a bug where device failure stops the entire trim instead of
  continuing to remaining devices
- Preserve the first error rather than the last, as it's typically more
  useful for debugging
- Handle user interrupts (-ERESTARTSYS/-EINTR) properly by returning
  immediately without counting them as device failures

Patches:
  1. Fix the break vs continue bug in device trimming loop
  2. Preserve first error instead of last error
  3. Handle user interrupts properly
  4. Refactor to release device_list_mutex periodically during trim

Changes since v3:
- Split patch 1 into three separate patches (patches 1-3) for easier
  review:
  - Patch 1: The minimal bug fix (break -> continue)
  - Patch 2: Preserve first error behavior
  - Patch 3: Proper user interrupt handling
- Fix patch 4 to preserve first error (consistent with patch 2) instead
  of overwriting with last error

Changes since v2:
- Set dev_ret properly on user interrupt so callers get proper
notification
- Convert -EINTR from mutex operations to -ERESTARTSYS for consistent
handling
- Restructure btrfs_trim_free_extents() for better code clarity

Changes since v1:
- Add #define BTRFS_MAX_TRIM_LENGTH; remove maxlen parameter
- Move loop-scoped variables into proper scope
- Fix comment formatting per style guidelines
- Replace goto with direct return statements

jinbaohong (4):
  btrfs: continue trimming remaining devices on failure
  btrfs: preserve first error in btrfs_trim_fs
  btrfs: handle user interrupt properly in btrfs_trim_fs
  btrfs: fix transaction commit blocking during trim of unallocated
    space

 fs/btrfs/extent-tree.c | 174 +++++++++++++++++++++++++++++++++++------
 1 file changed, 152 insertions(+), 22 deletions(-)

-- 
2.34.1


Disclaimer: The contents of this e-mail message and any attachments are confidential and are intended solely for addressee. The information may also be legally privileged. This transmission is sent in trust, for the sole purpose of delivery to the intended recipient. If you have received this transmission in error, any use, reproduction or dissemination of this transmission is strictly prohibited. If you are not the intended recipient, please immediately notify the sender by reply e-mail or phone and delete this message and its attachments, if any.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 1/4] btrfs: continue trimming remaining devices on failure
  2026-01-28  7:06 [PATCH v4 0/4] btrfs: fix transaction commit blocking during trim of unallocated space jinbaohong
@ 2026-01-28  7:06 ` jinbaohong
  2026-01-28  8:09   ` Qu Wenruo
  2026-01-28  7:06 ` [PATCH v4 2/4] btrfs: preserve first error in btrfs_trim_fs jinbaohong
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: jinbaohong @ 2026-01-28  7:06 UTC (permalink / raw)
  To: linux-btrfs; +Cc: fdmanana, dsterba, jinbaohong, Robbie Ko

Commit 93bba24d4b5a ("btrfs: Enhance btrfs_trim_fs function to handle
error better") intended to make device trimming continue even if one
device fails, tracking failures and reporting them at the end. However,
it used 'break' instead of 'continue', causing the loop to exit on the
first device failure.

Fix this by replacing 'break' with 'continue'.

Fixes: 93bba24d4b5a ("btrfs: Enhance btrfs_trim_fs function to handle error better")
Signed-off-by: Robbie Ko <robbieko@synology.com>
Signed-off-by: jinbaohong <jinbaohong@synology.com>
---
 fs/btrfs/extent-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 0ce2a7def0f3..bd167466b770 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -6688,7 +6688,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 		if (ret) {
 			dev_failed++;
 			dev_ret = ret;
-			break;
+			continue;
 		}
 	}
 	mutex_unlock(&fs_devices->device_list_mutex);
-- 
2.34.1


Disclaimer: The contents of this e-mail message and any attachments are confidential and are intended solely for addressee. The information may also be legally privileged. This transmission is sent in trust, for the sole purpose of delivery to the intended recipient. If you have received this transmission in error, any use, reproduction or dissemination of this transmission is strictly prohibited. If you are not the intended recipient, please immediately notify the sender by reply e-mail or phone and delete this message and its attachments, if any.

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/4] btrfs: preserve first error in btrfs_trim_fs
  2026-01-28  7:06 [PATCH v4 0/4] btrfs: fix transaction commit blocking during trim of unallocated space jinbaohong
  2026-01-28  7:06 ` [PATCH v4 1/4] btrfs: continue trimming remaining devices on failure jinbaohong
@ 2026-01-28  7:06 ` jinbaohong
  2026-01-28  8:15   ` Qu Wenruo
  2026-01-28  7:06 ` [PATCH v4 3/4] btrfs: handle user interrupt properly " jinbaohong
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: jinbaohong @ 2026-01-28  7:06 UTC (permalink / raw)
  To: linux-btrfs; +Cc: fdmanana, dsterba, jinbaohong, Robbie Ko

When multiple block groups or devices fail during trim, preserve the
first error encountered rather than the last one. The first error is
typically more useful for debugging as it represents the original
failure, while subsequent errors may be cascading effects.

Signed-off-by: Robbie Ko <robbieko@synology.com>
Signed-off-by: jinbaohong <jinbaohong@synology.com>
---
 fs/btrfs/extent-tree.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index bd167466b770..6c49465c0632 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -6653,7 +6653,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 				ret = btrfs_cache_block_group(cache, true);
 				if (ret) {
 					bg_failed++;
-					bg_ret = ret;
+					if (!bg_ret)
+						bg_ret = ret;
 					continue;
 				}
 			}
@@ -6666,7 +6667,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 			trimmed += group_trimmed;
 			if (ret) {
 				bg_failed++;
-				bg_ret = ret;
+				if (!bg_ret)
+					bg_ret = ret;
 				continue;
 			}
 		}
@@ -6674,7 +6676,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 
 	if (bg_failed)
 		btrfs_warn(fs_info,
-			"failed to trim %llu block group(s), last error %d",
+			"failed to trim %llu block group(s), first error %d",
 			bg_failed, bg_ret);
 
 	mutex_lock(&fs_devices->device_list_mutex);
@@ -6687,7 +6689,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 		trimmed += group_trimmed;
 		if (ret) {
 			dev_failed++;
-			dev_ret = ret;
+			if (!dev_ret)
+				dev_ret = ret;
 			continue;
 		}
 	}
@@ -6695,7 +6698,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 
 	if (dev_failed)
 		btrfs_warn(fs_info,
-			"failed to trim %llu device(s), last error %d",
+			"failed to trim %llu device(s), first error %d",
 			dev_failed, dev_ret);
 	range->len = trimmed;
 	if (bg_ret)
-- 
2.34.1


Disclaimer: The contents of this e-mail message and any attachments are confidential and are intended solely for addressee. The information may also be legally privileged. This transmission is sent in trust, for the sole purpose of delivery to the intended recipient. If you have received this transmission in error, any use, reproduction or dissemination of this transmission is strictly prohibited. If you are not the intended recipient, please immediately notify the sender by reply e-mail or phone and delete this message and its attachments, if any.

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 3/4] btrfs: handle user interrupt properly in btrfs_trim_fs
  2026-01-28  7:06 [PATCH v4 0/4] btrfs: fix transaction commit blocking during trim of unallocated space jinbaohong
  2026-01-28  7:06 ` [PATCH v4 1/4] btrfs: continue trimming remaining devices on failure jinbaohong
  2026-01-28  7:06 ` [PATCH v4 2/4] btrfs: preserve first error in btrfs_trim_fs jinbaohong
@ 2026-01-28  7:06 ` jinbaohong
  2026-01-28  8:20   ` Qu Wenruo
  2026-01-28  7:06 ` [PATCH v4 4/4] btrfs: fix transaction commit blocking during trim of unallocated space jinbaohong
  2026-01-28 16:49 ` [PATCH v4 0/4] " Filipe Manana
  4 siblings, 1 reply; 12+ messages in thread
From: jinbaohong @ 2026-01-28  7:06 UTC (permalink / raw)
  To: linux-btrfs; +Cc: fdmanana, dsterba, jinbaohong, Robbie Ko

When a fatal signal is pending or the process is freezing,
btrfs_trim_block_group and btrfs_trim_free_extents return -ERESTARTSYS.
Currently this is treated as a regular error: the loops continue to the
next iteration and count it as a block group or device failure.

Instead, break out of the loops immediately and return -ERESTARTSYS to
userspace without counting it as a failure. Also skip the device loop
entirely if the block group loop was interrupted.

Signed-off-by: Robbie Ko <robbieko@synology.com>
Signed-off-by: jinbaohong <jinbaohong@synology.com>
---
 fs/btrfs/extent-tree.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 6c49465c0632..633c7c0f9d92 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -6665,6 +6665,10 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 						     range->minlen);
 
 			trimmed += group_trimmed;
+			if (ret == -ERESTARTSYS || ret == -EINTR) {
+				btrfs_put_block_group(cache);
+				break;
+			}
 			if (ret) {
 				bg_failed++;
 				if (!bg_ret)
@@ -6679,6 +6683,9 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 			"failed to trim %llu block group(s), first error %d",
 			bg_failed, bg_ret);
 
+	if (ret == -ERESTARTSYS || ret == -EINTR)
+		return ret;
+
 	mutex_lock(&fs_devices->device_list_mutex);
 	list_for_each_entry(device, &fs_devices->devices, dev_list) {
 		if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
@@ -6687,6 +6694,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 		ret = btrfs_trim_free_extents(device, &group_trimmed);
 
 		trimmed += group_trimmed;
+		if (ret == -ERESTARTSYS || ret == -EINTR)
+			break;
 		if (ret) {
 			dev_failed++;
 			if (!dev_ret)
@@ -6701,6 +6710,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 			"failed to trim %llu device(s), first error %d",
 			dev_failed, dev_ret);
 	range->len = trimmed;
+	if (ret == -ERESTARTSYS || ret == -EINTR)
+		return ret;
 	if (bg_ret)
 		return bg_ret;
 	return dev_ret;
-- 
2.34.1


Disclaimer: The contents of this e-mail message and any attachments are confidential and are intended solely for addressee. The information may also be legally privileged. This transmission is sent in trust, for the sole purpose of delivery to the intended recipient. If you have received this transmission in error, any use, reproduction or dissemination of this transmission is strictly prohibited. If you are not the intended recipient, please immediately notify the sender by reply e-mail or phone and delete this message and its attachments, if any.

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 4/4] btrfs: fix transaction commit blocking during trim of unallocated space
  2026-01-28  7:06 [PATCH v4 0/4] btrfs: fix transaction commit blocking during trim of unallocated space jinbaohong
                   ` (2 preceding siblings ...)
  2026-01-28  7:06 ` [PATCH v4 3/4] btrfs: handle user interrupt properly " jinbaohong
@ 2026-01-28  7:06 ` jinbaohong
  2026-01-29  2:16   ` David Sterba
  2026-01-28 16:49 ` [PATCH v4 0/4] " Filipe Manana
  4 siblings, 1 reply; 12+ messages in thread
From: jinbaohong @ 2026-01-28  7:06 UTC (permalink / raw)
  To: linux-btrfs; +Cc: fdmanana, dsterba, jinbaohong, robbieko

When trimming unallocated space, btrfs_trim_fs() holds device_list_mutex
for the entire duration while iterating through all devices. On large
filesystems with significant unallocated space, this operation can take
minutes to hours on large storage systems.

This causes a problem because btrfs_run_dev_stats(), which is called
during transaction commit, also requires device_list_mutex:

  btrfs_trim_fs()
    mutex_lock(&fs_devices->device_list_mutex)
    list_for_each_entry(device, ...)
      btrfs_trim_free_extents(device)
    mutex_unlock(&fs_devices->device_list_mutex)

  commit_transaction()
    btrfs_run_dev_stats()
      mutex_lock(&fs_devices->device_list_mutex)  // blocked!
      ...

While trim is running, all transaction commits are blocked waiting for
the mutex.

Fix this by refactoring btrfs_trim_free_extents() to process devices in
bounded chunks (up to 2GB per iteration) and release device_list_mutex
between chunks.

Signed-off-by: robbieko <robbieko@synology.com>
Signed-off-by: jinbaohong <jinbaohong@synology.com>
---
 fs/btrfs/extent-tree.c | 160 +++++++++++++++++++++++++++++++++++------
 1 file changed, 138 insertions(+), 22 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 633c7c0f9d92..5c12cb226ef9 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -6493,6 +6493,9 @@ void btrfs_error_unpin_extent_range(struct btrfs_fs_info *fs_info, u64 start, u6
 	unpin_extent_range(fs_info, start, end, false);
 }
 
+/* Maximum length to trim in a single iteration to avoid holding mutex too long. */
+#define BTRFS_MAX_TRIM_LENGTH SZ_2G
+
 /*
  * It used to be that old block groups would be left around forever.
  * Iterating over them would be enough to trim unused space.  Since we
@@ -6513,10 +6516,12 @@ void btrfs_error_unpin_extent_range(struct btrfs_fs_info *fs_info, u64 start, u6
  * it while performing the free space search since we have already
  * held back allocations.
  */
-static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
+static int btrfs_trim_free_extents_throttle(struct btrfs_device *device,
+		u64 *trimmed, u64 pos, u64 *ret_next_pos)
 {
-	u64 start = BTRFS_DEVICE_RANGE_RESERVED, len = 0, end = 0;
 	int ret;
+	u64 start = pos;
+	u64 trim_len = 0;
 
 	*trimmed = 0;
 
@@ -6536,15 +6541,20 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
 
 	while (1) {
 		struct btrfs_fs_info *fs_info = device->fs_info;
+		u64 cur_start;
+		u64 end;
+		u64 len;
 		u64 bytes;
 
 		ret = mutex_lock_interruptible(&fs_info->chunk_mutex);
 		if (ret)
 			break;
 
+		cur_start = start;
 		btrfs_find_first_clear_extent_bit(&device->alloc_state, start,
 						  &start, &end,
 						  CHUNK_TRIMMED | CHUNK_ALLOCATED);
+		start = max(start, cur_start);
 
 		/* Check if there are any CHUNK_* bits left */
 		if (start > device->total_bytes) {
@@ -6570,6 +6580,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
 		end = min(end, device->total_bytes - 1);
 
 		len = end - start + 1;
+		len = min(len, BTRFS_MAX_TRIM_LENGTH);
 
 		/* We didn't find any extents */
 		if (!len) {
@@ -6590,6 +6601,12 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
 
 		start += len;
 		*trimmed += bytes;
+		trim_len += len;
+		if (trim_len >= BTRFS_MAX_TRIM_LENGTH) {
+			*ret_next_pos = start;
+			ret = -EAGAIN;
+			break;
+		}
 
 		if (btrfs_trim_interrupted()) {
 			ret = -ERESTARTSYS;
@@ -6602,6 +6619,122 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
 	return ret;
 }
 
+static int btrfs_trim_free_extents(struct btrfs_fs_info *fs_info, u64 *trimmed,
+				   u64 *dev_failed, int *dev_ret)
+{
+	struct btrfs_device *dev;
+	struct btrfs_device *working_dev = NULL;
+	struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
+	u8 uuid[BTRFS_UUID_SIZE];
+	u64 start = BTRFS_DEVICE_RANGE_RESERVED;
+
+	*trimmed = 0;
+	*dev_failed = 0;
+	*dev_ret = 0;
+
+	/* Find the device with the smallest UUID to start. */
+	mutex_lock(&fs_devices->device_list_mutex);
+	list_for_each_entry(dev, &fs_devices->devices, dev_list) {
+		if (test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state))
+			continue;
+		if (!working_dev ||
+		    memcmp(dev->uuid, working_dev->uuid, BTRFS_UUID_SIZE) < 0)
+			working_dev = dev;
+	}
+	if (working_dev)
+		memcpy(uuid, working_dev->uuid, BTRFS_UUID_SIZE);
+	mutex_unlock(&fs_devices->device_list_mutex);
+
+	if (!working_dev)
+		return 0;
+
+	while (1) {
+		u64 group_trimmed = 0;
+		u64 next_pos = 0;
+		int ret = 0;
+
+		mutex_lock(&fs_devices->device_list_mutex);
+
+		/* Find and trim the current device. */
+		list_for_each_entry(dev, &fs_devices->devices, dev_list) {
+			if (test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state))
+				continue;
+			if (dev == working_dev) {
+				ret = btrfs_trim_free_extents_throttle(working_dev,
+					&group_trimmed, start, &next_pos);
+				break;
+			}
+		}
+
+		/* Throttle: continue same device from new position. */
+		if (ret == -EAGAIN && next_pos > start) {
+			mutex_unlock(&fs_devices->device_list_mutex);
+			*trimmed += group_trimmed;
+			start = next_pos;
+			cond_resched();
+			continue;
+		}
+
+		/* User interrupted. */
+		if (ret == -ERESTARTSYS || ret == -EINTR) {
+			mutex_unlock(&fs_devices->device_list_mutex);
+			*trimmed += group_trimmed;
+			return ret;
+		}
+
+		/*
+		 * Device completed (ret == 0), failed, or EAGAIN with no progress.
+		 * Record error if any, then move to next device.
+		 */
+		if (ret == -EAGAIN) {
+			/* No progress - log and skip device. */
+			btrfs_warn(fs_info,
+				   "trim throttle: no progress, offset=%llu device %s, skipping",
+				   start, btrfs_dev_name(working_dev));
+			(*dev_failed)++;
+			if (!*dev_ret)
+				*dev_ret = ret;
+		} else if (ret) {
+			/* Device failed with error. */
+			(*dev_failed)++;
+			if (!*dev_ret)
+				*dev_ret = ret;
+		}
+
+		/*
+		 * Find next device: smallest UUID larger than current.
+		 * Devices added during trim with smaller UUID will be skipped.
+		 */
+		working_dev = NULL;
+		list_for_each_entry(dev, &fs_devices->devices, dev_list) {
+			if (test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state))
+				continue;
+			/* Must larger than current uuid. */
+			if (memcmp(dev->uuid, uuid, BTRFS_UUID_SIZE) <= 0)
+				continue;
+			/* Find the smallest. */
+			if (!working_dev ||
+			    memcmp(dev->uuid, working_dev->uuid, BTRFS_UUID_SIZE) < 0)
+				working_dev = dev;
+		}
+		if (working_dev)
+			memcpy(uuid, working_dev->uuid, BTRFS_UUID_SIZE);
+
+		mutex_unlock(&fs_devices->device_list_mutex);
+
+		*trimmed += group_trimmed;
+		start = BTRFS_DEVICE_RANGE_RESERVED;
+
+		/* No more devices. */
+		if (!working_dev)
+			break;
+
+		cond_resched();
+	}
+
+	return 0;
+}
+
 /*
  * Trim the whole filesystem by:
  * 1) trimming the free space in each block group
@@ -6613,9 +6746,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
  */
 int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 {
-	struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
 	struct btrfs_block_group *cache = NULL;
-	struct btrfs_device *device;
 	u64 group_trimmed;
 	u64 range_end = U64_MAX;
 	u64 start;
@@ -6686,24 +6817,9 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 	if (ret == -ERESTARTSYS || ret == -EINTR)
 		return ret;
 
-	mutex_lock(&fs_devices->device_list_mutex);
-	list_for_each_entry(device, &fs_devices->devices, dev_list) {
-		if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
-			continue;
-
-		ret = btrfs_trim_free_extents(device, &group_trimmed);
-
-		trimmed += group_trimmed;
-		if (ret == -ERESTARTSYS || ret == -EINTR)
-			break;
-		if (ret) {
-			dev_failed++;
-			if (!dev_ret)
-				dev_ret = ret;
-			continue;
-		}
-	}
-	mutex_unlock(&fs_devices->device_list_mutex);
+	ret = btrfs_trim_free_extents(fs_info, &group_trimmed,
+				&dev_failed, &dev_ret);
+	trimmed += group_trimmed;
 
 	if (dev_failed)
 		btrfs_warn(fs_info,
-- 
2.34.1


Disclaimer: The contents of this e-mail message and any attachments are confidential and are intended solely for addressee. The information may also be legally privileged. This transmission is sent in trust, for the sole purpose of delivery to the intended recipient. If you have received this transmission in error, any use, reproduction or dissemination of this transmission is strictly prohibited. If you are not the intended recipient, please immediately notify the sender by reply e-mail or phone and delete this message and its attachments, if any.

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/4] btrfs: continue trimming remaining devices on failure
  2026-01-28  7:06 ` [PATCH v4 1/4] btrfs: continue trimming remaining devices on failure jinbaohong
@ 2026-01-28  8:09   ` Qu Wenruo
  0 siblings, 0 replies; 12+ messages in thread
From: Qu Wenruo @ 2026-01-28  8:09 UTC (permalink / raw)
  To: jinbaohong, linux-btrfs; +Cc: fdmanana, dsterba, Robbie Ko



在 2026/1/28 17:36, jinbaohong 写道:
> Commit 93bba24d4b5a ("btrfs: Enhance btrfs_trim_fs function to handle
> error better") intended to make device trimming continue even if one
> device fails, tracking failures and reporting them at the end. However,
> it used 'break' instead of 'continue', causing the loop to exit on the
> first device failure.
> 
> Fix this by replacing 'break' with 'continue'.
> 
> Fixes: 93bba24d4b5a ("btrfs: Enhance btrfs_trim_fs function to handle error better")
> Signed-off-by: Robbie Ko <robbieko@synology.com>
> Signed-off-by: jinbaohong <jinbaohong@synology.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> ---
>   fs/btrfs/extent-tree.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 0ce2a7def0f3..bd167466b770 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -6688,7 +6688,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>   		if (ret) {
>   			dev_failed++;
>   			dev_ret = ret;
> -			break;
> +			continue;
>   		}
>   	}
>   	mutex_unlock(&fs_devices->device_list_mutex);


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/4] btrfs: preserve first error in btrfs_trim_fs
  2026-01-28  7:06 ` [PATCH v4 2/4] btrfs: preserve first error in btrfs_trim_fs jinbaohong
@ 2026-01-28  8:15   ` Qu Wenruo
  0 siblings, 0 replies; 12+ messages in thread
From: Qu Wenruo @ 2026-01-28  8:15 UTC (permalink / raw)
  To: jinbaohong, linux-btrfs; +Cc: fdmanana, dsterba, Robbie Ko



在 2026/1/28 17:36, jinbaohong 写道:
> When multiple block groups or devices fail during trim, preserve the
> first error encountered rather than the last one. The first error is
> typically more useful for debugging as it represents the original
> failure, while subsequent errors may be cascading effects.
> 
> Signed-off-by: Robbie Ko <robbieko@synology.com>
> Signed-off-by: jinbaohong <jinbaohong@synology.com>

The comment of btrfs_trim_fs() is still saying the return value is the 
last error.

Otherwise looks good to me.

Thanks,
Qu

> ---
>   fs/btrfs/extent-tree.c | 13 ++++++++-----
>   1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index bd167466b770..6c49465c0632 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -6653,7 +6653,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>   				ret = btrfs_cache_block_group(cache, true);
>   				if (ret) {
>   					bg_failed++;
> -					bg_ret = ret;
> +					if (!bg_ret)
> +						bg_ret = ret;
>   					continue;
>   				}
>   			}
> @@ -6666,7 +6667,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>   			trimmed += group_trimmed;
>   			if (ret) {
>   				bg_failed++;
> -				bg_ret = ret;
> +				if (!bg_ret)
> +					bg_ret = ret;
>   				continue;
>   			}
>   		}
> @@ -6674,7 +6676,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>   
>   	if (bg_failed)
>   		btrfs_warn(fs_info,
> -			"failed to trim %llu block group(s), last error %d",
> +			"failed to trim %llu block group(s), first error %d",
>   			bg_failed, bg_ret);
>   
>   	mutex_lock(&fs_devices->device_list_mutex);
> @@ -6687,7 +6689,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>   		trimmed += group_trimmed;
>   		if (ret) {
>   			dev_failed++;
> -			dev_ret = ret;
> +			if (!dev_ret)
> +				dev_ret = ret;
>   			continue;
>   		}
>   	}
> @@ -6695,7 +6698,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>   
>   	if (dev_failed)
>   		btrfs_warn(fs_info,
> -			"failed to trim %llu device(s), last error %d",
> +			"failed to trim %llu device(s), first error %d",
>   			dev_failed, dev_ret);
>   	range->len = trimmed;
>   	if (bg_ret)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/4] btrfs: handle user interrupt properly in btrfs_trim_fs
  2026-01-28  7:06 ` [PATCH v4 3/4] btrfs: handle user interrupt properly " jinbaohong
@ 2026-01-28  8:20   ` Qu Wenruo
  2026-01-28  9:21     ` jinbaohong
  0 siblings, 1 reply; 12+ messages in thread
From: Qu Wenruo @ 2026-01-28  8:20 UTC (permalink / raw)
  To: jinbaohong, linux-btrfs; +Cc: fdmanana, dsterba, Robbie Ko



在 2026/1/28 17:36, jinbaohong 写道:
> When a fatal signal is pending or the process is freezing,
> btrfs_trim_block_group and btrfs_trim_free_extents return -ERESTARTSYS.
> Currently this is treated as a regular error: the loops continue to the
> next iteration and count it as a block group or device failure.
> 
> Instead, break out of the loops immediately and return -ERESTARTSYS to
> userspace without counting it as a failure. Also skip the device loop
> entirely if the block group loop was interrupted.
> 
> Signed-off-by: Robbie Ko <robbieko@synology.com>
> Signed-off-by: jinbaohong <jinbaohong@synology.com>
> ---
>   fs/btrfs/extent-tree.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 6c49465c0632..633c7c0f9d92 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -6665,6 +6665,10 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>   						     range->minlen);
>   
>   			trimmed += group_trimmed;
> +			if (ret == -ERESTARTSYS || ret == -EINTR) {
> +				btrfs_put_block_group(cache);
> +				break;
> +			}
>   			if (ret) {
>   				bg_failed++;
>   				if (!bg_ret)
> @@ -6679,6 +6683,9 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>   			"failed to trim %llu block group(s), first error %d",
>   			bg_failed, bg_ret);
>   
> +	if (ret == -ERESTARTSYS || ret == -EINTR)
> +		return ret;
> +

If your idea is to exit without counting it as an error, it's better to 
put the early return before the error message, or it will still acts 
like an error.

>   	mutex_lock(&fs_devices->device_list_mutex);
>   	list_for_each_entry(device, &fs_devices->devices, dev_list) {
>   		if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
> @@ -6687,6 +6694,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>   		ret = btrfs_trim_free_extents(device, &group_trimmed);
>   
>   		trimmed += group_trimmed;
> +		if (ret == -ERESTARTSYS || ret == -EINTR)
> +			break;

Here you're only to break and then catching the same error code just to 
return.

Why not just unlock the mutex and return? That also skips the error message.

Thanks,
Qu

>   		if (ret) {
>   			dev_failed++;
>   			if (!dev_ret)
> @@ -6701,6 +6710,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>   			"failed to trim %llu device(s), first error %d",
>   			dev_failed, dev_ret);
>   	range->len = trimmed;
> +	if (ret == -ERESTARTSYS || ret == -EINTR)
> +		return ret;
>   	if (bg_ret)
>   		return bg_ret;
>   	return dev_ret;


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/4] btrfs: handle user interrupt properly in btrfs_trim_fs
  2026-01-28  8:20   ` Qu Wenruo
@ 2026-01-28  9:21     ` jinbaohong
  2026-01-28  9:37       ` Qu Wenruo
  0 siblings, 1 reply; 12+ messages in thread
From: jinbaohong @ 2026-01-28  9:21 UTC (permalink / raw)
  To: wqu; +Cc: dsterba, fdmanana, jinbaohong, linux-btrfs, robbieko

Hi Qu,

Thanks for the review.

> If your idea is to exit without counting it as an error, it's better to
> put the early return before the error message, or it will still acts
> like an error.

If some block groups or devices failed before the user interrupted, I
think we should still report those failures to the user. The warning
message tells the user what actually went wrong before they interrupted,
which could be useful information.

So the current placement is intentional: report any real errors that
occurred, then return the interrupt code.

What do you think?

Thanks,
Jinbao


Disclaimer: The contents of this e-mail message and any attachments are confidential and are intended solely for addressee. The information may also be legally privileged. This transmission is sent in trust, for the sole purpose of delivery to the intended recipient. If you have received this transmission in error, any use, reproduction or dissemination of this transmission is strictly prohibited. If you are not the intended recipient, please immediately notify the sender by reply e-mail or phone and delete this message and its attachments, if any.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/4] btrfs: handle user interrupt properly in btrfs_trim_fs
  2026-01-28  9:21     ` jinbaohong
@ 2026-01-28  9:37       ` Qu Wenruo
  0 siblings, 0 replies; 12+ messages in thread
From: Qu Wenruo @ 2026-01-28  9:37 UTC (permalink / raw)
  To: jinbaohong; +Cc: dsterba, fdmanana, linux-btrfs, robbieko



在 2026/1/28 19:51, jinbaohong 写道:
> Hi Qu,
> 
> Thanks for the review.
> 
>> If your idea is to exit without counting it as an error, it's better to
>> put the early return before the error message, or it will still acts
>> like an error.
> 
> If some block groups or devices failed before the user interrupted, I
> think we should still report those failures to the user. The warning
> message tells the user what actually went wrong before they interrupted,
> which could be useful information.
> 
> So the current placement is intentional: report any real errors that
> occurred, then return the interrupt code.
> 
> What do you think?

OK, then it looks good to me.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> 
> Thanks,
> Jinbao
> 
> 
> Disclaimer: The contents of this e-mail message and any attachments are confidential and are intended solely for addressee. The information may also be legally privileged. This transmission is sent in trust, for the sole purpose of delivery to the intended recipient. If you have received this transmission in error, any use, reproduction or dissemination of this transmission is strictly prohibited. If you are not the intended recipient, please immediately notify the sender by reply e-mail or phone and delete this message and its attachments, if any.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/4] btrfs: fix transaction commit blocking during trim of unallocated space
  2026-01-28  7:06 [PATCH v4 0/4] btrfs: fix transaction commit blocking during trim of unallocated space jinbaohong
                   ` (3 preceding siblings ...)
  2026-01-28  7:06 ` [PATCH v4 4/4] btrfs: fix transaction commit blocking during trim of unallocated space jinbaohong
@ 2026-01-28 16:49 ` Filipe Manana
  4 siblings, 0 replies; 12+ messages in thread
From: Filipe Manana @ 2026-01-28 16:49 UTC (permalink / raw)
  To: jinbaohong; +Cc: linux-btrfs, dsterba

On Wed, Jan 28, 2026 at 7:06 AM jinbaohong <jinbaohong@synology.com> wrote:
>
> When trimming unallocated space, btrfs_trim_fs() holds device_list_mutex
> for the entire duration while iterating through all devices. On large
> filesystems with significant unallocated space, this operation can take
> minutes to hours.
>
> This causes a problem because btrfs_run_dev_stats(), called during
> transaction commit, also requires device_list_mutex. While trim is
> running, all transaction commits are blocked waiting for the mutex.
>
> This series fixes the mutex blocking issue by refactoring
> btrfs_trim_free_extents() to process devices in bounded chunks and
> release device_list_mutex between chunks.
>
> Additionally, this series includes related fixes and improvements to
> error handling in btrfs_trim_fs():
> - Fix a bug where device failure stops the entire trim instead of
>   continuing to remaining devices
> - Preserve the first error rather than the last, as it's typically more
>   useful for debugging
> - Handle user interrupts (-ERESTARTSYS/-EINTR) properly by returning
>   immediately without counting them as device failures
>
> Patches:
>   1. Fix the break vs continue bug in device trimming loop
>   2. Preserve first error instead of last error
>   3. Handle user interrupts properly
>   4. Refactor to release device_list_mutex periodically during trim
>
> Changes since v3:
> - Split patch 1 into three separate patches (patches 1-3) for easier
>   review:
>   - Patch 1: The minimal bug fix (break -> continue)
>   - Patch 2: Preserve first error behavior
>   - Patch 3: Proper user interrupt handling
> - Fix patch 4 to preserve first error (consistent with patch 2) instead
>   of overwriting with last error
>
> Changes since v2:
> - Set dev_ret properly on user interrupt so callers get proper
> notification
> - Convert -EINTR from mutex operations to -ERESTARTSYS for consistent
> handling
> - Restructure btrfs_trim_free_extents() for better code clarity
>
> Changes since v1:
> - Add #define BTRFS_MAX_TRIM_LENGTH; remove maxlen parameter
> - Move loop-scoped variables into proper scope
> - Fix comment formatting per style guidelines
> - Replace goto with direct return statements
>
> jinbaohong (4):
>   btrfs: continue trimming remaining devices on failure
>   btrfs: preserve first error in btrfs_trim_fs
>   btrfs: handle user interrupt properly in btrfs_trim_fs
>   btrfs: fix transaction commit blocking during trim of unallocated
>     space

Reviewed-by: Filipe Manana <fdmanana@suse.com>

I pushed these to the for-next branch on github.
I updated patch 2 to replace "last" with "first" in the comment above
btrfs_trim_fs(), as Qu pointed out, and fixed a couple odd
indentations in patch 4.

Thanks.

>
>  fs/btrfs/extent-tree.c | 174 +++++++++++++++++++++++++++++++++++------
>  1 file changed, 152 insertions(+), 22 deletions(-)
>
> --
> 2.34.1
>
>
> Disclaimer: The contents of this e-mail message and any attachments are confidential and are intended solely for addressee. The information may also be legally privileged. This transmission is sent in trust, for the sole purpose of delivery to the intended recipient. If you have received this transmission in error, any use, reproduction or dissemination of this transmission is strictly prohibited. If you are not the intended recipient, please immediately notify the sender by reply e-mail or phone and delete this message and its attachments, if any.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 4/4] btrfs: fix transaction commit blocking during trim of unallocated space
  2026-01-28  7:06 ` [PATCH v4 4/4] btrfs: fix transaction commit blocking during trim of unallocated space jinbaohong
@ 2026-01-29  2:16   ` David Sterba
  0 siblings, 0 replies; 12+ messages in thread
From: David Sterba @ 2026-01-29  2:16 UTC (permalink / raw)
  To: jinbaohong; +Cc: linux-btrfs, fdmanana, dsterba, robbieko

On Wed, Jan 28, 2026 at 07:06:41AM +0000, jinbaohong wrote:
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -6493,6 +6493,9 @@ void btrfs_error_unpin_extent_range(struct btrfs_fs_info *fs_info, u64 start, u6
>  	unpin_extent_range(fs_info, start, end, false);
>  }
>  
> +/* Maximum length to trim in a single iteration to avoid holding mutex too long. */
> +#define BTRFS_MAX_TRIM_LENGTH SZ_2G

We have such constants in fs.h so it's in one place and more visible,
I'll update the commit. Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-01-29  2:16 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-28  7:06 [PATCH v4 0/4] btrfs: fix transaction commit blocking during trim of unallocated space jinbaohong
2026-01-28  7:06 ` [PATCH v4 1/4] btrfs: continue trimming remaining devices on failure jinbaohong
2026-01-28  8:09   ` Qu Wenruo
2026-01-28  7:06 ` [PATCH v4 2/4] btrfs: preserve first error in btrfs_trim_fs jinbaohong
2026-01-28  8:15   ` Qu Wenruo
2026-01-28  7:06 ` [PATCH v4 3/4] btrfs: handle user interrupt properly " jinbaohong
2026-01-28  8:20   ` Qu Wenruo
2026-01-28  9:21     ` jinbaohong
2026-01-28  9:37       ` Qu Wenruo
2026-01-28  7:06 ` [PATCH v4 4/4] btrfs: fix transaction commit blocking during trim of unallocated space jinbaohong
2026-01-29  2:16   ` David Sterba
2026-01-28 16:49 ` [PATCH v4 0/4] " Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox