public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] btrfs: Don't block suspend during fstrim
@ 2024-09-03  7:16 Luca Stefani
  2024-09-03  7:16 ` [PATCH v3 1/3] btrfs: Always update fstrim_range on failure Luca Stefani
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Luca Stefani @ 2024-09-03  7:16 UTC (permalink / raw)
  Cc: Luca Stefani, Chris Mason, Josef Bacik, David Sterba, linux-btrfs,
	linux-kernel

Changes since v2:
* Use blk_alloc_discard_bio directly
* Reset ret to ERESTARTSYS

Changes since v1:
* Use bio_discard_limit to calculate chunk size
* Makes use of the split chunks

v1: https://lore.kernel.org/lkml/20240902114303.922472-1-luca.stefani.ge1@gmail.com/
v2: https://lore.kernel.org/lkml/20240902205828.943155-1-luca.stefani.ge1@gmail.com/
Original discussion: https://lore.kernel.org/lkml/20240822164908.4957-1-luca.stefani.ge1@gmail.com/

Luca Stefani (3):
  btrfs: Always update fstrim_range on failure
  btrfs: Split remaining space to discard in chunks
  btrfs: Don't block system suspend during fstrim

 fs/btrfs/extent-tree.c | 52 +++++++++++++++++++++++++++++++++++-------
 fs/btrfs/ioctl.c       |  4 +---
 2 files changed, 45 insertions(+), 11 deletions(-)

-- 
2.46.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 1/3] btrfs: Always update fstrim_range on failure
  2024-09-03  7:16 [PATCH v3 0/2] btrfs: Don't block suspend during fstrim Luca Stefani
@ 2024-09-03  7:16 ` Luca Stefani
  2024-09-03  7:16 ` [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks Luca Stefani
  2024-09-03  7:16 ` [PATCH v3 3/3] btrfs: Don't block system suspend during fstrim Luca Stefani
  2 siblings, 0 replies; 14+ messages in thread
From: Luca Stefani @ 2024-09-03  7:16 UTC (permalink / raw)
  Cc: Luca Stefani, Qu Wenruo, Chris Mason, Josef Bacik, David Sterba,
	linux-btrfs, linux-kernel

Even in case of failure we could've discarded
some data and userspace should be made aware of it,
so copy fstrim_range to userspace regardless.

Also make sure to update the trimmed bytes amount
even if btrfs_trim_free_extents fails.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
---
 fs/btrfs/extent-tree.c | 4 ++--
 fs/btrfs/ioctl.c       | 4 +---
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index feec49e6f9c8..a5966324607d 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -6551,13 +6551,13 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 			continue;
 
 		ret = btrfs_trim_free_extents(device, &group_trimmed);
+
+		trimmed += group_trimmed;
 		if (ret) {
 			dev_failed++;
 			dev_ret = ret;
 			break;
 		}
-
-		trimmed += group_trimmed;
 	}
 	mutex_unlock(&fs_devices->device_list_mutex);
 
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index e0a664b8a46a..94d8f29b04c5 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -543,13 +543,11 @@ static noinline int btrfs_ioctl_fitrim(struct btrfs_fs_info *fs_info,
 
 	range.minlen = max(range.minlen, minlen);
 	ret = btrfs_trim_fs(fs_info, &range);
-	if (ret < 0)
-		return ret;
 
 	if (copy_to_user(arg, &range, sizeof(range)))
 		return -EFAULT;
 
-	return 0;
+	return ret;
 }
 
 int __pure btrfs_is_empty_uuid(const u8 *uuid)
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks
  2024-09-03  7:16 [PATCH v3 0/2] btrfs: Don't block suspend during fstrim Luca Stefani
  2024-09-03  7:16 ` [PATCH v3 1/3] btrfs: Always update fstrim_range on failure Luca Stefani
@ 2024-09-03  7:16 ` Luca Stefani
  2024-09-03  7:27   ` Christoph Hellwig
                     ` (3 more replies)
  2024-09-03  7:16 ` [PATCH v3 3/3] btrfs: Don't block system suspend during fstrim Luca Stefani
  2 siblings, 4 replies; 14+ messages in thread
From: Luca Stefani @ 2024-09-03  7:16 UTC (permalink / raw)
  Cc: Luca Stefani, Chris Mason, Josef Bacik, David Sterba, linux-btrfs,
	linux-kernel

Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device,
mostly empty although we will do the split according to our super block
locations, the last super block ends at 256G, we can submit a huge
discard for the range [256G, 8T), causing a super large delay.

We now split the space left to discard based the block discard limit
in preparation of introduction of cancellation signals handling.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
---
 fs/btrfs/extent-tree.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a5966324607d..9c1ddf13659e 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1301,12 +1301,26 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
 	}
 
 	if (bytes_left) {
-		ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT,
-					   bytes_left >> SECTOR_SHIFT,
-					   GFP_NOFS);
-		if (!ret)
-			*discarded_bytes += bytes_left;
+		u64 bytes_to_discard;
+		struct bio *bio = NULL;
+		sector_t sector = start >> SECTOR_SHIFT;
+		sector_t nr_sects = bytes_left >> SECTOR_SHIFT;
+
+		while ((bio = blk_alloc_discard_bio(bdev, &sector, &nr_sects,
+				GFP_NOFS))) {
+			ret = submit_bio_wait(bio);
+			bio_put(bio);
+
+			if (!ret)
+				bytes_to_discard = bio->bi_iter.bi_size;
+			else if (ret != -EOPNOTSUPP)
+				return ret;
+
+			start += bytes_to_discard;
+			bytes_left -= bytes_to_discard;
+		}
 	}
+
 	return ret;
 }
 
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 3/3] btrfs: Don't block system suspend during fstrim
  2024-09-03  7:16 [PATCH v3 0/2] btrfs: Don't block suspend during fstrim Luca Stefani
  2024-09-03  7:16 ` [PATCH v3 1/3] btrfs: Always update fstrim_range on failure Luca Stefani
  2024-09-03  7:16 ` [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks Luca Stefani
@ 2024-09-03  7:16 ` Luca Stefani
  2 siblings, 0 replies; 14+ messages in thread
From: Luca Stefani @ 2024-09-03  7:16 UTC (permalink / raw)
  Cc: Luca Stefani, Chris Mason, Josef Bacik, David Sterba, linux-btrfs,
	linux-kernel

Sometimes the system isn't able to suspend because the task
responsible for trimming the device isn't able to finish in
time, especially since we have a free extent discarding phase,
which can trim a lot of unallocated space, and there is no
limits on the trim size (unlike the block group part).

Since discard isn't a critical call it can be interrupted
at any time, in such cases we stop the trim, report the amount
of discarded bytes and return failure.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
---
 fs/btrfs/extent-tree.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 9c1ddf13659e..ee4daaa56b95 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -16,6 +16,7 @@
 #include <linux/percpu_counter.h>
 #include <linux/lockdep.h>
 #include <linux/crc32c.h>
+#include <linux/freezer.h>
 #include "ctree.h"
 #include "extent-tree.h"
 #include "transaction.h"
@@ -1235,6 +1236,11 @@ static int remove_extent_backref(struct btrfs_trans_handle *trans,
 	return ret;
 }
 
+static bool btrfs_trim_interrupted(void)
+{
+	return fatal_signal_pending(current) || freezing(current);
+}
+
 static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
 			       u64 *discarded_bytes)
 {
@@ -1318,6 +1324,11 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
 
 			start += bytes_to_discard;
 			bytes_left -= bytes_to_discard;
+
+			if (btrfs_trim_interrupted()) {
+				ret = -ERESTARTSYS;
+				break;
+			}
 		}
 	}
 
@@ -6473,7 +6484,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
 		start += len;
 		*trimmed += bytes;
 
-		if (fatal_signal_pending(current)) {
+		if (btrfs_trim_interrupted()) {
 			ret = -ERESTARTSYS;
 			break;
 		}
@@ -6522,6 +6533,9 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 
 	cache = btrfs_lookup_first_block_group(fs_info, range->start);
 	for (; cache; cache = btrfs_next_block_group(cache)) {
+		if (btrfs_trim_interrupted())
+			break;
+
 		if (cache->start >= range_end) {
 			btrfs_put_block_group(cache);
 			break;
@@ -6561,6 +6575,9 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 
 	mutex_lock(&fs_devices->device_list_mutex);
 	list_for_each_entry(device, &fs_devices->devices, dev_list) {
+		if (btrfs_trim_interrupted())
+			break;
+
 		if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
 			continue;
 
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks
  2024-09-03  7:16 ` [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks Luca Stefani
@ 2024-09-03  7:27   ` Christoph Hellwig
  2024-09-03  7:39     ` [PATCH] block: Export blk_alloc_discard_bio Luca Stefani
  2024-09-03  9:43     ` [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks Qu Wenruo
  2024-09-03 10:39   ` Luca Stefani
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Christoph Hellwig @ 2024-09-03  7:27 UTC (permalink / raw)
  To: Luca Stefani
  Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel

You'll also need add a EXPORT_SYMBOL_GPL for blk_alloc_discard_bio.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] block: Export blk_alloc_discard_bio
  2024-09-03  7:27   ` Christoph Hellwig
@ 2024-09-03  7:39     ` Luca Stefani
  2024-09-03 15:49       ` Jens Axboe
  2024-09-03  9:43     ` [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks Qu Wenruo
  1 sibling, 1 reply; 14+ messages in thread
From: Luca Stefani @ 2024-09-03  7:39 UTC (permalink / raw)
  Cc: Luca Stefani, Jens Axboe, linux-block, linux-kernel

The fs trim loops over ranges and sends discard requests, some ranges
can be large so it's all transparently handled by blkdev_issue_discard()
and processed in smaller chunks.

To support cancellation (or suspend) requests we need to insert checks
into the the loop, exporting the symbol allows to reimplement
such loop with the desired behavior.

Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
---
 block/blk-lib.c        | 1 +
 include/linux/blkdev.h | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 4c9f20a689f7..ebaef47d8ce7 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -59,6 +59,7 @@ struct bio *blk_alloc_discard_bio(struct block_device *bdev,
 	cond_resched();
 	return bio;
 }
+EXPORT_SYMBOL_GPL(blk_alloc_discard_bio);
 
 int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 		sector_t nr_sects, gfp_t gfp_mask, struct bio **biop)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index b7664d593486..f3631044d905 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1088,6 +1088,8 @@ static inline long nr_blockdev_pages(void)
 
 extern void blk_io_schedule(void);
 
+struct bio *blk_alloc_discard_bio(struct block_device *bdev,
+		sector_t *sector, sector_t *nr_sects, gfp_t gfp_mask);
 int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 		sector_t nr_sects, gfp_t gfp_mask);
 int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks
  2024-09-03  7:27   ` Christoph Hellwig
  2024-09-03  7:39     ` [PATCH] block: Export blk_alloc_discard_bio Luca Stefani
@ 2024-09-03  9:43     ` Qu Wenruo
  2024-09-04  4:13       ` Christoph Hellwig
  1 sibling, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2024-09-03  9:43 UTC (permalink / raw)
  To: Christoph Hellwig, Luca Stefani
  Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel



在 2024/9/3 16:57, Christoph Hellwig 写道:
> You'll also need add a EXPORT_SYMBOL_GPL for blk_alloc_discard_bio.

Just curious, shouldn't blk_ioctl_discard() also check frozen status to
prevent block device trim from block suspension?

And if that's the case, would it make more sense to move the signal and
freezing checks into blk_alloc_discard_bio()?

Thanks,
Qu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks
  2024-09-03  7:16 ` [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks Luca Stefani
  2024-09-03  7:27   ` Christoph Hellwig
@ 2024-09-03 10:39   ` Luca Stefani
  2024-09-03 19:39   ` David Sterba
  2024-09-06 20:10   ` kernel test robot
  3 siblings, 0 replies; 14+ messages in thread
From: Luca Stefani @ 2024-09-03 10:39 UTC (permalink / raw)
  Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel



On 03/09/24 09:16, Luca Stefani wrote:
> Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device,
> mostly empty although we will do the split according to our super block
> locations, the last super block ends at 256G, we can submit a huge
> discard for the range [256G, 8T), causing a super large delay.
> 
> We now split the space left to discard based the block discard limit
> in preparation of introduction of cancellation signals handling.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
> Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
> Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
> ---
>   fs/btrfs/extent-tree.c | 24 +++++++++++++++++++-----
>   1 file changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index a5966324607d..9c1ddf13659e 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -1301,12 +1301,26 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
>   	}
>   
>   	if (bytes_left) {
> -		ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT,
> -					   bytes_left >> SECTOR_SHIFT,
> -					   GFP_NOFS);
> -		if (!ret)
> -			*discarded_bytes += bytes_left;
I removed this by mistake, will be re-added.
> +		u64 bytes_to_discard;
> +		struct bio *bio = NULL;
> +		sector_t sector = start >> SECTOR_SHIFT;
> +		sector_t nr_sects = bytes_left >> SECTOR_SHIFT;
> +
> +		while ((bio = blk_alloc_discard_bio(bdev, &sector, &nr_sects,
> +				GFP_NOFS))) {
> +			ret = submit_bio_wait(bio);
> +			bio_put(bio);
> +
> +			if (!ret)
> +				bytes_to_discard = bio->bi_iter.bi_size;
> +			else if (ret != -EOPNOTSUPP)
> +				return ret;
I think I got the logic wrong, we probably want to `continue` in case 
ret is set, but it's not -EOPNOTSUPP, otherwise bytes_to_discard might 
be left uninitialized.
bio->bi_iter.bi_size can be used directly for all those cases, so I'll 
remove bytes_to_discard as a whole.
> +
> +			start += bytes_to_discard;
> +			bytes_left -= bytes_to_discard;
> +		}
>   	}
> +
>   	return ret;
>   }
>   

I'll fix those up for v4, but I'll wait for more comments before doing so.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] block: Export blk_alloc_discard_bio
  2024-09-03  7:39     ` [PATCH] block: Export blk_alloc_discard_bio Luca Stefani
@ 2024-09-03 15:49       ` Jens Axboe
  2024-09-03 16:02         ` Luca Stefani
  0 siblings, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2024-09-03 15:49 UTC (permalink / raw)
  To: Luca Stefani; +Cc: linux-block, linux-kernel

On 9/3/24 1:39 AM, Luca Stefani wrote:
> The fs trim loops over ranges and sends discard requests, some ranges
> can be large so it's all transparently handled by blkdev_issue_discard()
> and processed in smaller chunks.
> 
> To support cancellation (or suspend) requests we need to insert checks
> into the the loop, exporting the symbol allows to reimplement
> such loop with the desired behavior.
> 
> Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
> ---
>  block/blk-lib.c        | 1 +
>  include/linux/blkdev.h | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index 4c9f20a689f7..ebaef47d8ce7 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -59,6 +59,7 @@ struct bio *blk_alloc_discard_bio(struct block_device *bdev,
>  	cond_resched();
>  	return bio;
>  }
> +EXPORT_SYMBOL_GPL(blk_alloc_discard_bio);
>  
>  int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>  		sector_t nr_sects, gfp_t gfp_mask, struct bio **biop)
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index b7664d593486..f3631044d905 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1088,6 +1088,8 @@ static inline long nr_blockdev_pages(void)
>  
>  extern void blk_io_schedule(void);
>  
> +struct bio *blk_alloc_discard_bio(struct block_device *bdev,
> +		sector_t *sector, sector_t *nr_sects, gfp_t gfp_mask);
>  int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>  		sector_t nr_sects, gfp_t gfp_mask);
>  int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,

Since blk_alloc_discard_bio() is already defined in a header (otherwise
it would've been static and your export symbol above would have failed
miserably), why add it to another header?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] block: Export blk_alloc_discard_bio
  2024-09-03 15:49       ` Jens Axboe
@ 2024-09-03 16:02         ` Luca Stefani
  0 siblings, 0 replies; 14+ messages in thread
From: Luca Stefani @ 2024-09-03 16:02 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel



On 03/09/24 17:49, Jens Axboe wrote:
> On 9/3/24 1:39 AM, Luca Stefani wrote:
>> The fs trim loops over ranges and sends discard requests, some ranges
>> can be large so it's all transparently handled by blkdev_issue_discard()
>> and processed in smaller chunks.
>>
>> To support cancellation (or suspend) requests we need to insert checks
>> into the the loop, exporting the symbol allows to reimplement
>> such loop with the desired behavior.
>>
>> Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
>> ---
>>   block/blk-lib.c        | 1 +
>>   include/linux/blkdev.h | 2 ++
>>   2 files changed, 3 insertions(+)
>>
>> diff --git a/block/blk-lib.c b/block/blk-lib.c
>> index 4c9f20a689f7..ebaef47d8ce7 100644
>> --- a/block/blk-lib.c
>> +++ b/block/blk-lib.c
>> @@ -59,6 +59,7 @@ struct bio *blk_alloc_discard_bio(struct block_device *bdev,
>>   	cond_resched();
>>   	return bio;
>>   }
>> +EXPORT_SYMBOL_GPL(blk_alloc_discard_bio);
>>   
>>   int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>>   		sector_t nr_sects, gfp_t gfp_mask, struct bio **biop)
>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>> index b7664d593486..f3631044d905 100644
>> --- a/include/linux/blkdev.h
>> +++ b/include/linux/blkdev.h
>> @@ -1088,6 +1088,8 @@ static inline long nr_blockdev_pages(void)
>>   
>>   extern void blk_io_schedule(void);
>>   
>> +struct bio *blk_alloc_discard_bio(struct block_device *bdev,
>> +		sector_t *sector, sector_t *nr_sects, gfp_t gfp_mask);
>>   int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>>   		sector_t nr_sects, gfp_t gfp_mask);
>>   int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> 
> Since blk_alloc_discard_bio() is already defined in a header (otherwise
> it would've been static and your export symbol above would have failed
> miserably), why add it to another header?
> 

ACK, will remove the header change in v4,

Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks
  2024-09-03  7:16 ` [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks Luca Stefani
  2024-09-03  7:27   ` Christoph Hellwig
  2024-09-03 10:39   ` Luca Stefani
@ 2024-09-03 19:39   ` David Sterba
  2024-09-06 20:10   ` kernel test robot
  3 siblings, 0 replies; 14+ messages in thread
From: David Sterba @ 2024-09-03 19:39 UTC (permalink / raw)
  To: Luca Stefani
  Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel

On Tue, Sep 03, 2024 at 09:16:11AM +0200, Luca Stefani wrote:
> Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device,
> mostly empty although we will do the split according to our super block
> locations, the last super block ends at 256G, we can submit a huge
> discard for the range [256G, 8T), causing a super large delay.
> 
> We now split the space left to discard based the block discard limit
> in preparation of introduction of cancellation signals handling.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
> Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
> Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
> ---
>  fs/btrfs/extent-tree.c | 24 +++++++++++++++++++-----
>  1 file changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index a5966324607d..9c1ddf13659e 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -1301,12 +1301,26 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
>  	}
>  
>  	if (bytes_left) {
> -		ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT,
> -					   bytes_left >> SECTOR_SHIFT,
> -					   GFP_NOFS);
> -		if (!ret)
> -			*discarded_bytes += bytes_left;
> +		u64 bytes_to_discard;
> +		struct bio *bio = NULL;
> +		sector_t sector = start >> SECTOR_SHIFT;
> +		sector_t nr_sects = bytes_left >> SECTOR_SHIFT;
> +
> +		while ((bio = blk_alloc_discard_bio(bdev, &sector, &nr_sects,
> +				GFP_NOFS))) {
> +			ret = submit_bio_wait(bio);
> +			bio_put(bio);
> +
> +			if (!ret)
> +				bytes_to_discard = bio->bi_iter.bi_size;
> +			else if (ret != -EOPNOTSUPP)
> +				return ret;
> +
> +			start += bytes_to_discard;
> +			bytes_left -= bytes_to_discard;
> +		}

This is not what I anticipated, we only wanted to know the optimal
request size but now it's reimplementing the bio submission and compared
to blkdev_issue_discard() it lacks blk_start_plug/blk_finish_plug.

As we won't get the bio_discard_limit() export for some reason I suggest
to go back to setting the maximum chunk limit in our code and set it to
something like 8G.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks
  2024-09-03  9:43     ` [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks Qu Wenruo
@ 2024-09-04  4:13       ` Christoph Hellwig
  2024-09-04  5:34         ` Qu Wenruo
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2024-09-04  4:13 UTC (permalink / raw)
  To: Qu Wenruo
  Cc: Christoph Hellwig, Luca Stefani, Chris Mason, Josef Bacik,
	David Sterba, linux-btrfs, linux-kernel

On Tue, Sep 03, 2024 at 07:13:29PM +0930, Qu Wenruo wrote:
> 
> 
> 在 2024/9/3 16:57, Christoph Hellwig 写道:
> > You'll also need add a EXPORT_SYMBOL_GPL for blk_alloc_discard_bio.
> 
> Just curious, shouldn't blk_ioctl_discard() also check frozen status to
> prevent block device trim from block suspension?

Someone needs to explain what 'block suspension' is for that first.

> And if that's the case, would it make more sense to move the signal and
> freezing checks into blk_alloc_discard_bio()?

No.  Look at that the recent history of the dicard support.  We tried
that and it badly breaks callers not expecting the signal handling.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks
  2024-09-04  4:13       ` Christoph Hellwig
@ 2024-09-04  5:34         ` Qu Wenruo
  0 siblings, 0 replies; 14+ messages in thread
From: Qu Wenruo @ 2024-09-04  5:34 UTC (permalink / raw)
  To: Christoph Hellwig, Qu Wenruo
  Cc: Luca Stefani, Chris Mason, Josef Bacik, David Sterba, linux-btrfs,
	linux-kernel



在 2024/9/4 13:43, Christoph Hellwig 写道:
> On Tue, Sep 03, 2024 at 07:13:29PM +0930, Qu Wenruo wrote:
>>
>>
>> 在 2024/9/3 16:57, Christoph Hellwig 写道:
>>> You'll also need add a EXPORT_SYMBOL_GPL for blk_alloc_discard_bio.
>>
>> Just curious, shouldn't blk_ioctl_discard() also check frozen status to
>> prevent block device trim from block suspension?
> 
> Someone needs to explain what 'block suspension' is for that first.

E.g. a long running full device trim preventing PM suspension/hibernation.

The original report shows the fstrim of btrfs is preventing the system 
entering suspension/hibernation:

  PM: suspend entry (deep)
  Filesystems sync: 0.060 seconds
  Freezing user space processes
  Freezing user space processes failed after 20.007 seconds (1 tasks 
refusing to freeze, wq_busy=0):
  task:fstrim          state:D stack:0     pid:15564 tgid:15564 ppid:1 
    flags:0x00004006
  Call Trace:
   <TASK>
   __schedule+0x381/0x1540
   schedule+0x24/0xb0
   schedule_timeout+0x1ea/0x2a0
   io_schedule_timeout+0x19/0x50
   wait_for_completion_io+0x78/0x140
   submit_bio_wait+0xaa/0xc0
   blkdev_issue_discard+0x65/0xb0
   btrfs_issue_discard+0xcf/0x160 [btrfs 
7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
   btrfs_discard_extent+0x120/0x2a0 [btrfs 
7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
   do_trimming+0xd4/0x220 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
   trim_bitmaps+0x418/0x520 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
   btrfs_trim_block_group+0xcb/0x130 [btrfs 
7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
   btrfs_trim_fs+0x119/0x460 [btrfs 
7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
   btrfs_ioctl_fitrim+0xfb/0x160 [btrfs 
7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
   btrfs_ioctl+0x11cc/0x29f0 [btrfs 
7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
   __x64_sys_ioctl+0x92/0xd0
   do_syscall_64+0x5b/0x80
   entry_SYSCALL_64_after_hwframe+0x7c/0xe6
  RIP: 0033:0x7f5f3b529f9b
  RSP: 002b:00007fff279ebc20 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 00007fff279ebd60 RCX: 00007f5f3b529f9b
  RDX: 00007fff279ebc90 RSI: 00000000c0185879 RDI: 0000000000000003
  RBP: 000055748718b2d0 R08: 00005574871899e8 R09: 00007fff279eb010
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
  R13: 000055748718ac40 R14: 000055748718b290 R15: 000055748718b290
   </TASK>
  OOM killer enabled.
  Restarting tasks ... done.
  random: crng reseeded on system resumption
  PM: suspend exit
  PM: suspend entry (s2idle)
  Filesystems sync: 0.047 seconds

Considering blkdev_issue_discard() is already doing cond_sched() and 
btrfs is also doing cond_sched() for each block group, it looks like PM 
freezing needs its own freezing() checks, just like what ext4 is doing.

> 
>> And if that's the case, would it make more sense to move the signal and
>> freezing checks into blk_alloc_discard_bio()?
> 
> No.  Look at that the recent history of the dicard support.  We tried
> that and it badly breaks callers not expecting the signal handling.

OK, got it, at least for btrfs we would go the blk_alloc_discard_bio() 
and do signal and freezing checks.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks
  2024-09-03  7:16 ` [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks Luca Stefani
                     ` (2 preceding siblings ...)
  2024-09-03 19:39   ` David Sterba
@ 2024-09-06 20:10   ` kernel test robot
  3 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2024-09-06 20:10 UTC (permalink / raw)
  To: Luca Stefani
  Cc: llvm, oe-kbuild-all, Luca Stefani, Chris Mason, Josef Bacik,
	David Sterba, linux-btrfs, linux-kernel

Hi Luca,

kernel test robot noticed the following build errors:

[auto build test ERROR on v6.11-rc6]
[also build test ERROR on linus/master]
[cannot apply to kdave/for-next next-20240906]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Luca-Stefani/btrfs-Always-update-fstrim_range-on-failure/20240903-191851
base:   v6.11-rc6
patch link:    https://lore.kernel.org/r/20240903071625.957275-3-luca.stefani.ge1%40gmail.com
patch subject: [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks
config: x86_64-rhel-8.3-rust (https://download.01.org/0day-ci/archive/20240907/202409070311.SB9wmgSx-lkp@intel.com/config)
compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 617a15a9eac96088ae5e9134248d8236e34b91b1)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240907/202409070311.SB9wmgSx-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409070311.SB9wmgSx-lkp@intel.com/

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "blk_alloc_discard_bio" [fs/btrfs/btrfs.ko] undefined!

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-09-06 20:10 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-03  7:16 [PATCH v3 0/2] btrfs: Don't block suspend during fstrim Luca Stefani
2024-09-03  7:16 ` [PATCH v3 1/3] btrfs: Always update fstrim_range on failure Luca Stefani
2024-09-03  7:16 ` [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks Luca Stefani
2024-09-03  7:27   ` Christoph Hellwig
2024-09-03  7:39     ` [PATCH] block: Export blk_alloc_discard_bio Luca Stefani
2024-09-03 15:49       ` Jens Axboe
2024-09-03 16:02         ` Luca Stefani
2024-09-03  9:43     ` [PATCH v3 2/3] btrfs: Split remaining space to discard in chunks Qu Wenruo
2024-09-04  4:13       ` Christoph Hellwig
2024-09-04  5:34         ` Qu Wenruo
2024-09-03 10:39   ` Luca Stefani
2024-09-03 19:39   ` David Sterba
2024-09-06 20:10   ` kernel test robot
2024-09-03  7:16 ` [PATCH v3 3/3] btrfs: Don't block system suspend during fstrim Luca Stefani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox