* [PATCH v6 0/2] btrfs: Don't block system suspend during fstrim
@ 2024-09-17 20:33 Luca Stefani
2024-09-17 20:33 ` [PATCH v6 1/2] btrfs: Split remaining space to discard in chunks Luca Stefani
` (5 more replies)
0 siblings, 6 replies; 10+ messages in thread
From: Luca Stefani @ 2024-09-17 20:33 UTC (permalink / raw)
Cc: Luca Stefani, Chris Mason, Josef Bacik, David Sterba, linux-btrfs,
linux-kernel
Changes since v5:
* Make chunk size a define
* Remove superfluous trim_interrupted checks
after moving them to trim_no_bitmap/trim_bitmaps
Changes since v4:
* Set chunk size to 1G
* Set proper error return codes in case of interruption
* Dropped fstrim_range fixup as pulled in -next
Changes since v3:
* Went back to manual chunk size
Changes since v2:
* Use blk_alloc_discard_bio directly
* Reset ret to ERESTARTSYS
Changes since v1:
* Use bio_discard_limit to calculate chunk size
* Makes use of the split chunks
Original discussion: https://lore.kernel.org/lkml/20240822164908.4957-1-luca.stefani.ge1@gmail.com/
v1: https://lore.kernel.org/lkml/20240902114303.922472-1-luca.stefani.ge1@gmail.com/
v2: https://lore.kernel.org/lkml/20240902205828.943155-1-luca.stefani.ge1@gmail.com/
v3: https://lore.kernel.org/lkml/20240903071625.957275-4-luca.stefani.ge1@gmail.com/
v4: https://lore.kernel.org/lkml/20240916101615.116164-1-luca.stefani.ge1@gmail.com/
v5: https://lore.kernel.org/lkml/20240916125707.127118-1-luca.stefani.ge1@gmail.com/
---
NB: I didn't change btrfs_discard_workfn yet to add error checks
as I don't know what semantics we should have in that case.
The work queue is always re-scheduled and created with WQ_FREEZABLE
so it should be automatically frozen. Shall I simply add some logs?
---
Luca Stefani (2):
btrfs: Split remaining space to discard in chunks
btrfs: Don't block system suspend during fstrim
fs/btrfs/extent-tree.c | 26 +++++++++++++++++++++-----
fs/btrfs/free-space-cache.c | 4 ++--
fs/btrfs/free-space-cache.h | 6 ++++++
fs/btrfs/volumes.h | 1 +
4 files changed, 30 insertions(+), 7 deletions(-)
--
2.46.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v6 1/2] btrfs: Split remaining space to discard in chunks
2024-09-17 20:33 [PATCH v6 0/2] btrfs: Don't block system suspend during fstrim Luca Stefani
@ 2024-09-17 20:33 ` Luca Stefani
2024-09-17 20:33 ` [PATCH v6 2/2] btrfs: Don't block system suspend during fstrim Luca Stefani
` (4 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Luca Stefani @ 2024-09-17 20:33 UTC (permalink / raw)
Cc: Luca Stefani, Chris Mason, Josef Bacik, David Sterba, linux-btrfs,
linux-kernel
Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device,
mostly empty although we will do the split according to our super block
locations, the last super block ends at 256G, we can submit a huge
discard for the range [256G, 8T), causing a super large delay.
We now split the space left to discard based on BTRFS_MAX_DISCARD_CHUNK_SIZE
in preparation of introduction of cancellation signals handling.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
---
fs/btrfs/extent-tree.c | 19 +++++++++++++++----
fs/btrfs/volumes.h | 1 +
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a5966324607d..ad70548d1f72 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1300,13 +1300,24 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
bytes_left = end - start;
}
- if (bytes_left) {
+ while (bytes_left) {
+ u64 bytes_to_discard = min(BTRFS_MAX_DISCARD_CHUNK_SIZE, bytes_left);
+
ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT,
- bytes_left >> SECTOR_SHIFT,
+ bytes_to_discard >> SECTOR_SHIFT,
GFP_NOFS);
- if (!ret)
- *discarded_bytes += bytes_left;
+
+ if (ret) {
+ if (ret != -EOPNOTSUPP)
+ break;
+ continue;
+ }
+
+ start += bytes_to_discard;
+ bytes_left -= bytes_to_discard;
+ *discarded_bytes += bytes_to_discard;
}
+
return ret;
}
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 03d2d60afe0c..c15062a435bd 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -29,6 +29,7 @@ struct btrfs_trans_handle;
struct btrfs_zoned_device_info;
#define BTRFS_MAX_DATA_CHUNK_SIZE (10ULL * SZ_1G)
+#define BTRFS_MAX_DISCARD_CHUNK_SIZE (1ULL * SZ_1G)
extern struct mutex uuid_mutex;
--
2.46.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v6 2/2] btrfs: Don't block system suspend during fstrim
2024-09-17 20:33 [PATCH v6 0/2] btrfs: Don't block system suspend during fstrim Luca Stefani
2024-09-17 20:33 ` [PATCH v6 1/2] btrfs: Split remaining space to discard in chunks Luca Stefani
@ 2024-09-17 20:33 ` Luca Stefani
2024-11-19 7:54 ` git
2024-09-18 11:52 ` [PATCH v6 0/2] " David Sterba
` (3 subsequent siblings)
5 siblings, 1 reply; 10+ messages in thread
From: Luca Stefani @ 2024-09-17 20:33 UTC (permalink / raw)
Cc: Luca Stefani, Chris Mason, Josef Bacik, David Sterba, linux-btrfs,
linux-kernel
Sometimes the system isn't able to suspend because the task
responsible for trimming the device isn't able to finish in
time, especially since we have a free extent discarding phase,
which can trim a lot of unallocated space, and there is no
limits on the trim size (unlike the block group part).
Since discard isn't a critical call it can be interrupted
at any time, in such cases we stop the trim, report the amount
of discarded bytes and return failure.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
---
fs/btrfs/extent-tree.c | 7 ++++++-
fs/btrfs/free-space-cache.c | 4 ++--
fs/btrfs/free-space-cache.h | 6 ++++++
3 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index ad70548d1f72..d9f511babd89 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1316,6 +1316,11 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
start += bytes_to_discard;
bytes_left -= bytes_to_discard;
*discarded_bytes += bytes_to_discard;
+
+ if (btrfs_trim_interrupted()) {
+ ret = -ERESTARTSYS;
+ break;
+ }
}
return ret;
@@ -6470,7 +6475,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
start += len;
*trimmed += bytes;
- if (fatal_signal_pending(current)) {
+ if (btrfs_trim_interrupted()) {
ret = -ERESTARTSYS;
break;
}
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index eaa1dbd31352..f4bcb2530660 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -3809,7 +3809,7 @@ static int trim_no_bitmap(struct btrfs_block_group *block_group,
if (async && *total_trimmed)
break;
- if (fatal_signal_pending(current)) {
+ if (btrfs_trim_interrupted()) {
ret = -ERESTARTSYS;
break;
}
@@ -4000,7 +4000,7 @@ static int trim_bitmaps(struct btrfs_block_group *block_group,
}
block_group->discard_cursor = start;
- if (fatal_signal_pending(current)) {
+ if (btrfs_trim_interrupted()) {
if (start != offset)
reset_trimming_bitmap(ctl, offset);
ret = -ERESTARTSYS;
diff --git a/fs/btrfs/free-space-cache.h b/fs/btrfs/free-space-cache.h
index 83774bfd7b3b..9f1dbfdee8ca 100644
--- a/fs/btrfs/free-space-cache.h
+++ b/fs/btrfs/free-space-cache.h
@@ -10,6 +10,7 @@
#include <linux/list.h>
#include <linux/spinlock.h>
#include <linux/mutex.h>
+#include <linux/freezer.h>
#include "fs.h"
struct inode;
@@ -56,6 +57,11 @@ static inline bool btrfs_free_space_trimming_bitmap(
return (info->trim_state == BTRFS_TRIM_STATE_TRIMMING);
}
+static inline bool btrfs_trim_interrupted(void)
+{
+ return fatal_signal_pending(current) || freezing(current);
+}
+
/*
* Deltas are an effective way to populate global statistics. Give macro names
* to make it clear what we're doing. An example is discard_extents in
--
2.46.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v6 0/2] btrfs: Don't block system suspend during fstrim
2024-09-17 20:33 [PATCH v6 0/2] btrfs: Don't block system suspend during fstrim Luca Stefani
2024-09-17 20:33 ` [PATCH v6 1/2] btrfs: Split remaining space to discard in chunks Luca Stefani
2024-09-17 20:33 ` [PATCH v6 2/2] btrfs: Don't block system suspend during fstrim Luca Stefani
@ 2024-09-18 11:52 ` David Sterba
2024-09-18 11:53 ` David Sterba
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2024-09-18 11:52 UTC (permalink / raw)
To: Luca Stefani
Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel
On Tue, Sep 17, 2024 at 10:33:03PM +0200, Luca Stefani wrote:
> Changes since v5:
> * Make chunk size a define
> * Remove superfluous trim_interrupted checks
> after moving them to trim_no_bitmap/trim_bitmaps
Thanks, I haven't spotted anything left to fix or update. The patches
are short enough to be backported so that can happen around 6.12-rc1
time.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v6 0/2] btrfs: Don't block system suspend during fstrim
2024-09-17 20:33 [PATCH v6 0/2] btrfs: Don't block system suspend during fstrim Luca Stefani
` (2 preceding siblings ...)
2024-09-18 11:52 ` [PATCH v6 0/2] " David Sterba
@ 2024-09-18 11:53 ` David Sterba
2024-09-18 21:34 ` Qu Wenruo
2024-09-19 12:55 ` David Sterba
5 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2024-09-18 11:53 UTC (permalink / raw)
To: Luca Stefani
Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel
On Tue, Sep 17, 2024 at 10:33:03PM +0200, Luca Stefani wrote:
> Changes since v5:
> * Make chunk size a define
> * Remove superfluous trim_interrupted checks
> after moving them to trim_no_bitmap/trim_bitmaps
>
> Changes since v4:
> * Set chunk size to 1G
> * Set proper error return codes in case of interruption
> * Dropped fstrim_range fixup as pulled in -next
>
> Changes since v3:
> * Went back to manual chunk size
>
> Changes since v2:
> * Use blk_alloc_discard_bio directly
> * Reset ret to ERESTARTSYS
>
> Changes since v1:
> * Use bio_discard_limit to calculate chunk size
> * Makes use of the split chunks
>
> Original discussion: https://lore.kernel.org/lkml/20240822164908.4957-1-luca.stefani.ge1@gmail.com/
> v1: https://lore.kernel.org/lkml/20240902114303.922472-1-luca.stefani.ge1@gmail.com/
> v2: https://lore.kernel.org/lkml/20240902205828.943155-1-luca.stefani.ge1@gmail.com/
> v3: https://lore.kernel.org/lkml/20240903071625.957275-4-luca.stefani.ge1@gmail.com/
> v4: https://lore.kernel.org/lkml/20240916101615.116164-1-luca.stefani.ge1@gmail.com/
> v5: https://lore.kernel.org/lkml/20240916125707.127118-1-luca.stefani.ge1@gmail.com/
>
> ---
>
> NB: I didn't change btrfs_discard_workfn yet to add error checks
> as I don't know what semantics we should have in that case.
> The work queue is always re-scheduled and created with WQ_FREEZABLE
> so it should be automatically frozen. Shall I simply add some logs?
About that, this may be a bit tricky, interrupting trim there is handled
but the accounting may get wrong. This is related but a different
problem than the suspend vs trim.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v6 0/2] btrfs: Don't block system suspend during fstrim
2024-09-17 20:33 [PATCH v6 0/2] btrfs: Don't block system suspend during fstrim Luca Stefani
` (3 preceding siblings ...)
2024-09-18 11:53 ` David Sterba
@ 2024-09-18 21:34 ` Qu Wenruo
2024-09-19 12:55 ` David Sterba
5 siblings, 0 replies; 10+ messages in thread
From: Qu Wenruo @ 2024-09-18 21:34 UTC (permalink / raw)
To: Luca Stefani
Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel
在 2024/9/18 06:03, Luca Stefani 写道:
> Changes since v5:
> * Make chunk size a define
> * Remove superfluous trim_interrupted checks
> after moving them to trim_no_bitmap/trim_bitmaps
>
> Changes since v4:
> * Set chunk size to 1G
> * Set proper error return codes in case of interruption
> * Dropped fstrim_range fixup as pulled in -next
>
> Changes since v3:
> * Went back to manual chunk size
>
> Changes since v2:
> * Use blk_alloc_discard_bio directly
> * Reset ret to ERESTARTSYS
>
> Changes since v1:
> * Use bio_discard_limit to calculate chunk size
> * Makes use of the split chunks
>
> Original discussion: https://lore.kernel.org/lkml/20240822164908.4957-1-luca.stefani.ge1@gmail.com/
> v1: https://lore.kernel.org/lkml/20240902114303.922472-1-luca.stefani.ge1@gmail.com/
> v2: https://lore.kernel.org/lkml/20240902205828.943155-1-luca.stefani.ge1@gmail.com/
> v3: https://lore.kernel.org/lkml/20240903071625.957275-4-luca.stefani.ge1@gmail.com/
> v4: https://lore.kernel.org/lkml/20240916101615.116164-1-luca.stefani.ge1@gmail.com/
> v5: https://lore.kernel.org/lkml/20240916125707.127118-1-luca.stefani.ge1@gmail.com/
>
Reviewed-by: Qu Wenruo <wqu@suse.com>
> ---
>
> NB: I didn't change btrfs_discard_workfn yet to add error checks
> as I don't know what semantics we should have in that case.
> The work queue is always re-scheduled and created with WQ_FREEZABLE
> so it should be automatically frozen. Shall I simply add some logs?
And that is for async discard, which has a much smaller discard size and
iops limits.
The main part of large discard is the free extents discarding, which is
already addressed by your patch correctly now.
IIRC we do not need to bother that much for it.
Thanks,
Qu
>
> ---
>
> Luca Stefani (2):
> btrfs: Split remaining space to discard in chunks
> btrfs: Don't block system suspend during fstrim
>
> fs/btrfs/extent-tree.c | 26 +++++++++++++++++++++-----
> fs/btrfs/free-space-cache.c | 4 ++--
> fs/btrfs/free-space-cache.h | 6 ++++++
> fs/btrfs/volumes.h | 1 +
> 4 files changed, 30 insertions(+), 7 deletions(-)
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v6 0/2] btrfs: Don't block system suspend during fstrim
2024-09-17 20:33 [PATCH v6 0/2] btrfs: Don't block system suspend during fstrim Luca Stefani
` (4 preceding siblings ...)
2024-09-18 21:34 ` Qu Wenruo
@ 2024-09-19 12:55 ` David Sterba
5 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2024-09-19 12:55 UTC (permalink / raw)
To: Luca Stefani
Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel
On Tue, Sep 17, 2024 at 10:33:03PM +0200, Luca Stefani wrote:
> Changes since v5:
> * Make chunk size a define
> * Remove superfluous trim_interrupted checks
> after moving them to trim_no_bitmap/trim_bitmaps
>
> Changes since v4:
> * Set chunk size to 1G
> * Set proper error return codes in case of interruption
> * Dropped fstrim_range fixup as pulled in -next
>
> Changes since v3:
> * Went back to manual chunk size
>
> Changes since v2:
> * Use blk_alloc_discard_bio directly
> * Reset ret to ERESTARTSYS
>
> Changes since v1:
> * Use bio_discard_limit to calculate chunk size
> * Makes use of the split chunks
>
> Original discussion: https://lore.kernel.org/lkml/20240822164908.4957-1-luca.stefani.ge1@gmail.com/
> v1: https://lore.kernel.org/lkml/20240902114303.922472-1-luca.stefani.ge1@gmail.com/
> v2: https://lore.kernel.org/lkml/20240902205828.943155-1-luca.stefani.ge1@gmail.com/
> v3: https://lore.kernel.org/lkml/20240903071625.957275-4-luca.stefani.ge1@gmail.com/
> v4: https://lore.kernel.org/lkml/20240916101615.116164-1-luca.stefani.ge1@gmail.com/
> v5: https://lore.kernel.org/lkml/20240916125707.127118-1-luca.stefani.ge1@gmail.com/
>
> ---
>
> NB: I didn't change btrfs_discard_workfn yet to add error checks
> as I don't know what semantics we should have in that case.
> The work queue is always re-scheduled and created with WQ_FREEZABLE
> so it should be automatically frozen. Shall I simply add some logs?
>
> ---
>
> Luca Stefani (2):
> btrfs: Split remaining space to discard in chunks
> btrfs: Don't block system suspend during fstrim
Added to for-next, with some minor updates to changelogs. Thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v6 2/2] btrfs: Don't block system suspend during fstrim
2024-09-17 20:33 ` [PATCH v6 2/2] btrfs: Don't block system suspend during fstrim Luca Stefani
@ 2024-11-19 7:54 ` git
2024-11-19 13:13 ` Greg Kroah-Hartman
0 siblings, 1 reply; 10+ messages in thread
From: git @ 2024-11-19 7:54 UTC (permalink / raw)
To: Luca Stefani
Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel,
Greg Kroah-Hartman
[-- Attachment #1.1: Type: text/plain, Size: 370 bytes --]
Hi,
forgive my ignorance of the kernel patch workflow but it appears this second patch was only ported to 6.11? Only the first patch here was ported to LTS kernels (i.e. f00545e8386e228aac739588b40a6f200e0f0ffc).
I just hit the bug this fixes on 6.6.59 (I need to update, I know) but also checked Greg's v6.6.62 tree and it's not in there either.
Thanks,
Atemu
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 249 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v6 2/2] btrfs: Don't block system suspend during fstrim
2024-11-19 7:54 ` git
@ 2024-11-19 13:13 ` Greg Kroah-Hartman
2024-11-20 7:41 ` Atemu
0 siblings, 1 reply; 10+ messages in thread
From: Greg Kroah-Hartman @ 2024-11-19 13:13 UTC (permalink / raw)
To: git
Cc: Luca Stefani, Chris Mason, Josef Bacik, David Sterba, linux-btrfs,
linux-kernel
On Tue, Nov 19, 2024 at 07:54:58AM +0000, git@atemu.net wrote:
> Hi,
>
> forgive my ignorance of the kernel patch workflow but it appears this second patch was only ported to 6.11? Only the first patch here was ported to LTS kernels (i.e. f00545e8386e228aac739588b40a6f200e0f0ffc).
>
> I just hit the bug this fixes on 6.6.59 (I need to update, I know) but also checked Greg's v6.6.62 tree and it's not in there either.
What is the git commit id in LInus's tree you are referring to here?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v6 2/2] btrfs: Don't block system suspend during fstrim
2024-11-19 13:13 ` Greg Kroah-Hartman
@ 2024-11-20 7:41 ` Atemu
0 siblings, 0 replies; 10+ messages in thread
From: Atemu @ 2024-11-20 7:41 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Luca Stefani, Chris Mason, Josef Bacik, David Sterba, linux-btrfs,
linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 189 bytes --]
Hi,
> What is the git commit id in LInus's tree you are referring to here?
It's 69313850dce33ce8c24b38576a279421f4c60996. Apparently marked for backport to 5.15+.
Thanks,
Atemu
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 249 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-11-20 7:41 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-17 20:33 [PATCH v6 0/2] btrfs: Don't block system suspend during fstrim Luca Stefani
2024-09-17 20:33 ` [PATCH v6 1/2] btrfs: Split remaining space to discard in chunks Luca Stefani
2024-09-17 20:33 ` [PATCH v6 2/2] btrfs: Don't block system suspend during fstrim Luca Stefani
2024-11-19 7:54 ` git
2024-11-19 13:13 ` Greg Kroah-Hartman
2024-11-20 7:41 ` Atemu
2024-09-18 11:52 ` [PATCH v6 0/2] " David Sterba
2024-09-18 11:53 ` David Sterba
2024-09-18 21:34 ` Qu Wenruo
2024-09-19 12:55 ` David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox