* [PATCH v5 0/2] btrfs: Don't block system suspend during fstrim @ 2024-09-16 12:56 Luca Stefani 2024-09-16 12:56 ` [PATCH v5 1/2] btrfs: Split remaining space to discard in chunks Luca Stefani 2024-09-16 12:56 ` [PATCH v5 2/2] btrfs: Don't block system suspend during fstrim Luca Stefani 0 siblings, 2 replies; 6+ messages in thread From: Luca Stefani @ 2024-09-16 12:56 UTC (permalink / raw) Cc: Luca Stefani, Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel Changes since v4: * Set chunk size to 1G * Set proper error return codes in case of interruption * Dropped fstrim_range fixup as pulled in -next Changes since v3: * Went back to manual chunk size Changes since v2: * Use blk_alloc_discard_bio directly * Reset ret to ERESTARTSYS Changes since v1: * Use bio_discard_limit to calculate chunk size * Makes use of the split chunks Original discussion: https://lore.kernel.org/lkml/20240822164908.4957-1-luca.stefani.ge1@gmail.com/ v1: https://lore.kernel.org/lkml/20240902114303.922472-1-luca.stefani.ge1@gmail.com/ v2: https://lore.kernel.org/lkml/20240902205828.943155-1-luca.stefani.ge1@gmail.com/ v3: https://lore.kernel.org/lkml/20240903071625.957275-4-luca.stefani.ge1@gmail.com/ v4: https://lore.kernel.org/lkml/20240916101615.116164-1-luca.stefani.ge1@gmail.com/ Luca Stefani (2): btrfs: Split remaining space to discard in chunks btrfs: Don't block system suspend during fstrim fs/btrfs/extent-tree.c | 42 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 37 insertions(+), 5 deletions(-) -- 2.46.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v5 1/2] btrfs: Split remaining space to discard in chunks 2024-09-16 12:56 [PATCH v5 0/2] btrfs: Don't block system suspend during fstrim Luca Stefani @ 2024-09-16 12:56 ` Luca Stefani 2024-09-17 16:25 ` David Sterba 2024-09-16 12:56 ` [PATCH v5 2/2] btrfs: Don't block system suspend during fstrim Luca Stefani 1 sibling, 1 reply; 6+ messages in thread From: Luca Stefani @ 2024-09-16 12:56 UTC (permalink / raw) Cc: Luca Stefani, Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device, mostly empty although we will do the split according to our super block locations, the last super block ends at 256G, we can submit a huge discard for the range [256G, 8T), causing a super large delay. We now split the space left to discard based on BTRFS_MAX_DATA_CHUNK_SIZE in preparation of introduction of cancellation signals handling. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180 Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737 Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> --- fs/btrfs/extent-tree.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index a5966324607d..79b9243c9cd6 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1300,13 +1300,24 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, bytes_left = end - start; } - if (bytes_left) { + while (bytes_left) { + u64 bytes_to_discard = min(SZ_1G, bytes_left); + ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT, - bytes_left >> SECTOR_SHIFT, + bytes_to_discard >> SECTOR_SHIFT, GFP_NOFS); - if (!ret) - *discarded_bytes += bytes_left; + + if (ret) { + if (ret != -EOPNOTSUPP) + break; + continue; + } + + start += bytes_to_discard; + bytes_left -= bytes_to_discard; + *discarded_bytes += bytes_to_discard; } + return ret; } -- 2.46.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v5 1/2] btrfs: Split remaining space to discard in chunks 2024-09-16 12:56 ` [PATCH v5 1/2] btrfs: Split remaining space to discard in chunks Luca Stefani @ 2024-09-17 16:25 ` David Sterba 0 siblings, 0 replies; 6+ messages in thread From: David Sterba @ 2024-09-17 16:25 UTC (permalink / raw) To: Luca Stefani Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel On Mon, Sep 16, 2024 at 02:56:14PM +0200, Luca Stefani wrote: > Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device, > mostly empty although we will do the split according to our super block > locations, the last super block ends at 256G, we can submit a huge > discard for the range [256G, 8T), causing a super large delay. > > We now split the space left to discard based on BTRFS_MAX_DATA_CHUNK_SIZE > in preparation of introduction of cancellation signals handling. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180 > Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737 > Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> > --- > fs/btrfs/extent-tree.c | 19 +++++++++++++++---- > 1 file changed, 15 insertions(+), 4 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index a5966324607d..79b9243c9cd6 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -1300,13 +1300,24 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, > bytes_left = end - start; > } > > - if (bytes_left) { > + while (bytes_left) { > + u64 bytes_to_discard = min(SZ_1G, bytes_left); Please define a separate constant for that and also mention it in the changelog instead of BTRFS_MAX_DATA_CHUNK_SIZE. > + > ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT, > - bytes_left >> SECTOR_SHIFT, > + bytes_to_discard >> SECTOR_SHIFT, > GFP_NOFS); > - if (!ret) > - *discarded_bytes += bytes_left; > + > + if (ret) { > + if (ret != -EOPNOTSUPP) > + break; > + continue; > + } > + > + start += bytes_to_discard; > + bytes_left -= bytes_to_discard; > + *discarded_bytes += bytes_to_discard; > } > + > return ret; > } > > -- > 2.46.0 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v5 2/2] btrfs: Don't block system suspend during fstrim 2024-09-16 12:56 [PATCH v5 0/2] btrfs: Don't block system suspend during fstrim Luca Stefani 2024-09-16 12:56 ` [PATCH v5 1/2] btrfs: Split remaining space to discard in chunks Luca Stefani @ 2024-09-16 12:56 ` Luca Stefani 2024-09-17 16:24 ` David Sterba 1 sibling, 1 reply; 6+ messages in thread From: Luca Stefani @ 2024-09-16 12:56 UTC (permalink / raw) Cc: Luca Stefani, Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel Sometimes the system isn't able to suspend because the task responsible for trimming the device isn't able to finish in time, especially since we have a free extent discarding phase, which can trim a lot of unallocated space, and there is no limits on the trim size (unlike the block group part). Since discard isn't a critical call it can be interrupted at any time, in such cases we stop the trim, report the amount of discarded bytes and return failure. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180 Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737 Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> --- fs/btrfs/extent-tree.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 79b9243c9cd6..cef368a30731 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -16,6 +16,7 @@ #include <linux/percpu_counter.h> #include <linux/lockdep.h> #include <linux/crc32c.h> +#include <linux/freezer.h> #include "ctree.h" #include "extent-tree.h" #include "transaction.h" @@ -1235,6 +1236,11 @@ static int remove_extent_backref(struct btrfs_trans_handle *trans, return ret; } +static bool btrfs_trim_interrupted(void) +{ + return fatal_signal_pending(current) || freezing(current); +} + static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, u64 *discarded_bytes) { @@ -1316,6 +1322,11 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, start += bytes_to_discard; bytes_left -= bytes_to_discard; *discarded_bytes += bytes_to_discard; + + if (btrfs_trim_interrupted()) { + ret = -ERESTARTSYS; + break; + } } return ret; @@ -6470,7 +6481,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed) start += len; *trimmed += bytes; - if (fatal_signal_pending(current)) { + if (btrfs_trim_interrupted()) { ret = -ERESTARTSYS; break; } @@ -6519,6 +6530,11 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) cache = btrfs_lookup_first_block_group(fs_info, range->start); for (; cache; cache = btrfs_next_block_group(cache)) { + if (btrfs_trim_interrupted()) { + bg_ret = -ERESTARTSYS; + break; + } + if (cache->start >= range_end) { btrfs_put_block_group(cache); break; @@ -6558,6 +6574,11 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) mutex_lock(&fs_devices->device_list_mutex); list_for_each_entry(device, &fs_devices->devices, dev_list) { + if (btrfs_trim_interrupted()) { + dev_ret = -ERESTARTSYS; + break; + } + if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) continue; -- 2.46.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v5 2/2] btrfs: Don't block system suspend during fstrim 2024-09-16 12:56 ` [PATCH v5 2/2] btrfs: Don't block system suspend during fstrim Luca Stefani @ 2024-09-17 16:24 ` David Sterba 2024-09-17 17:38 ` Luca Stefani 0 siblings, 1 reply; 6+ messages in thread From: David Sterba @ 2024-09-17 16:24 UTC (permalink / raw) To: Luca Stefani Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel On Mon, Sep 16, 2024 at 02:56:15PM +0200, Luca Stefani wrote: > Sometimes the system isn't able to suspend because the task > responsible for trimming the device isn't able to finish in > time, especially since we have a free extent discarding phase, > which can trim a lot of unallocated space, and there is no > limits on the trim size (unlike the block group part). > > Since discard isn't a critical call it can be interrupted > at any time, in such cases we stop the trim, report the amount > of discarded bytes and return failure. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180 > Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737 > Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> I went through the cancellation points, some of them don't seem to be necessary, eg. in a big loop when some function is called to do trim (extents, bitmaps) and then again does the signal and freezing check. Next, some of the functions are called from async discard and errors are not checked: btrfs_trim_block_group_bitmaps() called from btrfs_discard_workfn(). Ther's also check for signals pending in trim_bitmaps() in free-space-cache.c. Given that the space cache code is on the way out we don't necesssarily need to fix it but if the patch gets backported to older kernels it still makes sense. > --- > fs/btrfs/extent-tree.c | 23 ++++++++++++++++++++++- > 1 file changed, 22 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 79b9243c9cd6..cef368a30731 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -16,6 +16,7 @@ > #include <linux/percpu_counter.h> > #include <linux/lockdep.h> > #include <linux/crc32c.h> > +#include <linux/freezer.h> > #include "ctree.h" > #include "extent-tree.h" > #include "transaction.h" > @@ -1235,6 +1236,11 @@ static int remove_extent_backref(struct btrfs_trans_handle *trans, > return ret; > } > > +static bool btrfs_trim_interrupted(void) > +{ > + return fatal_signal_pending(current) || freezing(current); > +} > + > static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, > u64 *discarded_bytes) > { > @@ -1316,6 +1322,11 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, > start += bytes_to_discard; > bytes_left -= bytes_to_discard; > *discarded_bytes += bytes_to_discard; > + > + if (btrfs_trim_interrupted()) { > + ret = -ERESTARTSYS; > + break; > + } > } > > return ret; > @@ -6470,7 +6481,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed) > start += len; > *trimmed += bytes; > > - if (fatal_signal_pending(current)) { > + if (btrfs_trim_interrupted()) { > ret = -ERESTARTSYS; > break; > } > @@ -6519,6 +6530,11 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) > > cache = btrfs_lookup_first_block_group(fs_info, range->start); > for (; cache; cache = btrfs_next_block_group(cache)) { > + if (btrfs_trim_interrupted()) { > + bg_ret = -ERESTARTSYS; > + break; > + } > + > if (cache->start >= range_end) { > btrfs_put_block_group(cache); > break; > @@ -6558,6 +6574,11 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) > > mutex_lock(&fs_devices->device_list_mutex); > list_for_each_entry(device, &fs_devices->devices, dev_list) { > + if (btrfs_trim_interrupted()) { > + dev_ret = -ERESTARTSYS; This one seems redundant. > + break; > + } > + > if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) > continue; > > -- > 2.46.0 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v5 2/2] btrfs: Don't block system suspend during fstrim 2024-09-17 16:24 ` David Sterba @ 2024-09-17 17:38 ` Luca Stefani 0 siblings, 0 replies; 6+ messages in thread From: Luca Stefani @ 2024-09-17 17:38 UTC (permalink / raw) To: dsterba; +Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel On 17/09/24 18:24, David Sterba wrote: > On Mon, Sep 16, 2024 at 02:56:15PM +0200, Luca Stefani wrote: >> Sometimes the system isn't able to suspend because the task >> responsible for trimming the device isn't able to finish in >> time, especially since we have a free extent discarding phase, >> which can trim a lot of unallocated space, and there is no >> limits on the trim size (unlike the block group part). >> >> Since discard isn't a critical call it can be interrupted >> at any time, in such cases we stop the trim, report the amount >> of discarded bytes and return failure. >> >> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180 >> Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737 >> Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> > > I went through the cancellation points, some of them don't seem to be > necessary, eg. in a big loop when some function is called to do trim > (extents, bitmaps) and then again does the signal and freezing check. > > Next, some of the functions are called from async discard and errors are > not checked: btrfs_trim_block_group_bitmaps() called from > btrfs_discard_workfn(). Both btrfs_trim_block_group_bitmaps and btrfs_trim_block_group_extents ret codes are never checked indeed in btrfs_discard_workfn. I'll fix that up in another CL. > > Ther's also check for signals pending in trim_bitmaps() in > free-space-cache.c. Given that the space cache code is on the way out we > don't necesssarily need to fix it but if the patch gets backported to > older kernels it still makes sense. Ah I missed this one, will fix it. There's a few more instances of fatal_signal_pending but I don't know if they should be translated or not, will focus on the one you mentioned and trim_no_bitmap which seems to do similar checks for fatal signals. > >> --- >> fs/btrfs/extent-tree.c | 23 ++++++++++++++++++++++- >> 1 file changed, 22 insertions(+), 1 deletion(-) >> >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >> index 79b9243c9cd6..cef368a30731 100644 >> --- a/fs/btrfs/extent-tree.c >> +++ b/fs/btrfs/extent-tree.c >> @@ -16,6 +16,7 @@ >> #include <linux/percpu_counter.h> >> #include <linux/lockdep.h> >> #include <linux/crc32c.h> >> +#include <linux/freezer.h> >> #include "ctree.h" >> #include "extent-tree.h" >> #include "transaction.h" >> @@ -1235,6 +1236,11 @@ static int remove_extent_backref(struct btrfs_trans_handle *trans, >> return ret; >> } >> >> +static bool btrfs_trim_interrupted(void) >> +{ >> + return fatal_signal_pending(current) || freezing(current); >> +} >> + >> static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, >> u64 *discarded_bytes) >> { >> @@ -1316,6 +1322,11 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, >> start += bytes_to_discard; >> bytes_left -= bytes_to_discard; >> *discarded_bytes += bytes_to_discard; >> + >> + if (btrfs_trim_interrupted()) { >> + ret = -ERESTARTSYS; >> + break; >> + } >> } >> >> return ret; >> @@ -6470,7 +6481,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed) >> start += len; >> *trimmed += bytes; >> >> - if (fatal_signal_pending(current)) { >> + if (btrfs_trim_interrupted()) { >> ret = -ERESTARTSYS; >> break; >> } >> @@ -6519,6 +6530,11 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) >> >> cache = btrfs_lookup_first_block_group(fs_info, range->start); >> for (; cache; cache = btrfs_next_block_group(cache)) { >> + if (btrfs_trim_interrupted()) { >> + bg_ret = -ERESTARTSYS; >> + break; >> + } >> + >> if (cache->start >= range_end) { >> btrfs_put_block_group(cache); >> break; >> @@ -6558,6 +6574,11 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) >> >> mutex_lock(&fs_devices->device_list_mutex); >> list_for_each_entry(device, &fs_devices->devices, dev_list) { >> + if (btrfs_trim_interrupted()) { >> + dev_ret = -ERESTARTSYS; > > This one seems redundant. > >> + break; >> + } >> + >> if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) >> continue; >> >> -- >> 2.46.0 >> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-09-17 17:38 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-16 12:56 [PATCH v5 0/2] btrfs: Don't block system suspend during fstrim Luca Stefani 2024-09-16 12:56 ` [PATCH v5 1/2] btrfs: Split remaining space to discard in chunks Luca Stefani 2024-09-17 16:25 ` David Sterba 2024-09-16 12:56 ` [PATCH v5 2/2] btrfs: Don't block system suspend during fstrim Luca Stefani 2024-09-17 16:24 ` David Sterba 2024-09-17 17:38 ` Luca Stefani
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.