* [PATCH 0/2] Preemptive flushing fixes
@ 2021-08-11 18:37 Josef Bacik
2021-08-11 18:37 ` [PATCH 1/2] btrfs: reduce the preemptive flushing threshold to 90% Josef Bacik
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Josef Bacik @ 2021-08-11 18:37 UTC (permalink / raw)
To: linux-btrfs, kernel-team
Hello,
I thought I had fixed the preemptive flushing burning CPU's problem with my
previous set of fixes, but I was wrong. However those tracepoints gave me the
information I needed to fix the problem properly. The first patch
btrfs: reduce the preemptive flushing threshold to 90%
can go back to stable and make its way into the distros to stop the pain for the
current users having problems. The second patch augments the fix with a little
less of a strong hammer.
The problem is for very full file systems on slower disks will end up with a
very small threshold to start preemptive flushing. We were relying on sanity
checks to bail out ahead of time, however they were not strong enough. These
problematic cases existed in the short area where there was enough space to
operate without needing to do synchronous flushing, but not enough space to
avoid flushing all of the time.
The fix is to adjust the sanity checks to something more reasonable to account
for these cases and avoid spinning doing preemptive flushing constantly.
Thanks,
Josef
Josef Bacik (2):
btrfs: reduce the preemptive flushing threshold to 90%
btrfs: do not do preemptive flushing if the majority is global rsv
fs/btrfs/space-info.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
--
2.26.3
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] btrfs: reduce the preemptive flushing threshold to 90%
2021-08-11 18:37 [PATCH 0/2] Preemptive flushing fixes Josef Bacik
@ 2021-08-11 18:37 ` Josef Bacik
2021-08-11 18:37 ` [PATCH 2/2] btrfs: do not do preemptive flushing if the majority is global rsv Josef Bacik
2021-08-16 13:55 ` [PATCH 0/2] Preemptive flushing fixes David Sterba
2 siblings, 0 replies; 5+ messages in thread
From: Josef Bacik @ 2021-08-11 18:37 UTC (permalink / raw)
To: linux-btrfs, kernel-team; +Cc: stable
The preemptive flushing code was added in order to avoid needing to
synchronously wait for ENOSPC flushing to recover space. Once we're
almost full however we can essentially flush constantly. We were using
98% as a threshold to determine if we were simply full, however in
practice this is a really high bar to hit. For example reports of
systems running into this problem had around 94% usage and thus
continued to flush. Fix this by lowering the threshold to 90%, which is
a more sane value, especially for smaller file systems.
cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=212185
Fixes: 576fa34830af ("btrfs: improve preemptive background space flushing")
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/space-info.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index d9c8d738678f..ddb4878e94df 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -733,7 +733,7 @@ static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
{
u64 global_rsv_size = fs_info->global_block_rsv.reserved;
u64 ordered, delalloc;
- u64 thresh = div_factor_fine(space_info->total_bytes, 98);
+ u64 thresh = div_factor(space_info->total_bytes, 9);
u64 used;
/* If we're just plain full then async reclaim just slows us down. */
--
2.26.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] btrfs: do not do preemptive flushing if the majority is global rsv
2021-08-11 18:37 [PATCH 0/2] Preemptive flushing fixes Josef Bacik
2021-08-11 18:37 ` [PATCH 1/2] btrfs: reduce the preemptive flushing threshold to 90% Josef Bacik
@ 2021-08-11 18:37 ` Josef Bacik
2021-08-17 8:39 ` Nikolay Borisov
2021-08-16 13:55 ` [PATCH 0/2] Preemptive flushing fixes David Sterba
2 siblings, 1 reply; 5+ messages in thread
From: Josef Bacik @ 2021-08-11 18:37 UTC (permalink / raw)
To: linux-btrfs, kernel-team
A common characteristic of the bug report where preemptive flushing was
going full tilt was the fact that the vast majority of the free metadata
space was used up by the global reserve. The hard 90% threshold would
cover the majority of these cases, but to be even smarter we should take
into account how much of the outstanding reservations are covered by the
global block reserve. If the global block reserve accounts for the vast
majority of outstanding reservations, skip preemptive flushing, as it
will likely just cause churn and pain.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=212185
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/space-info.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index ddb4878e94df..2fce15d58b55 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -741,6 +741,20 @@ static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
global_rsv_size) >= thresh)
return false;
+ used = space_info->bytes_may_use + space_info->bytes_pinned;
+
+ /* The total reservation belongs to the global rsv, don't flush. */
+ if (global_rsv_size >= used)
+ return false;
+
+ /*
+ * 128m is 1/4 of the maximum global rsv size. If we have less than
+ * that devoted to other reservations then there's no sense in flushing,
+ * we don't have a lot of things that need flushing.
+ */
+ if ((used - global_rsv_size) <= SZ_128M)
+ return false;
+
/*
* We have tickets queued, bail so we don't compete with the async
* flushers.
--
2.26.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 0/2] Preemptive flushing fixes
2021-08-11 18:37 [PATCH 0/2] Preemptive flushing fixes Josef Bacik
2021-08-11 18:37 ` [PATCH 1/2] btrfs: reduce the preemptive flushing threshold to 90% Josef Bacik
2021-08-11 18:37 ` [PATCH 2/2] btrfs: do not do preemptive flushing if the majority is global rsv Josef Bacik
@ 2021-08-16 13:55 ` David Sterba
2 siblings, 0 replies; 5+ messages in thread
From: David Sterba @ 2021-08-16 13:55 UTC (permalink / raw)
To: Josef Bacik; +Cc: linux-btrfs, kernel-team
On Wed, Aug 11, 2021 at 02:37:14PM -0400, Josef Bacik wrote:
> Hello,
>
> I thought I had fixed the preemptive flushing burning CPU's problem with my
> previous set of fixes, but I was wrong. However those tracepoints gave me the
> information I needed to fix the problem properly. The first patch
>
> btrfs: reduce the preemptive flushing threshold to 90%
>
> can go back to stable and make its way into the distros to stop the pain for the
> current users having problems. The second patch augments the fix with a little
> less of a strong hammer.
>
> The problem is for very full file systems on slower disks will end up with a
> very small threshold to start preemptive flushing. We were relying on sanity
> checks to bail out ahead of time, however they were not strong enough. These
> problematic cases existed in the short area where there was enough space to
> operate without needing to do synchronous flushing, but not enough space to
> avoid flushing all of the time.
>
> The fix is to adjust the sanity checks to something more reasonable to account
> for these cases and avoid spinning doing preemptive flushing constantly.
> Thanks,
>
> Josef
>
> Josef Bacik (2):
> btrfs: reduce the preemptive flushing threshold to 90%
> btrfs: do not do preemptive flushing if the majority is global rsv
Added to misc-next, thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] btrfs: do not do preemptive flushing if the majority is global rsv
2021-08-11 18:37 ` [PATCH 2/2] btrfs: do not do preemptive flushing if the majority is global rsv Josef Bacik
@ 2021-08-17 8:39 ` Nikolay Borisov
0 siblings, 0 replies; 5+ messages in thread
From: Nikolay Borisov @ 2021-08-17 8:39 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
On 11.08.21 г. 21:37, Josef Bacik wrote:
> A common characteristic of the bug report where preemptive flushing was
> going full tilt was the fact that the vast majority of the free metadata
> space was used up by the global reserve. The hard 90% threshold would
> cover the majority of these cases, but to be even smarter we should take
> into account how much of the outstanding reservations are covered by the
> global block reserve. If the global block reserve accounts for the vast
> majority of outstanding reservations, skip preemptive flushing, as it
> will likely just cause churn and pain.
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=212185
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
> fs/btrfs/space-info.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index ddb4878e94df..2fce15d58b55 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -741,6 +741,20 @@ static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
> global_rsv_size) >= thresh)
> return false;
>
> + used = space_info->bytes_may_use + space_info->bytes_pinned;
But global_rsv_size is accounted entirely in bytes_may_use (per
btrfs_update_global_block_rsv logic), so why add bytes_pinned?
> +
> + /* The total reservation belongs to the global rsv, don't flush. */
> + if (global_rsv_size >= used)
> + return false;
> +
> + /*
> + * 128m is 1/4 of the maximum global rsv size. If we have less than
> + * that devoted to other reservations then there's no sense in flushing,
> + * we don't have a lot of things that need flushing.
> + */
> + if ((used - global_rsv_size) <= SZ_128M)
> + return false;
> +
> /*
> * We have tickets queued, bail so we don't compete with the async
> * flushers.
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-08-17 8:39 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-08-11 18:37 [PATCH 0/2] Preemptive flushing fixes Josef Bacik
2021-08-11 18:37 ` [PATCH 1/2] btrfs: reduce the preemptive flushing threshold to 90% Josef Bacik
2021-08-11 18:37 ` [PATCH 2/2] btrfs: do not do preemptive flushing if the majority is global rsv Josef Bacik
2021-08-17 8:39 ` Nikolay Borisov
2021-08-16 13:55 ` [PATCH 0/2] Preemptive flushing fixes David Sterba
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.