From: Dave Chinner <david@fromorbit.com>
To: Chris Mason <clm@fb.com>
Cc: djwong@kernel.org, hch@infradead.org, linux-xfs@vger.kernel.org,
linux-fsdevel@vger.kernel.org, hannes@cmpxchg.org
Subject: Re: [PATCH v2] iomap: skip pages past eof in iomap_do_writepage()
Date: Thu, 9 Jun 2022 10:53:13 +1000 [thread overview]
Message-ID: <20220609005313.GX227878@dread.disaster.area> (raw)
In-Reply-To: <20220608004228.3658429-1-clm@fb.com>
On Tue, Jun 07, 2022 at 05:42:29PM -0700, Chris Mason wrote:
> iomap_do_writepage() sends pages past i_size through
> folio_redirty_for_writepage(), which normally isn't a problem because
> truncate and friends clean them very quickly.
>
> When the system has cgroups configured, we can end up in situations
> where one cgroup has almost no dirty pages at all, and other cgroups
> consume the entire background dirty limit. This is especially common in
> our XFS workloads in production because they have cgroups using O_DIRECT
> for almost all of the IO mixed in with cgroups that do more traditional
> buffered IO work.
>
> We've hit storms where the redirty path hits millions of times in a few
> seconds, on all a single file that's only ~40 pages long. This leads to
> long tail latencies for file writes because the pdflush workers are
> hogging the CPU from some kworkers bound to the same CPU.
>
> Reproducing this on 5.18 was tricky because 869ae85dae ("xfs: flush new
> eof page on truncate...") ends up writing/waiting most of these dirty pages
> before truncate gets a chance to wait on them.
That commit went into 5.10, so this would mean it's not easily
reproducable on kernels released since then?
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 8ce8720093b9..64d1476c457d 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1482,10 +1482,10 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> pgoff_t end_index = isize >> PAGE_SHIFT;
>
> /*
> - * Skip the page if it's fully outside i_size, e.g. due to a
> - * truncate operation that's in progress. We must redirty the
> - * page so that reclaim stops reclaiming it. Otherwise
> - * iomap_vm_releasepage() is called on it and gets confused.
> + * Skip the page if it's fully outside i_size, e.g.
> + * due to a truncate operation that's in progress. We've
> + * cleaned this page and truncate will finish things off for
> + * us.
> *
> * Note that the end_index is unsigned long. If the given
> * offset is greater than 16TB on a 32-bit system then if we
> @@ -1500,7 +1500,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> */
> if (folio->index > end_index ||
> (folio->index == end_index && poff == 0))
> - goto redirty;
> + goto unlock;
>
> /*
> * The page straddles i_size. It must be zeroed out on each
> @@ -1518,6 +1518,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
>
> redirty:
> folio_redirty_for_writepage(wbc, folio);
> +unlock:
> folio_unlock(folio);
> return 0;
> }
Regardless, the change looks fine.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2022-06-09 0:53 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-08 0:42 [PATCH v2] iomap: skip pages past eof in iomap_do_writepage() Chris Mason
2022-06-08 18:27 ` Matthew Wilcox
2022-06-09 0:53 ` Dave Chinner [this message]
2022-06-09 21:15 ` Chris Mason
2022-06-11 23:34 ` Chris Mason
2022-06-13 22:04 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220609005313.GX227878@dread.disaster.area \
--to=david@fromorbit.com \
--cc=clm@fb.com \
--cc=djwong@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox