From: Johannes Weiner <hannes@cmpxchg.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Chris Mason <clm@fb.com>, Christoph Hellwig <hch@infradead.org>,
"Darrick J. Wong" <djwong@kernel.org>,
xfs <linux-xfs@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
"dchinner@redhat.com" <dchinner@redhat.com>
Subject: Re: [PATCH RFC] iomap: invalidate pages past eof in iomap_do_writepage()
Date: Thu, 2 Jun 2022 11:32:30 -0400 [thread overview]
Message-ID: <YpjYDjeR2Wpx3ImB@cmpxchg.org> (raw)
In-Reply-To: <20220602065252.GD1098723@dread.disaster.area>
On Thu, Jun 02, 2022 at 04:52:52PM +1000, Dave Chinner wrote:
> On Wed, Jun 01, 2022 at 02:13:42PM +0000, Chris Mason wrote:
> > In prod, bpftrace showed looping on a single inode inside a mysql
> > cgroup. That inode was usually in the middle of being deleted,
> > i_size set to zero, but it still had 40-90 pages sitting in the
> > xarray waiting for truncation. We’d loop through the whole call
> > path above over and over again, mostly because writepages() was
> > returning progress had been made on this one inode. The
> > redirty_page_for_writepage() path does drop wbc->nr_to_write, so
> > the rest of the writepages machinery believes real work is being
> > done. nr_to_write is LONG_MAX, so we’ve got a while to loop.
>
> Yup, this code relies on truncate making progress to avoid looping
> forever. Truncate should only block on the page while it locks it
> and waits for writeback to complete, then it gets forcibly
> invalidated and removed from the page cache.
It's not looping forever, truncate can just take a relatively long
time during which the flusher is busy-spinning full bore on a
relatively small number of unflushable pages (range_cyclic).
But you raise a good point asking "why is truncate stuck?". I first
thought they might be cannibalizing each other over the page locks,
but that wasn't it (and wouldn't explain the clear asymmetry between
truncate and flusher). That leaves the waiting for writeback. I just
confirmed with tracing that that's exactly where truncate sits while
the flusher goes bananas on the same inode. So the race must be this:
truncate: flusher
put a subset of pages under writeback
i_size_write(0)
wait_on_page_writeback()
loop with range_cyclic over remaining dirty >EOF pages
> Hence I think we can remove the redirtying completely - it's not
> needed and hasn't been for some time.
>
> Further, I don't think we need to invalidate the folio, either. If
> it's beyond EOF, then it is because a truncate is in progress that
> means it is somebody else's problem to clean up. Hence we should
> leave it to the truncate to deal with, just like the pre-2013 code
> did....
Perfect, that works.
next prev parent reply other threads:[~2022-06-02 15:32 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-01 1:11 [PATCH RFC] iomap: invalidate pages past eof in iomap_do_writepage() Chris Mason
2022-06-01 12:18 ` Christoph Hellwig
2022-06-01 14:13 ` Chris Mason
2022-06-02 6:52 ` Dave Chinner
2022-06-02 15:32 ` Johannes Weiner [this message]
2022-06-02 19:41 ` Chris Mason
2022-06-02 19:59 ` Matthew Wilcox
2022-06-02 22:07 ` Dave Chinner
2022-06-02 22:06 ` Dave Chinner
2022-06-03 1:29 ` Chris Mason
2022-06-03 5:20 ` Dave Chinner
2022-06-03 15:06 ` Johannes Weiner
2022-06-03 16:09 ` Chris Mason
2022-06-05 23:32 ` Dave Chinner
2022-06-06 14:46 ` Johannes Weiner
2022-06-06 15:13 ` Chris Mason
2022-06-07 22:52 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YpjYDjeR2Wpx3ImB@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=clm@fb.com \
--cc=david@fromorbit.com \
--cc=dchinner@redhat.com \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox