public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Chris Mason <clm@fb.com>
Cc: djwong@kernel.org, hch@infradead.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, hannes@cmpxchg.org
Subject: Re: [PATCH v2] iomap: skip pages past eof in iomap_do_writepage()
Date: Thu, 9 Jun 2022 10:53:13 +1000	[thread overview]
Message-ID: <20220609005313.GX227878@dread.disaster.area> (raw)
In-Reply-To: <20220608004228.3658429-1-clm@fb.com>

On Tue, Jun 07, 2022 at 05:42:29PM -0700, Chris Mason wrote:
> iomap_do_writepage() sends pages past i_size through
> folio_redirty_for_writepage(), which normally isn't a problem because
> truncate and friends clean them very quickly.
> 
> When the system has cgroups configured, we can end up in situations
> where one cgroup has almost no dirty pages at all, and other cgroups
> consume the entire background dirty limit.  This is especially common in
> our XFS workloads in production because they have cgroups using O_DIRECT
> for almost all of the IO mixed in with cgroups that do more traditional
> buffered IO work.
> 
> We've hit storms where the redirty path hits millions of times in a few
> seconds, on all a single file that's only ~40 pages long.  This leads to
> long tail latencies for file writes because the pdflush workers are
> hogging the CPU from some kworkers bound to the same CPU.
> 
> Reproducing this on 5.18 was tricky because 869ae85dae ("xfs: flush new
> eof page on truncate...") ends up writing/waiting most of these dirty pages
> before truncate gets a chance to wait on them.

That commit went into 5.10, so this would mean it's not easily
reproducable on kernels released since then?

> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 8ce8720093b9..64d1476c457d 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1482,10 +1482,10 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
>  		pgoff_t end_index = isize >> PAGE_SHIFT;
>  
>  		/*
> -		 * Skip the page if it's fully outside i_size, e.g. due to a
> -		 * truncate operation that's in progress. We must redirty the
> -		 * page so that reclaim stops reclaiming it. Otherwise
> -		 * iomap_vm_releasepage() is called on it and gets confused.
> +		 * Skip the page if it's fully outside i_size, e.g.
> +		 * due to a truncate operation that's in progress.  We've
> +		 * cleaned this page and truncate will finish things off for
> +		 * us.
>  		 *
>  		 * Note that the end_index is unsigned long.  If the given
>  		 * offset is greater than 16TB on a 32-bit system then if we
> @@ -1500,7 +1500,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
>  		 */
>  		if (folio->index > end_index ||
>  		    (folio->index == end_index && poff == 0))
> -			goto redirty;
> +			goto unlock;
>  
>  		/*
>  		 * The page straddles i_size.  It must be zeroed out on each
> @@ -1518,6 +1518,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
>  
>  redirty:
>  	folio_redirty_for_writepage(wbc, folio);
> +unlock:
>  	folio_unlock(folio);
>  	return 0;
>  }

Regardless, the change looks fine.

Reviewed-by: Dave Chinner <dchinner@redhat.com>

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2022-06-09  0:53 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-08  0:42 [PATCH v2] iomap: skip pages past eof in iomap_do_writepage() Chris Mason
2022-06-08 18:27 ` Matthew Wilcox
2022-06-09  0:53 ` Dave Chinner [this message]
2022-06-09 21:15   ` Chris Mason
2022-06-11 23:34     ` Chris Mason
2022-06-13 22:04       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220609005313.GX227878@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=clm@fb.com \
    --cc=djwong@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox