From: Matthew Wilcox <willy@infradead.org>
To: linux-fsdevel@vger.kernel.org
Cc: linux-mm@kvack.org
Subject: Re: [PATCH 2/3] mm: Don't hold a page reference while waiting for unlock
Date: Tue, 13 Oct 2020 14:32:39 +0100 [thread overview]
Message-ID: <20201013133239.GH20115@casper.infradead.org> (raw)
In-Reply-To: <20201013030008.27219-3-willy@infradead.org>
This patch was inadvertently included. I blame my tools. The actual
patch 2/3 is the other one.
On Tue, Oct 13, 2020 at 04:00:06AM +0100, Matthew Wilcox (Oracle) wrote:
> In the upcoming THP patch series, if we find a !Uptodate page, it
> is because of a read error. In this case, we want to split the THP
> into smaller pages so we can handle the error in as granular a fashion
> as possible. But xfstests generic/273 defeats this strategy by having
> 500 threads all sleeping on the same page, each waiting for their turn
> to split the page. None of them will ever succeed because splitting a
> page requires that you hold the only reference to it.
>
> To fix this, use put_and_wait_on_page_locked() to sleep without holding
> a reference. Each of the readers will then go back and retry the
> page lookup.
>
> This requires a few changes since we now get the page lock a little
> earlier in generic_file_buffered_read(). This is unlikely to affect any
> normal workloads as pages in the page cache are generally uptodate and
> will not hit this path. With the THP patch set and the readahead error
> injector, I see about a 25% performance improvement with this patch over
> an alternate approach which moves the page locking down.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> mm/filemap.c | 51 ++++++++++++++-------------------------------------
> 1 file changed, 14 insertions(+), 37 deletions(-)
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index f70227941627..9916353f0f0d 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1254,14 +1254,6 @@ static int __wait_on_page_locked_async(struct page *page,
> return ret;
> }
>
> -static int wait_on_page_locked_async(struct page *page,
> - struct wait_page_queue *wait)
> -{
> - if (!PageLocked(page))
> - return 0;
> - return __wait_on_page_locked_async(compound_head(page), wait, false);
> -}
> -
> /**
> * put_and_wait_on_page_locked - Drop a reference and wait for it to be unlocked
> * @page: The page to wait for.
> @@ -2128,19 +2120,21 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb,
> put_page(page);
> goto out;
> }
> - error = wait_on_page_locked_async(page,
> - iocb->ki_waitq);
> + error = lock_page_async(page, iocb->ki_waitq);
> + if (error)
> + goto readpage_error;
> + } else if (iocb->ki_flags & IOCB_NOWAIT) {
> + put_page(page);
> + goto would_block;
> } else {
> - if (iocb->ki_flags & IOCB_NOWAIT) {
> - put_page(page);
> - goto would_block;
> + if (!trylock_page(page)) {
> + put_and_wait_on_page_locked(page,
> + TASK_KILLABLE);
> + goto find_page;
> }
> - error = wait_on_page_locked_killable(page);
> }
> - if (unlikely(error))
> - goto readpage_error;
> if (PageUptodate(page))
> - goto page_ok;
> + goto uptodate;
>
> if (inode->i_blkbits == PAGE_SHIFT ||
> !mapping->a_ops->is_partially_uptodate)
> @@ -2148,14 +2142,13 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb,
> /* pipes can't handle partially uptodate pages */
> if (unlikely(iov_iter_is_pipe(iter)))
> goto page_not_up_to_date;
> - if (!trylock_page(page))
> - goto page_not_up_to_date;
> /* Did it get truncated before we got the lock? */
> if (!page->mapping)
> - goto page_not_up_to_date_locked;
> + goto page_not_up_to_date;
> if (!mapping->a_ops->is_partially_uptodate(page,
> offset, iter->count))
> - goto page_not_up_to_date_locked;
> + goto page_not_up_to_date;
> +uptodate:
> unlock_page(page);
> }
> page_ok:
> @@ -2223,28 +2216,12 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb,
> continue;
>
> page_not_up_to_date:
> - /* Get exclusive access to the page ... */
> - if (iocb->ki_flags & IOCB_WAITQ)
> - error = lock_page_async(page, iocb->ki_waitq);
> - else
> - error = lock_page_killable(page);
> - if (unlikely(error))
> - goto readpage_error;
> -
> -page_not_up_to_date_locked:
> /* Did it get truncated before we got the lock? */
> if (!page->mapping) {
> unlock_page(page);
> put_page(page);
> continue;
> }
> -
> - /* Did somebody else fill it already? */
> - if (PageUptodate(page)) {
> - unlock_page(page);
> - goto page_ok;
> - }
> -
> readpage:
> if (iocb->ki_flags & (IOCB_NOIO | IOCB_NOWAIT)) {
> unlock_page(page);
> --
> 2.28.0
>
next prev parent reply other threads:[~2020-10-13 13:32 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-13 3:00 [PATCH 0/3] Wait for I/O without holding a page reference Matthew Wilcox (Oracle)
2020-10-13 3:00 ` [PATCH 1/3] mm: Pass a sleep state to put_and_wait_on_page_locked Matthew Wilcox (Oracle)
2020-10-13 3:00 ` [PATCH 2/3] mm: Don't hold a page reference while waiting for unlock Matthew Wilcox (Oracle)
2020-10-13 13:32 ` Matthew Wilcox [this message]
2020-10-13 3:00 ` [PATCH 2/3] mm/filemap: " Matthew Wilcox (Oracle)
2020-10-13 3:00 ` [PATCH 3/3] mm: Inline __wait_on_page_locked_async into caller Matthew Wilcox (Oracle)
2020-10-13 13:32 ` [PATCH 0/3] Wait for I/O without holding a page reference William Kucharski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201013133239.GH20115@casper.infradead.org \
--to=willy@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).