linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, David Hildenbrand <david@redhat.com>,
	Mike Rapoport <rppt@kernel.org>, Theodore Ts'o <tytso@mit.edu>,
	Vishal Moola <vishal.moola@gmail.com>,
	Peter Collingbourne <pcc@google.com>
Subject: Re: [RFC] mm: page-flags.h: remove the bias against tail pages
Date: Mon, 25 Mar 2024 05:24:33 +0000	[thread overview]
Message-ID: <ZgEKkd9nc9rdfzCK@casper.infradead.org> (raw)
In-Reply-To: <20240325045519.222458-1-jhubbard@nvidia.com>

On Sun, Mar 24, 2024 at 09:55:19PM -0700, John Hubbard wrote:
> commit 1d798ca3f1643 ("mm: make compound_head() robust") added
> page->compound_head and the associated "unlikely" check for a tail page
> in compound_head():
> 
> 	if (unlikely(head & 1))
> 		return (struct page *) (head - 1);
> 	return page;
> 
> That worked nicely in 2015. However, in the 8.5 years since then, things
> have changed: folios and huge pages are heavily used, with more uses
> coming. See for example the various THP enhancements being proposed. And
> hugetlbfs remains alive and well. And large folios are being plumbed
> into everything.
> 
> With that in mind, remove the "unlikely" attribute when checking for a
> tail page in compound_head(), and let normal CPU branch prediction do
> what it may.

> Is this reasonable? I haven't gone out and gathered test data, because
> the original patch to create this just assumed that compound pages were
> uncommon, and so now it's time to stop making that assumption. I think
> that's sufficient reasoning here to leave out the compiler hint, right?

It's complicated.  On the one hand, it's "more likely" because there are
more tail pages than there are head pages or order-0 pages.  On the
other hand, a _lot_ of the time we call compound_head(), it's done with
a non-tail page because we tend to pass around head pages (eg,
pmd_page() on hugetlbfs, or looking up a folio in the page cache and
passing &folio->page to some function that's not yet converted.

On the third hand, does the compiler really do much with the annotation?

Before your patch:

    27d6:       a8 01                   test   $0x1,%al
    27d8:       75 02                   jne    27dc <clear_refs_pte_range+0x9c>
    27da:       eb 59                   jmp    2835 <clear_refs_pte_range+0xf5>
    27dc:       49 8b 44 24 08          mov    0x8(%r12),%rax
    27e1:       a8 01                   test   $0x1,%al
    27e3:       75 6f                   jne    2854 <clear_refs_pte_range+0x114>
    27e5:       eb 73                   jmp    285a <clear_refs_pte_range+0x11a>

With your patch:

    1ee6:       a8 01                   test   $0x1,%al
    1ee8:       75 02                   jne    1eec <clear_refs_pte_range+0x9c>
    1eea:       eb 5f                   jmp    1f4b <clear_refs_pte_range+0xfb>
    1eec:       49 8b 44 24 08          mov    0x8(%r12),%rax
    1ef1:       a8 01                   test   $0x1,%al
    1ef3:       75 50                   jne    1f45 <clear_refs_pte_range+0xf5>
    1ef5:       eb 6c                   jmp    1f63 <clear_refs_pte_range+0x113>

Looks pretty much the same.  bloat-o-meter says:

$ ./scripts/bloat-o-meter before.o after.o
add/remove: 0/0 grow/shrink: 2/4 up/down: 32/-48 (-16)
Function                                     old     new   delta
gather_stats.constprop                       730     753     +23
smaps_hugetlb_range                          635     644      +9
smaps_page_accumulate                        342     338      -4
clear_refs_pte_range                         339     328     -11
pagemap_hugetlb_range                        422     407     -15
smaps_pte_range                             1406    1388     -18
Total: Before=20066, After=20050, chg -0.08%

(I was looking at clear_refs_pte_range above).  This seems marginal.
The benefits of removing a call to compound_head are much less
ambiguous:

$ ./scripts/bloat-o-meter before.o .build/fs/proc/task_mmu.o
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-101 (-101)
Function                                     old     new   delta
clear_refs_pte_range                         339     238    -101
Total: Before=20066, After=19965, chg -0.50%

I'd describe that as replacing four calls to compound_head() with two:

-               page = pmd_page(*pmd);
+               folio = page_folio(pmd_page(*pmd));

                /* Clear accessed and referenced bits. */
                pmdp_test_and_clear_young(vma, addr, pmd);
-               test_and_clear_page_young(page);
-               ClearPageReferenced(page);
+               folio_test_clear_young(folio);
+               folio_clear_referenced(folio);
...
-               page = vm_normal_page(vma, addr, ptent);
-               if (!page)
+               folio = vm_normal_folio(vma, addr, ptent);
+               if (!folio)
                        continue;

                /* Clear accessed and referenced bits. */
                ptep_test_and_clear_young(vma, addr, pte);
-               test_and_clear_page_young(page);
-               ClearPageReferenced(page);
+               folio_test_clear_young(folio);
+               folio_clear_referenced(folio);

I'm not saying this patch is necessarily wrong, I just think it's
"not proven".


  reply	other threads:[~2024-03-25  5:24 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-25  4:55 [RFC] mm: page-flags.h: remove the bias against tail pages John Hubbard
2024-03-25  5:24 ` Matthew Wilcox [this message]
2024-03-26  3:21   ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZgEKkd9nc9rdfzCK@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pcc@google.com \
    --cc=rppt@kernel.org \
    --cc=tytso@mit.edu \
    --cc=vishal.moola@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).