From: Andrew Morton <akpm@linux-foundation.org>
To: mm-commits@vger.kernel.org, zokeefe@google.com,
willy@infradead.org, vbabka@suse.cz,
torvalds@linux-foundation.org, songmuchun@bytedance.com,
sidhartha.kumar@oracle.com, shy828301@gmail.com,
peterx@redhat.com, naoya.horiguchi@linux.dev,
mike.kravetz@oracle.com, linmiaohe@huawei.com,
kirill@shutemov.name, jthoughton@google.com, jhubbard@nvidia.com,
hannes@cmpxchg.org, david@redhat.com, almasrymina@google.com,
hughd@google.com, akpm@linux-foundation.org
Subject: + mmthprmap-subpages_mapcount-compound_mapped-if-pmd-mapped.patch added to mm-unstable branch
Date: Fri, 18 Nov 2022 14:01:16 -0800 [thread overview]
Message-ID: <20221118220116.DBF19C433C1@smtp.kernel.org> (raw)
The patch titled
Subject: mm,thp,rmap: subpages_mapcount COMPOUND_MAPPED if PMD-mapped
has been added to the -mm mm-unstable branch. Its filename is
mmthprmap-subpages_mapcount-compound_mapped-if-pmd-mapped.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mmthprmap-subpages_mapcount-compound_mapped-if-pmd-mapped.patch
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd@google.com>
Subject: mm,thp,rmap: subpages_mapcount COMPOUND_MAPPED if PMD-mapped
Date: Fri, 18 Nov 2022 01:14:17 -0800 (PST)
Can the lock_compound_mapcount() bit_spin_lock apparatus be removed now?
Yes. Not by atomic64_t or cmpxchg games, those get difficult on 32-bit;
but if we slightly abuse subpages_mapcount by additionally demanding that
one bit be set there when the compound page is PMD-mapped, then a cascade
of two atomic ops is able to maintain the stats without bit_spin_lock.
This is harder to reason about than when bit_spin_locked, but I believe
safe; and no drift in stats detected when testing. When there are racing
removes and adds, of course the sequence of operations is less well-
defined; but each operation on subpages_mapcount is atomically good. What
might be disastrous, is if subpages_mapcount could ever fleetingly appear
negative: but the pte lock (or pmd lock) these rmap functions are called
under, ensures that a last remove cannot race ahead of a first add.
Continue to make an exception for hugetlb (PageHuge) pages, though that
exception can be easily removed by a further commit if necessary: leave
subpages_mapcount 0, don't bother with COMPOUND_MAPPED in its case, just
carry on checking compound_mapcount too in folio_mapped(), page_mapped().
Evidence is that this way goes slightly faster than the previous
implementation in all cases (pmds after ptes now taking around 103ms); and
relieves us of worrying about contention on the bit_spin_lock.
Link: https://lkml.kernel.org/r/25a09a7a-81a9-e9c2-7567-c94ce18ac2@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/mm/transhuge.rst | 7 -
include/linux/mm.h | 19 ++++
include/linux/rmap.h | 12 +--
mm/debug.c | 2
mm/rmap.c | 124 ++++++-------------------------
5 files changed, 52 insertions(+), 112 deletions(-)
--- a/Documentation/mm/transhuge.rst~mmthprmap-subpages_mapcount-compound_mapped-if-pmd-mapped
+++ a/Documentation/mm/transhuge.rst
@@ -118,15 +118,14 @@ pages:
succeeds on tail pages.
- map/unmap of PMD entry for the whole compound page increment/decrement
- ->compound_mapcount, stored in the first tail page of the compound page.
+ ->compound_mapcount, stored in the first tail page of the compound page;
+ and also increment/decrement ->subpages_mapcount (also in the first tail)
+ by COMPOUND_MAPPED when compound_mapcount goes from -1 to 0 or 0 to -1.
- map/unmap of sub-pages with PTE entry increment/decrement ->_mapcount
on relevant sub-page of the compound page, and also increment/decrement
->subpages_mapcount, stored in first tail page of the compound page, when
_mapcount goes from -1 to 0 or 0 to -1: counting sub-pages mapped by PTE.
- In order to have race-free accounting of sub-pages mapped, changes to
- sub-page ->_mapcount, ->subpages_mapcount and ->compound_mapcount are
- are all locked by bit_spin_lock of PG_locked in the first tail ->flags.
split_huge_page internally has to distribute the refcounts in the head
page to the tail pages before clearing all PG_head/tail bits from the page
--- a/include/linux/mm.h~mmthprmap-subpages_mapcount-compound_mapped-if-pmd-mapped
+++ a/include/linux/mm.h
@@ -837,7 +837,16 @@ static inline int head_compound_mapcount
}
/*
- * Number of sub-pages mapped by PTE, does not include compound mapcount.
+ * If a 16GB hugetlb page were mapped by PTEs of all of its 4kB sub-pages,
+ * its subpages_mapcount would be 0x400000: choose the COMPOUND_MAPPED bit
+ * above that range, instead of 2*(PMD_SIZE/PAGE_SIZE). Hugetlb currently
+ * leaves subpages_mapcount at 0, but avoid surprise if it participates later.
+ */
+#define COMPOUND_MAPPED 0x800000
+#define SUBPAGES_MAPPED (COMPOUND_MAPPED - 1)
+
+/*
+ * Number of sub-pages mapped by PTE, plus COMPOUND_MAPPED if compound mapped.
* Must be called only on head of compound page.
*/
static inline int head_subpages_mapcount(struct page *head)
@@ -902,8 +911,12 @@ static inline int total_mapcount(struct
static inline bool folio_large_is_mapped(struct folio *folio)
{
- return atomic_read(folio_mapcount_ptr(folio)) +
- atomic_read(folio_subpages_mapcount_ptr(folio)) >= 0;
+ /*
+ * Reading folio_mapcount_ptr() below could be omitted if hugetlb
+ * participated in incrementing subpages_mapcount when compound mapped.
+ */
+ return atomic_read(folio_mapcount_ptr(folio)) >= 0 ||
+ atomic_read(folio_subpages_mapcount_ptr(folio)) > 0;
}
/**
--- a/include/linux/rmap.h~mmthprmap-subpages_mapcount-compound_mapped-if-pmd-mapped
+++ a/include/linux/rmap.h
@@ -204,14 +204,14 @@ void hugepage_add_anon_rmap(struct page
void hugepage_add_new_anon_rmap(struct page *, struct vm_area_struct *,
unsigned long address);
-void page_dup_compound_rmap(struct page *page);
+static inline void __page_dup_rmap(struct page *page, bool compound)
+{
+ atomic_inc(compound ? compound_mapcount_ptr(page) : &page->_mapcount);
+}
static inline void page_dup_file_rmap(struct page *page, bool compound)
{
- if (likely(!compound /* page is mapped by PTE */))
- atomic_inc(&page->_mapcount);
- else
- page_dup_compound_rmap(page);
+ __page_dup_rmap(page, compound);
}
/**
@@ -260,7 +260,7 @@ static inline int page_try_dup_anon_rmap
* the page R/O into both processes.
*/
dup:
- page_dup_file_rmap(page, compound);
+ __page_dup_rmap(page, compound);
return 0;
}
--- a/mm/debug.c~mmthprmap-subpages_mapcount-compound_mapped-if-pmd-mapped
+++ a/mm/debug.c
@@ -97,7 +97,7 @@ static void __dump_page(struct page *pag
pr_warn("head:%p order:%u compound_mapcount:%d subpages_mapcount:%d compound_pincount:%d\n",
head, compound_order(head),
head_compound_mapcount(head),
- head_subpages_mapcount(head),
+ head_subpages_mapcount(head) & SUBPAGES_MAPPED,
head_compound_pincount(head));
}
--- a/mm/rmap.c~mmthprmap-subpages_mapcount-compound_mapped-if-pmd-mapped
+++ a/mm/rmap.c
@@ -1085,38 +1085,6 @@ int pfn_mkclean_range(unsigned long pfn,
return page_vma_mkclean_one(&pvmw);
}
-struct compound_mapcounts {
- unsigned int compound_mapcount;
- unsigned int subpages_mapcount;
-};
-
-/*
- * lock_compound_mapcounts() first locks, then copies subpages_mapcount and
- * compound_mapcount from head[1].compound_mapcount and subpages_mapcount,
- * converting from struct page's internal representation to logical count
- * (that is, adding 1 to compound_mapcount to hide its offset by -1).
- */
-static void lock_compound_mapcounts(struct page *head,
- struct compound_mapcounts *local)
-{
- bit_spin_lock(PG_locked, &head[1].flags);
- local->compound_mapcount = atomic_read(compound_mapcount_ptr(head)) + 1;
- local->subpages_mapcount = atomic_read(subpages_mapcount_ptr(head));
-}
-
-/*
- * After caller has updated subpage._mapcount, local subpages_mapcount and
- * local compound_mapcount, as necessary, unlock_compound_mapcounts() converts
- * and copies them back to the compound head[1] fields, and then unlocks.
- */
-static void unlock_compound_mapcounts(struct page *head,
- struct compound_mapcounts *local)
-{
- atomic_set(compound_mapcount_ptr(head), local->compound_mapcount - 1);
- atomic_set(subpages_mapcount_ptr(head), local->subpages_mapcount);
- bit_spin_unlock(PG_locked, &head[1].flags);
-}
-
int total_compound_mapcount(struct page *head)
{
int mapcount = head_compound_mapcount(head);
@@ -1124,7 +1092,7 @@ int total_compound_mapcount(struct page
int i;
/* In the common case, avoid the loop when no subpages mapped by PTE */
- if (head_subpages_mapcount(head) == 0)
+ if ((head_subpages_mapcount(head) & SUBPAGES_MAPPED) == 0)
return mapcount;
/*
* Add all the PTE mappings of those subpages mapped by PTE.
@@ -1140,35 +1108,6 @@ int total_compound_mapcount(struct page
return mapcount;
}
-/*
- * page_dup_compound_rmap(), used when copying mm,
- * provides a simple example of using lock_ and unlock_compound_mapcounts().
- */
-void page_dup_compound_rmap(struct page *head)
-{
- struct compound_mapcounts mapcounts;
-
- /*
- * Hugetlb pages could use lock_compound_mapcounts(), like THPs do;
- * but at present they are still being managed by atomic operations:
- * which are likely to be somewhat faster, so don't rush to convert
- * them over without evaluating the effect.
- *
- * Note that hugetlb does not call page_add_file_rmap():
- * here is where hugetlb shared page mapcount is raised.
- */
- if (PageHuge(head)) {
- atomic_inc(compound_mapcount_ptr(head));
-
- } else if (PageTransHuge(head)) {
- /* That test is redundant: it's for safety or to optimize out */
-
- lock_compound_mapcounts(head, &mapcounts);
- mapcounts.compound_mapcount++;
- unlock_compound_mapcounts(head, &mapcounts);
- }
-}
-
/**
* page_move_anon_rmap - move a page to our anon_vma
* @page: the page to move to our anon_vma
@@ -1278,7 +1217,7 @@ static void __page_check_anon_rmap(struc
void page_add_anon_rmap(struct page *page,
struct vm_area_struct *vma, unsigned long address, rmap_t flags)
{
- struct compound_mapcounts mapcounts;
+ atomic_t *mapped;
int nr = 0, nr_pmdmapped = 0;
bool compound = flags & RMAP_COMPOUND;
bool first;
@@ -1290,24 +1229,20 @@ void page_add_anon_rmap(struct page *pag
first = atomic_inc_and_test(&page->_mapcount);
nr = first;
if (first && PageCompound(page)) {
- struct page *head = compound_head(page);
-
- lock_compound_mapcounts(head, &mapcounts);
- mapcounts.subpages_mapcount++;
- nr = !mapcounts.compound_mapcount;
- unlock_compound_mapcounts(head, &mapcounts);
+ mapped = subpages_mapcount_ptr(compound_head(page));
+ nr = atomic_inc_return_relaxed(mapped);
+ nr = !(nr & COMPOUND_MAPPED);
}
} else if (PageTransHuge(page)) {
/* That test is redundant: it's for safety or to optimize out */
- lock_compound_mapcounts(page, &mapcounts);
- first = !mapcounts.compound_mapcount;
- mapcounts.compound_mapcount++;
+ first = atomic_inc_and_test(compound_mapcount_ptr(page));
if (first) {
+ mapped = subpages_mapcount_ptr(page);
+ nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped);
nr_pmdmapped = thp_nr_pages(page);
- nr = nr_pmdmapped - mapcounts.subpages_mapcount;
+ nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED);
}
- unlock_compound_mapcounts(page, &mapcounts);
}
VM_BUG_ON_PAGE(!first && (flags & RMAP_EXCLUSIVE), page);
@@ -1360,6 +1295,7 @@ void page_add_new_anon_rmap(struct page
VM_BUG_ON_PAGE(!PageTransHuge(page), page);
/* increment count (starts at -1) */
atomic_set(compound_mapcount_ptr(page), 0);
+ atomic_set(subpages_mapcount_ptr(page), COMPOUND_MAPPED);
nr = thp_nr_pages(page);
__mod_lruvec_page_state(page, NR_ANON_THPS, nr);
}
@@ -1379,7 +1315,7 @@ void page_add_new_anon_rmap(struct page
void page_add_file_rmap(struct page *page,
struct vm_area_struct *vma, bool compound)
{
- struct compound_mapcounts mapcounts;
+ atomic_t *mapped;
int nr = 0, nr_pmdmapped = 0;
bool first;
@@ -1390,24 +1326,20 @@ void page_add_file_rmap(struct page *pag
first = atomic_inc_and_test(&page->_mapcount);
nr = first;
if (first && PageCompound(page)) {
- struct page *head = compound_head(page);
-
- lock_compound_mapcounts(head, &mapcounts);
- mapcounts.subpages_mapcount++;
- nr = !mapcounts.compound_mapcount;
- unlock_compound_mapcounts(head, &mapcounts);
+ mapped = subpages_mapcount_ptr(compound_head(page));
+ nr = atomic_inc_return_relaxed(mapped);
+ nr = !(nr & COMPOUND_MAPPED);
}
} else if (PageTransHuge(page)) {
/* That test is redundant: it's for safety or to optimize out */
- lock_compound_mapcounts(page, &mapcounts);
- first = !mapcounts.compound_mapcount;
- mapcounts.compound_mapcount++;
+ first = atomic_inc_and_test(compound_mapcount_ptr(page));
if (first) {
+ mapped = subpages_mapcount_ptr(page);
+ nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped);
nr_pmdmapped = thp_nr_pages(page);
- nr = nr_pmdmapped - mapcounts.subpages_mapcount;
+ nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED);
}
- unlock_compound_mapcounts(page, &mapcounts);
}
if (nr_pmdmapped)
@@ -1431,7 +1363,7 @@ void page_add_file_rmap(struct page *pag
void page_remove_rmap(struct page *page,
struct vm_area_struct *vma, bool compound)
{
- struct compound_mapcounts mapcounts;
+ atomic_t *mapped;
int nr = 0, nr_pmdmapped = 0;
bool last;
@@ -1451,24 +1383,20 @@ void page_remove_rmap(struct page *page,
last = atomic_add_negative(-1, &page->_mapcount);
nr = last;
if (last && PageCompound(page)) {
- struct page *head = compound_head(page);
-
- lock_compound_mapcounts(head, &mapcounts);
- mapcounts.subpages_mapcount--;
- nr = !mapcounts.compound_mapcount;
- unlock_compound_mapcounts(head, &mapcounts);
+ mapped = subpages_mapcount_ptr(compound_head(page));
+ nr = atomic_dec_return_relaxed(mapped);
+ nr = !(nr & COMPOUND_MAPPED);
}
} else if (PageTransHuge(page)) {
/* That test is redundant: it's for safety or to optimize out */
- lock_compound_mapcounts(page, &mapcounts);
- mapcounts.compound_mapcount--;
- last = !mapcounts.compound_mapcount;
+ last = atomic_add_negative(-1, compound_mapcount_ptr(page));
if (last) {
+ mapped = subpages_mapcount_ptr(page);
+ nr = atomic_sub_return_relaxed(COMPOUND_MAPPED, mapped);
nr_pmdmapped = thp_nr_pages(page);
- nr = nr_pmdmapped - mapcounts.subpages_mapcount;
+ nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED);
}
- unlock_compound_mapcounts(page, &mapcounts);
}
if (nr_pmdmapped) {
_
Patches currently in -mm which might be from hughd@google.com are
mmhugetlb-use-folio-fields-in-second-tail-page.patch
mmhugetlb-use-folio-fields-in-second-tail-page-fix.patch
mmthprmap-simplify-compound-page-mapcount-handling.patch
mmthprmap-lock_compound_mapcounts-on-thp-mapcounts.patch
mmthprmap-handle-the-normal-pagecompound-case-first.patch
mmthprmap-subpages_mapcount-of-pte-mapped-subpages.patch
mmthprmap-subpages_mapcount-compound_mapped-if-pmd-mapped.patch
mmthprmap-clean-up-the-end-of-__split_huge_pmd_locked.patch
next reply other threads:[~2022-11-18 22:01 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-18 22:01 Andrew Morton [this message]
-- strict thread matches above, loose matches on Subject: below --
2022-11-22 22:52 + mmthprmap-subpages_mapcount-compound_mapped-if-pmd-mapped.patch added to mm-unstable branch Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221118220116.DBF19C433C1@smtp.kernel.org \
--to=akpm@linux-foundation.org \
--cc=almasrymina@google.com \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jhubbard@nvidia.com \
--cc=jthoughton@google.com \
--cc=kirill@shutemov.name \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=mm-commits@vger.kernel.org \
--cc=naoya.horiguchi@linux.dev \
--cc=peterx@redhat.com \
--cc=shy828301@gmail.com \
--cc=sidhartha.kumar@oracle.com \
--cc=songmuchun@bytedance.com \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=zokeefe@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.