From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Hugh Dickins <hughd@google.com>,
Jiri Kosina <jkosina@suse.cz>
Subject: [PATCH 4.4 17/37] kaiser: vmstat show NR_KAISERTABLE as nr_overhead
Date: Wed, 3 Jan 2018 21:11:23 +0100 [thread overview]
Message-ID: <20180103195057.727940597@linuxfoundation.org> (raw)
In-Reply-To: <20180103195056.837404126@linuxfoundation.org>
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hugh Dickins <hughd@google.com>
The kaiser update made an interesting choice, never to free any shadow
page tables. Contention on global spinlock was worrying, particularly
with it held across page table scans when freeing. Something had to be
done: I was going to add refcounting; but simply never to free them is
an appealing choice, minimizing contention without complicating the code
(the more a page table is found already, the less the spinlock is used).
But leaking pages in this way is also a worry: can we get away with it?
At the very least, we need a count to show how bad it actually gets:
in principle, one might end up wasting about 1/256 of memory that way
(1/512 for when direct-mapped pages have to be user-mapped, plus 1/512
for when they are user-mapped from the vmalloc area on another occasion
(but we don't have vmalloc'ed stacks, so only large ldts are vmalloc'ed).
Add per-cpu stat NR_KAISERTABLE: including 256 at startup for the
shared pgd entries, and 1 for each intermediate page table added
thereafter for user-mapping - but leave out the 1 per mm, for its
shadow pgd, because that distracts from the monotonic increase.
Shown in /proc/vmstat as nr_overhead (0 if kaiser not enabled).
In practice, it doesn't look so bad so far: more like 1/12000 after
nine hours of gtests below; and movable pageblock segregation should
tend to cluster the kaiser tables into a subset of the address space
(if not, they will be bad for compaction too). But production may
tell a different story: keep an eye on this number, and bring back
lighter freeing if it gets out of control (maybe a shrinker).
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/mm/kaiser.c | 16 +++++++++++-----
include/linux/mmzone.h | 3 ++-
mm/vmstat.c | 1 +
3 files changed, 14 insertions(+), 6 deletions(-)
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -122,9 +122,11 @@ static pte_t *kaiser_pagetable_walk(unsi
if (!new_pmd_page)
return NULL;
spin_lock(&shadow_table_allocation_lock);
- if (pud_none(*pud))
+ if (pud_none(*pud)) {
set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page)));
- else
+ __inc_zone_page_state(virt_to_page((void *)
+ new_pmd_page), NR_KAISERTABLE);
+ } else
free_page(new_pmd_page);
spin_unlock(&shadow_table_allocation_lock);
}
@@ -140,9 +142,11 @@ static pte_t *kaiser_pagetable_walk(unsi
if (!new_pte_page)
return NULL;
spin_lock(&shadow_table_allocation_lock);
- if (pmd_none(*pmd))
+ if (pmd_none(*pmd)) {
set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page)));
- else
+ __inc_zone_page_state(virt_to_page((void *)
+ new_pte_page), NR_KAISERTABLE);
+ } else
free_page(new_pte_page);
spin_unlock(&shadow_table_allocation_lock);
}
@@ -206,11 +210,13 @@ static void __init kaiser_init_all_pgds(
pgd = native_get_shadow_pgd(pgd_offset_k((unsigned long )0));
for (i = PTRS_PER_PGD / 2; i < PTRS_PER_PGD; i++) {
pgd_t new_pgd;
- pud_t *pud = pud_alloc_one(&init_mm, PAGE_OFFSET + i * PGDIR_SIZE);
+ pud_t *pud = pud_alloc_one(&init_mm,
+ PAGE_OFFSET + i * PGDIR_SIZE);
if (!pud) {
WARN_ON(1);
break;
}
+ inc_zone_page_state(virt_to_page(pud), NR_KAISERTABLE);
new_pgd = __pgd(_KERNPG_TABLE |__pa(pud));
/*
* Make sure not to stomp on some other pgd entry.
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -131,8 +131,9 @@ enum zone_stat_item {
NR_SLAB_RECLAIMABLE,
NR_SLAB_UNRECLAIMABLE,
NR_PAGETABLE, /* used for pagetables */
- NR_KERNEL_STACK,
/* Second 128 byte cacheline */
+ NR_KERNEL_STACK,
+ NR_KAISERTABLE,
NR_UNSTABLE_NFS, /* NFS unstable pages */
NR_BOUNCE,
NR_VMSCAN_WRITE,
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -736,6 +736,7 @@ const char * const vmstat_text[] = {
"nr_slab_unreclaimable",
"nr_page_table_pages",
"nr_kernel_stack",
+ "nr_overhead",
"nr_unstable",
"nr_bounce",
"nr_vmscan_write",
next prev parent reply other threads:[~2018-01-03 20:12 UTC|newest]
Thread overview: 144+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-03 20:11 [PATCH 4.4 00/37] 4.4.110-stable review Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 01/37] x86/boot: Add early cmdline parsing for options with arguments Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 02/37] KAISER: Kernel Address Isolation Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 03/37] kaiser: merged update Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 04/37] kaiser: do not set _PAGE_NX on pgd_none Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 05/37] kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZE Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 06/37] kaiser: fix build and FIXME in alloc_ldt_struct() Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 07/37] kaiser: KAISER depends on SMP Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 08/37] kaiser: fix regs to do_nmi() ifndef CONFIG_KAISER Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 09/37] kaiser: fix perf crashes Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 10/37] kaiser: ENOMEM if kaiser_pagetable_walk() NULL Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 11/37] kaiser: tidied up asm/kaiser.h somewhat Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 12/37] kaiser: tidied up kaiser_add/remove_mapping slightly Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 13/37] kaiser: kaiser_remove_mapping() move along the pgd Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 14/37] kaiser: cleanups while trying for gold link Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 15/37] kaiser: name that 0x1000 KAISER_SHADOW_PGD_OFFSET Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 16/37] kaiser: delete KAISER_REAL_SWITCH option Greg Kroah-Hartman
2018-01-03 20:11 ` Greg Kroah-Hartman [this message]
2018-01-03 20:11 ` [PATCH 4.4 18/37] kaiser: enhanced by kernel and user PCIDs Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 19/37] kaiser: load_new_mm_cr3() let SWITCH_USER_CR3 flush user Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 20/37] kaiser: PCID 0 for kernel and 128 for user Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 21/37] kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 22/37] kaiser: paranoid_entry pass cr3 need to paranoid_exit Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 23/37] kaiser: _pgd_alloc() without __GFP_REPEAT to avoid stalls Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 24/37] kaiser: fix unlikely error in alloc_ldt_struct() Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 25/37] kaiser: add "nokaiser" boot option, using ALTERNATIVE Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 26/37] x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 27/37] x86/kaiser: Check boottime cmdline params Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 28/37] kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 29/37] kaiser: drop is_atomic arg to kaiser_pagetable_walk() Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 30/37] kaiser: asm/tlbflush.h handle noPGE at lower level Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 31/37] kaiser: kaiser_flush_tlb_on_return_to_user() check PCID Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 32/37] x86/paravirt: Dont patch flush_tlb_single Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 33/37] x86/kaiser: Reenable PARAVIRT Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 34/37] kaiser: disabled on Xen PV Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 35/37] x86/kaiser: Move feature detection up Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 36/37] KPTI: Rename to PAGE_TABLE_ISOLATION Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.4 37/37] KPTI: Report when enabled Greg Kroah-Hartman
2018-01-03 22:08 ` [PATCH 4.4 00/37] 4.4.110-stable review Nathan Chancellor
2018-01-04 8:10 ` Greg Kroah-Hartman
2018-01-04 6:50 ` Naresh Kamboju
2018-01-04 16:38 ` Pavel Tatashin
2018-01-04 16:53 ` Greg Kroah-Hartman
2018-01-04 17:01 ` Guenter Roeck
2018-01-04 17:09 ` Greg Kroah-Hartman
2018-01-04 17:02 ` Pavel Tatashin
2018-01-04 17:03 ` Willy Tarreau
2018-01-04 17:11 ` Greg Kroah-Hartman
2018-01-04 17:13 ` Willy Tarreau
2018-01-04 17:14 ` Greg Kroah-Hartman
2018-01-04 17:16 ` Greg Kroah-Hartman
2018-01-04 17:56 ` Guenter Roeck
2018-01-05 15:00 ` Greg Kroah-Hartman
2018-01-05 18:12 ` Guenter Roeck
2018-01-05 20:53 ` Greg Kroah-Hartman
2018-01-04 20:11 ` Linus Torvalds
2018-01-04 17:03 ` Guenter Roeck
2018-01-04 19:38 ` Thomas Voegtle
2018-01-04 19:50 ` Greg Kroah-Hartman
2018-01-04 20:16 ` Thomas Voegtle
2018-01-04 20:29 ` Linus Torvalds
2018-01-04 20:43 ` Andy Lutomirski
2018-01-04 20:57 ` Hugh Dickins
2018-01-04 21:16 ` Andy Lutomirski
2018-01-04 21:23 ` Pavel Tatashin
2018-01-04 21:37 ` Hugh Dickins
2018-01-04 21:48 ` Pavel Tatashin
2018-01-04 22:33 ` Linus Torvalds
2018-01-05 14:59 ` Greg Kroah-Hartman
2018-01-05 15:32 ` Pavel Tatashin
2018-01-05 15:51 ` Greg Kroah-Hartman
2018-01-05 15:57 ` Willy Tarreau
2018-01-05 18:01 ` Greg Kroah-Hartman
2018-01-05 16:26 ` Pavel Tatashin
2018-01-05 16:57 ` Andy Lutomirski
2018-01-05 17:14 ` Pavel Tatashin
2018-01-05 17:43 ` Andy Lutomirski
2018-01-05 17:48 ` Pavel Tatashin
2018-01-05 17:52 ` Greg Kroah-Hartman
2018-01-05 18:15 ` Andy Lutomirski
2018-01-05 18:21 ` Pavel Tatashin
2018-01-05 19:14 ` Pavel Tatashin
2018-01-05 19:18 ` Pavel Tatashin
2018-01-05 20:45 ` Greg Kroah-Hartman
2018-01-05 21:03 ` Pavel Tatashin
2018-01-05 23:15 ` Hugh Dickins
2018-01-06 1:16 ` Pavel Tatashin
2018-01-07 10:45 ` Greg Kroah-Hartman
2018-01-07 14:17 ` Pavel Tatashin
2018-01-07 15:06 ` Pavel Tatashin
2018-01-08 7:46 ` Greg Kroah-Hartman
2018-01-08 20:38 ` Pavel Tatashin
2018-01-08 21:24 ` Pavel Tatashin
2018-01-11 18:36 ` Pavel Tatashin
2018-01-11 18:40 ` Pavel Tatashin
2018-01-11 19:09 ` Linus Torvalds
2018-01-11 20:37 ` Thomas Gleixner
2018-01-11 20:46 ` Linus Torvalds
2018-01-11 21:32 ` Thomas Gleixner
2018-01-11 22:30 ` Thomas Gleixner
2018-01-11 22:42 ` Steven Sistare
2018-01-11 22:47 ` Thomas Gleixner
2018-01-12 1:15 ` Guenter Roeck
2018-01-11 22:59 ` Linus Torvalds
2018-01-11 23:03 ` Thomas Gleixner
2018-01-12 7:19 ` Greg Kroah-Hartman
2018-01-12 8:03 ` Thomas Gleixner
2018-01-11 21:35 ` Steven Sistare
2018-01-11 21:44 ` Thomas Gleixner
2018-01-11 20:10 ` Greg Kroah-Hartman
2018-01-11 20:17 ` Linus Torvalds
2018-01-11 20:18 ` Pavel Tatashin
2018-01-05 20:48 ` Greg Kroah-Hartman
2018-01-05 5:33 ` Andy Lutomirski
2018-01-05 10:12 ` Kees Cook
2018-01-05 12:14 ` Greg Kroah-Hartman
2018-01-05 13:08 ` Greg Kroah-Hartman
2018-01-04 20:10 ` Guenter Roeck
2018-01-05 14:58 ` Greg Kroah-Hartman
2018-01-05 15:25 ` Thomas Voegtle
2018-01-05 15:48 ` Greg Kroah-Hartman
2018-01-04 22:00 ` Shuah Khan
2018-01-05 7:55 ` Greg Kroah-Hartman
2018-01-04 23:45 ` Guenter Roeck
2018-01-04 23:58 ` Linus Torvalds
2018-01-05 4:37 ` Mike Galbraith
2018-01-05 12:17 ` Greg Kroah-Hartman
2018-01-05 13:03 ` Mike Galbraith
2018-01-05 13:34 ` Greg Kroah-Hartman
2018-01-05 14:03 ` Mike Galbraith
2018-01-05 23:28 ` Hugh Dickins
2018-01-06 2:58 ` Mike Galbraith
2018-01-05 13:41 ` Greg Kroah-Hartman
2018-01-05 17:51 ` Guenter Roeck
2018-01-05 17:20 ` Alice Ferrazzi
2018-01-05 18:01 ` Greg Kroah-Hartman
2018-01-09 19:49 ` Serge E. Hallyn
2018-01-10 8:48 ` Greg Kroah-Hartman
2018-01-10 16:45 ` Serge E. Hallyn
2018-01-05 17:56 ` Guenter Roeck
2018-01-05 20:54 ` Greg Kroah-Hartman
2018-01-05 21:21 ` Guenter Roeck
2018-01-06 1:35 ` Guenter Roeck
[not found] ` <5a4df377.03a5500a.51f2e.f41f@mx.google.com>
[not found] ` <7hmv1t2mq2.fsf@baylibre.com>
2018-01-08 15:06 ` Guillaume Tucker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180103195057.727940597@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=hughd@google.com \
--cc=jkosina@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).