* [PATCH v4 0/4] KVM: mm: count KVM mmu usage in memory stats
@ 2022-04-29 20:11 Yosry Ahmed
[not found] ` <20220429201131.3397875-1-yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 27+ messages in thread
From: Yosry Ahmed @ 2022-04-29 20:11 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Zefan Li, Marc Zyngier, James Morse,
Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini,
Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin,
Shakeel Butt, Oliver Upton
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
kvmarm-FPEHb7Xf0XXUo1n7N8X6UoWGPAHP3yOg,
kvm-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
Yosry Ahmed
We keep track of several kernel memory stats (total kernel memory, page
tables, stack, vmalloc, etc) on multiple levels (global, per-node,
per-memcg, etc). These stats give insights to users to how much memory
is used by the kernel and for what purposes.
Currently, memory used by kvm mmu is not accounted in any of those
kernel memory stats. This patch series accounts the memory pages
used by KVM for page tables in those stats in a new
NR_SECONDARY_PAGETABLE stat.
---
Changes in V4:
- Changed accounting hooks in arm64 to only account s2 page tables and
refactored them to a much cleaner form, based on recommendations from
Oliver Upton and Marc Zyngier.
- Dropped patches for mips and riscv. I am not interested in those archs
anyway and don't have the resources to test them. I posted them for
completeness but it doesn't seem like anyone was interested.
Changes in V3:
- Added NR_SECONDARY_PAGETABLE instead of piggybacking on NR_PAGETABLE
stats.
Changes in V2:
- Added accounting stats for other archs than x86.
- Changed locations in the code where x86 KVM page table stats were
accounted based on suggestions from Sean Christopherson.
---
Yosry Ahmed (4):
mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses.
KVM: mmu: add a helper to account memory used by KVM mmu.
KVM: x86/mmu: count KVM mmu usage in secondary pagetable stats.
KVM: arm64/mmu: count KVM s2 mmu usage in secondary pagetable stats
Documentation/admin-guide/cgroup-v2.rst | 5 ++++
Documentation/filesystems/proc.rst | 4 +++
arch/arm64/kvm/mmu.c | 35 ++++++++++++++++++++++---
arch/x86/kvm/mmu/mmu.c | 16 +++++++++--
arch/x86/kvm/mmu/tdp_mmu.c | 16 +++++++++--
drivers/base/node.c | 2 ++
fs/proc/meminfo.c | 2 ++
include/linux/kvm_host.h | 9 +++++++
include/linux/mmzone.h | 1 +
mm/memcontrol.c | 1 +
mm/page_alloc.c | 6 ++++-
mm/vmstat.c | 1 +
12 files changed, 89 insertions(+), 9 deletions(-)
--
2.36.0.464.gb9c8b46e94-goog
^ permalink raw reply [flat|nested] 27+ messages in thread[parent not found: <20220429201131.3397875-1-yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>]
* [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <20220429201131.3397875-1-yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> @ 2022-04-29 20:11 ` Yosry Ahmed 2022-05-02 10:01 ` Marc Zyngier 2022-04-29 20:11 ` [PATCH v4 2/4] KVM: mmu: add a helper to account memory used by KVM mmu Yosry Ahmed ` (2 subsequent siblings) 3 siblings, 1 reply; 27+ messages in thread From: Yosry Ahmed @ 2022-04-29 20:11 UTC (permalink / raw) To: Tejun Heo, Johannes Weiner, Zefan Li, Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, Oliver Upton Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, kvmarm-FPEHb7Xf0XXUo1n7N8X6UoWGPAHP3yOg, kvm-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Yosry Ahmed Add NR_SECONDARY_PAGETABLE stat to count secondary page table uses, e.g. KVM mmu. This provides more insights on the kernel memory used by a workload. This stat will be used by subsequent patches to count KVM mmu memory usage. Signed-off-by: Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> --- Documentation/admin-guide/cgroup-v2.rst | 5 +++++ Documentation/filesystems/proc.rst | 4 ++++ drivers/base/node.c | 2 ++ fs/proc/meminfo.c | 2 ++ include/linux/mmzone.h | 1 + mm/memcontrol.c | 1 + mm/page_alloc.c | 6 +++++- mm/vmstat.c | 1 + 8 files changed, 21 insertions(+), 1 deletion(-) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index 69d7a6983f78..828cb6b6f918 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1312,6 +1312,11 @@ PAGE_SIZE multiple when read back. pagetables Amount of memory allocated for page tables. + secondary_pagetables + Amount of memory allocated for secondary page tables, + this currently includes KVM mmu allocations on x86 + and arm64. + percpu (npn) Amount of memory used for storing per-cpu kernel data structures. diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index 061744c436d9..894d6317f3bd 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -973,6 +973,7 @@ You may not have all of these fields. SReclaimable: 159856 kB SUnreclaim: 124508 kB PageTables: 24448 kB + SecPageTables: 0 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB @@ -1067,6 +1068,9 @@ SUnreclaim PageTables amount of memory dedicated to the lowest level of page tables. +SecPageTables + amount of memory dedicated to secondary page tables, this + currently includes KVM mmu allocations on x86 and arm64. NFS_Unstable Always zero. Previous counted pages which had been written to the server, but has not been committed to stable storage. diff --git a/drivers/base/node.c b/drivers/base/node.c index ec8bb24a5a22..9fe716832546 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -433,6 +433,7 @@ static ssize_t node_read_meminfo(struct device *dev, "Node %d ShadowCallStack:%8lu kB\n" #endif "Node %d PageTables: %8lu kB\n" + "Node %d SecPageTables: %8lu kB\n" "Node %d NFS_Unstable: %8lu kB\n" "Node %d Bounce: %8lu kB\n" "Node %d WritebackTmp: %8lu kB\n" @@ -459,6 +460,7 @@ static ssize_t node_read_meminfo(struct device *dev, nid, node_page_state(pgdat, NR_KERNEL_SCS_KB), #endif nid, K(node_page_state(pgdat, NR_PAGETABLE)), + nid, K(node_page_state(pgdat, NR_SECONDARY_PAGETABLE)), nid, 0UL, nid, K(sum_zone_node_page_state(nid, NR_BOUNCE)), nid, K(node_page_state(pgdat, NR_WRITEBACK_TEMP)), diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index 6fa761c9cc78..fad29024eb2e 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -108,6 +108,8 @@ static int meminfo_proc_show(struct seq_file *m, void *v) #endif show_val_kb(m, "PageTables: ", global_node_page_state(NR_PAGETABLE)); + show_val_kb(m, "SecPageTables: ", + global_node_page_state(NR_SECONDARY_PAGETABLE)); show_val_kb(m, "NFS_Unstable: ", 0); show_val_kb(m, "Bounce: ", diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 962b14d403e8..35f57f2578c0 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -219,6 +219,7 @@ enum node_stat_item { NR_KERNEL_SCS_KB, /* measured in KiB */ #endif NR_PAGETABLE, /* used for pagetables */ + NR_SECONDARY_PAGETABLE, /* secondary pagetables, e.g. kvm shadow pagetables */ #ifdef CONFIG_SWAP NR_SWAPCACHE, #endif diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 725f76723220..89fbd1793960 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1388,6 +1388,7 @@ static const struct memory_stat memory_stats[] = { { "kernel", MEMCG_KMEM }, { "kernel_stack", NR_KERNEL_STACK_KB }, { "pagetables", NR_PAGETABLE }, + { "secondary_pagetables", NR_SECONDARY_PAGETABLE }, { "percpu", MEMCG_PERCPU_B }, { "sock", MEMCG_SOCK }, { "vmalloc", MEMCG_VMALLOC }, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2db95780e003..96d00ae9d5c1 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5932,7 +5932,8 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask) " active_file:%lu inactive_file:%lu isolated_file:%lu\n" " unevictable:%lu dirty:%lu writeback:%lu\n" " slab_reclaimable:%lu slab_unreclaimable:%lu\n" - " mapped:%lu shmem:%lu pagetables:%lu bounce:%lu\n" + " mapped:%lu shmem:%lu pagetables:%lu\n" + " secondary_pagetables:%lu bounce:%lu\n" " kernel_misc_reclaimable:%lu\n" " free:%lu free_pcp:%lu free_cma:%lu\n", global_node_page_state(NR_ACTIVE_ANON), @@ -5949,6 +5950,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask) global_node_page_state(NR_FILE_MAPPED), global_node_page_state(NR_SHMEM), global_node_page_state(NR_PAGETABLE), + global_node_page_state(NR_SECONDARY_PAGETABLE), global_zone_page_state(NR_BOUNCE), global_node_page_state(NR_KERNEL_MISC_RECLAIMABLE), global_zone_page_state(NR_FREE_PAGES), @@ -5982,6 +5984,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask) " shadow_call_stack:%lukB" #endif " pagetables:%lukB" + " secondary_pagetables:%lukB" " all_unreclaimable? %s" "\n", pgdat->node_id, @@ -6007,6 +6010,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask) node_page_state(pgdat, NR_KERNEL_SCS_KB), #endif K(node_page_state(pgdat, NR_PAGETABLE)), + K(node_page_state(pgdat, NR_SECONDARY_PAGETABLE)), pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES ? "yes" : "no"); } diff --git a/mm/vmstat.c b/mm/vmstat.c index b75b1a64b54c..50bbec73809b 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1240,6 +1240,7 @@ const char * const vmstat_text[] = { "nr_shadow_call_stack", #endif "nr_page_table_pages", + "nr_secondary_page_table_pages", #ifdef CONFIG_SWAP "nr_swapcached", #endif -- 2.36.0.464.gb9c8b46e94-goog ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. 2022-04-29 20:11 ` [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses Yosry Ahmed @ 2022-05-02 10:01 ` Marc Zyngier [not found] ` <87ilqoi77b.wl-maz-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 0 siblings, 1 reply; 27+ messages in thread From: Marc Zyngier @ 2022-05-02 10:01 UTC (permalink / raw) To: Yosry Ahmed Cc: Wanpeng Li, kvm, Roman Gushchin, Michal Hocko, linux-mm, Zefan Li, kvmarm, Joerg Roedel, Shakeel Butt, cgroups, Andrew Morton, linux-arm-kernel, Jim Mattson, linux-kernel, Johannes Weiner, Tejun Heo, Paolo Bonzini, Vitaly Kuznetsov On Fri, 29 Apr 2022 21:11:28 +0100, Yosry Ahmed <yosryahmed@google.com> wrote: > > Add NR_SECONDARY_PAGETABLE stat to count secondary page table uses, e.g. > KVM mmu. This provides more insights on the kernel memory used > by a workload. > > This stat will be used by subsequent patches to count KVM mmu > memory usage. > > Signed-off-by: Yosry Ahmed <yosryahmed@google.com> > --- > Documentation/admin-guide/cgroup-v2.rst | 5 +++++ > Documentation/filesystems/proc.rst | 4 ++++ > drivers/base/node.c | 2 ++ > fs/proc/meminfo.c | 2 ++ > include/linux/mmzone.h | 1 + > mm/memcontrol.c | 1 + > mm/page_alloc.c | 6 +++++- > mm/vmstat.c | 1 + > 8 files changed, 21 insertions(+), 1 deletion(-) > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst > index 69d7a6983f78..828cb6b6f918 100644 > --- a/Documentation/admin-guide/cgroup-v2.rst > +++ b/Documentation/admin-guide/cgroup-v2.rst > @@ -1312,6 +1312,11 @@ PAGE_SIZE multiple when read back. > pagetables > Amount of memory allocated for page tables. > > + secondary_pagetables > + Amount of memory allocated for secondary page tables, > + this currently includes KVM mmu allocations on x86 > + and arm64. Can you please explain what the rationale is for this? We already account for the (arm64) S2 PTs as a userspace allocation (see 115bae923ac8bb29ee635). You are saying that this is related to a 'workload', but given that the accounting is global, I fail to see how you can attribute these allocations on a particular VM. What do you plan to do for IOMMU page tables? After all, they serve the exact same purpose, and I'd expect these to be handled the same way (i.e. why is this KVM specific?). Thanks, M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <87ilqoi77b.wl-maz-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <87ilqoi77b.wl-maz-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2022-05-02 18:46 ` Yosry Ahmed [not found] ` <CAJD7tkY7JF25XXUFq2mGroetMkfo-2zGOaQC94pjZE3D42+oaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 27+ messages in thread From: Yosry Ahmed @ 2022-05-02 18:46 UTC (permalink / raw) To: Marc Zyngier Cc: Tejun Heo, Johannes Weiner, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, Oliver Upton, cgroups-u79uwXL29TY76Z2rM5mHXA, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, kvma On Mon, May 2, 2022 at 3:01 AM Marc Zyngier <maz-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > > On Fri, 29 Apr 2022 21:11:28 +0100, > Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote: > > > > Add NR_SECONDARY_PAGETABLE stat to count secondary page table uses, e.g. > > KVM mmu. This provides more insights on the kernel memory used > > by a workload. > > > > This stat will be used by subsequent patches to count KVM mmu > > memory usage. > > > > Signed-off-by: Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > > --- > > Documentation/admin-guide/cgroup-v2.rst | 5 +++++ > > Documentation/filesystems/proc.rst | 4 ++++ > > drivers/base/node.c | 2 ++ > > fs/proc/meminfo.c | 2 ++ > > include/linux/mmzone.h | 1 + > > mm/memcontrol.c | 1 + > > mm/page_alloc.c | 6 +++++- > > mm/vmstat.c | 1 + > > 8 files changed, 21 insertions(+), 1 deletion(-) > > > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst > > index 69d7a6983f78..828cb6b6f918 100644 > > --- a/Documentation/admin-guide/cgroup-v2.rst > > +++ b/Documentation/admin-guide/cgroup-v2.rst > > @@ -1312,6 +1312,11 @@ PAGE_SIZE multiple when read back. > > pagetables > > Amount of memory allocated for page tables. > > > > + secondary_pagetables > > + Amount of memory allocated for secondary page tables, > > + this currently includes KVM mmu allocations on x86 > > + and arm64. > > Can you please explain what the rationale is for this? We already > account for the (arm64) S2 PTs as a userspace allocation (see This can be considered as continuation for that work. The mentioned commit accounts S2 PTs to the VM process cgroup kernel memory. We have stats for the total kernel memory, and some fine-grained categories of that, like (pagetables, stack, slab, etc.). This patch just adds another category to give further insights into what exactly is using kernel memory. > 115bae923ac8bb29ee635). You are saying that this is related to a > 'workload', but given that the accounting is global, I fail to see how > you can attribute these allocations on a particular VM. The main motivation is having the memcg stats, which give attribution to workloads. If you think it's more appropriate, we can add it as a memcg-only stat, like MEMCG_VMALLOC (see 4e5aa1f4c2b4 ("memcg: add per-memcg vmalloc stat")). The only reason I made this as a global stat too is to be consistent with NR_PAGETABLE. > > What do you plan to do for IOMMU page tables? After all, they serve > the exact same purpose, and I'd expect these to be handled the same > way (i.e. why is this KVM specific?). The reason this was named NR_SECONDARY_PAGTABLE instead of NR_KVM_PAGETABLE is exactly that. To leave room to incrementally account other types of secondary page tables to this stat. It is just that we are currently interested in the KVM MMU usage. > > Thanks, > > M. > > -- > Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <CAJD7tkY7JF25XXUFq2mGroetMkfo-2zGOaQC94pjZE3D42+oaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <CAJD7tkY7JF25XXUFq2mGroetMkfo-2zGOaQC94pjZE3D42+oaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2022-05-09 16:38 ` Yosry Ahmed [not found] ` <CAJD7tkbfT-FRs3LE2GPddqrQSWw_eC1R6k3z04x=z9Zvt5yLpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2022-05-12 23:07 ` Johannes Weiner 1 sibling, 1 reply; 27+ messages in thread From: Yosry Ahmed @ 2022-05-09 16:38 UTC (permalink / raw) To: Marc Zyngier Cc: Tejun Heo, Johannes Weiner, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, Oliver Upton, cgroups-u79uwXL29TY76Z2rM5mHXA, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, kvma On Mon, May 2, 2022 at 11:46 AM Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote: > > On Mon, May 2, 2022 at 3:01 AM Marc Zyngier <maz-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > > > > On Fri, 29 Apr 2022 21:11:28 +0100, > > Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote: > > > > > > Add NR_SECONDARY_PAGETABLE stat to count secondary page table uses, e.g. > > > KVM mmu. This provides more insights on the kernel memory used > > > by a workload. > > > > > > This stat will be used by subsequent patches to count KVM mmu > > > memory usage. > > > > > > Signed-off-by: Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > > > --- > > > Documentation/admin-guide/cgroup-v2.rst | 5 +++++ > > > Documentation/filesystems/proc.rst | 4 ++++ > > > drivers/base/node.c | 2 ++ > > > fs/proc/meminfo.c | 2 ++ > > > include/linux/mmzone.h | 1 + > > > mm/memcontrol.c | 1 + > > > mm/page_alloc.c | 6 +++++- > > > mm/vmstat.c | 1 + > > > 8 files changed, 21 insertions(+), 1 deletion(-) > > > > > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst > > > index 69d7a6983f78..828cb6b6f918 100644 > > > --- a/Documentation/admin-guide/cgroup-v2.rst > > > +++ b/Documentation/admin-guide/cgroup-v2.rst > > > @@ -1312,6 +1312,11 @@ PAGE_SIZE multiple when read back. > > > pagetables > > > Amount of memory allocated for page tables. > > > > > > + secondary_pagetables > > > + Amount of memory allocated for secondary page tables, > > > + this currently includes KVM mmu allocations on x86 > > > + and arm64. > > > > Can you please explain what the rationale is for this? We already > > account for the (arm64) S2 PTs as a userspace allocation (see > > This can be considered as continuation for that work. The mentioned > commit accounts S2 PTs to the VM process cgroup kernel memory. We have > stats for the total kernel memory, and some fine-grained categories of > that, like (pagetables, stack, slab, etc.). > > This patch just adds another category to give further insights into > what exactly is using kernel memory. > > > 115bae923ac8bb29ee635). You are saying that this is related to a > > 'workload', but given that the accounting is global, I fail to see how > > you can attribute these allocations on a particular VM. > > The main motivation is having the memcg stats, which give attribution > to workloads. If you think it's more appropriate, we can add it as a > memcg-only stat, like MEMCG_VMALLOC (see 4e5aa1f4c2b4 ("memcg: add > per-memcg vmalloc stat")). The only reason I made this as a global > stat too is to be consistent with NR_PAGETABLE. > > > > > What do you plan to do for IOMMU page tables? After all, they serve > > the exact same purpose, and I'd expect these to be handled the same > > way (i.e. why is this KVM specific?). > > The reason this was named NR_SECONDARY_PAGTABLE instead of > NR_KVM_PAGETABLE is exactly that. To leave room to incrementally > account other types of secondary page tables to this stat. It is just > that we are currently interested in the KVM MMU usage. > Any thoughts on this? Do you think MEMCG_SECONDARY_PAGETABLE would be more appropriate here? ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <CAJD7tkbfT-FRs3LE2GPddqrQSWw_eC1R6k3z04x=z9Zvt5yLpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <CAJD7tkbfT-FRs3LE2GPddqrQSWw_eC1R6k3z04x=z9Zvt5yLpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2022-05-12 20:36 ` Shakeel Butt 0 siblings, 0 replies; 27+ messages in thread From: Shakeel Butt @ 2022-05-12 20:36 UTC (permalink / raw) To: Yosry Ahmed Cc: Marc Zyngier, Tejun Heo, Johannes Weiner, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Oliver Upton, Cgroups, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWPdpHbCvnp+Ag On Mon, May 9, 2022 at 9:38 AM Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote: > [...] > > > > > > What do you plan to do for IOMMU page tables? After all, they serve > > > the exact same purpose, and I'd expect these to be handled the same > > > way (i.e. why is this KVM specific?). > > > > The reason this was named NR_SECONDARY_PAGTABLE instead of > > NR_KVM_PAGETABLE is exactly that. To leave room to incrementally > > account other types of secondary page tables to this stat. It is just > > that we are currently interested in the KVM MMU usage. > > > > > Any thoughts on this? Do you think MEMCG_SECONDARY_PAGETABLE would be > more appropriate here? I think NR_SECONDARY_PAGTABLE is good. Later it can include pagetables from other subsystems. The only nit (which you can ignore) I have is the very long memcg stat and vmstat names. Other than that the patch looks good to me. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <CAJD7tkY7JF25XXUFq2mGroetMkfo-2zGOaQC94pjZE3D42+oaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2022-05-09 16:38 ` Yosry Ahmed @ 2022-05-12 23:07 ` Johannes Weiner [not found] ` <Yn2TGJ4vZ/fst+CY-druUgvl0LCNAfugRpC6u6w@public.gmane.org> 1 sibling, 1 reply; 27+ messages in thread From: Johannes Weiner @ 2022-05-12 23:07 UTC (permalink / raw) To: Yosry Ahmed Cc: Marc Zyngier, Tejun Heo, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, Oliver Upton, cgroups-u79uwXL29TY76Z2rM5mHXA, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, kvmarm Hey Yosry, On Mon, May 02, 2022 at 11:46:26AM -0700, Yosry Ahmed wrote: > On Mon, May 2, 2022 at 3:01 AM Marc Zyngier <maz-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > > 115bae923ac8bb29ee635). You are saying that this is related to a > > 'workload', but given that the accounting is global, I fail to see how > > you can attribute these allocations on a particular VM. > > The main motivation is having the memcg stats, which give attribution > to workloads. If you think it's more appropriate, we can add it as a > memcg-only stat, like MEMCG_VMALLOC (see 4e5aa1f4c2b4 ("memcg: add > per-memcg vmalloc stat")). The only reason I made this as a global > stat too is to be consistent with NR_PAGETABLE. Please no memcg-specific stats if a regular vmstat item is possible and useful at the system level as well, like in this case. It's extra memcg code, extra callbacks, and it doesn't have NUMA node awareness. > > What do you plan to do for IOMMU page tables? After all, they serve > > the exact same purpose, and I'd expect these to be handled the same > > way (i.e. why is this KVM specific?). > > The reason this was named NR_SECONDARY_PAGTABLE instead of > NR_KVM_PAGETABLE is exactly that. To leave room to incrementally > account other types of secondary page tables to this stat. It is just > that we are currently interested in the KVM MMU usage. Do you actually care at the supervisor level that this memory is used for guest page tables? It seems to me you primarily care that it is reported *somewhere* (hence the piggybacking off of NR_PAGETABLE at first). And whether it's page tables or iommu tables or whatever else allocated for the purpose of virtualization, it doesn't make much of a difference to the host/cgroup that is tracking it, right? (The proximity to nr_pagetable could also be confusing. A high page table count can be a hint to userspace to enable THP. It seems actionable in a different way than a high number of kvm page tables or iommu page tables.) How about NR_VIRT? It's shorter, seems descriptive enough, less room for confusion, and is more easily extensible in the future. ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <Yn2TGJ4vZ/fst+CY-druUgvl0LCNAfugRpC6u6w@public.gmane.org>]
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <Yn2TGJ4vZ/fst+CY-druUgvl0LCNAfugRpC6u6w@public.gmane.org> @ 2022-05-12 23:29 ` Sean Christopherson [not found] ` <Yn2YYl98Vhh/UL0w-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 27+ messages in thread From: Sean Christopherson @ 2022-05-12 23:29 UTC (permalink / raw) To: Johannes Weiner Cc: Yosry Ahmed, Marc Zyngier, Tejun Heo, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, Oliver Upton, cgroups-u79uwXL29TY76Z2rM5mHXA, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, kvmarm-FPEHb7Xf0XXUo1n7N8X6Ug On Thu, May 12, 2022, Johannes Weiner wrote: > Hey Yosry, > > On Mon, May 02, 2022 at 11:46:26AM -0700, Yosry Ahmed wrote: > > On Mon, May 2, 2022 at 3:01 AM Marc Zyngier <maz-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > > > 115bae923ac8bb29ee635). You are saying that this is related to a > > > 'workload', but given that the accounting is global, I fail to see how > > > you can attribute these allocations on a particular VM. > > > > The main motivation is having the memcg stats, which give attribution > > to workloads. If you think it's more appropriate, we can add it as a > > memcg-only stat, like MEMCG_VMALLOC (see 4e5aa1f4c2b4 ("memcg: add > > per-memcg vmalloc stat")). The only reason I made this as a global > > stat too is to be consistent with NR_PAGETABLE. > > Please no memcg-specific stats if a regular vmstat item is possible > and useful at the system level as well, like in this case. It's extra > memcg code, extra callbacks, and it doesn't have NUMA node awareness. > > > > What do you plan to do for IOMMU page tables? After all, they serve > > > the exact same purpose, and I'd expect these to be handled the same > > > way (i.e. why is this KVM specific?). > > > > The reason this was named NR_SECONDARY_PAGTABLE instead of > > NR_KVM_PAGETABLE is exactly that. To leave room to incrementally > > account other types of secondary page tables to this stat. It is just > > that we are currently interested in the KVM MMU usage. > > Do you actually care at the supervisor level that this memory is used > for guest page tables? Hmm, yes? KVM does have a decent number of large-ish allocations that aren't for page tables, but except for page tables, the number/size of those allocations scales linearly with either the number of vCPUs or the amount of memory assigned to the VM (with no room for improvement barring KVM changes). Off the top of my head, KVM's secondary page tables are the only allocations that don't scale linearly, especially when nested virtualization is in use. > It seems to me you primarily care that it is reported *somewhere* > (hence the piggybacking off of NR_PAGETABLE at first). And whether > it's page tables or iommu tables or whatever else allocated for the > purpose of virtualization, it doesn't make much of a difference to the > host/cgroup that is tracking it, right? > > (The proximity to nr_pagetable could also be confusing. A high page > table count can be a hint to userspace to enable THP. It seems > actionable in a different way than a high number of kvm page tables or > iommu page tables.) I don't know about iommu page tables, but on the KVM side a high count can also be a good signal that enabling THP would be beneficial. It's definitely actionable in a different way though too. > How about NR_VIRT? It's shorter, seems descriptive enough, less room > for confusion, and is more easily extensible in the future. I don't like NR_VIRT because VFIO/iommu can be used for non-virtualization things, and we'd be lying by omission unless KVM (and other users) updates all of its large-ish allocations to account them correctly. ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <Yn2YYl98Vhh/UL0w-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <Yn2YYl98Vhh/UL0w-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> @ 2022-05-13 15:50 ` Johannes Weiner 2022-05-13 16:12 ` Sean Christopherson 0 siblings, 1 reply; 27+ messages in thread From: Johannes Weiner @ 2022-05-13 15:50 UTC (permalink / raw) To: Sean Christopherson Cc: Yosry Ahmed, Marc Zyngier, Tejun Heo, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, Oliver Upton, cgroups-u79uwXL29TY76Z2rM5mHXA, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, kvmarm-FPEHb7Xf0XXUo1n7N8X6Ug On Thu, May 12, 2022 at 11:29:38PM +0000, Sean Christopherson wrote: > On Thu, May 12, 2022, Johannes Weiner wrote: > > On Mon, May 02, 2022 at 11:46:26AM -0700, Yosry Ahmed wrote: > > > On Mon, May 2, 2022 at 3:01 AM Marc Zyngier <maz-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > > > > What do you plan to do for IOMMU page tables? After all, they serve > > > > the exact same purpose, and I'd expect these to be handled the same > > > > way (i.e. why is this KVM specific?). > > > > > > The reason this was named NR_SECONDARY_PAGTABLE instead of > > > NR_KVM_PAGETABLE is exactly that. To leave room to incrementally > > > account other types of secondary page tables to this stat. It is just > > > that we are currently interested in the KVM MMU usage. > > > > Do you actually care at the supervisor level that this memory is used > > for guest page tables? > > Hmm, yes? KVM does have a decent number of large-ish allocations that aren't > for page tables, but except for page tables, the number/size of those allocations > scales linearly with either the number of vCPUs or the amount of memory assigned > to the VM (with no room for improvement barring KVM changes). > > Off the top of my head, KVM's secondary page tables are the only allocations that > don't scale linearly, especially when nested virtualization is in use. Thanks, that's useful information. Are these other allocations accounted somewhere? If not, are they potential containment holes that will need fixing eventually? > > It seems to me you primarily care that it is reported *somewhere* > > (hence the piggybacking off of NR_PAGETABLE at first). And whether > > it's page tables or iommu tables or whatever else allocated for the > > purpose of virtualization, it doesn't make much of a difference to the > > host/cgroup that is tracking it, right? > > > > (The proximity to nr_pagetable could also be confusing. A high page > > table count can be a hint to userspace to enable THP. It seems > > actionable in a different way than a high number of kvm page tables or > > iommu page tables.) > > I don't know about iommu page tables, but on the KVM side a high count can also > be a good signal that enabling THP would be beneficial. Well, maybe. It might help, but ultimately it's the process that's in control in all cases: it's unmovable kernel memory allocated to manage virtual address space inside the task. So I'm still a bit at a loss whether these things should all be lumped in together or kept separately. meminfo and memory.stat are permanent ABI, so we should try to establish in advance whether the new itme is really a first-class consumer or part of something bigger. The patch initially piggybacked on NR_PAGETABLE. I found an email of you asking why it couldn't be a separate item, but it didn't provide a reasoning for that decision. Could you share your thoughts on that? Thanks ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. 2022-05-13 15:50 ` Johannes Weiner @ 2022-05-13 16:12 ` Sean Christopherson [not found] ` <Yn6DeEGLyR4Q0cDp-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 27+ messages in thread From: Sean Christopherson @ 2022-05-13 16:12 UTC (permalink / raw) To: Johannes Weiner Cc: Wanpeng Li, kvm, Roman Gushchin, Michal Hocko, Yosry Ahmed, Linux-MM, Zefan Li, kvmarm, Marc Zyngier, Joerg Roedel, Shakeel Butt, cgroups, Andrew Morton, linux-arm-kernel, Jim Mattson, Linux Kernel Mailing List, Tejun Heo, Paolo Bonzini, Vitaly Kuznetsov On Fri, May 13, 2022, Johannes Weiner wrote: > On Thu, May 12, 2022 at 11:29:38PM +0000, Sean Christopherson wrote: > > On Thu, May 12, 2022, Johannes Weiner wrote: > > > On Mon, May 02, 2022 at 11:46:26AM -0700, Yosry Ahmed wrote: > > > > On Mon, May 2, 2022 at 3:01 AM Marc Zyngier <maz@kernel.org> wrote: > > > > > What do you plan to do for IOMMU page tables? After all, they serve > > > > > the exact same purpose, and I'd expect these to be handled the same > > > > > way (i.e. why is this KVM specific?). > > > > > > > > The reason this was named NR_SECONDARY_PAGTABLE instead of > > > > NR_KVM_PAGETABLE is exactly that. To leave room to incrementally > > > > account other types of secondary page tables to this stat. It is just > > > > that we are currently interested in the KVM MMU usage. > > > > > > Do you actually care at the supervisor level that this memory is used > > > for guest page tables? > > > > Hmm, yes? KVM does have a decent number of large-ish allocations that aren't > > for page tables, but except for page tables, the number/size of those allocations > > scales linearly with either the number of vCPUs or the amount of memory assigned > > to the VM (with no room for improvement barring KVM changes). > > > > Off the top of my head, KVM's secondary page tables are the only allocations that > > don't scale linearly, especially when nested virtualization is in use. > > Thanks, that's useful information. > > Are these other allocations accounted somewhere? If not, are they > potential containment holes that will need fixing eventually? All allocations that are tied to specific VM/vCPU are tagged GFP_KERNEL_ACCOUNT, so we should be good on that front. > > > It seems to me you primarily care that it is reported *somewhere* > > > (hence the piggybacking off of NR_PAGETABLE at first). And whether > > > it's page tables or iommu tables or whatever else allocated for the > > > purpose of virtualization, it doesn't make much of a difference to the > > > host/cgroup that is tracking it, right? > > > > > > (The proximity to nr_pagetable could also be confusing. A high page > > > table count can be a hint to userspace to enable THP. It seems > > > actionable in a different way than a high number of kvm page tables or > > > iommu page tables.) > > > > I don't know about iommu page tables, but on the KVM side a high count can also > > be a good signal that enabling THP would be beneficial. > > Well, maybe. > > It might help, but ultimately it's the process that's in control in > all cases: it's unmovable kernel memory allocated to manage virtual > address space inside the task. > > So I'm still a bit at a loss whether these things should all be lumped > in together or kept separately. meminfo and memory.stat are permanent > ABI, so we should try to establish in advance whether the new itme is > really a first-class consumer or part of something bigger. > > The patch initially piggybacked on NR_PAGETABLE. I found an email of > you asking why it couldn't be a separate item, but it didn't provide a > reasoning for that decision. Could you share your thoughts on that? It was mostly an honest question, I too am trying to understand what userspace wants to do with this information. I was/am also trying to understand the benefits of doing the tracking through page_state and not a dedicated KVM stat. E.g. KVM already has specific stats for the number of leaf pages mapped into a VM, why not do the same for non-leaf pages? ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <Yn6DeEGLyR4Q0cDp-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <Yn6DeEGLyR4Q0cDp-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> @ 2022-05-13 16:22 ` Yosry Ahmed 2022-05-13 17:13 ` Shakeel Butt 1 sibling, 0 replies; 27+ messages in thread From: Yosry Ahmed @ 2022-05-13 16:22 UTC (permalink / raw) To: Sean Christopherson Cc: Johannes Weiner, Marc Zyngier, Tejun Heo, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, Oliver Upton, cgroups-u79uwXL29TY76Z2rM5mHXA, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, kvmarm Thanks everyone for participating in this discussion and looking into this. On Fri, May 13, 2022 at 9:12 AM Sean Christopherson <seanjc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote: > > On Fri, May 13, 2022, Johannes Weiner wrote: > > On Thu, May 12, 2022 at 11:29:38PM +0000, Sean Christopherson wrote: > > > On Thu, May 12, 2022, Johannes Weiner wrote: > > > > On Mon, May 02, 2022 at 11:46:26AM -0700, Yosry Ahmed wrote: > > > > > On Mon, May 2, 2022 at 3:01 AM Marc Zyngier <maz-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > > > > > > What do you plan to do for IOMMU page tables? After all, they serve > > > > > > the exact same purpose, and I'd expect these to be handled the same > > > > > > way (i.e. why is this KVM specific?). > > > > > > > > > > The reason this was named NR_SECONDARY_PAGTABLE instead of > > > > > NR_KVM_PAGETABLE is exactly that. To leave room to incrementally > > > > > account other types of secondary page tables to this stat. It is just > > > > > that we are currently interested in the KVM MMU usage. > > > > > > > > Do you actually care at the supervisor level that this memory is used > > > > for guest page tables? > > > > > > Hmm, yes? KVM does have a decent number of large-ish allocations that aren't > > > for page tables, but except for page tables, the number/size of those allocations > > > scales linearly with either the number of vCPUs or the amount of memory assigned > > > to the VM (with no room for improvement barring KVM changes). > > > > > > Off the top of my head, KVM's secondary page tables are the only allocations that > > > don't scale linearly, especially when nested virtualization is in use. > > > > Thanks, that's useful information. > > > > Are these other allocations accounted somewhere? If not, are they > > potential containment holes that will need fixing eventually? > > All allocations that are tied to specific VM/vCPU are tagged GFP_KERNEL_ACCOUNT, > so we should be good on that front. > > > > > It seems to me you primarily care that it is reported *somewhere* > > > > (hence the piggybacking off of NR_PAGETABLE at first). And whether > > > > it's page tables or iommu tables or whatever else allocated for the > > > > purpose of virtualization, it doesn't make much of a difference to the > > > > host/cgroup that is tracking it, right? > > > > > > > > (The proximity to nr_pagetable could also be confusing. A high page > > > > table count can be a hint to userspace to enable THP. It seems > > > > actionable in a different way than a high number of kvm page tables or > > > > iommu page tables.) > > > > > > I don't know about iommu page tables, but on the KVM side a high count can also > > > be a good signal that enabling THP would be beneficial. > > > > Well, maybe. > > > > It might help, but ultimately it's the process that's in control in > > all cases: it's unmovable kernel memory allocated to manage virtual > > address space inside the task. > > > > So I'm still a bit at a loss whether these things should all be lumped > > in together or kept separately. meminfo and memory.stat are permanent > > ABI, so we should try to establish in advance whether the new itme is > > really a first-class consumer or part of something bigger. > > > > The patch initially piggybacked on NR_PAGETABLE. I found an email of > > you asking why it couldn't be a separate item, but it didn't provide a > > reasoning for that decision. Could you share your thoughts on that? > > It was mostly an honest question, I too am trying to understand what userspace > wants to do with this information. I was/am also trying to understand the benefits > of doing the tracking through page_state and not a dedicated KVM stat. E.g. KVM > already has specific stats for the number of leaf pages mapped into a VM, why not > do the same for non-leaf pages? Let me cast some light on this. The reason this started being piggybacked on NR_PAGETABLE is that we had a remnant of an old internal implementation of NR_PAGETABLE before it was introduced upstream, that accounted KVM secondary page tables as normal page tables. This made me think this behavior was preferable. Personally, I wanted to make it a separate thing since the beginning. When I found opinions here that also suggested a separate stat I went ahead for that. As for where to put this information, it does not have to be NR_SECONDARY_PAGETABLE. Ultimately, people working on KVM are the ones that will interpret and act upon this data, so if you have somewhere else in mind please let me know, Sean. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <Yn6DeEGLyR4Q0cDp-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 2022-05-13 16:22 ` Yosry Ahmed @ 2022-05-13 17:13 ` Shakeel Butt 2022-05-20 1:56 ` Yosry Ahmed 1 sibling, 1 reply; 27+ messages in thread From: Shakeel Butt @ 2022-05-13 17:13 UTC (permalink / raw) To: Sean Christopherson Cc: Johannes Weiner, Yosry Ahmed, Marc Zyngier, Tejun Heo, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Oliver Upton, Cgroups, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XQ On Fri, May 13, 2022 at 9:12 AM Sean Christopherson <seanjc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote: > [...] > > It was mostly an honest question, I too am trying to understand what userspace > wants to do with this information. I was/am also trying to understand the benefits > of doing the tracking through page_state and not a dedicated KVM stat. E.g. KVM > already has specific stats for the number of leaf pages mapped into a VM, why not > do the same for non-leaf pages? Let me answer why a more general stat is useful and the potential userspace reaction: For a memory type which is significant enough, it is useful to expose it in the general interfaces, so that the general data/stat collection infra can collect them instead of having workload dependent stat collectors. In addition, not necessarily that stat has to have a userspace reaction in an online fashion. We do collect stats for offline analysis which greatly influence the priority order of optimization workitems. Next the question is do we really need a separate stat item (secondary_pagetable instead of just plain pagetable) exposed in the stable API? To me secondary_pagetable is general (not kvm specific) enough and can be significant, so having a separate dedicated stat should be ok. Though I am ok with lump it with pagetable stat for now but we do want it to be accounted somewhere. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. 2022-05-13 17:13 ` Shakeel Butt @ 2022-05-20 1:56 ` Yosry Ahmed [not found] ` <CAJD7tka-5+XRkthNV4qCg8woPCpjcwynQoRBame-3GP1L8y+WQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 27+ messages in thread From: Yosry Ahmed @ 2022-05-20 1:56 UTC (permalink / raw) To: Shakeel Butt Cc: Sean Christopherson, Johannes Weiner, Marc Zyngier, Tejun Heo, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Oliver Upton, Cgroups, Linux Kernel Mailing List, linux-arm-kernel On Fri, May 13, 2022 at 10:14 AM Shakeel Butt <shakeelb@google.com> wrote: > > On Fri, May 13, 2022 at 9:12 AM Sean Christopherson <seanjc@google.com> wrote: > > > [...] > > > > It was mostly an honest question, I too am trying to understand what userspace > > wants to do with this information. I was/am also trying to understand the benefits > > of doing the tracking through page_state and not a dedicated KVM stat. E.g. KVM > > already has specific stats for the number of leaf pages mapped into a VM, why not > > do the same for non-leaf pages? > > Let me answer why a more general stat is useful and the potential > userspace reaction: > > For a memory type which is significant enough, it is useful to expose > it in the general interfaces, so that the general data/stat collection > infra can collect them instead of having workload dependent stat > collectors. In addition, not necessarily that stat has to have a > userspace reaction in an online fashion. We do collect stats for > offline analysis which greatly influence the priority order of > optimization workitems. > > Next the question is do we really need a separate stat item > (secondary_pagetable instead of just plain pagetable) exposed in the > stable API? To me secondary_pagetable is general (not kvm specific) > enough and can be significant, so having a separate dedicated stat > should be ok. Though I am ok with lump it with pagetable stat for now > but we do want it to be accounted somewhere. Any thoughts on this? Johannes? ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <CAJD7tka-5+XRkthNV4qCg8woPCpjcwynQoRBame-3GP1L8y+WQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <CAJD7tka-5+XRkthNV4qCg8woPCpjcwynQoRBame-3GP1L8y+WQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2022-05-20 14:39 ` Johannes Weiner [not found] ` <YoeoLJNQTam5fJSu-druUgvl0LCNAfugRpC6u6w@public.gmane.org> 0 siblings, 1 reply; 27+ messages in thread From: Johannes Weiner @ 2022-05-20 14:39 UTC (permalink / raw) To: Yosry Ahmed Cc: Shakeel Butt, Sean Christopherson, Marc Zyngier, Tejun Heo, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Oliver Upton, Cgroups, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWM9gwCICpdfJQ On Thu, May 19, 2022 at 06:56:54PM -0700, Yosry Ahmed wrote: > On Fri, May 13, 2022 at 10:14 AM Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote: > > > > On Fri, May 13, 2022 at 9:12 AM Sean Christopherson <seanjc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote: > > > > > [...] > > > > > > It was mostly an honest question, I too am trying to understand what userspace > > > wants to do with this information. I was/am also trying to understand the benefits > > > of doing the tracking through page_state and not a dedicated KVM stat. E.g. KVM > > > already has specific stats for the number of leaf pages mapped into a VM, why not > > > do the same for non-leaf pages? > > > > Let me answer why a more general stat is useful and the potential > > userspace reaction: > > > > For a memory type which is significant enough, it is useful to expose > > it in the general interfaces, so that the general data/stat collection > > infra can collect them instead of having workload dependent stat > > collectors. In addition, not necessarily that stat has to have a > > userspace reaction in an online fashion. We do collect stats for > > offline analysis which greatly influence the priority order of > > optimization workitems. > > > > Next the question is do we really need a separate stat item > > (secondary_pagetable instead of just plain pagetable) exposed in the > > stable API? To me secondary_pagetable is general (not kvm specific) > > enough and can be significant, so having a separate dedicated stat > > should be ok. Though I am ok with lump it with pagetable stat for now > > but we do want it to be accounted somewhere. > > Any thoughts on this? Johannes? I agree that this memory should show up in vmstat/memory.stat in some form or another. The arguments on whether this should be part of NR_PAGETABLE or a separate entry seem a bit vague to me. I was hoping somebody more familiar with KVM could provide a better picture of memory consumption in that area. Sean had mentioned that these allocations already get tracked through GFP_KERNEL_ACCOUNT. That's good, but if they are significant enough to track, they should be represented in memory.stat in some form. Sean also pointed out though that those allocations tend to scale rather differently than the page tables, so it probably makes sense to keep those two things separate at least. Any thoughts on putting shadow page tables and iommu page tables into the existing NR_PAGETABLE item? If not, what are the cons? And creating (maybe later) a separate NR_VIRT for the other GPF_KERNEL_ACCOUNT allocations in kvm? ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <YoeoLJNQTam5fJSu-druUgvl0LCNAfugRpC6u6w@public.gmane.org>]
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <YoeoLJNQTam5fJSu-druUgvl0LCNAfugRpC6u6w@public.gmane.org> @ 2022-05-24 22:31 ` Yosry Ahmed [not found] ` <CAJD7tkYjcmwBeUx-=MTQeUf78uqFDvfpy7OuKy4OvoS7HiVO1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 27+ messages in thread From: Yosry Ahmed @ 2022-05-24 22:31 UTC (permalink / raw) To: Johannes Weiner Cc: Shakeel Butt, Sean Christopherson, Marc Zyngier, Tejun Heo, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Oliver Upton, Cgroups, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWM9gwCICpdfJQ On Fri, May 20, 2022 at 7:39 AM Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> wrote: > > On Thu, May 19, 2022 at 06:56:54PM -0700, Yosry Ahmed wrote: > > On Fri, May 13, 2022 at 10:14 AM Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote: > > > > > > On Fri, May 13, 2022 at 9:12 AM Sean Christopherson <seanjc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote: > > > > > > > [...] > > > > > > > > It was mostly an honest question, I too am trying to understand what userspace > > > > wants to do with this information. I was/am also trying to understand the benefits > > > > of doing the tracking through page_state and not a dedicated KVM stat. E.g. KVM > > > > already has specific stats for the number of leaf pages mapped into a VM, why not > > > > do the same for non-leaf pages? > > > > > > Let me answer why a more general stat is useful and the potential > > > userspace reaction: > > > > > > For a memory type which is significant enough, it is useful to expose > > > it in the general interfaces, so that the general data/stat collection > > > infra can collect them instead of having workload dependent stat > > > collectors. In addition, not necessarily that stat has to have a > > > userspace reaction in an online fashion. We do collect stats for > > > offline analysis which greatly influence the priority order of > > > optimization workitems. > > > > > > Next the question is do we really need a separate stat item > > > (secondary_pagetable instead of just plain pagetable) exposed in the > > > stable API? To me secondary_pagetable is general (not kvm specific) > > > enough and can be significant, so having a separate dedicated stat > > > should be ok. Though I am ok with lump it with pagetable stat for now > > > but we do want it to be accounted somewhere. > > > > Any thoughts on this? Johannes? > > I agree that this memory should show up in vmstat/memory.stat in some > form or another. > > The arguments on whether this should be part of NR_PAGETABLE or a > separate entry seem a bit vague to me. I was hoping somebody more > familiar with KVM could provide a better picture of memory consumption > in that area. > > Sean had mentioned that these allocations already get tracked through > GFP_KERNEL_ACCOUNT. That's good, but if they are significant enough to > track, they should be represented in memory.stat in some form. Sean > also pointed out though that those allocations tend to scale rather > differently than the page tables, so it probably makes sense to keep > those two things separate at least. > > Any thoughts on putting shadow page tables and iommu page tables into > the existing NR_PAGETABLE item? If not, what are the cons? > > And creating (maybe later) a separate NR_VIRT for the other > GPF_KERNEL_ACCOUNT allocations in kvm? I agree with Sean that a NR_VIRT stat would be inaccurate by omission, unless we account for all KVM allocations under this stat. This might be an unnecessary burden according to what Sean said, as most other allocations scale linearly with the number of vCPUs or the memory assigned to the VM. I don't have enough context to say whether we should piggyback KVM MMU pages to the existing NR_PAGETABLE item, but from a high level it seems like it would be more helpful if they are a separate stat. Anyway, I am willing to go with whatever Sean thinks is best. ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <CAJD7tkYjcmwBeUx-=MTQeUf78uqFDvfpy7OuKy4OvoS7HiVO1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <CAJD7tkYjcmwBeUx-=MTQeUf78uqFDvfpy7OuKy4OvoS7HiVO1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2022-05-25 11:56 ` Johannes Weiner 2022-05-26 0:38 ` Sean Christopherson 0 siblings, 1 reply; 27+ messages in thread From: Johannes Weiner @ 2022-05-25 11:56 UTC (permalink / raw) To: Yosry Ahmed Cc: Shakeel Butt, Sean Christopherson, Marc Zyngier, Tejun Heo, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Oliver Upton, Cgroups, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWM9gwCICpdfJQ On Tue, May 24, 2022 at 03:31:52PM -0700, Yosry Ahmed wrote: > On Fri, May 20, 2022 at 7:39 AM Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> wrote: > > I agree that this memory should show up in vmstat/memory.stat in some > > form or another. > > > > The arguments on whether this should be part of NR_PAGETABLE or a > > separate entry seem a bit vague to me. I was hoping somebody more > > familiar with KVM could provide a better picture of memory consumption > > in that area. > > > > Sean had mentioned that these allocations already get tracked through > > GFP_KERNEL_ACCOUNT. That's good, but if they are significant enough to > > track, they should be represented in memory.stat in some form. Sean > > also pointed out though that those allocations tend to scale rather > > differently than the page tables, so it probably makes sense to keep > > those two things separate at least. > > > > Any thoughts on putting shadow page tables and iommu page tables into > > the existing NR_PAGETABLE item? If not, what are the cons? > > > > And creating (maybe later) a separate NR_VIRT for the other > > GPF_KERNEL_ACCOUNT allocations in kvm? > > I agree with Sean that a NR_VIRT stat would be inaccurate by omission, > unless we account for all KVM allocations under this stat. This might > be an unnecessary burden according to what Sean said, as most other > allocations scale linearly with the number of vCPUs or the memory > assigned to the VM. I think it's fine to table the addition of NR_VIRT for now. My conclusion from this discussion was just that if we do want to add more KVM-related allocation sites later on, they likely would be something separate and not share an item with the shadow tables. This simplifies the discussion around how to present the shadow tables. That said, stats can be incremental and still useful. memory.current itself lies by ommission. It's more important to catch what's of significance and allow users to narrow down pathological cases. > I don't have enough context to say whether we should piggyback KVM MMU > pages to the existing NR_PAGETABLE item, but from a high level it > seems like it would be more helpful if they are a separate stat. > Anyway, I am willing to go with whatever Sean thinks is best. Somebody should work this out and put it into a changelog. It's permanent ABI. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. 2022-05-25 11:56 ` Johannes Weiner @ 2022-05-26 0:38 ` Sean Christopherson [not found] ` <Yo7MHA2aUaprvgl8-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 27+ messages in thread From: Sean Christopherson @ 2022-05-26 0:38 UTC (permalink / raw) To: Johannes Weiner Cc: Yosry Ahmed, Shakeel Butt, Marc Zyngier, Tejun Heo, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Oliver Upton, Cgroups, Linux Kernel Mailing List, linux-arm-kernel On Wed, May 25, 2022, Johannes Weiner wrote: > On Tue, May 24, 2022 at 03:31:52PM -0700, Yosry Ahmed wrote: > > I don't have enough context to say whether we should piggyback KVM MMU > > pages to the existing NR_PAGETABLE item, but from a high level it > > seems like it would be more helpful if they are a separate stat. > > Anyway, I am willing to go with whatever Sean thinks is best. > > Somebody should work this out and put it into a changelog. It's > permanent ABI. After a lot of waffling, my vote is to add a dedicated NR_SECONDARY_PAGETABLE. It's somewhat redundant from a KVM perspective, as NR_SECONDARY_PAGETABLE will scale with KVM's per-VM pages_{4k,2m,1g} stats unless the guest is doing something bizarre, e.g. accessing only 4kb chunks of 2mb pages so that KVM is forced to allocate a large number of page tables even though the guest isn't accessing that much memory. But, someone would need to either understand how KVM works to make that connection, or know (or be told) to go look at KVM's stats if they're running VMs to better decipher the stats. And even in the little bit of time I played with this, I found having nr_page_table_pages side-by-side with nr_secondary_page_table_pages to be very informative. E.g. when backing a VM with THP versus HugeTLB, nr_secondary_page_table_pages is roughly the same, but nr_page_table_pages is an order of a magnitude higher with THP. I'm guessing the THP behavior is due to something triggering DoubleMap, but now I want to find out why that's happening. So while I'm pretty sure a clever user could glean the same info by cross-referencing NR_PAGETABLE stats with KVM stats, I think having NR_SECONDARY_PAGETABLE will at the very least prove to be helpful for understanding tradeoffs between VM backing types, and likely even steer folks towards potential optimizations. Baseline: # grep page_table /proc/vmstat nr_page_table_pages 2830 nr_secondary_page_table_pages 0 THP: # grep page_table /proc/vmstat nr_page_table_pages 7584 nr_secondary_page_table_pages 140 HugeTLB: # grep page_table /proc/vmstat nr_page_table_pages 3153 nr_secondary_page_table_pages 153 ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <Yo7MHA2aUaprvgl8-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <Yo7MHA2aUaprvgl8-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> @ 2022-05-27 18:33 ` Yosry Ahmed [not found] ` <CAJD7tkYoz=rYvBV3tcp4aLgiyEtr-sBwbncFduZsOq+c8wk5sA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 27+ messages in thread From: Yosry Ahmed @ 2022-05-27 18:33 UTC (permalink / raw) To: Sean Christopherson Cc: Johannes Weiner, Shakeel Butt, Marc Zyngier, Tejun Heo, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Oliver Upton, Cgroups, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XWGXanvQGlWp On Wed, May 25, 2022 at 5:39 PM Sean Christopherson <seanjc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote: > > On Wed, May 25, 2022, Johannes Weiner wrote: > > On Tue, May 24, 2022 at 03:31:52PM -0700, Yosry Ahmed wrote: > > > I don't have enough context to say whether we should piggyback KVM MMU > > > pages to the existing NR_PAGETABLE item, but from a high level it > > > seems like it would be more helpful if they are a separate stat. > > > Anyway, I am willing to go with whatever Sean thinks is best. > > > > Somebody should work this out and put it into a changelog. It's > > permanent ABI. > > After a lot of waffling, my vote is to add a dedicated NR_SECONDARY_PAGETABLE. > > It's somewhat redundant from a KVM perspective, as NR_SECONDARY_PAGETABLE will > scale with KVM's per-VM pages_{4k,2m,1g} stats unless the guest is doing something > bizarre, e.g. accessing only 4kb chunks of 2mb pages so that KVM is forced to > allocate a large number of page tables even though the guest isn't accessing that > much memory. > > But, someone would need to either understand how KVM works to make that connection, > or know (or be told) to go look at KVM's stats if they're running VMs to better > decipher the stats. > > And even in the little bit of time I played with this, I found having > nr_page_table_pages side-by-side with nr_secondary_page_table_pages to be very > informative. E.g. when backing a VM with THP versus HugeTLB, > nr_secondary_page_table_pages is roughly the same, but nr_page_table_pages is an > order of a magnitude higher with THP. I'm guessing the THP behavior is due to > something triggering DoubleMap, but now I want to find out why that's happening. > > So while I'm pretty sure a clever user could glean the same info by cross-referencing > NR_PAGETABLE stats with KVM stats, I think having NR_SECONDARY_PAGETABLE will at the > very least prove to be helpful for understanding tradeoffs between VM backing types, > and likely even steer folks towards potential optimizations. > > Baseline: > # grep page_table /proc/vmstat > nr_page_table_pages 2830 > nr_secondary_page_table_pages 0 > > THP: > # grep page_table /proc/vmstat > nr_page_table_pages 7584 > nr_secondary_page_table_pages 140 > > HugeTLB: > # grep page_table /proc/vmstat > nr_page_table_pages 3153 > nr_secondary_page_table_pages 153 > Interesting findings! Thanks for taking the time to look into this, Sean! I will refresh this patchset and summarize the discussion in the commit message, and also fix some nits on the KVM side. Does this sound good to everyone? ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <CAJD7tkYoz=rYvBV3tcp4aLgiyEtr-sBwbncFduZsOq+c8wk5sA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses. [not found] ` <CAJD7tkYoz=rYvBV3tcp4aLgiyEtr-sBwbncFduZsOq+c8wk5sA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2022-06-03 16:42 ` Johannes Weiner 0 siblings, 0 replies; 27+ messages in thread From: Johannes Weiner @ 2022-06-03 16:42 UTC (permalink / raw) To: Yosry Ahmed Cc: Sean Christopherson, Shakeel Butt, Marc Zyngier, Tejun Heo, Zefan Li, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Oliver Upton, Cgroups, Linux Kernel Mailing List, linux-arm-kernel-IAPFreCvJWM9gwCICpdfJQ On Fri, May 27, 2022 at 11:33:27AM -0700, Yosry Ahmed wrote: > On Wed, May 25, 2022 at 5:39 PM Sean Christopherson <seanjc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote: > > > > On Wed, May 25, 2022, Johannes Weiner wrote: > > > On Tue, May 24, 2022 at 03:31:52PM -0700, Yosry Ahmed wrote: > > > > I don't have enough context to say whether we should piggyback KVM MMU > > > > pages to the existing NR_PAGETABLE item, but from a high level it > > > > seems like it would be more helpful if they are a separate stat. > > > > Anyway, I am willing to go with whatever Sean thinks is best. > > > > > > Somebody should work this out and put it into a changelog. It's > > > permanent ABI. > > > > After a lot of waffling, my vote is to add a dedicated NR_SECONDARY_PAGETABLE. > > > > It's somewhat redundant from a KVM perspective, as NR_SECONDARY_PAGETABLE will > > scale with KVM's per-VM pages_{4k,2m,1g} stats unless the guest is doing something > > bizarre, e.g. accessing only 4kb chunks of 2mb pages so that KVM is forced to > > allocate a large number of page tables even though the guest isn't accessing that > > much memory. > > > > But, someone would need to either understand how KVM works to make that connection, > > or know (or be told) to go look at KVM's stats if they're running VMs to better > > decipher the stats. > > > > And even in the little bit of time I played with this, I found having > > nr_page_table_pages side-by-side with nr_secondary_page_table_pages to be very > > informative. E.g. when backing a VM with THP versus HugeTLB, > > nr_secondary_page_table_pages is roughly the same, but nr_page_table_pages is an > > order of a magnitude higher with THP. I'm guessing the THP behavior is due to > > something triggering DoubleMap, but now I want to find out why that's happening. > > > > So while I'm pretty sure a clever user could glean the same info by cross-referencing > > NR_PAGETABLE stats with KVM stats, I think having NR_SECONDARY_PAGETABLE will at the > > very least prove to be helpful for understanding tradeoffs between VM backing types, > > and likely even steer folks towards potential optimizations. > > > > Baseline: > > # grep page_table /proc/vmstat > > nr_page_table_pages 2830 > > nr_secondary_page_table_pages 0 > > > > THP: > > # grep page_table /proc/vmstat > > nr_page_table_pages 7584 > > nr_secondary_page_table_pages 140 > > > > HugeTLB: > > # grep page_table /proc/vmstat > > nr_page_table_pages 3153 > > nr_secondary_page_table_pages 153 > > > > Interesting findings! Thanks for taking the time to look into this, Sean! > I will refresh this patchset and summarize the discussion in the > commit message, and also fix some nits on the KVM side. Does this > sound good to everyone? Yes, thanks for summarizing this. Sounds good to me! ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v4 2/4] KVM: mmu: add a helper to account memory used by KVM mmu. [not found] ` <20220429201131.3397875-1-yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 2022-04-29 20:11 ` [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses Yosry Ahmed @ 2022-04-29 20:11 ` Yosry Ahmed 2022-04-29 20:11 ` [PATCH v4 3/4] KVM: x86/mmu: count KVM mmu usage in secondary pagetable stats Yosry Ahmed 2022-04-29 20:11 ` [PATCH v4 4/4] KVM: arm64/mmu: count KVM s2 " Yosry Ahmed 3 siblings, 0 replies; 27+ messages in thread From: Yosry Ahmed @ 2022-04-29 20:11 UTC (permalink / raw) To: Tejun Heo, Johannes Weiner, Zefan Li, Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, Oliver Upton Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, kvmarm-FPEHb7Xf0XXUo1n7N8X6UoWGPAHP3yOg, kvm-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Yosry Ahmed Add a helper to account pages used by KVM for page tables in secondary pagetable stats. This function will be used by subsequent patches in different archs. Signed-off-by: Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> --- include/linux/kvm_host.h | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 252ee4a61b58..54cc4634053c 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2221,6 +2221,15 @@ static inline void kvm_handle_signal_exit(struct kvm_vcpu *vcpu) } #endif /* CONFIG_KVM_XFER_TO_GUEST_WORK */ +/* + * If nr > 1, we assume virt is the address of the first page of a block of + * pages that were allocated together (i.e accounted together). + */ +static inline void kvm_account_pgtable_pages(void *virt, int nr) +{ + mod_lruvec_page_state(virt_to_page(virt), NR_SECONDARY_PAGETABLE, nr); +} + /* * This defines how many reserved entries we want to keep before we * kick the vcpu to the userspace to avoid dirty ring full. This -- 2.36.0.464.gb9c8b46e94-goog ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v4 3/4] KVM: x86/mmu: count KVM mmu usage in secondary pagetable stats. [not found] ` <20220429201131.3397875-1-yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 2022-04-29 20:11 ` [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses Yosry Ahmed 2022-04-29 20:11 ` [PATCH v4 2/4] KVM: mmu: add a helper to account memory used by KVM mmu Yosry Ahmed @ 2022-04-29 20:11 ` Yosry Ahmed [not found] ` <20220429201131.3397875-4-yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 2022-04-29 20:11 ` [PATCH v4 4/4] KVM: arm64/mmu: count KVM s2 " Yosry Ahmed 3 siblings, 1 reply; 27+ messages in thread From: Yosry Ahmed @ 2022-04-29 20:11 UTC (permalink / raw) To: Tejun Heo, Johannes Weiner, Zefan Li, Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, Oliver Upton Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, kvmarm-FPEHb7Xf0XXUo1n7N8X6UoWGPAHP3yOg, kvm-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Yosry Ahmed Count the pages used by KVM mmu on x86 for in secondary pagetable stats. For the legacy mmu, accounting pagetable stats is combined KVM's existing for mmu pages in newly introduced kvm_[un]account_mmu_page() helpers. For tdp mmu, introduce new tdp_[un]account_mmu_page() helpers. That combines accounting pagetable stats with the tdp_mmu_pages counter accounting. tdp_mmu_pages counter introduced in this series [1]. This patch was rebased on top of the first two patches in that series. [1]https://lore.kernel.org/lkml/20220401063636.2414200-1-mizhang-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org/ Signed-off-by: Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> --- arch/x86/kvm/mmu/mmu.c | 16 ++++++++++++++-- arch/x86/kvm/mmu/tdp_mmu.c | 16 ++++++++++++++-- 2 files changed, 28 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 78d8e1d8fb99..e5b0e826445d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1679,6 +1679,18 @@ static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, long nr) percpu_counter_add(&kvm_total_used_mmu_pages, nr); } +static void kvm_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp) +{ + kvm_mod_used_mmu_pages(kvm, +1); + kvm_account_pgtable_pages((void *)sp->spt, +1); +} + +static void kvm_unaccount_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp) +{ + kvm_mod_used_mmu_pages(kvm, -1); + kvm_account_pgtable_pages((void *)sp->spt, -1); +} + static void kvm_mmu_free_page(struct kvm_mmu_page *sp) { MMU_WARN_ON(!is_empty_shadow_page(sp->spt)); @@ -1734,7 +1746,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct */ sp->mmu_valid_gen = vcpu->kvm->arch.mmu_valid_gen; list_add(&sp->link, &vcpu->kvm->arch.active_mmu_pages); - kvm_mod_used_mmu_pages(vcpu->kvm, +1); + kvm_account_mmu_page(vcpu->kvm, sp); return sp; } @@ -2363,7 +2375,7 @@ static bool __kvm_mmu_prepare_zap_page(struct kvm *kvm, list_add(&sp->link, invalid_list); else list_move(&sp->link, invalid_list); - kvm_mod_used_mmu_pages(kvm, -1); + kvm_unaccount_mmu_page(kvm, sp); } else { /* * Remove the active root from the active page list, the root diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 3456277ade18..6295c4da5dee 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -371,6 +371,18 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn, } } +static void tdp_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp) +{ + atomic64_inc(&kvm->arch.tdp_mmu_pages); + kvm_account_pgtable_pages((void *)sp->spt, +1); +} + +static void tdp_unaccount_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp) +{ + atomic64_dec(&kvm->arch.tdp_mmu_pages); + kvm_account_pgtable_pages((void *)sp->spt, -1); +} + /** * tdp_mmu_unlink_sp() - Remove a shadow page from the list of used pages * @@ -383,7 +395,7 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn, static void tdp_mmu_unlink_sp(struct kvm *kvm, struct kvm_mmu_page *sp, bool shared) { - atomic64_dec(&kvm->arch.tdp_mmu_pages); + tdp_unaccount_mmu_page(kvm, sp); if (!sp->lpage_disallowed) return; @@ -1121,7 +1133,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, tdp_mmu_set_spte(kvm, iter, spte); } - atomic64_inc(&kvm->arch.tdp_mmu_pages); + tdp_account_mmu_page(kvm, sp); return 0; } -- 2.36.0.464.gb9c8b46e94-goog ^ permalink raw reply related [flat|nested] 27+ messages in thread
[parent not found: <20220429201131.3397875-4-yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v4 3/4] KVM: x86/mmu: count KVM mmu usage in secondary pagetable stats. [not found] ` <20220429201131.3397875-4-yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> @ 2022-07-22 20:58 ` Mingwei Zhang [not found] ` <YtsPk5+hZNMEwT0c-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 27+ messages in thread From: Mingwei Zhang @ 2022-07-22 20:58 UTC (permalink / raw) To: Yosry Ahmed Cc: Tejun Heo, Johannes Weiner, Zefan Li, Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, Oliver Upton, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r On Fri, Apr 29, 2022, Yosry Ahmed wrote: > Count the pages used by KVM mmu on x86 for in secondary pagetable stats. > > For the legacy mmu, accounting pagetable stats is combined KVM's > existing for mmu pages in newly introduced kvm_[un]account_mmu_page() > helpers. > > For tdp mmu, introduce new tdp_[un]account_mmu_page() helpers. That > combines accounting pagetable stats with the tdp_mmu_pages counter > accounting. > > tdp_mmu_pages counter introduced in this series [1]. This patch was > rebased on top of the first two patches in that series. > > [1]https://lore.kernel.org/lkml/20220401063636.2414200-1-mizhang-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org/ > > Signed-off-by: Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > --- It looks like there are two metrics for mmu in x86: one for shadow mmu and the other for TDP mmu. Is there any plan to merge them together? > arch/x86/kvm/mmu/mmu.c | 16 ++++++++++++++-- > arch/x86/kvm/mmu/tdp_mmu.c | 16 ++++++++++++++-- > 2 files changed, 28 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 78d8e1d8fb99..e5b0e826445d 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -1679,6 +1679,18 @@ static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, long nr) > percpu_counter_add(&kvm_total_used_mmu_pages, nr); > } > > +static void kvm_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp) > +{ > + kvm_mod_used_mmu_pages(kvm, +1); > + kvm_account_pgtable_pages((void *)sp->spt, +1); > +} > + > +static void kvm_unaccount_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp) > +{ > + kvm_mod_used_mmu_pages(kvm, -1); > + kvm_account_pgtable_pages((void *)sp->spt, -1); > +} > + > static void kvm_mmu_free_page(struct kvm_mmu_page *sp) > { > MMU_WARN_ON(!is_empty_shadow_page(sp->spt)); > @@ -1734,7 +1746,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct > */ > sp->mmu_valid_gen = vcpu->kvm->arch.mmu_valid_gen; > list_add(&sp->link, &vcpu->kvm->arch.active_mmu_pages); > - kvm_mod_used_mmu_pages(vcpu->kvm, +1); > + kvm_account_mmu_page(vcpu->kvm, sp); > return sp; > } > > @@ -2363,7 +2375,7 @@ static bool __kvm_mmu_prepare_zap_page(struct kvm *kvm, > list_add(&sp->link, invalid_list); > else > list_move(&sp->link, invalid_list); > - kvm_mod_used_mmu_pages(kvm, -1); > + kvm_unaccount_mmu_page(kvm, sp); > } else { > /* > * Remove the active root from the active page list, the root > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c > index 3456277ade18..6295c4da5dee 100644 > --- a/arch/x86/kvm/mmu/tdp_mmu.c > +++ b/arch/x86/kvm/mmu/tdp_mmu.c > @@ -371,6 +371,18 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn, > } > } > > +static void tdp_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp) > +{ > + atomic64_inc(&kvm->arch.tdp_mmu_pages); > + kvm_account_pgtable_pages((void *)sp->spt, +1); > +} > + > +static void tdp_unaccount_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp) > +{ > + atomic64_dec(&kvm->arch.tdp_mmu_pages); > + kvm_account_pgtable_pages((void *)sp->spt, -1); > +} > + > /** > * tdp_mmu_unlink_sp() - Remove a shadow page from the list of used pages > * > @@ -383,7 +395,7 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn, > static void tdp_mmu_unlink_sp(struct kvm *kvm, struct kvm_mmu_page *sp, > bool shared) > { > - atomic64_dec(&kvm->arch.tdp_mmu_pages); > + tdp_unaccount_mmu_page(kvm, sp); > > if (!sp->lpage_disallowed) > return; > @@ -1121,7 +1133,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, > tdp_mmu_set_spte(kvm, iter, spte); > } > > - atomic64_inc(&kvm->arch.tdp_mmu_pages); > + tdp_account_mmu_page(kvm, sp); > > return 0; > } > -- > 2.36.0.464.gb9c8b46e94-goog > ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <YtsPk5+hZNMEwT0c-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v4 3/4] KVM: x86/mmu: count KVM mmu usage in secondary pagetable stats. [not found] ` <YtsPk5+hZNMEwT0c-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> @ 2022-07-26 18:03 ` Sean Christopherson 0 siblings, 0 replies; 27+ messages in thread From: Sean Christopherson @ 2022-07-26 18:03 UTC (permalink / raw) To: Mingwei Zhang Cc: Yosry Ahmed, Tejun Heo, Johannes Weiner, Zefan Li, Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, Oliver Upton, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r On Fri, Jul 22, 2022, Mingwei Zhang wrote: > On Fri, Apr 29, 2022, Yosry Ahmed wrote: > > Count the pages used by KVM mmu on x86 for in secondary pagetable stats. > > > > For the legacy mmu, accounting pagetable stats is combined KVM's > > existing for mmu pages in newly introduced kvm_[un]account_mmu_page() > > helpers. > > > > For tdp mmu, introduce new tdp_[un]account_mmu_page() helpers. That > > combines accounting pagetable stats with the tdp_mmu_pages counter > > accounting. > > > > tdp_mmu_pages counter introduced in this series [1]. This patch was > > rebased on top of the first two patches in that series. > > > > [1]https://lore.kernel.org/lkml/20220401063636.2414200-1-mizhang-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org/ > > > > Signed-off-by: Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > > --- > > It looks like there are two metrics for mmu in x86: one for shadow mmu > and the other for TDP mmu. Is there any plan to merge them together? There aren't two _separate_ metrics per se, rather that the TDP MMU (intentionally) doesn't honor KVM_SET_NR_MMU_PAGES, nor does it play nice the the core mm shrinkers. Thus, the TDP MMU doesn't udpate kvm_mod_used_mmu_pages(), which feeds into both of those things. Long term, I don't think the TDP MMU will ever honor KVM_SET_NR_MMU_PAGES. That particular knob predates proper integration with memcg and probably should be deprecated. As for supporting shrinkers in the TDP MMU, it's unclear whether or not that's truly necessary. And until mmu_shrink_scan() is made a _lot_ smarter, it's somewhat of a moot point because KVM's shrinker implementation is just too naive for it to be a net positive, e.g. it tends to zap upper level entries and wipe out large swaths of KVM's page tables. KVM_SET_NR_MMU_PAGES uses the same naive algorithm, so it's not any better. ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v4 4/4] KVM: arm64/mmu: count KVM s2 mmu usage in secondary pagetable stats [not found] ` <20220429201131.3397875-1-yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> ` (2 preceding siblings ...) 2022-04-29 20:11 ` [PATCH v4 3/4] KVM: x86/mmu: count KVM mmu usage in secondary pagetable stats Yosry Ahmed @ 2022-04-29 20:11 ` Yosry Ahmed 2022-05-02 7:24 ` Oliver Upton 3 siblings, 1 reply; 27+ messages in thread From: Yosry Ahmed @ 2022-04-29 20:11 UTC (permalink / raw) To: Tejun Heo, Johannes Weiner, Zefan Li, Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, Oliver Upton Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, kvmarm-FPEHb7Xf0XXUo1n7N8X6UoWGPAHP3yOg, kvm-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Yosry Ahmed Count the pages used by KVM in arm64 for stage2 mmu in secondary pagetable stats. Signed-off-by: Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> --- arch/arm64/kvm/mmu.c | 35 +++++++++++++++++++++++++++++++---- 1 file changed, 31 insertions(+), 4 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 53ae2c0640bc..fc5030307cce 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -92,9 +92,13 @@ static bool kvm_is_device_pfn(unsigned long pfn) static void *stage2_memcache_zalloc_page(void *arg) { struct kvm_mmu_memory_cache *mc = arg; + void *virt; /* Allocated with __GFP_ZERO, so no need to zero */ - return kvm_mmu_memory_cache_alloc(mc); + virt = kvm_mmu_memory_cache_alloc(mc); + if (virt) + kvm_account_pgtable_pages(virt, +1); + return virt; } static void *kvm_host_zalloc_pages_exact(size_t size) @@ -102,6 +106,20 @@ static void *kvm_host_zalloc_pages_exact(size_t size) return alloc_pages_exact(size, GFP_KERNEL_ACCOUNT | __GFP_ZERO); } +static void *kvm_s2_zalloc_pages_exact(size_t size) +{ + void *virt = kvm_host_zalloc_pages_exact(size); + if (virt) + kvm_account_pgtable_pages(virt, +(size >> PAGE_SHIFT)); + return virt; +} + +static void kvm_s2_free_pages_exact(void *virt, size_t size) +{ + kvm_account_pgtable_pages(virt, -(size >> PAGE_SHIFT)); + free_pages_exact(virt, size); +} + static void kvm_host_get_page(void *addr) { get_page(virt_to_page(addr)); @@ -112,6 +130,15 @@ static void kvm_host_put_page(void *addr) put_page(virt_to_page(addr)); } +static void kvm_s2_put_page(void *addr) +{ + struct page *p = virt_to_page(addr); + /* Dropping last refcount, the page will be freed */ + if (page_count(p) == 1) + kvm_account_pgtable_pages(addr, -1); + put_page(p); +} + static int kvm_host_page_count(void *addr) { return page_count(virt_to_page(addr)); @@ -603,10 +630,10 @@ static int get_user_mapping_size(struct kvm *kvm, u64 addr) static struct kvm_pgtable_mm_ops kvm_s2_mm_ops = { .zalloc_page = stage2_memcache_zalloc_page, - .zalloc_pages_exact = kvm_host_zalloc_pages_exact, - .free_pages_exact = free_pages_exact, + .zalloc_pages_exact = kvm_s2_zalloc_pages_exact, + .free_pages_exact = kvm_s2_free_pages_exact, .get_page = kvm_host_get_page, - .put_page = kvm_host_put_page, + .put_page = kvm_s2_put_page, .page_count = kvm_host_page_count, .phys_to_virt = kvm_host_va, .virt_to_phys = kvm_host_pa, -- 2.36.0.464.gb9c8b46e94-goog ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH v4 4/4] KVM: arm64/mmu: count KVM s2 mmu usage in secondary pagetable stats 2022-04-29 20:11 ` [PATCH v4 4/4] KVM: arm64/mmu: count KVM s2 " Yosry Ahmed @ 2022-05-02 7:24 ` Oliver Upton 2022-05-02 9:49 ` Marc Zyngier 0 siblings, 1 reply; 27+ messages in thread From: Oliver Upton @ 2022-05-02 7:24 UTC (permalink / raw) To: Yosry Ahmed Cc: Tejun Heo, Johannes Weiner, Zefan Li, Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andrew Morton, Michal Hocko, Roman Gushchin, Shakeel Butt, cgroups, linux-kernel, linux-arm-kernel, kvmarm, kvm Hi Yosry, On Fri, Apr 29, 2022 at 08:11:31PM +0000, Yosry Ahmed wrote: > Count the pages used by KVM in arm64 for stage2 mmu in secondary pagetable > stats. > > Signed-off-by: Yosry Ahmed <yosryahmed@google.com> > --- > arch/arm64/kvm/mmu.c | 35 +++++++++++++++++++++++++++++++---- > 1 file changed, 31 insertions(+), 4 deletions(-) > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 53ae2c0640bc..fc5030307cce 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -92,9 +92,13 @@ static bool kvm_is_device_pfn(unsigned long pfn) > static void *stage2_memcache_zalloc_page(void *arg) > { > struct kvm_mmu_memory_cache *mc = arg; > + void *virt; > > /* Allocated with __GFP_ZERO, so no need to zero */ > - return kvm_mmu_memory_cache_alloc(mc); > + virt = kvm_mmu_memory_cache_alloc(mc); > + if (virt) > + kvm_account_pgtable_pages(virt, +1); Sorry I didn't say it last time around, would now be a good time to clean up the funky sign convention of kvm_mod_used_mmu_pages()? Or limit the funk to just x86 :) -- Thanks, Oliver ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 4/4] KVM: arm64/mmu: count KVM s2 mmu usage in secondary pagetable stats 2022-05-02 7:24 ` Oliver Upton @ 2022-05-02 9:49 ` Marc Zyngier 2022-05-02 16:41 ` Yosry Ahmed 0 siblings, 1 reply; 27+ messages in thread From: Marc Zyngier @ 2022-05-02 9:49 UTC (permalink / raw) To: Oliver Upton Cc: Wanpeng Li, kvm, Roman Gushchin, Michal Hocko, Yosry Ahmed, linux-mm, Zefan Li, kvmarm, Joerg Roedel, Shakeel Butt, cgroups, Andrew Morton, linux-arm-kernel, Jim Mattson, linux-kernel, Johannes Weiner, Tejun Heo, Paolo Bonzini, Vitaly Kuznetsov On Mon, 02 May 2022 08:24:28 +0100, Oliver Upton <oupton@google.com> wrote: > > Hi Yosry, > > On Fri, Apr 29, 2022 at 08:11:31PM +0000, Yosry Ahmed wrote: > > Count the pages used by KVM in arm64 for stage2 mmu in secondary pagetable > > stats. > > > > Signed-off-by: Yosry Ahmed <yosryahmed@google.com> > > --- > > arch/arm64/kvm/mmu.c | 35 +++++++++++++++++++++++++++++++---- > > 1 file changed, 31 insertions(+), 4 deletions(-) > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index 53ae2c0640bc..fc5030307cce 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -92,9 +92,13 @@ static bool kvm_is_device_pfn(unsigned long pfn) > > static void *stage2_memcache_zalloc_page(void *arg) > > { > > struct kvm_mmu_memory_cache *mc = arg; > > + void *virt; > > > > /* Allocated with __GFP_ZERO, so no need to zero */ > > - return kvm_mmu_memory_cache_alloc(mc); > > + virt = kvm_mmu_memory_cache_alloc(mc); > > + if (virt) > > + kvm_account_pgtable_pages(virt, +1); > > Sorry I didn't say it last time around, would now be a good time to > clean up the funky sign convention of kvm_mod_used_mmu_pages()? Or limit > the funk to just x86 :) Indeed. I pointed this out in my initial review of this series, and expected these to be gone by now. M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 4/4] KVM: arm64/mmu: count KVM s2 mmu usage in secondary pagetable stats 2022-05-02 9:49 ` Marc Zyngier @ 2022-05-02 16:41 ` Yosry Ahmed 0 siblings, 0 replies; 27+ messages in thread From: Yosry Ahmed @ 2022-05-02 16:41 UTC (permalink / raw) To: Marc Zyngier Cc: Wanpeng Li, kvm, Roman Gushchin, Michal Hocko, linux-mm, Zefan Li, kvmarm, Joerg Roedel, Shakeel Butt, cgroups, Vitaly Kuznetsov, linux-arm-kernel, Jim Mattson, linux-kernel, Johannes Weiner, Tejun Heo, Paolo Bonzini, Andrew Morton [-- Attachment #1.1: Type: text/plain, Size: 1616 bytes --] On Mon, May 2, 2022 at 2:48 AM Marc Zyngier <maz@kernel.org> wrote: > On Mon, 02 May 2022 08:24:28 +0100, > Oliver Upton <oupton@google.com> wrote: > > > > Hi Yosry, > > > > On Fri, Apr 29, 2022 at 08:11:31PM +0000, Yosry Ahmed wrote: > > > Count the pages used by KVM in arm64 for stage2 mmu in secondary > pagetable > > > stats. > > > > > > Signed-off-by: Yosry Ahmed <yosryahmed@google.com> > > > --- > > > arch/arm64/kvm/mmu.c | 35 +++++++++++++++++++++++++++++++---- > > > 1 file changed, 31 insertions(+), 4 deletions(-) > > > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > > index 53ae2c0640bc..fc5030307cce 100644 > > > --- a/arch/arm64/kvm/mmu.c > > > +++ b/arch/arm64/kvm/mmu.c > > > @@ -92,9 +92,13 @@ static bool kvm_is_device_pfn(unsigned long pfn) > > > static void *stage2_memcache_zalloc_page(void *arg) > > > { > > > struct kvm_mmu_memory_cache *mc = arg; > > > + void *virt; > > > > > > /* Allocated with __GFP_ZERO, so no need to zero */ > > > - return kvm_mmu_memory_cache_alloc(mc); > > > + virt = kvm_mmu_memory_cache_alloc(mc); > > > + if (virt) > > > + kvm_account_pgtable_pages(virt, +1); > > > > Sorry I didn't say it last time around, would now be a good time to > > clean up the funky sign convention of kvm_mod_used_mmu_pages()? Or limit > > the funk to just x86 :) > > Indeed. I pointed this out in my initial review of this series, and > expected these to be gone by now. Hey everyone, Sorry about that. Will clean that up in the next version. > > M. > > -- > Without deviation from the norm, progress is not possible. > [-- Attachment #1.2: Type: text/html, Size: 2692 bytes --] [-- Attachment #2: Type: text/plain, Size: 151 bytes --] _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2022-07-26 18:03 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-04-29 20:11 [PATCH v4 0/4] KVM: mm: count KVM mmu usage in memory stats Yosry Ahmed
[not found] ` <20220429201131.3397875-1-yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2022-04-29 20:11 ` [PATCH v4 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses Yosry Ahmed
2022-05-02 10:01 ` Marc Zyngier
[not found] ` <87ilqoi77b.wl-maz-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2022-05-02 18:46 ` Yosry Ahmed
[not found] ` <CAJD7tkY7JF25XXUFq2mGroetMkfo-2zGOaQC94pjZE3D42+oaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-05-09 16:38 ` Yosry Ahmed
[not found] ` <CAJD7tkbfT-FRs3LE2GPddqrQSWw_eC1R6k3z04x=z9Zvt5yLpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-05-12 20:36 ` Shakeel Butt
2022-05-12 23:07 ` Johannes Weiner
[not found] ` <Yn2TGJ4vZ/fst+CY-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2022-05-12 23:29 ` Sean Christopherson
[not found] ` <Yn2YYl98Vhh/UL0w-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2022-05-13 15:50 ` Johannes Weiner
2022-05-13 16:12 ` Sean Christopherson
[not found] ` <Yn6DeEGLyR4Q0cDp-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2022-05-13 16:22 ` Yosry Ahmed
2022-05-13 17:13 ` Shakeel Butt
2022-05-20 1:56 ` Yosry Ahmed
[not found] ` <CAJD7tka-5+XRkthNV4qCg8woPCpjcwynQoRBame-3GP1L8y+WQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-05-20 14:39 ` Johannes Weiner
[not found] ` <YoeoLJNQTam5fJSu-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2022-05-24 22:31 ` Yosry Ahmed
[not found] ` <CAJD7tkYjcmwBeUx-=MTQeUf78uqFDvfpy7OuKy4OvoS7HiVO1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-05-25 11:56 ` Johannes Weiner
2022-05-26 0:38 ` Sean Christopherson
[not found] ` <Yo7MHA2aUaprvgl8-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2022-05-27 18:33 ` Yosry Ahmed
[not found] ` <CAJD7tkYoz=rYvBV3tcp4aLgiyEtr-sBwbncFduZsOq+c8wk5sA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-06-03 16:42 ` Johannes Weiner
2022-04-29 20:11 ` [PATCH v4 2/4] KVM: mmu: add a helper to account memory used by KVM mmu Yosry Ahmed
2022-04-29 20:11 ` [PATCH v4 3/4] KVM: x86/mmu: count KVM mmu usage in secondary pagetable stats Yosry Ahmed
[not found] ` <20220429201131.3397875-4-yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2022-07-22 20:58 ` Mingwei Zhang
[not found] ` <YtsPk5+hZNMEwT0c-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2022-07-26 18:03 ` Sean Christopherson
2022-04-29 20:11 ` [PATCH v4 4/4] KVM: arm64/mmu: count KVM s2 " Yosry Ahmed
2022-05-02 7:24 ` Oliver Upton
2022-05-02 9:49 ` Marc Zyngier
2022-05-02 16:41 ` Yosry Ahmed
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).