* Re: [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg
2024-10-23 20:34 [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg Joshua Hahn
@ 2024-10-23 21:15 ` Yosry Ahmed
2024-10-23 21:17 ` Yosry Ahmed
2024-10-23 21:32 ` Joshua Hahn
2024-10-23 22:17 ` Shakeel Butt
` (2 subsequent siblings)
3 siblings, 2 replies; 10+ messages in thread
From: Yosry Ahmed @ 2024-10-23 21:15 UTC (permalink / raw)
To: Joshua Hahn
Cc: hannes, nphamcs, mhocko, roman.gushcin, shakeel.butt, muchun.song,
lnyng, akpm, cgroups
Hi Joshua,
On Wed, Oct 23, 2024 at 1:34 PM Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
>
> Changelog
> v2:
> * Enables the feature only if memcg accounts for hugeTLB usage
> * Moves the counter from memcg_stat_item to node_stat_item
> * Expands on motivation & justification in commitlog
> * Added Suggested-by: Nhat Pham
Changelogs usually go at the end, after ---, not as part of the commit
log itself.
>
> This patch introduces a new counter to memory.stat that tracks hugeTLB
> usage, only if hugeTLB accounting is done to memory.current. This
> feature is enabled the same way hugeTLB accounting is enabled, via
> the memory_hugetlb_accounting mount flag for cgroupsv2.
>
> 1. Why is this patch necessary?
> Currently, memcg hugeTLB accounting is an opt-in feature [1] that adds
> hugeTLB usage to memory.current. However, the metric is not reported in
> memory.stat. Given that users often interpret memory.stat as a breakdown
> of the value reported in memory.current, the disparity between the two
> reports can be confusing. This patch solves this problem by including
> the metric in memory.stat as well, but only if it is also reported in
> memory.current (it would also be confusing if the value was reported in
> memory.stat, but not in memory.current)
>
> Aside from the consistentcy between the two files, we also see benefits
consistency*
> in observability. Userspace might be interested in the hugeTLB footprint
> of cgroups for many reasons. For instance, system admins might want to
> verify that hugeTLB usage is distributed as expected across tasks: i.e.
> memory-intensive tasks are using more hugeTLB pages than tasks that
> don't consume a lot of memory, or is seen to fault frequently. Note that
are* seen
> this is separate from wanting to inspect the distribution for limiting
> purposes (in which case, hugeTLB controller makes more sense).
>
> 2. We already have a hugeTLB controller. Why not use that?
> It is true that hugeTLB tracks the exact value that we want. In fact, by
> enabling the hugeTLB controller, we get all of the observability
> benefits that I mentioned above, and users can check the total hugeTLB
> usage, verify if it is distributed as expected, etc.
>
> With this said, there are 2 problems:
> (a) They are still not reported in memory.stat, which means the
> disparity between the memcg reports are still there.
> (b) We cannot reasonably expect users to enable the hugeTLB controller
> just for the sake of hugeTLB usage reporting, especially since
> they don't have any use for hugeTLB usage enforcing [2].
>
> [1] https://lore.kernel.org/all/20231006184629.155543-1-nphamcs@gmail.com/
> [2] Of course, we can't make a new patch for every feature that can be
> duplicated. However, since the exsting solution of enabling the
existing*
> hugeTLB controller is an imperfect solution that still leaves a
> discrepancy between memory.stat and memory.curent, I think that it
> is reasonable to isolate the feature in this case.
>
> Suggested-by: Nhat Pham <nphamcs@gmail.com>
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
You should also CC linux-mm on such patches.
>
> ---
> include/linux/mmzone.h | 3 +++
> mm/hugetlb.c | 4 ++++
> mm/memcontrol.c | 11 +++++++++++
> mm/vmstat.c | 3 +++
> 4 files changed, 21 insertions(+)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 17506e4a2835..d3ba49a974b2 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -215,6 +215,9 @@ enum node_stat_item {
> #ifdef CONFIG_NUMA_BALANCING
> PGPROMOTE_SUCCESS, /* promote successfully */
> PGPROMOTE_CANDIDATE, /* candidate pages to promote */
> +#endif
> +#ifdef CONFIG_HUGETLB_PAGE
> + HUGETLB_B,
Why '_B'?
> #endif
> /* PGDEMOTE_*: pages demoted */
> PGDEMOTE_KSWAPD,
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 190fa05635f4..055bc91858e4 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1925,6 +1925,8 @@ void free_huge_folio(struct folio *folio)
> pages_per_huge_page(h), folio);
> hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
> pages_per_huge_page(h), folio);
> + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
I think we already have a couple of these checks and this patch adds a
few more, perhaps we should add a helper at this point to improve
readability? Maybe something like memcg_accounts_hugetlb()?
> + lruvec_stat_mod_folio(folio, HUGETLB_B, -pages_per_huge_page(h));
> mem_cgroup_uncharge(folio);
> if (restore_reserve)
> h->resv_huge_pages++;
> @@ -3094,6 +3096,8 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
> if (!memcg_charge_ret)
> mem_cgroup_commit_charge(folio, memcg);
> mem_cgroup_put(memcg);
> + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
> + lruvec_stat_mod_folio(folio, HUGETLB_B, pages_per_huge_page(h));
>
> return folio;
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 7845c64a2c57..de5899eb8203 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -306,6 +306,9 @@ static const unsigned int memcg_node_stat_items[] = {
> #endif
> #ifdef CONFIG_NUMA_BALANCING
> PGPROMOTE_SUCCESS,
> +#endif
> +#ifdef CONFIG_HUGETLB_PAGE
> + HUGETLB_B,
> #endif
> PGDEMOTE_KSWAPD,
> PGDEMOTE_DIRECT,
> @@ -1327,6 +1330,9 @@ static const struct memory_stat memory_stats[] = {
> #ifdef CONFIG_ZSWAP
> { "zswap", MEMCG_ZSWAP_B },
> { "zswapped", MEMCG_ZSWAPPED },
> +#endif
> +#ifdef CONFIG_HUGETLB_PAGE
> + { "hugeTLB", HUGETLB_B },
nit: I think we usually use lowercase letters when naming stats.
> #endif
> { "file_mapped", NR_FILE_MAPPED },
> { "file_dirty", NR_FILE_DIRTY },
> @@ -1441,6 +1447,11 @@ static void memcg_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
> for (i = 0; i < ARRAY_SIZE(memory_stats); i++) {
> u64 size;
>
> +#ifdef CONFIG_HUGETLB_PAGE
> + if (unlikely(memory_stats[i].idx == HUGETLB_B) &&
> + !(cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING))
> + continue;
> +#endif
> size = memcg_page_state_output(memcg, memory_stats[i].idx);
> seq_buf_printf(s, "%s %llu\n", memory_stats[i].name, size);
>
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index b5a4cea423e1..466c40cffeb0 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1269,6 +1269,9 @@ const char * const vmstat_text[] = {
> #ifdef CONFIG_NUMA_BALANCING
> "pgpromote_success",
> "pgpromote_candidate",
> +#endif
> +#ifdef CONFIG_HUGETLB_PAGE
> + "hugeTLB",
> #endif
> "pgdemote_kswapd",
> "pgdemote_direct",
> --
> 2.43.5
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg
2024-10-23 21:15 ` Yosry Ahmed
@ 2024-10-23 21:17 ` Yosry Ahmed
2024-10-24 16:57 ` Roman Gushchin
2024-10-23 21:32 ` Joshua Hahn
1 sibling, 1 reply; 10+ messages in thread
From: Yosry Ahmed @ 2024-10-23 21:17 UTC (permalink / raw)
To: Joshua Hahn
Cc: hannes, nphamcs, mhocko, roman.gushcin, shakeel.butt, muchun.song,
lnyng, akpm, cgroups, Roman Gushchin
On Wed, Oct 23, 2024 at 2:15 PM Yosry Ahmed <yosryahmed@google.com> wrote:
>
> Hi Joshua,
>
> On Wed, Oct 23, 2024 at 1:34 PM Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> >
> > Changelog
> > v2:
> > * Enables the feature only if memcg accounts for hugeTLB usage
> > * Moves the counter from memcg_stat_item to node_stat_item
> > * Expands on motivation & justification in commitlog
> > * Added Suggested-by: Nhat Pham
>
> Changelogs usually go at the end, after ---, not as part of the commit
> log itself.
>
> >
> > This patch introduces a new counter to memory.stat that tracks hugeTLB
> > usage, only if hugeTLB accounting is done to memory.current. This
> > feature is enabled the same way hugeTLB accounting is enabled, via
> > the memory_hugetlb_accounting mount flag for cgroupsv2.
> >
> > 1. Why is this patch necessary?
> > Currently, memcg hugeTLB accounting is an opt-in feature [1] that adds
> > hugeTLB usage to memory.current. However, the metric is not reported in
> > memory.stat. Given that users often interpret memory.stat as a breakdown
> > of the value reported in memory.current, the disparity between the two
> > reports can be confusing. This patch solves this problem by including
> > the metric in memory.stat as well, but only if it is also reported in
> > memory.current (it would also be confusing if the value was reported in
> > memory.stat, but not in memory.current)
> >
> > Aside from the consistentcy between the two files, we also see benefits
>
> consistency*
>
> > in observability. Userspace might be interested in the hugeTLB footprint
> > of cgroups for many reasons. For instance, system admins might want to
> > verify that hugeTLB usage is distributed as expected across tasks: i.e.
> > memory-intensive tasks are using more hugeTLB pages than tasks that
> > don't consume a lot of memory, or is seen to fault frequently. Note that
>
> are* seen
>
> > this is separate from wanting to inspect the distribution for limiting
> > purposes (in which case, hugeTLB controller makes more sense).
> >
> > 2. We already have a hugeTLB controller. Why not use that?
> > It is true that hugeTLB tracks the exact value that we want. In fact, by
> > enabling the hugeTLB controller, we get all of the observability
> > benefits that I mentioned above, and users can check the total hugeTLB
> > usage, verify if it is distributed as expected, etc.
> >
> > With this said, there are 2 problems:
> > (a) They are still not reported in memory.stat, which means the
> > disparity between the memcg reports are still there.
> > (b) We cannot reasonably expect users to enable the hugeTLB controller
> > just for the sake of hugeTLB usage reporting, especially since
> > they don't have any use for hugeTLB usage enforcing [2].
> >
> > [1] https://lore.kernel.org/all/20231006184629.155543-1-nphamcs@gmail.com/
> > [2] Of course, we can't make a new patch for every feature that can be
> > duplicated. However, since the exsting solution of enabling the
>
> existing*
>
> > hugeTLB controller is an imperfect solution that still leaves a
> > discrepancy between memory.stat and memory.curent, I think that it
> > is reasonable to isolate the feature in this case.
> >
> > Suggested-by: Nhat Pham <nphamcs@gmail.com>
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>
> You should also CC linux-mm on such patches.
+roman.gushchin@linux.dev
CCing Roman's correct email.
>
> >
> > ---
> > include/linux/mmzone.h | 3 +++
> > mm/hugetlb.c | 4 ++++
> > mm/memcontrol.c | 11 +++++++++++
> > mm/vmstat.c | 3 +++
> > 4 files changed, 21 insertions(+)
> >
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 17506e4a2835..d3ba49a974b2 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -215,6 +215,9 @@ enum node_stat_item {
> > #ifdef CONFIG_NUMA_BALANCING
> > PGPROMOTE_SUCCESS, /* promote successfully */
> > PGPROMOTE_CANDIDATE, /* candidate pages to promote */
> > +#endif
> > +#ifdef CONFIG_HUGETLB_PAGE
> > + HUGETLB_B,
>
> Why '_B'?
>
> > #endif
> > /* PGDEMOTE_*: pages demoted */
> > PGDEMOTE_KSWAPD,
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 190fa05635f4..055bc91858e4 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1925,6 +1925,8 @@ void free_huge_folio(struct folio *folio)
> > pages_per_huge_page(h), folio);
> > hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
> > pages_per_huge_page(h), folio);
> > + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
>
> I think we already have a couple of these checks and this patch adds a
> few more, perhaps we should add a helper at this point to improve
> readability? Maybe something like memcg_accounts_hugetlb()?
>
> > + lruvec_stat_mod_folio(folio, HUGETLB_B, -pages_per_huge_page(h));
> > mem_cgroup_uncharge(folio);
> > if (restore_reserve)
> > h->resv_huge_pages++;
> > @@ -3094,6 +3096,8 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
> > if (!memcg_charge_ret)
> > mem_cgroup_commit_charge(folio, memcg);
> > mem_cgroup_put(memcg);
> > + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
> > + lruvec_stat_mod_folio(folio, HUGETLB_B, pages_per_huge_page(h));
> >
> > return folio;
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 7845c64a2c57..de5899eb8203 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -306,6 +306,9 @@ static const unsigned int memcg_node_stat_items[] = {
> > #endif
> > #ifdef CONFIG_NUMA_BALANCING
> > PGPROMOTE_SUCCESS,
> > +#endif
> > +#ifdef CONFIG_HUGETLB_PAGE
> > + HUGETLB_B,
> > #endif
> > PGDEMOTE_KSWAPD,
> > PGDEMOTE_DIRECT,
> > @@ -1327,6 +1330,9 @@ static const struct memory_stat memory_stats[] = {
> > #ifdef CONFIG_ZSWAP
> > { "zswap", MEMCG_ZSWAP_B },
> > { "zswapped", MEMCG_ZSWAPPED },
> > +#endif
> > +#ifdef CONFIG_HUGETLB_PAGE
> > + { "hugeTLB", HUGETLB_B },
>
> nit: I think we usually use lowercase letters when naming stats.
>
> > #endif
> > { "file_mapped", NR_FILE_MAPPED },
> > { "file_dirty", NR_FILE_DIRTY },
> > @@ -1441,6 +1447,11 @@ static void memcg_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
> > for (i = 0; i < ARRAY_SIZE(memory_stats); i++) {
> > u64 size;
> >
> > +#ifdef CONFIG_HUGETLB_PAGE
> > + if (unlikely(memory_stats[i].idx == HUGETLB_B) &&
> > + !(cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING))
> > + continue;
> > +#endif
> > size = memcg_page_state_output(memcg, memory_stats[i].idx);
> > seq_buf_printf(s, "%s %llu\n", memory_stats[i].name, size);
> >
> > diff --git a/mm/vmstat.c b/mm/vmstat.c
> > index b5a4cea423e1..466c40cffeb0 100644
> > --- a/mm/vmstat.c
> > +++ b/mm/vmstat.c
> > @@ -1269,6 +1269,9 @@ const char * const vmstat_text[] = {
> > #ifdef CONFIG_NUMA_BALANCING
> > "pgpromote_success",
> > "pgpromote_candidate",
> > +#endif
> > +#ifdef CONFIG_HUGETLB_PAGE
> > + "hugeTLB",
> > #endif
> > "pgdemote_kswapd",
> > "pgdemote_direct",
> > --
> > 2.43.5
> >
> >
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg
2024-10-23 21:17 ` Yosry Ahmed
@ 2024-10-24 16:57 ` Roman Gushchin
0 siblings, 0 replies; 10+ messages in thread
From: Roman Gushchin @ 2024-10-24 16:57 UTC (permalink / raw)
To: Yosry Ahmed
Cc: Joshua Hahn, hannes, nphamcs, mhocko, roman.gushcin, shakeel.butt,
muchun.song, lnyng, akpm, cgroups
On Wed, Oct 23, 2024 at 02:17:27PM -0700, Yosry Ahmed wrote:
> On Wed, Oct 23, 2024 at 2:15 PM Yosry Ahmed <yosryahmed@google.com> wrote:
> >
> > Hi Joshua,
> >
> > On Wed, Oct 23, 2024 at 1:34 PM Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> > >
> > > Changelog
> > > v2:
> > > * Enables the feature only if memcg accounts for hugeTLB usage
> > > * Moves the counter from memcg_stat_item to node_stat_item
> > > * Expands on motivation & justification in commitlog
> > > * Added Suggested-by: Nhat Pham
> >
> > Changelogs usually go at the end, after ---, not as part of the commit
> > log itself.
> >
> > >
> > > This patch introduces a new counter to memory.stat that tracks hugeTLB
> > > usage, only if hugeTLB accounting is done to memory.current. This
> > > feature is enabled the same way hugeTLB accounting is enabled, via
> > > the memory_hugetlb_accounting mount flag for cgroupsv2.
> > >
> > > 1. Why is this patch necessary?
> > > Currently, memcg hugeTLB accounting is an opt-in feature [1] that adds
> > > hugeTLB usage to memory.current. However, the metric is not reported in
> > > memory.stat. Given that users often interpret memory.stat as a breakdown
> > > of the value reported in memory.current, the disparity between the two
> > > reports can be confusing. This patch solves this problem by including
> > > the metric in memory.stat as well, but only if it is also reported in
> > > memory.current (it would also be confusing if the value was reported in
> > > memory.stat, but not in memory.current)
> > >
> > > Aside from the consistentcy between the two files, we also see benefits
> >
> > consistency*
> >
> > > in observability. Userspace might be interested in the hugeTLB footprint
> > > of cgroups for many reasons. For instance, system admins might want to
> > > verify that hugeTLB usage is distributed as expected across tasks: i.e.
> > > memory-intensive tasks are using more hugeTLB pages than tasks that
> > > don't consume a lot of memory, or is seen to fault frequently. Note that
> >
> > are* seen
> >
> > > this is separate from wanting to inspect the distribution for limiting
> > > purposes (in which case, hugeTLB controller makes more sense).
> > >
> > > 2. We already have a hugeTLB controller. Why not use that?
> > > It is true that hugeTLB tracks the exact value that we want. In fact, by
> > > enabling the hugeTLB controller, we get all of the observability
> > > benefits that I mentioned above, and users can check the total hugeTLB
> > > usage, verify if it is distributed as expected, etc.
> > >
> > > With this said, there are 2 problems:
> > > (a) They are still not reported in memory.stat, which means the
> > > disparity between the memcg reports are still there.
> > > (b) We cannot reasonably expect users to enable the hugeTLB controller
> > > just for the sake of hugeTLB usage reporting, especially since
> > > they don't have any use for hugeTLB usage enforcing [2].
> > >
> > > [1] https://lore.kernel.org/all/20231006184629.155543-1-nphamcs@gmail.com/
> > > [2] Of course, we can't make a new patch for every feature that can be
> > > duplicated. However, since the exsting solution of enabling the
> >
> > existing*
> >
> > > hugeTLB controller is an imperfect solution that still leaves a
> > > discrepancy between memory.stat and memory.curent, I think that it
> > > is reasonable to isolate the feature in this case.
> > >
> > > Suggested-by: Nhat Pham <nphamcs@gmail.com>
> > > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> >
> > You should also CC linux-mm on such patches.
>
> +roman.gushchin@linux.dev
>
> CCing Roman's correct email.
Funny enough I suggested a similar functionality back to 2017:
https://lkml.iu.edu/hypermail/linux/kernel/1711.1/04891.html ,
but it was rejected at that time.
So yeah, it still sounds like a good idea to me.
As I understand, there is a new version of this patchset pending,
I'll wait for it for the review.
Thank you for looping me in!
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg
2024-10-23 21:15 ` Yosry Ahmed
2024-10-23 21:17 ` Yosry Ahmed
@ 2024-10-23 21:32 ` Joshua Hahn
2024-10-23 21:37 ` Yosry Ahmed
1 sibling, 1 reply; 10+ messages in thread
From: Joshua Hahn @ 2024-10-23 21:32 UTC (permalink / raw)
To: Yosry Ahmed
Cc: hannes, nphamcs, mhocko, roman.gushcin, shakeel.butt, muchun.song,
lnyng, akpm, cgroups
Hi Yosry, thank you for taking the time to review my patch. It seems
like I made a
lot of silly spelling / grammar / style mistakes in this patch, I'll
be more mindful of
these in the future.
On Wed, Oct 23, 2024 at 5:15 PM Yosry Ahmed <yosryahmed@google.com> wrote:
>
> Hi Joshua,
> [...]
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 17506e4a2835..d3ba49a974b2 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -215,6 +215,9 @@ enum node_stat_item {
> > #ifdef CONFIG_NUMA_BALANCING
> > PGPROMOTE_SUCCESS, /* promote successfully */
> > PGPROMOTE_CANDIDATE, /* candidate pages to promote */
> > +#endif
> > +#ifdef CONFIG_HUGETLB_PAGE
> > + HUGETLB_B,
>
> Why '_B'?
I added _B because I am measuring the statistics in bytes (not pages).
IIRC some stat items use _B at the end to denote that the unit is in bytes,
so I put it at the end in the spirit of maintaining consistency. However, if
is not the case, I will remove it in the next version.
> [...]
> > #endif
> > /* PGDEMOTE_*: pages demoted */
> > PGDEMOTE_KSWAPD,
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 190fa05635f4..055bc91858e4 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1925,6 +1925,8 @@ void free_huge_folio(struct folio *folio)
> > pages_per_huge_page(h), folio);
> > hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
> > pages_per_huge_page(h), folio);
> > + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
>
> I think we already have a couple of these checks and this patch adds a
> few more, perhaps we should add a helper at this point to improve
> readability? Maybe something like memcg_accounts_hugetlb()?
> [...]
As far as I can tell, these checks only come up in mem_cgroup_hugetlb_try_charge
and cgroup_show_options. Shakeel already requested a cleanup of the try charge
function in the v1 thread, so I think the best course of action is to
add the helper
function in this patch (series) and use that helper function in
another patch series to
clean up the remaining functions.
I'll be sure to add the other grammar / style changes that you
mentioned above in
v3 as well. Thank you again for your feedback!
Joshua
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg
2024-10-23 21:32 ` Joshua Hahn
@ 2024-10-23 21:37 ` Yosry Ahmed
0 siblings, 0 replies; 10+ messages in thread
From: Yosry Ahmed @ 2024-10-23 21:37 UTC (permalink / raw)
To: Joshua Hahn
Cc: hannes, nphamcs, mhocko, shakeel.butt, muchun.song, lnyng, akpm,
cgroups, Linux-MM, Roman Gushchin
On Wed, Oct 23, 2024 at 2:32 PM Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
>
> Hi Yosry, thank you for taking the time to review my patch. It seems
> like I made a
> lot of silly spelling / grammar / style mistakes in this patch, I'll
> be more mindful of
> these in the future.
>
> On Wed, Oct 23, 2024 at 5:15 PM Yosry Ahmed <yosryahmed@google.com> wrote:
> >
> > Hi Joshua,
> > [...]
> > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > > index 17506e4a2835..d3ba49a974b2 100644
> > > --- a/include/linux/mmzone.h
> > > +++ b/include/linux/mmzone.h
> > > @@ -215,6 +215,9 @@ enum node_stat_item {
> > > #ifdef CONFIG_NUMA_BALANCING
> > > PGPROMOTE_SUCCESS, /* promote successfully */
> > > PGPROMOTE_CANDIDATE, /* candidate pages to promote */
> > > +#endif
> > > +#ifdef CONFIG_HUGETLB_PAGE
> > > + HUGETLB_B,
> >
> > Why '_B'?
>
> I added _B because I am measuring the statistics in bytes (not pages).
> IIRC some stat items use _B at the end to denote that the unit is in bytes,
> so I put it at the end in the spirit of maintaining consistency. However, if
> is not the case, I will remove it in the next version.
I think ~all the memcg stats are exported in bytes, the ones that have
_B are also counted in bytes. I think in this case we are counting the
stats in pages.
See memcg_page_state_output_unit() and memcg_page_state_unit() for more details.
>
> > [...]
> > > #endif
> > > /* PGDEMOTE_*: pages demoted */
> > > PGDEMOTE_KSWAPD,
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 190fa05635f4..055bc91858e4 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -1925,6 +1925,8 @@ void free_huge_folio(struct folio *folio)
> > > pages_per_huge_page(h), folio);
> > > hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
> > > pages_per_huge_page(h), folio);
> > > + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
> >
> > I think we already have a couple of these checks and this patch adds a
> > few more, perhaps we should add a helper at this point to improve
> > readability? Maybe something like memcg_accounts_hugetlb()?
> > [...]
>
> As far as I can tell, these checks only come up in mem_cgroup_hugetlb_try_charge
> and cgroup_show_options. Shakeel already requested a cleanup of the try charge
> function in the v1 thread, so I think the best course of action is to
> add the helper
> function in this patch (series) and use that helper function in
> another patch series to
> clean up the remaining functions.
Introducing the helper in this patch/series and using it in further
cleanups makes sense to me.
>
> I'll be sure to add the other grammar / style changes that you
> mentioned above in
> v3 as well. Thank you again for your feedback!
>
> Joshua
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg
2024-10-23 20:34 [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg Joshua Hahn
2024-10-23 21:15 ` Yosry Ahmed
@ 2024-10-23 22:17 ` Shakeel Butt
2024-10-24 18:13 ` Joshua Hahn
2024-10-24 15:52 ` kernel test robot
2024-10-25 0:11 ` kernel test robot
3 siblings, 1 reply; 10+ messages in thread
From: Shakeel Butt @ 2024-10-23 22:17 UTC (permalink / raw)
To: Joshua Hahn
Cc: hannes, nphamcs, mhocko, roman.gushcin, muchun.song, lnyng, akpm,
cgroups
On Wed, Oct 23, 2024 at 01:34:33PM GMT, Joshua Hahn wrote:
> Changelog
> v2:
> * Enables the feature only if memcg accounts for hugeTLB usage
> * Moves the counter from memcg_stat_item to node_stat_item
> * Expands on motivation & justification in commitlog
> * Added Suggested-by: Nhat Pham
>
> This patch introduces a new counter to memory.stat that tracks hugeTLB
> usage, only if hugeTLB accounting is done to memory.current. This
> feature is enabled the same way hugeTLB accounting is enabled, via
> the memory_hugetlb_accounting mount flag for cgroupsv2.
>
> 1. Why is this patch necessary?
> Currently, memcg hugeTLB accounting is an opt-in feature [1] that adds
> hugeTLB usage to memory.current. However, the metric is not reported in
> memory.stat. Given that users often interpret memory.stat as a breakdown
> of the value reported in memory.current, the disparity between the two
> reports can be confusing. This patch solves this problem by including
> the metric in memory.stat as well, but only if it is also reported in
> memory.current (it would also be confusing if the value was reported in
> memory.stat, but not in memory.current)
>
> Aside from the consistentcy between the two files, we also see benefits
> in observability. Userspace might be interested in the hugeTLB footprint
> of cgroups for many reasons. For instance, system admins might want to
> verify that hugeTLB usage is distributed as expected across tasks: i.e.
> memory-intensive tasks are using more hugeTLB pages than tasks that
> don't consume a lot of memory, or is seen to fault frequently. Note that
> this is separate from wanting to inspect the distribution for limiting
> purposes (in which case, hugeTLB controller makes more sense).
>
> 2. We already have a hugeTLB controller. Why not use that?
> It is true that hugeTLB tracks the exact value that we want. In fact, by
> enabling the hugeTLB controller, we get all of the observability
> benefits that I mentioned above, and users can check the total hugeTLB
> usage, verify if it is distributed as expected, etc.
>
> With this said, there are 2 problems:
> (a) They are still not reported in memory.stat, which means the
> disparity between the memcg reports are still there.
> (b) We cannot reasonably expect users to enable the hugeTLB controller
> just for the sake of hugeTLB usage reporting, especially since
> they don't have any use for hugeTLB usage enforcing [2].
>
> [1] https://lore.kernel.org/all/20231006184629.155543-1-nphamcs@gmail.com/
> [2] Of course, we can't make a new patch for every feature that can be
> duplicated. However, since the exsting solution of enabling the
> hugeTLB controller is an imperfect solution that still leaves a
> discrepancy between memory.stat and memory.curent, I think that it
> is reasonable to isolate the feature in this case.
>
> Suggested-by: Nhat Pham <nphamcs@gmail.com>
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>
> ---
> include/linux/mmzone.h | 3 +++
> mm/hugetlb.c | 4 ++++
> mm/memcontrol.c | 11 +++++++++++
> mm/vmstat.c | 3 +++
> 4 files changed, 21 insertions(+)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 17506e4a2835..d3ba49a974b2 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -215,6 +215,9 @@ enum node_stat_item {
> #ifdef CONFIG_NUMA_BALANCING
> PGPROMOTE_SUCCESS, /* promote successfully */
> PGPROMOTE_CANDIDATE, /* candidate pages to promote */
> +#endif
> +#ifdef CONFIG_HUGETLB_PAGE
> + HUGETLB_B,
As Yosry pointed out, this is in pages, not bytes. There is already
functionality to display this bin ytes for the readers of the memory
stats.
Also you will need to update Documentation/admin-guide/cgroup-v2.rst to
include the hugetlb stats.
> #endif
> /* PGDEMOTE_*: pages demoted */
> PGDEMOTE_KSWAPD,
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 190fa05635f4..055bc91858e4 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1925,6 +1925,8 @@ void free_huge_folio(struct folio *folio)
> pages_per_huge_page(h), folio);
> hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
> pages_per_huge_page(h), folio);
> + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
> + lruvec_stat_mod_folio(folio, HUGETLB_B, -pages_per_huge_page(h));
Please note that by you are adding this stat not only in memcg but also
in global and per-node vmstat. This check will break those interfaces
when this mount option is not used. You only need the check at the
charging time. The uncharging and stats update functions will do the
right thing as they check memcg_data attached to the folio.
> mem_cgroup_uncharge(folio);
> if (restore_reserve)
> h->resv_huge_pages++;
> @@ -3094,6 +3096,8 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
> if (!memcg_charge_ret)
> mem_cgroup_commit_charge(folio, memcg);
> mem_cgroup_put(memcg);
> + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
Same here.
> + lruvec_stat_mod_folio(folio, HUGETLB_B, pages_per_huge_page(h));
>
> return folio;
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg
2024-10-23 22:17 ` Shakeel Butt
@ 2024-10-24 18:13 ` Joshua Hahn
0 siblings, 0 replies; 10+ messages in thread
From: Joshua Hahn @ 2024-10-24 18:13 UTC (permalink / raw)
To: Shakeel Butt
Cc: hannes, nphamcs, mhocko, roman.gushcin, muchun.song, lnyng, akpm,
cgroups
On Wed, Oct 23, 2024 at 6:17 PM Shakeel Butt <shakeel.butt@linux.dev> wrote:
>
> On Wed, Oct 23, 2024 at 01:34:33PM GMT, Joshua Hahn wrote:
> > include/linux/mmzone.h | 3 +++
> > mm/hugetlb.c | 4 ++++
> > mm/memcontrol.c | 11 +++++++++++
> > mm/vmstat.c | 3 +++
> > 4 files changed, 21 insertions(+)
> >
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 17506e4a2835..d3ba49a974b2 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -215,6 +215,9 @@ enum node_stat_item {
> > #ifdef CONFIG_NUMA_BALANCING
> > PGPROMOTE_SUCCESS, /* promote successfully */
> > PGPROMOTE_CANDIDATE, /* candidate pages to promote */
> > +#endif
> > +#ifdef CONFIG_HUGETLB_PAGE
> > + HUGETLB_B,
>
> As Yosry pointed out, this is in pages, not bytes. There is already
> functionality to display this bin ytes for the readers of the memory
> stats.
Ah I see. I misunderstood what the _B meant, I didn't realize that it's
used to denote what the units of accounting were, not for exporting. I just
checked the functions that Yosry mentioned, and it makes a lot more sense.
(I also should have known, the method that I'm using to do the accounting
is called "pages"_per_huge_page)
> Also you will need to update Documentation/admin-guide/cgroup-v2.rst to
> include the hugetlb stats.
Thank you for pointing this out, I'll update the doc to include hugetlb stats.
> > #endif
> > /* PGDEMOTE_*: pages demoted */
> > PGDEMOTE_KSWAPD,
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 190fa05635f4..055bc91858e4 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1925,6 +1925,8 @@ void free_huge_folio(struct folio *folio)
> > pages_per_huge_page(h), folio);
> > hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
> > pages_per_huge_page(h), folio);
> > + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
> > + lruvec_stat_mod_folio(folio, HUGETLB_B, -pages_per_huge_page(h));
>
> Please note that by you are adding this stat not only in memcg but also
> in global and per-node vmstat. This check will break those interfaces
> when this mount option is not used. You only need the check at the
> charging time. The uncharging and stats update functions will do the
> right thing as they check memcg_data attached to the folio.
Sounds good. I'll go back to make sure which other parts of the code
I'm touching
with this patch. Thank you for your feedback!
Joshua
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg
2024-10-23 20:34 [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg Joshua Hahn
2024-10-23 21:15 ` Yosry Ahmed
2024-10-23 22:17 ` Shakeel Butt
@ 2024-10-24 15:52 ` kernel test robot
2024-10-25 0:11 ` kernel test robot
3 siblings, 0 replies; 10+ messages in thread
From: kernel test robot @ 2024-10-24 15:52 UTC (permalink / raw)
To: Joshua Hahn, hannes
Cc: oe-kbuild-all, nphamcs, mhocko, roman.gushcin, shakeel.butt,
muchun.song, lnyng, akpm, cgroups
Hi Joshua,
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on linus/master v6.12-rc4 next-20241024]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Joshua-Hahn/memcg-hugetlb-Adding-hugeTLB-counters-to-memcg/20241024-043559
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20241023203433.1568323-1-joshua.hahnjy%40gmail.com
patch subject: [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg
config: powerpc-ep88xc_defconfig (https://download.01.org/0day-ci/archive/20241024/202410242321.pttwmbpo-lkp@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 14.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241024/202410242321.pttwmbpo-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410242321.pttwmbpo-lkp@intel.com/
All errors (new ones prefixed by >>):
mm/hugetlb.c: In function 'free_huge_folio':
>> mm/hugetlb.c:1928:13: error: 'cgrp_dfl_root' undeclared (first use in this function)
1928 | if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
| ^~~~~~~~~~~~~
mm/hugetlb.c:1928:13: note: each undeclared identifier is reported only once for each function it appears in
>> mm/hugetlb.c:1928:35: error: 'CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING' undeclared (first use in this function)
1928 | if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/hugetlb.c: In function 'alloc_hugetlb_folio':
mm/hugetlb.c:3099:13: error: 'cgrp_dfl_root' undeclared (first use in this function)
3099 | if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
| ^~~~~~~~~~~~~
mm/hugetlb.c:3099:35: error: 'CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING' undeclared (first use in this function)
3099 | if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
vim +/cgrp_dfl_root +1928 mm/hugetlb.c
1880
1881 void free_huge_folio(struct folio *folio)
1882 {
1883 /*
1884 * Can't pass hstate in here because it is called from the
1885 * generic mm code.
1886 */
1887 struct hstate *h = folio_hstate(folio);
1888 int nid = folio_nid(folio);
1889 struct hugepage_subpool *spool = hugetlb_folio_subpool(folio);
1890 bool restore_reserve;
1891 unsigned long flags;
1892
1893 VM_BUG_ON_FOLIO(folio_ref_count(folio), folio);
1894 VM_BUG_ON_FOLIO(folio_mapcount(folio), folio);
1895
1896 hugetlb_set_folio_subpool(folio, NULL);
1897 if (folio_test_anon(folio))
1898 __ClearPageAnonExclusive(&folio->page);
1899 folio->mapping = NULL;
1900 restore_reserve = folio_test_hugetlb_restore_reserve(folio);
1901 folio_clear_hugetlb_restore_reserve(folio);
1902
1903 /*
1904 * If HPageRestoreReserve was set on page, page allocation consumed a
1905 * reservation. If the page was associated with a subpool, there
1906 * would have been a page reserved in the subpool before allocation
1907 * via hugepage_subpool_get_pages(). Since we are 'restoring' the
1908 * reservation, do not call hugepage_subpool_put_pages() as this will
1909 * remove the reserved page from the subpool.
1910 */
1911 if (!restore_reserve) {
1912 /*
1913 * A return code of zero implies that the subpool will be
1914 * under its minimum size if the reservation is not restored
1915 * after page is free. Therefore, force restore_reserve
1916 * operation.
1917 */
1918 if (hugepage_subpool_put_pages(spool, 1) == 0)
1919 restore_reserve = true;
1920 }
1921
1922 spin_lock_irqsave(&hugetlb_lock, flags);
1923 folio_clear_hugetlb_migratable(folio);
1924 hugetlb_cgroup_uncharge_folio(hstate_index(h),
1925 pages_per_huge_page(h), folio);
1926 hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
1927 pages_per_huge_page(h), folio);
> 1928 if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
1929 lruvec_stat_mod_folio(folio, HUGETLB_B, -pages_per_huge_page(h));
1930 mem_cgroup_uncharge(folio);
1931 if (restore_reserve)
1932 h->resv_huge_pages++;
1933
1934 if (folio_test_hugetlb_temporary(folio)) {
1935 remove_hugetlb_folio(h, folio, false);
1936 spin_unlock_irqrestore(&hugetlb_lock, flags);
1937 update_and_free_hugetlb_folio(h, folio, true);
1938 } else if (h->surplus_huge_pages_node[nid]) {
1939 /* remove the page from active list */
1940 remove_hugetlb_folio(h, folio, true);
1941 spin_unlock_irqrestore(&hugetlb_lock, flags);
1942 update_and_free_hugetlb_folio(h, folio, true);
1943 } else {
1944 arch_clear_hugetlb_flags(folio);
1945 enqueue_hugetlb_folio(h, folio);
1946 spin_unlock_irqrestore(&hugetlb_lock, flags);
1947 }
1948 }
1949
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg
2024-10-23 20:34 [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg Joshua Hahn
` (2 preceding siblings ...)
2024-10-24 15:52 ` kernel test robot
@ 2024-10-25 0:11 ` kernel test robot
3 siblings, 0 replies; 10+ messages in thread
From: kernel test robot @ 2024-10-25 0:11 UTC (permalink / raw)
To: Joshua Hahn, hannes
Cc: llvm, oe-kbuild-all, nphamcs, mhocko, roman.gushcin, shakeel.butt,
muchun.song, lnyng, akpm, cgroups
Hi Joshua,
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on linus/master v6.12-rc4 next-20241024]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Joshua-Hahn/memcg-hugetlb-Adding-hugeTLB-counters-to-memcg/20241024-043559
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20241023203433.1568323-1-joshua.hahnjy%40gmail.com
patch subject: [PATCH v2 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg
config: i386-buildonly-randconfig-003-20241025 (https://download.01.org/0day-ci/archive/20241025/202410250704.0LVzYMzi-lkp@intel.com/config)
compiler: clang version 19.1.2 (https://github.com/llvm/llvm-project 7ba7d8e2f7b6445b60679da826210cdde29eaf8b)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241025/202410250704.0LVzYMzi-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410250704.0LVzYMzi-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from mm/hugetlb.c:8:
In file included from include/linux/mm.h:2213:
include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
518 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
| ~~~~~~~~~~~ ^ ~~~
In file included from mm/hugetlb.c:37:
include/linux/mm_inline.h:47:41: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
47 | __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
| ~~~~~~~~~~~ ^ ~~~
include/linux/mm_inline.h:49:22: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
49 | NR_ZONE_LRU_BASE + lru, nr_pages);
| ~~~~~~~~~~~~~~~~ ^ ~~~
>> mm/hugetlb.c:1928:6: error: use of undeclared identifier 'cgrp_dfl_root'
1928 | if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
| ^
>> mm/hugetlb.c:1928:28: error: use of undeclared identifier 'CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING'
1928 | if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
| ^
mm/hugetlb.c:3099:6: error: use of undeclared identifier 'cgrp_dfl_root'
3099 | if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
| ^
mm/hugetlb.c:3099:28: error: use of undeclared identifier 'CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING'
3099 | if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
| ^
3 warnings and 4 errors generated.
vim +/cgrp_dfl_root +1928 mm/hugetlb.c
1880
1881 void free_huge_folio(struct folio *folio)
1882 {
1883 /*
1884 * Can't pass hstate in here because it is called from the
1885 * generic mm code.
1886 */
1887 struct hstate *h = folio_hstate(folio);
1888 int nid = folio_nid(folio);
1889 struct hugepage_subpool *spool = hugetlb_folio_subpool(folio);
1890 bool restore_reserve;
1891 unsigned long flags;
1892
1893 VM_BUG_ON_FOLIO(folio_ref_count(folio), folio);
1894 VM_BUG_ON_FOLIO(folio_mapcount(folio), folio);
1895
1896 hugetlb_set_folio_subpool(folio, NULL);
1897 if (folio_test_anon(folio))
1898 __ClearPageAnonExclusive(&folio->page);
1899 folio->mapping = NULL;
1900 restore_reserve = folio_test_hugetlb_restore_reserve(folio);
1901 folio_clear_hugetlb_restore_reserve(folio);
1902
1903 /*
1904 * If HPageRestoreReserve was set on page, page allocation consumed a
1905 * reservation. If the page was associated with a subpool, there
1906 * would have been a page reserved in the subpool before allocation
1907 * via hugepage_subpool_get_pages(). Since we are 'restoring' the
1908 * reservation, do not call hugepage_subpool_put_pages() as this will
1909 * remove the reserved page from the subpool.
1910 */
1911 if (!restore_reserve) {
1912 /*
1913 * A return code of zero implies that the subpool will be
1914 * under its minimum size if the reservation is not restored
1915 * after page is free. Therefore, force restore_reserve
1916 * operation.
1917 */
1918 if (hugepage_subpool_put_pages(spool, 1) == 0)
1919 restore_reserve = true;
1920 }
1921
1922 spin_lock_irqsave(&hugetlb_lock, flags);
1923 folio_clear_hugetlb_migratable(folio);
1924 hugetlb_cgroup_uncharge_folio(hstate_index(h),
1925 pages_per_huge_page(h), folio);
1926 hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
1927 pages_per_huge_page(h), folio);
> 1928 if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
1929 lruvec_stat_mod_folio(folio, HUGETLB_B, -pages_per_huge_page(h));
1930 mem_cgroup_uncharge(folio);
1931 if (restore_reserve)
1932 h->resv_huge_pages++;
1933
1934 if (folio_test_hugetlb_temporary(folio)) {
1935 remove_hugetlb_folio(h, folio, false);
1936 spin_unlock_irqrestore(&hugetlb_lock, flags);
1937 update_and_free_hugetlb_folio(h, folio, true);
1938 } else if (h->surplus_huge_pages_node[nid]) {
1939 /* remove the page from active list */
1940 remove_hugetlb_folio(h, folio, true);
1941 spin_unlock_irqrestore(&hugetlb_lock, flags);
1942 update_and_free_hugetlb_folio(h, folio, true);
1943 } else {
1944 arch_clear_hugetlb_flags(folio);
1945 enqueue_hugetlb_folio(h, folio);
1946 spin_unlock_irqrestore(&hugetlb_lock, flags);
1947 }
1948 }
1949
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 10+ messages in thread