* [PATCH v2] mm: memcg: Don't periodically flush stats when memcg is disabled
@ 2024-01-26 21:19 T.J. Mercier
2024-01-26 21:28 ` Chris Li
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: T.J. Mercier @ 2024-01-26 21:19 UTC (permalink / raw)
To: tjmercier, Johannes Weiner, Michal Hocko, Roman Gushchin,
Shakeel Butt, Muchun Song, Andrew Morton
Cc: android-mm, Minchan Kim, cgroups, linux-mm, linux-kernel
The root memcg is onlined even when memcg is disabled. When it's onlined
a 2 second periodic stat flush is started, but no stat flushing is
required when memcg is disabled because there can be no child memcgs.
Most calls to flush memcg stats are avoided when memcg is disabled as a
result of the mem_cgroup_disabled check added in 7d7ef0a4686a
("mm: memcg: restore subtree stats flushing"), but the periodic flushing
started in mem_cgroup_css_online is not. Skip it.
Fixes: aa48e47e3906 ("memcg: infrastructure to flush memcg stats")
Reported-by: Minchan Kim <minchan@google.com>
Signed-off-by: T.J. Mercier <tjmercier@google.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
---
mm/memcontrol.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e4c8735e7c85..bad8f9dfc9ab 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5586,7 +5586,7 @@ static int mem_cgroup_css_online(struct cgroup_subsys_state *css)
if (alloc_shrinker_info(memcg))
goto offline_kmem;
- if (unlikely(mem_cgroup_is_root(memcg)))
+ if (unlikely(mem_cgroup_is_root(memcg)) && !mem_cgroup_disabled())
queue_delayed_work(system_unbound_wq, &stats_flush_dwork,
FLUSH_TIME);
lru_gen_online_memcg(memcg);
--
2.43.0.429.g432eaa2c6b-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v2] mm: memcg: Don't periodically flush stats when memcg is disabled 2024-01-26 21:19 [PATCH v2] mm: memcg: Don't periodically flush stats when memcg is disabled T.J. Mercier @ 2024-01-26 21:28 ` Chris Li 2024-01-29 20:32 ` Yosry Ahmed 2024-02-01 14:26 ` Michal Koutný 2 siblings, 0 replies; 6+ messages in thread From: Chris Li @ 2024-01-26 21:28 UTC (permalink / raw) To: T.J. Mercier Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt, Muchun Song, Andrew Morton, android-mm, Minchan Kim, cgroups, linux-mm, linux-kernel Hi T.J. Acked-by: Chris Li <chrisl@kernel.org> Chris On Fri, Jan 26, 2024 at 1:19 PM T.J. Mercier <tjmercier@google.com> wrote: > > The root memcg is onlined even when memcg is disabled. When it's onlined > a 2 second periodic stat flush is started, but no stat flushing is > required when memcg is disabled because there can be no child memcgs. > Most calls to flush memcg stats are avoided when memcg is disabled as a > result of the mem_cgroup_disabled check added in 7d7ef0a4686a > ("mm: memcg: restore subtree stats flushing"), but the periodic flushing > started in mem_cgroup_css_online is not. Skip it. > > Fixes: aa48e47e3906 ("memcg: infrastructure to flush memcg stats") > Reported-by: Minchan Kim <minchan@google.com> > Signed-off-by: T.J. Mercier <tjmercier@google.com> > Acked-by: Shakeel Butt <shakeelb@google.com> > Acked-by: Johannes Weiner <hannes@cmpxchg.org> > --- > mm/memcontrol.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index e4c8735e7c85..bad8f9dfc9ab 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -5586,7 +5586,7 @@ static int mem_cgroup_css_online(struct cgroup_subsys_state *css) > if (alloc_shrinker_info(memcg)) > goto offline_kmem; > > - if (unlikely(mem_cgroup_is_root(memcg))) > + if (unlikely(mem_cgroup_is_root(memcg)) && !mem_cgroup_disabled()) > queue_delayed_work(system_unbound_wq, &stats_flush_dwork, > FLUSH_TIME); > lru_gen_online_memcg(memcg); > -- > 2.43.0.429.g432eaa2c6b-goog > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] mm: memcg: Don't periodically flush stats when memcg is disabled 2024-01-26 21:19 [PATCH v2] mm: memcg: Don't periodically flush stats when memcg is disabled T.J. Mercier 2024-01-26 21:28 ` Chris Li @ 2024-01-29 20:32 ` Yosry Ahmed 2024-02-01 14:26 ` Michal Koutný 2 siblings, 0 replies; 6+ messages in thread From: Yosry Ahmed @ 2024-01-29 20:32 UTC (permalink / raw) To: T.J. Mercier Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt, Muchun Song, Andrew Morton, android-mm, Minchan Kim, cgroups, linux-mm, linux-kernel On Fri, Jan 26, 2024 at 1:19 PM T.J. Mercier <tjmercier@google.com> wrote: > > The root memcg is onlined even when memcg is disabled. When it's onlined > a 2 second periodic stat flush is started, but no stat flushing is > required when memcg is disabled because there can be no child memcgs. > Most calls to flush memcg stats are avoided when memcg is disabled as a > result of the mem_cgroup_disabled check added in 7d7ef0a4686a > ("mm: memcg: restore subtree stats flushing"), but the periodic flushing > started in mem_cgroup_css_online is not. Skip it. > > Fixes: aa48e47e3906 ("memcg: infrastructure to flush memcg stats") > Reported-by: Minchan Kim <minchan@google.com> > Signed-off-by: T.J. Mercier <tjmercier@google.com> > Acked-by: Shakeel Butt <shakeelb@google.com> > Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Yosry Ahmed <yosryahmed@google.com> Thanks! ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] mm: memcg: Don't periodically flush stats when memcg is disabled 2024-01-26 21:19 [PATCH v2] mm: memcg: Don't periodically flush stats when memcg is disabled T.J. Mercier 2024-01-26 21:28 ` Chris Li 2024-01-29 20:32 ` Yosry Ahmed @ 2024-02-01 14:26 ` Michal Koutný 2024-02-01 21:02 ` T.J. Mercier 2 siblings, 1 reply; 6+ messages in thread From: Michal Koutný @ 2024-02-01 14:26 UTC (permalink / raw) To: T.J. Mercier Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt, Muchun Song, Andrew Morton, android-mm, Minchan Kim, cgroups, linux-mm, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1673 bytes --] On Fri, Jan 26, 2024 at 09:19:25PM +0000, "T.J. Mercier" <tjmercier@google.com> wrote: > The root memcg is onlined even when memcg is disabled. When it's onlined > a 2 second periodic stat flush is started, but no stat flushing is > required when memcg is disabled because there can be no child memcgs. > Most calls to flush memcg stats are avoided when memcg is disabled as a > result of the mem_cgroup_disabled check added in 7d7ef0a4686a > ("mm: memcg: restore subtree stats flushing"), but the periodic flushing > started in mem_cgroup_css_online is not. Skip it. Have you tried --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -6099,6 +6099,9 @@ int __init cgroup_init(void) cgroup_unlock(); for_each_subsys(ss, ssid) { + if (!cgroup_ssid_enabled(ssid)) + continue; + if (ss->early_init) { struct cgroup_subsys_state *css = init_css_set.subsys[ss->id]; @@ -6118,9 +6121,6 @@ int __init cgroup_init(void) * disabled flag and cftype registration needs kmalloc, * both of which aren't available during early_init. */ - if (!cgroup_ssid_enabled(ssid)) - continue; - if (cgroup1_ssid_disabled(ssid)) pr_info("Disabling %s control group subsystem in v1 mounts\n", ss->legacy_name); ? I'm asking about a try because I'm not sure whether this does not blow up due to something missing. But I think disabled controllers would not need to be (root-)onlined at all. Thanks, Michal [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] mm: memcg: Don't periodically flush stats when memcg is disabled 2024-02-01 14:26 ` Michal Koutný @ 2024-02-01 21:02 ` T.J. Mercier 2024-02-02 22:39 ` Michal Koutný 0 siblings, 1 reply; 6+ messages in thread From: T.J. Mercier @ 2024-02-01 21:02 UTC (permalink / raw) To: Michal Koutný Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt, Muchun Song, Andrew Morton, android-mm, Minchan Kim, cgroups, linux-mm, linux-kernel On Thu, Feb 1, 2024 at 6:26 AM Michal Koutný <mkoutny@suse.com> wrote: > > On Fri, Jan 26, 2024 at 09:19:25PM +0000, "T.J. Mercier" <tjmercier@google.com> wrote: > > The root memcg is onlined even when memcg is disabled. When it's onlined > > a 2 second periodic stat flush is started, but no stat flushing is > > required when memcg is disabled because there can be no child memcgs. > > Most calls to flush memcg stats are avoided when memcg is disabled as a > > result of the mem_cgroup_disabled check added in 7d7ef0a4686a > > ("mm: memcg: restore subtree stats flushing"), but the periodic flushing > > started in mem_cgroup_css_online is not. Skip it. > > Have you tried > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -6099,6 +6099,9 @@ int __init cgroup_init(void) > cgroup_unlock(); > > for_each_subsys(ss, ssid) { > + if (!cgroup_ssid_enabled(ssid)) > + continue; > + > if (ss->early_init) { > struct cgroup_subsys_state *css = > init_css_set.subsys[ss->id]; > @@ -6118,9 +6121,6 @@ int __init cgroup_init(void) > * disabled flag and cftype registration needs kmalloc, > * both of which aren't available during early_init. > */ > - if (!cgroup_ssid_enabled(ssid)) > - continue; > - > if (cgroup1_ssid_disabled(ssid)) > pr_info("Disabling %s control group subsystem in v1 mounts\n", > ss->legacy_name); > ? > I'm asking about a try because I'm not sure whether this does not blow > up due to something missing. But I think disabled controllers would not > need to be (root-)onlined at all. > > Thanks, > Michal Hi Michal, It does blow up, but not how I was expecting. There's a null pointer dereference inside find_css_set when trying to get a css pointer for the memory controller, I think because the allocation in cgroup_init_subsys is skipped: [ 9.591766] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 9.591909] #PF: supervisor read access in kernel mode [ 9.592023] #PF: error_code(0x0000) - not-present page [ 9.592138] PGD 0 P4D 0 [ 9.592199] Oops: 0000 [#1] PREEMPT SMP PTI [ 9.592289] CPU: 1 PID: 1 Comm: init Tainted: G E 6.7.0-mainline-maybe-dirty #1 [ 9.592490] Hardware name: emulation qemu-x86/qemu-x86, BIOS 2023.07-rc6-gb8315a3184b2-ab11347395 07/01/2023 [ 9.592731] RIP: 0010:find_css_set+0x5c3/0x7a0 [ 9.592850] Code: 20 69 cf b2 4e 8b 3c 33 48 c7 c7 1d 3b 41 b2 4c 89 fe e8 10 b0 f7 00 49 8b b4 24 a0 00 00 00 4e 8d 24 2b 49 81 c4 b0 fc ff ff <49> 8b 0f 4c 01 e9 48 c7 c7 c4 c0 46 b2 4c 89 e2 e8 e8 af f7 00 49 [ 9.593251] RSP: 0018:ffffb6218000bb90 EFLAGS: 00010087 [ 9.593370] RAX: 0000000000000021 RBX: ffff99181044a200 RCX: 4f5e789f089a0c00 [ 9.593554] RDX: ffffb6218000ba50 RSI: ffffffffb2448284 RDI: ffffffffb2c91950 [ 9.593735] RBP: ffffb6218000bc28 R08: 0000000000000fff R09: ffffffffb2c79950 [ 9.593909] R10: 0000000000002ffd R11: 0000000000000004 R12: ffff99181044a2d8 [ 9.594102] R13: 0000000000000428 R14: 0000000000000020 R15: 0000000000000000 [ 9.594291] FS: 00007f3d2f986fc8(0000) GS:ffff99182bd00000(0000) knlGS:0000000000000000 [ 9.594467] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9.594610] CR2: 0000000000000000 CR3: 00000001001d8001 CR4: 0000000000370eb0 [ 9.594818] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 9.595007] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 9.595178] Call Trace: [ 9.595237] <TASK> [ 9.595297] ? __die_body+0x5e/0xb0 [ 9.595392] ? __die+0x9e/0xb0 [ 9.595481] ? page_fault_oops+0x35f/0x3d0 [ 9.595574] ? do_user_addr_fault+0x6ab/0x780 [ 9.595690] ? prb_read_valid+0x28/0x60 [ 9.595782] ? exc_page_fault+0x83/0x220 [ 9.595874] ? asm_exc_page_fault+0x2b/0x30 [ 9.595966] ? find_css_set+0x5c3/0x7a0 [ 9.596059] cgroup_migrate_prepare_dst+0x75/0x2b0 [ 9.596194] cgroup_attach_task+0x293/0x450 [ 9.596305] ? cgroup_attach_task+0xb6/0x450 [ 9.596449] __cgroup_procs_write+0xef/0x1a0 [ 9.596589] cgroup_procs_write+0x16/0x30 [ 9.596733] cgroup_file_write+0x9d/0x260 [ 9.596840] kernfs_fop_write_iter+0x145/0x1e0 [ 9.596981] vfs_write+0x276/0x2e0 [ 9.597092] ksys_write+0x73/0xe0 [ 9.597198] __x64_sys_write+0x1a/0x30 [ 9.597303] do_syscall_64+0x5a/0x100 [ 9.597430] entry_SYSCALL_64_after_hwframe+0x6e/0x76 But there are also calls to mem_cgroup_css_from_folio that I think would cause a different null pointer deref even if we had the css but no root_mem_cgroup. There seems to be an assumption that we'll have a memcg to charge against even when the controller is disabled. To me it looks like that's to simplify the possible combinations of CONFIG_MEMCG and memcg being boot-time disabled or not. Best, T.J. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Re: [PATCH v2] mm: memcg: Don't periodically flush stats when memcg is disabled 2024-02-01 21:02 ` T.J. Mercier @ 2024-02-02 22:39 ` Michal Koutný 0 siblings, 0 replies; 6+ messages in thread From: Michal Koutný @ 2024-02-02 22:39 UTC (permalink / raw) To: T.J. Mercier Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt, Muchun Song, Andrew Morton, android-mm, Minchan Kim, cgroups, linux-mm, linux-kernel On Thu, Feb 01, 2024 at 01:02:04PM -0800, "T.J. Mercier" <tjmercier@google.com> wrote: > It does blow up, but not how I was expecting. There's a null pointer > dereference inside find_css_set when trying to get a css pointer for > the memory controller, I think because the allocation in > cgroup_init_subsys is skipped: Thanks for trying! I suspected it won't be easy. At the same time I suspected there must be a hook for your purpose -- after looking at cpuset, I was reminded of cgroup_subsys.bind callback. What about triggering periodic flush in that callback? (memcg doesn't implement it yet but cgroup_init() takes it into account.) It would take any dwork activation out of mem_cgroup_css_online() and it seems cleaner. (Ideally, the flush could be disabled again when memcg root is unmounted again. (That's impossible and practically unused but that's why consider callback approach cleaner. Of course, your original guard serves the purpose too.)) Regards, Michal ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-02-02 22:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-01-26 21:19 [PATCH v2] mm: memcg: Don't periodically flush stats when memcg is disabled T.J. Mercier 2024-01-26 21:28 ` Chris Li 2024-01-29 20:32 ` Yosry Ahmed 2024-02-01 14:26 ` Michal Koutný 2024-02-01 21:02 ` T.J. Mercier 2024-02-02 22:39 ` Michal Koutný
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).