linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] memcg: Optimize exit to user space
@ 2025-08-13 14:57 Thomas Gleixner
  2025-08-13 15:45 ` Roman Gushchin
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Thomas Gleixner @ 2025-08-13 14:57 UTC (permalink / raw)
  To: linux-mm
  Cc: LKML, Peter Zijlstra, Johannes Weiner, Michal Hocko,
	Roman Gushchin, Shakeel Butt, Muchun Song, cgroups, Andrew Morton

memcg uses TIF_NOTIFY_RESUME to handle reclaiming on exit to user
space. TIF_NOTIFY_RESUME is a multiplexing TIF bit, which is utilized by
other entities as well.

This results in a unconditional mem_cgroup_handle_over_high() call for
every invocation of resume_user_mode_work(), which is a pointless
exercise as most of the time there is no reclaim work to do.

Especially since RSEQ is used by glibc, TIF_NOTIFY_RESUME is raised
quite frequently and the empty calls show up in exit path profiling.

Optimize this by doing a quick check of the reclaim condition before
invoking it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 include/linux/memcontrol.h |    8 +++++++-
 mm/memcontrol.c            |    4 ++--
 2 files changed, 9 insertions(+), 3 deletions(-)

--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -900,7 +900,13 @@ unsigned long mem_cgroup_get_zone_lru_si
 	return READ_ONCE(mz->lru_zone_size[zone_idx][lru]);
 }
 
-void mem_cgroup_handle_over_high(gfp_t gfp_mask);
+void __mem_cgroup_handle_over_high(gfp_t gfp_mask);
+
+static inline void mem_cgroup_handle_over_high(gfp_t gfp_mask)
+{
+	if (unlikely(current->memcg_nr_pages_over_high))
+		__mem_cgroup_handle_over_high(gfp_mask);
+}
 
 unsigned long mem_cgroup_get_max(struct mem_cgroup *memcg);
 
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2203,7 +2203,7 @@ static unsigned long calculate_high_dela
  * try_charge() (context permitting), as well as from the userland
  * return path where reclaim is always able to block.
  */
-void mem_cgroup_handle_over_high(gfp_t gfp_mask)
+void __mem_cgroup_handle_over_high(gfp_t gfp_mask)
 {
 	unsigned long penalty_jiffies;
 	unsigned long pflags;
@@ -2486,7 +2486,7 @@ static int try_charge_memcg(struct mem_c
 	if (current->memcg_nr_pages_over_high > MEMCG_CHARGE_BATCH &&
 	    !(current->flags & PF_MEMALLOC) &&
 	    gfpflags_allow_blocking(gfp_mask))
-		mem_cgroup_handle_over_high(gfp_mask);
+		__mem_cgroup_handle_over_high(gfp_mask);
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] memcg: Optimize exit to user space
  2025-08-13 14:57 [PATCH] memcg: Optimize exit to user space Thomas Gleixner
@ 2025-08-13 15:45 ` Roman Gushchin
  2025-08-13 16:17 ` Johannes Weiner
  2025-08-13 17:19 ` Shakeel Butt
  2 siblings, 0 replies; 6+ messages in thread
From: Roman Gushchin @ 2025-08-13 15:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-mm, LKML, Peter Zijlstra, Johannes Weiner, Michal Hocko,
	Shakeel Butt, Muchun Song, cgroups, Andrew Morton

Thomas Gleixner <tglx@linutronix.de> writes:

> memcg uses TIF_NOTIFY_RESUME to handle reclaiming on exit to user
> space. TIF_NOTIFY_RESUME is a multiplexing TIF bit, which is utilized by
> other entities as well.
>
> This results in a unconditional mem_cgroup_handle_over_high() call for
> every invocation of resume_user_mode_work(), which is a pointless
> exercise as most of the time there is no reclaim work to do.
>
> Especially since RSEQ is used by glibc, TIF_NOTIFY_RESUME is raised
> quite frequently and the empty calls show up in exit path profiling.
>
> Optimize this by doing a quick check of the reclaim condition before
> invoking it.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Shakeel Butt <shakeel.butt@linux.dev>
> Cc: Muchun Song <muchun.song@linux.dev>
> Cc: Andrew Morton <akpm@linux-foundation.org>

Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>

Thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] memcg: Optimize exit to user space
  2025-08-13 14:57 [PATCH] memcg: Optimize exit to user space Thomas Gleixner
  2025-08-13 15:45 ` Roman Gushchin
@ 2025-08-13 16:17 ` Johannes Weiner
  2025-08-13 17:19 ` Shakeel Butt
  2 siblings, 0 replies; 6+ messages in thread
From: Johannes Weiner @ 2025-08-13 16:17 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-mm, LKML, Peter Zijlstra, Michal Hocko, Roman Gushchin,
	Shakeel Butt, Muchun Song, cgroups, Andrew Morton

On Wed, Aug 13, 2025 at 04:57:55PM +0200, Thomas Gleixner wrote:
> memcg uses TIF_NOTIFY_RESUME to handle reclaiming on exit to user
> space. TIF_NOTIFY_RESUME is a multiplexing TIF bit, which is utilized by
> other entities as well.
> 
> This results in a unconditional mem_cgroup_handle_over_high() call for
> every invocation of resume_user_mode_work(), which is a pointless
> exercise as most of the time there is no reclaim work to do.
> 
> Especially since RSEQ is used by glibc, TIF_NOTIFY_RESUME is raised
> quite frequently and the empty calls show up in exit path profiling.
> 
> Optimize this by doing a quick check of the reclaim condition before
> invoking it.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Shakeel Butt <shakeel.butt@linux.dev>
> Cc: Muchun Song <muchun.song@linux.dev>
> Cc: Andrew Morton <akpm@linux-foundation.org>

Nice!

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] memcg: Optimize exit to user space
  2025-08-13 14:57 [PATCH] memcg: Optimize exit to user space Thomas Gleixner
  2025-08-13 15:45 ` Roman Gushchin
  2025-08-13 16:17 ` Johannes Weiner
@ 2025-08-13 17:19 ` Shakeel Butt
  2025-08-13 21:25   ` Thomas Gleixner
  2025-08-13 22:40   ` Andrew Morton
  2 siblings, 2 replies; 6+ messages in thread
From: Shakeel Butt @ 2025-08-13 17:19 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-mm, LKML, Peter Zijlstra, Johannes Weiner, Michal Hocko,
	Roman Gushchin, Muchun Song, cgroups, Andrew Morton

On Wed, Aug 13, 2025 at 04:57:55PM +0200, Thomas Gleixner wrote:
> memcg uses TIF_NOTIFY_RESUME to handle reclaiming on exit to user
> space. TIF_NOTIFY_RESUME is a multiplexing TIF bit, which is utilized by
> other entities as well.
> 
> This results in a unconditional mem_cgroup_handle_over_high() call for
> every invocation of resume_user_mode_work(), which is a pointless
> exercise as most of the time there is no reclaim work to do.
> 
> Especially since RSEQ is used by glibc, TIF_NOTIFY_RESUME is raised
> quite frequently and the empty calls show up in exit path profiling.
> 
> Optimize this by doing a quick check of the reclaim condition before
> invoking it.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Shakeel Butt <shakeel.butt@linux.dev>
> Cc: Muchun Song <muchun.song@linux.dev>
> Cc: Andrew Morton <akpm@linux-foundation.org>

Acked-by: Shakeel Butt <shakeel.butt@linux.dev>

Since this is seen in profiling data and it is simple enough, I think it
is worth backporting to stable trees as well.

In the followup cleanup, we can remove the (!nr_pages) check inside
__mem_cgroup_handle_over_high() as well.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] memcg: Optimize exit to user space
  2025-08-13 17:19 ` Shakeel Butt
@ 2025-08-13 21:25   ` Thomas Gleixner
  2025-08-13 22:40   ` Andrew Morton
  1 sibling, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2025-08-13 21:25 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: linux-mm, LKML, Peter Zijlstra, Johannes Weiner, Michal Hocko,
	Roman Gushchin, Muchun Song, cgroups, Andrew Morton

On Wed, Aug 13 2025 at 10:19, Shakeel Butt wrote:
> On Wed, Aug 13, 2025 at 04:57:55PM +0200, Thomas Gleixner wrote:
> Since this is seen in profiling data and it is simple enough, I think it
> is worth backporting to stable trees as well.

Your call.

> In the followup cleanup, we can remove the (!nr_pages) check inside
> __mem_cgroup_handle_over_high() as well.

Yes. I did not want to do that in one go, but that's an obvious follow
up.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] memcg: Optimize exit to user space
  2025-08-13 17:19 ` Shakeel Butt
  2025-08-13 21:25   ` Thomas Gleixner
@ 2025-08-13 22:40   ` Andrew Morton
  1 sibling, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2025-08-13 22:40 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Thomas Gleixner, linux-mm, LKML, Peter Zijlstra, Johannes Weiner,
	Michal Hocko, Roman Gushchin, Muchun Song, cgroups

On Wed, 13 Aug 2025 10:19:03 -0700 Shakeel Butt <shakeel.butt@linux.dev> wrote:

> On Wed, Aug 13, 2025 at 04:57:55PM +0200, Thomas Gleixner wrote:
> > memcg uses TIF_NOTIFY_RESUME to handle reclaiming on exit to user
> > space. TIF_NOTIFY_RESUME is a multiplexing TIF bit, which is utilized by
> > other entities as well.
> > 
> > This results in a unconditional mem_cgroup_handle_over_high() call for
> > every invocation of resume_user_mode_work(), which is a pointless
> > exercise as most of the time there is no reclaim work to do.
> > 
> > Especially since RSEQ is used by glibc, TIF_NOTIFY_RESUME is raised
> > quite frequently and the empty calls show up in exit path profiling.
> > 
> > Optimize this by doing a quick check of the reclaim condition before
> > invoking it.
> > 
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > Cc: Shakeel Butt <shakeel.butt@linux.dev>
> > Cc: Muchun Song <muchun.song@linux.dev>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> 
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> 
> Since this is seen in profiling data and it is simple enough, I think it
> is worth backporting to stable trees as well.

People will probably do this, but it's a big break of -stable rules.

If it is a regression fix (ie, has a Fixes:) and if it makes a big
difference (ie, comes with impressive quantitative testing results)
then maybe we could push it into -stable anyway...

> In the followup cleanup, we can remove the (!nr_pages) check inside
> __mem_cgroup_handle_over_high() as well.

yup, how about we do that now

--- a/mm/memcontrol.c~memcg-optimize-exit-to-user-space-fix
+++ a/mm/memcontrol.c
@@ -2213,9 +2213,6 @@ void __mem_cgroup_handle_over_high(gfp_t
 	struct mem_cgroup *memcg;
 	bool in_retry = false;
 
-	if (likely(!nr_pages))
-		return;
-
 	memcg = get_mem_cgroup_from_mm(current->mm);
 	current->memcg_nr_pages_over_high = 0;
 
_


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-08-13 22:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-13 14:57 [PATCH] memcg: Optimize exit to user space Thomas Gleixner
2025-08-13 15:45 ` Roman Gushchin
2025-08-13 16:17 ` Johannes Weiner
2025-08-13 17:19 ` Shakeel Butt
2025-08-13 21:25   ` Thomas Gleixner
2025-08-13 22:40   ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).