* [PATCH v2 0/5] memcg: nmi-safe kmem charging
@ 2025-05-16 6:49 Shakeel Butt
2025-05-16 6:49 ` [PATCH 1/5] memcg: disable kmem charging in nmi for unsupported arch Shakeel Butt
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Shakeel Butt @ 2025-05-16 6:49 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Vlastimil Babka, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf,
linux-mm, cgroups, linux-kernel, Meta kernel team
Users can attached their BPF programs at arbitrary execution points in
the kernel and such BPF programs may run in nmi context. In addition,
these programs can trigger memcg charged kernel allocations in the nmi
context. However memcg charging infra for kernel memory is not equipped
to handle nmi context for all architectures.
This series removes the hurdles to enable kmem charging in the nmi
context for most of the archs. For archs without CONFIG_HAVE_NMI, this
series is a noop. For archs with NMI support and have
CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS, the previous work to make memcg
stats re-entrant is sufficient for allowing kmem charging in nmi
context. For archs with NMI support but without
CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS and with
ARCH_HAVE_NMI_SAFE_CMPXCHG, this series added infra to support kmem
charging in nmi context. Lastly those archs with NMI support but without
CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS and ARCH_HAVE_NMI_SAFE_CMPXCHG,
kmem charging in nmi context is not supported at all.
Mostly used archs have support for CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
and this series should be almost a noop (other than making
memcg_rstat_updated nmi safe) for such archs.
Changes since v1:
- The main change was to explicitly differentiate between archs which
have sane NMI support from others and make the series almost a noop
for such archs. (Suggested by Vlastimil)
- This version very explicitly describes where kmem charging in nmi
context is supported and where it is not.
Shakeel Butt (5):
memcg: disable kmem charging in nmi for unsupported arch
memcg: nmi safe memcg stats for specific archs
memcg: add nmi-safe update for MEMCG_KMEM
memcg: nmi-safe slab stats updates
memcg: make memcg_rstat_updated nmi safe
include/linux/memcontrol.h | 21 ++++++
mm/memcontrol.c | 136 +++++++++++++++++++++++++++++++++----
2 files changed, 145 insertions(+), 12 deletions(-)
--
2.47.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/5] memcg: disable kmem charging in nmi for unsupported arch
2025-05-16 6:49 [PATCH v2 0/5] memcg: nmi-safe kmem charging Shakeel Butt
@ 2025-05-16 6:49 ` Shakeel Butt
2025-05-16 9:30 ` Vlastimil Babka
2025-05-16 6:49 ` [PATCH 2/5] memcg: nmi safe memcg stats for specific archs Shakeel Butt
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Shakeel Butt @ 2025-05-16 6:49 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Vlastimil Babka, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf,
linux-mm, cgroups, linux-kernel, Meta kernel team
The memcg accounting and stats uses this_cpu* and atomic* ops. There are
archs which define CONFIG_HAVE_NMI but does not define
CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS and ARCH_HAVE_NMI_SAFE_CMPXCHG, so
memcg accounting for such archs in nmi context is not possible to
support. Let's just disable memcg accounting in nmi context for such
archs.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
---
include/linux/memcontrol.h | 5 +++++
mm/memcontrol.c | 15 +++++++++++++++
2 files changed, 20 insertions(+)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index f7848f73f41c..53920528821f 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -62,6 +62,11 @@ struct mem_cgroup_reclaim_cookie {
#ifdef CONFIG_MEMCG
+#if defined(CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS) || \
+ !defined(CONFIG_HAVE_NMI) || defined(ARCH_HAVE_NMI_SAFE_CMPXCHG)
+#define MEMCG_SUPPORTS_NMI_CHARGING
+#endif
+
#define MEM_CGROUP_ID_SHIFT 16
struct mem_cgroup_id {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e17b698f6243..dface07f69bb 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2647,11 +2647,26 @@ static struct obj_cgroup *current_objcg_update(void)
return objcg;
}
+#ifdef MEMCG_SUPPORTS_NMI_CHARGING
+static inline bool nmi_charging_allowed(void)
+{
+ return true;
+}
+#else
+static inline bool nmi_charging_allowed(void)
+{
+ return false;
+}
+#endif
+
__always_inline struct obj_cgroup *current_obj_cgroup(void)
{
struct mem_cgroup *memcg;
struct obj_cgroup *objcg;
+ if (in_nmi() && !nmi_charging_allowed())
+ return NULL;
+
if (in_task()) {
memcg = current->active_memcg;
if (unlikely(memcg))
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/5] memcg: nmi safe memcg stats for specific archs
2025-05-16 6:49 [PATCH v2 0/5] memcg: nmi-safe kmem charging Shakeel Butt
2025-05-16 6:49 ` [PATCH 1/5] memcg: disable kmem charging in nmi for unsupported arch Shakeel Butt
@ 2025-05-16 6:49 ` Shakeel Butt
2025-05-16 9:43 ` Vlastimil Babka
2025-05-16 6:49 ` [PATCH 3/5] memcg: add nmi-safe update for MEMCG_KMEM Shakeel Butt
` (2 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Shakeel Butt @ 2025-05-16 6:49 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Vlastimil Babka, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf,
linux-mm, cgroups, linux-kernel, Meta kernel team
There are archs which have NMI but does not support this_cpu_* ops
safely in the nmi context but they support safe atomic ops in nmi
context. For such archs, let's add infra to use atomic ops for the memcg
stats which can be updated in nmi.
At the moment, the memcg stats which get updated in the objcg charging
path are MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B.
Rather than adding support for all memcg stats to be nmi safe, let's
just add infra to make these three stats nmi safe which this patch is
doing.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
---
include/linux/memcontrol.h | 20 ++++++++++++++--
mm/memcontrol.c | 49 ++++++++++++++++++++++++++++++++++++++
2 files changed, 67 insertions(+), 2 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 53920528821f..b10ae2388c27 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -62,9 +62,15 @@ struct mem_cgroup_reclaim_cookie {
#ifdef CONFIG_MEMCG
-#if defined(CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS) || \
- !defined(CONFIG_HAVE_NMI) || defined(ARCH_HAVE_NMI_SAFE_CMPXCHG)
+#if defined(CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS) || !defined(CONFIG_HAVE_NMI)
+
+#define MEMCG_SUPPORTS_NMI_CHARGING
+
+#elif defined(ARCH_HAVE_NMI_SAFE_CMPXCHG)
+
#define MEMCG_SUPPORTS_NMI_CHARGING
+#define MEMCG_NMI_NEED_ATOMIC
+
#endif
#define MEM_CGROUP_ID_SHIFT 16
@@ -118,6 +124,12 @@ struct mem_cgroup_per_node {
CACHELINE_PADDING(_pad2_);
unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS];
struct mem_cgroup_reclaim_iter iter;
+
+#ifdef MEMCG_NMI_NEED_ATOMIC
+ /* slab stats for nmi context */
+ atomic_t slab_reclaimable;
+ atomic_t slab_unreclaimable;
+#endif
};
struct mem_cgroup_threshold {
@@ -241,6 +253,10 @@ struct mem_cgroup {
atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS];
atomic_long_t memory_events_local[MEMCG_NR_MEMORY_EVENTS];
+#ifdef MEMCG_NMI_NEED_ATOMIC
+ /* MEMCG_KMEM for nmi context */
+ atomic_t kmem_stat;
+#endif
/*
* Hint of reclaim pressure for socket memroy management. Note
* that this indicator should NOT be used in legacy cgroup mode
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index dface07f69bb..102fdec3f49e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3979,6 +3979,53 @@ static void mem_cgroup_stat_aggregate(struct aggregate_control *ac)
}
}
+#ifdef MEMCG_NMI_NEED_ATOMIC
+static void flush_nmi_stats(struct mem_cgroup *memcg, struct mem_cgroup *parent,
+ int cpu)
+{
+ int nid;
+
+ if (atomic_read(&memcg->kmem_stat)) {
+ int kmem = atomic_xchg(&memcg->kmem_stat, 0);
+ int index = memcg_stats_index(MEMCG_KMEM);
+
+ memcg->vmstats->state[index] += kmem;
+ if (parent)
+ parent->vmstats->state_pending[index] += kmem;
+ }
+
+ for_each_node_state(nid, N_MEMORY) {
+ struct mem_cgroup_per_node *pn = memcg->nodeinfo[nid];
+ struct lruvec_stats *lstats = pn->lruvec_stats;
+ struct lruvec_stats *plstats = NULL;
+
+ if (parent)
+ plstats = parent->nodeinfo[nid]->lruvec_stats;
+
+ if (atomic_read(&pn->slab_reclaimable)) {
+ int slab = atomic_xchg(&pn->slab_reclaimable, 0);
+ int index = memcg_stats_index(NR_SLAB_RECLAIMABLE_B);
+
+ lstats->state[index] += slab;
+ if (plstats)
+ plstats->state_pending[index] += slab;
+ }
+ if (atomic_read(&pn->slab_unreclaimable)) {
+ int slab = atomic_xchg(&pn->slab_unreclaimable, 0);
+ int index = memcg_stats_index(NR_SLAB_UNRECLAIMABLE_B);
+
+ lstats->state[index] += slab;
+ if (plstats)
+ plstats->state_pending[index] += slab;
+ }
+ }
+}
+#else
+static void flush_nmi_stats(struct mem_cgroup *memcg, struct mem_cgroup *parent,
+ int cpu)
+{}
+#endif
+
static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu)
{
struct mem_cgroup *memcg = mem_cgroup_from_css(css);
@@ -3987,6 +4034,8 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu)
struct aggregate_control ac;
int nid;
+ flush_nmi_stats(memcg, parent, cpu);
+
statc = per_cpu_ptr(memcg->vmstats_percpu, cpu);
ac = (struct aggregate_control) {
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/5] memcg: add nmi-safe update for MEMCG_KMEM
2025-05-16 6:49 [PATCH v2 0/5] memcg: nmi-safe kmem charging Shakeel Butt
2025-05-16 6:49 ` [PATCH 1/5] memcg: disable kmem charging in nmi for unsupported arch Shakeel Butt
2025-05-16 6:49 ` [PATCH 2/5] memcg: nmi safe memcg stats for specific archs Shakeel Butt
@ 2025-05-16 6:49 ` Shakeel Butt
2025-05-16 9:43 ` Vlastimil Babka
2025-05-16 6:49 ` [PATCH 4/5] memcg: nmi-safe slab stats updates Shakeel Butt
2025-05-16 6:49 ` [PATCH 5/5] memcg: make memcg_rstat_updated nmi safe Shakeel Butt
4 siblings, 1 reply; 14+ messages in thread
From: Shakeel Butt @ 2025-05-16 6:49 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Vlastimil Babka, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf,
linux-mm, cgroups, linux-kernel, Meta kernel team
The objcg based kmem charging and uncharging code path needs to update
MEMCG_KMEM appropriately. Let's add support to update MEMCG_KMEM in
nmi-safe way for those code paths.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
---
mm/memcontrol.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 102fdec3f49e..899a31e6b087 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2729,6 +2729,23 @@ struct obj_cgroup *get_obj_cgroup_from_folio(struct folio *folio)
return objcg;
}
+#ifdef MEMCG_NMI_NEED_ATOMIC
+static inline void account_kmem_nmi_safe(struct mem_cgroup *memcg, int val)
+{
+ if (likely(!in_nmi())) {
+ mod_memcg_state(memcg, MEMCG_KMEM, val);
+ } else {
+ /* TODO: add to cgroup update tree once it is nmi-safe. */
+ atomic_add(val, &memcg->kmem_stat);
+ }
+}
+#else
+static inline void account_kmem_nmi_safe(struct mem_cgroup *memcg, int val)
+{
+ mod_memcg_state(memcg, MEMCG_KMEM, val);
+}
+#endif
+
/*
* obj_cgroup_uncharge_pages: uncharge a number of kernel pages from a objcg
* @objcg: object cgroup to uncharge
@@ -2741,7 +2758,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
memcg = get_mem_cgroup_from_objcg(objcg);
- mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages);
+ account_kmem_nmi_safe(memcg, -nr_pages);
memcg1_account_kmem(memcg, -nr_pages);
if (!mem_cgroup_is_root(memcg))
refill_stock(memcg, nr_pages);
@@ -2769,7 +2786,7 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
if (ret)
goto out;
- mod_memcg_state(memcg, MEMCG_KMEM, nr_pages);
+ account_kmem_nmi_safe(memcg, nr_pages);
memcg1_account_kmem(memcg, nr_pages);
out:
css_put(&memcg->css);
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/5] memcg: nmi-safe slab stats updates
2025-05-16 6:49 [PATCH v2 0/5] memcg: nmi-safe kmem charging Shakeel Butt
` (2 preceding siblings ...)
2025-05-16 6:49 ` [PATCH 3/5] memcg: add nmi-safe update for MEMCG_KMEM Shakeel Butt
@ 2025-05-16 6:49 ` Shakeel Butt
2025-05-16 9:44 ` Vlastimil Babka
2025-05-16 6:49 ` [PATCH 5/5] memcg: make memcg_rstat_updated nmi safe Shakeel Butt
4 siblings, 1 reply; 14+ messages in thread
From: Shakeel Butt @ 2025-05-16 6:49 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Vlastimil Babka, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf,
linux-mm, cgroups, linux-kernel, Meta kernel team
The objcg based kmem [un]charging can be called in nmi context and it
may need to update NR_SLAB_[UN]RECLAIMABLE_B stats. So, let's correctly
handle the updates of these stats in the nmi context.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
---
mm/memcontrol.c | 36 +++++++++++++++++++++++++++++++++---
1 file changed, 33 insertions(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 899a31e6b087..85519ce37f18 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2517,17 +2517,47 @@ static void commit_charge(struct folio *folio, struct mem_cgroup *memcg)
folio->memcg_data = (unsigned long)memcg;
}
+#ifdef MEMCG_NMI_NEED_ATOMIC
+static inline void account_slab_nmi_safe(struct mem_cgroup *memcg,
+ struct pglist_data *pgdat,
+ enum node_stat_item idx, int nr)
+{
+ struct lruvec *lruvec;
+
+ if (likely(!in_nmi())) {
+ lruvec = mem_cgroup_lruvec(memcg, pgdat);
+ mod_memcg_lruvec_state(lruvec, idx, nr);
+ } else {
+ struct mem_cgroup_per_node *pn = memcg->nodeinfo[pgdat->node_id];
+
+ /* TODO: add to cgroup update tree once it is nmi-safe. */
+ if (idx == NR_SLAB_RECLAIMABLE_B)
+ atomic_add(nr, &pn->slab_reclaimable);
+ else
+ atomic_add(nr, &pn->slab_unreclaimable);
+ }
+}
+#else
+static inline void account_slab_nmi_safe(struct mem_cgroup *memcg,
+ struct pglist_data *pgdat,
+ enum node_stat_item idx, int nr)
+{
+ struct lruvec *lruvec;
+
+ lruvec = mem_cgroup_lruvec(memcg, pgdat);
+ mod_memcg_lruvec_state(lruvec, idx, nr);
+}
+#endif
+
static inline void mod_objcg_mlstate(struct obj_cgroup *objcg,
struct pglist_data *pgdat,
enum node_stat_item idx, int nr)
{
struct mem_cgroup *memcg;
- struct lruvec *lruvec;
rcu_read_lock();
memcg = obj_cgroup_memcg(objcg);
- lruvec = mem_cgroup_lruvec(memcg, pgdat);
- mod_memcg_lruvec_state(lruvec, idx, nr);
+ account_slab_nmi_safe(memcg, pgdat, idx, nr);
rcu_read_unlock();
}
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/5] memcg: make memcg_rstat_updated nmi safe
2025-05-16 6:49 [PATCH v2 0/5] memcg: nmi-safe kmem charging Shakeel Butt
` (3 preceding siblings ...)
2025-05-16 6:49 ` [PATCH 4/5] memcg: nmi-safe slab stats updates Shakeel Butt
@ 2025-05-16 6:49 ` Shakeel Butt
2025-05-16 9:45 ` Vlastimil Babka
4 siblings, 1 reply; 14+ messages in thread
From: Shakeel Butt @ 2025-05-16 6:49 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Vlastimil Babka, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf,
linux-mm, cgroups, linux-kernel, Meta kernel team
memcg: convert stats_updates to atomic_t
Currently kernel maintains memory related stats updates per-cgroup to
optimize stats flushing. The stats_updates is defined as atomic64_t
which is not nmi-safe on some archs. Actually we don't really need 64bit
atomic as the max value stats_updates can get should be less than
nr_cpus * MEMCG_CHARGE_BATCH. A normal atomic_t should suffice.
Also the function cgroup_rstat_updated() is still not nmi-safe but there
is parallel effort to make it nmi-safe, so until then let's ignore it in
the nmi context.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
---
mm/memcontrol.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 85519ce37f18..06d5d5407b00 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -533,7 +533,7 @@ struct memcg_vmstats {
unsigned long events_pending[NR_MEMCG_EVENTS];
/* Stats updates since the last flush */
- atomic64_t stats_updates;
+ atomic_t stats_updates;
};
/*
@@ -559,7 +559,7 @@ static u64 flush_last_time;
static bool memcg_vmstats_needs_flush(struct memcg_vmstats *vmstats)
{
- return atomic64_read(&vmstats->stats_updates) >
+ return atomic_read(&vmstats->stats_updates) >
MEMCG_CHARGE_BATCH * num_online_cpus();
}
@@ -573,7 +573,9 @@ static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val,
if (!val)
return;
- cgroup_rstat_updated(memcg->css.cgroup, cpu);
+ /* TODO: add to cgroup update tree once it is nmi-safe. */
+ if (!in_nmi())
+ cgroup_rstat_updated(memcg->css.cgroup, cpu);
statc_pcpu = memcg->vmstats_percpu;
for (; statc_pcpu; statc_pcpu = statc->parent_pcpu) {
statc = this_cpu_ptr(statc_pcpu);
@@ -591,7 +593,7 @@ static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val,
continue;
stats_updates = this_cpu_xchg(statc_pcpu->stats_updates, 0);
- atomic64_add(stats_updates, &statc->vmstats->stats_updates);
+ atomic_add(stats_updates, &statc->vmstats->stats_updates);
}
}
@@ -599,7 +601,7 @@ static void __mem_cgroup_flush_stats(struct mem_cgroup *memcg, bool force)
{
bool needs_flush = memcg_vmstats_needs_flush(memcg->vmstats);
- trace_memcg_flush_stats(memcg, atomic64_read(&memcg->vmstats->stats_updates),
+ trace_memcg_flush_stats(memcg, atomic_read(&memcg->vmstats->stats_updates),
force, needs_flush);
if (!force && !needs_flush)
@@ -4132,8 +4134,8 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu)
}
WRITE_ONCE(statc->stats_updates, 0);
/* We are in a per-cpu loop here, only do the atomic write once */
- if (atomic64_read(&memcg->vmstats->stats_updates))
- atomic64_set(&memcg->vmstats->stats_updates, 0);
+ if (atomic_read(&memcg->vmstats->stats_updates))
+ atomic_set(&memcg->vmstats->stats_updates, 0);
}
static void mem_cgroup_fork(struct task_struct *task)
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] memcg: disable kmem charging in nmi for unsupported arch
2025-05-16 6:49 ` [PATCH 1/5] memcg: disable kmem charging in nmi for unsupported arch Shakeel Butt
@ 2025-05-16 9:30 ` Vlastimil Babka
2025-05-16 15:37 ` Shakeel Butt
0 siblings, 1 reply; 14+ messages in thread
From: Vlastimil Babka @ 2025-05-16 9:30 UTC (permalink / raw)
To: Shakeel Butt, Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Alexei Starovoitov, Sebastian Andrzej Siewior, Harry Yoo,
Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf, linux-mm,
cgroups, linux-kernel, Meta kernel team
On 5/16/25 08:49, Shakeel Butt wrote:
> The memcg accounting and stats uses this_cpu* and atomic* ops. There are
> archs which define CONFIG_HAVE_NMI but does not define
> CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS and ARCH_HAVE_NMI_SAFE_CMPXCHG, so
> memcg accounting for such archs in nmi context is not possible to
> support. Let's just disable memcg accounting in nmi context for such
> archs.
>
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> ---
> include/linux/memcontrol.h | 5 +++++
> mm/memcontrol.c | 15 +++++++++++++++
> 2 files changed, 20 insertions(+)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index f7848f73f41c..53920528821f 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -62,6 +62,11 @@ struct mem_cgroup_reclaim_cookie {
>
> #ifdef CONFIG_MEMCG
>
> +#if defined(CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS) || \
> + !defined(CONFIG_HAVE_NMI) || defined(ARCH_HAVE_NMI_SAFE_CMPXCHG)
> +#define MEMCG_SUPPORTS_NMI_CHARGING
> +#endif
> +
> #define MEM_CGROUP_ID_SHIFT 16
>
> struct mem_cgroup_id {
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e17b698f6243..dface07f69bb 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2647,11 +2647,26 @@ static struct obj_cgroup *current_objcg_update(void)
> return objcg;
> }
>
> +#ifdef MEMCG_SUPPORTS_NMI_CHARGING
> +static inline bool nmi_charging_allowed(void)
> +{
> + return true;
> +}
> +#else
> +static inline bool nmi_charging_allowed(void)
> +{
> + return false;
> +}
> +#endif
> +
> __always_inline struct obj_cgroup *current_obj_cgroup(void)
> {
> struct mem_cgroup *memcg;
> struct obj_cgroup *objcg;
>
> + if (in_nmi() && !nmi_charging_allowed())
Exchange the two as the latter is compile-time constant, so it can shortcut
the in_nmi() check away in all the good cases?
> + return NULL;
> +
> if (in_task()) {
> memcg = current->active_memcg;
> if (unlikely(memcg))
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] memcg: nmi safe memcg stats for specific archs
2025-05-16 6:49 ` [PATCH 2/5] memcg: nmi safe memcg stats for specific archs Shakeel Butt
@ 2025-05-16 9:43 ` Vlastimil Babka
0 siblings, 0 replies; 14+ messages in thread
From: Vlastimil Babka @ 2025-05-16 9:43 UTC (permalink / raw)
To: Shakeel Butt, Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Alexei Starovoitov, Sebastian Andrzej Siewior, Harry Yoo,
Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf, linux-mm,
cgroups, linux-kernel, Meta kernel team
On 5/16/25 08:49, Shakeel Butt wrote:
> There are archs which have NMI but does not support this_cpu_* ops
> safely in the nmi context but they support safe atomic ops in nmi
> context. For such archs, let's add infra to use atomic ops for the memcg
> stats which can be updated in nmi.
>
> At the moment, the memcg stats which get updated in the objcg charging
> path are MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B.
> Rather than adding support for all memcg stats to be nmi safe, let's
> just add infra to make these three stats nmi safe which this patch is
> doing.
>
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] memcg: add nmi-safe update for MEMCG_KMEM
2025-05-16 6:49 ` [PATCH 3/5] memcg: add nmi-safe update for MEMCG_KMEM Shakeel Butt
@ 2025-05-16 9:43 ` Vlastimil Babka
0 siblings, 0 replies; 14+ messages in thread
From: Vlastimil Babka @ 2025-05-16 9:43 UTC (permalink / raw)
To: Shakeel Butt, Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Alexei Starovoitov, Sebastian Andrzej Siewior, Harry Yoo,
Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf, linux-mm,
cgroups, linux-kernel, Meta kernel team
On 5/16/25 08:49, Shakeel Butt wrote:
> The objcg based kmem charging and uncharging code path needs to update
> MEMCG_KMEM appropriately. Let's add support to update MEMCG_KMEM in
> nmi-safe way for those code paths.
>
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/5] memcg: nmi-safe slab stats updates
2025-05-16 6:49 ` [PATCH 4/5] memcg: nmi-safe slab stats updates Shakeel Butt
@ 2025-05-16 9:44 ` Vlastimil Babka
0 siblings, 0 replies; 14+ messages in thread
From: Vlastimil Babka @ 2025-05-16 9:44 UTC (permalink / raw)
To: Shakeel Butt, Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Alexei Starovoitov, Sebastian Andrzej Siewior, Harry Yoo,
Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf, linux-mm,
cgroups, linux-kernel, Meta kernel team
On 5/16/25 08:49, Shakeel Butt wrote:
> The objcg based kmem [un]charging can be called in nmi context and it
> may need to update NR_SLAB_[UN]RECLAIMABLE_B stats. So, let's correctly
> handle the updates of these stats in the nmi context.
>
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] memcg: make memcg_rstat_updated nmi safe
2025-05-16 6:49 ` [PATCH 5/5] memcg: make memcg_rstat_updated nmi safe Shakeel Butt
@ 2025-05-16 9:45 ` Vlastimil Babka
2025-05-16 15:34 ` Shakeel Butt
0 siblings, 1 reply; 14+ messages in thread
From: Vlastimil Babka @ 2025-05-16 9:45 UTC (permalink / raw)
To: Shakeel Butt, Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Alexei Starovoitov, Sebastian Andrzej Siewior, Harry Yoo,
Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf, linux-mm,
cgroups, linux-kernel, Meta kernel team
On 5/16/25 08:49, Shakeel Butt wrote:
> memcg: convert stats_updates to atomic_t
You have two subjects, I guess delete the second one?
> Currently kernel maintains memory related stats updates per-cgroup to
> optimize stats flushing. The stats_updates is defined as atomic64_t
> which is not nmi-safe on some archs. Actually we don't really need 64bit
> atomic as the max value stats_updates can get should be less than
> nr_cpus * MEMCG_CHARGE_BATCH. A normal atomic_t should suffice.
>
> Also the function cgroup_rstat_updated() is still not nmi-safe but there
> is parallel effort to make it nmi-safe, so until then let's ignore it in
> the nmi context.
>
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] memcg: make memcg_rstat_updated nmi safe
2025-05-16 9:45 ` Vlastimil Babka
@ 2025-05-16 15:34 ` Shakeel Butt
0 siblings, 0 replies; 14+ messages in thread
From: Shakeel Butt @ 2025-05-16 15:34 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Roman Gushchin,
Muchun Song, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf,
linux-mm, cgroups, linux-kernel, Meta kernel team
On Fri, May 16, 2025 at 11:45:39AM +0200, Vlastimil Babka wrote:
> On 5/16/25 08:49, Shakeel Butt wrote:
> > memcg: convert stats_updates to atomic_t
>
> You have two subjects, I guess delete the second one?
Oops I squashed the patch at the very end and forgot to fix this.
>
> > Currently kernel maintains memory related stats updates per-cgroup to
> > optimize stats flushing. The stats_updates is defined as atomic64_t
> > which is not nmi-safe on some archs. Actually we don't really need 64bit
> > atomic as the max value stats_updates can get should be less than
> > nr_cpus * MEMCG_CHARGE_BATCH. A normal atomic_t should suffice.
> >
> > Also the function cgroup_rstat_updated() is still not nmi-safe but there
> > is parallel effort to make it nmi-safe, so until then let's ignore it in
> > the nmi context.
> >
> > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
Thanks a lot.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] memcg: disable kmem charging in nmi for unsupported arch
2025-05-16 9:30 ` Vlastimil Babka
@ 2025-05-16 15:37 ` Shakeel Butt
2025-05-16 18:20 ` Shakeel Butt
0 siblings, 1 reply; 14+ messages in thread
From: Shakeel Butt @ 2025-05-16 15:37 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Roman Gushchin,
Muchun Song, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf,
linux-mm, cgroups, linux-kernel, Meta kernel team
On Fri, May 16, 2025 at 11:30:17AM +0200, Vlastimil Babka wrote:
> On 5/16/25 08:49, Shakeel Butt wrote:
> > The memcg accounting and stats uses this_cpu* and atomic* ops. There are
> > archs which define CONFIG_HAVE_NMI but does not define
> > CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS and ARCH_HAVE_NMI_SAFE_CMPXCHG, so
> > memcg accounting for such archs in nmi context is not possible to
> > support. Let's just disable memcg accounting in nmi context for such
> > archs.
> >
> > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> > ---
> > include/linux/memcontrol.h | 5 +++++
> > mm/memcontrol.c | 15 +++++++++++++++
> > 2 files changed, 20 insertions(+)
> >
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > index f7848f73f41c..53920528821f 100644
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -62,6 +62,11 @@ struct mem_cgroup_reclaim_cookie {
> >
> > #ifdef CONFIG_MEMCG
> >
> > +#if defined(CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS) || \
> > + !defined(CONFIG_HAVE_NMI) || defined(ARCH_HAVE_NMI_SAFE_CMPXCHG)
> > +#define MEMCG_SUPPORTS_NMI_CHARGING
> > +#endif
> > +
> > #define MEM_CGROUP_ID_SHIFT 16
> >
> > struct mem_cgroup_id {
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index e17b698f6243..dface07f69bb 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2647,11 +2647,26 @@ static struct obj_cgroup *current_objcg_update(void)
> > return objcg;
> > }
> >
> > +#ifdef MEMCG_SUPPORTS_NMI_CHARGING
> > +static inline bool nmi_charging_allowed(void)
> > +{
> > + return true;
> > +}
> > +#else
> > +static inline bool nmi_charging_allowed(void)
> > +{
> > + return false;
> > +}
> > +#endif
> > +
> > __always_inline struct obj_cgroup *current_obj_cgroup(void)
> > {
> > struct mem_cgroup *memcg;
> > struct obj_cgroup *objcg;
> >
> > + if (in_nmi() && !nmi_charging_allowed())
>
> Exchange the two as the latter is compile-time constant, so it can shortcut
> the in_nmi() check away in all the good cases?
>
Oh I thought compiler would figure that out but now that I think about
it, it can only do so if the first condition does not have any
side-effects and though in_nmi() does not, I am not sure if compiler can
extract that information.
I will fix this and make sure that compiler is doing the right thing.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] memcg: disable kmem charging in nmi for unsupported arch
2025-05-16 15:37 ` Shakeel Butt
@ 2025-05-16 18:20 ` Shakeel Butt
0 siblings, 0 replies; 14+ messages in thread
From: Shakeel Butt @ 2025-05-16 18:20 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Roman Gushchin,
Muchun Song, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, bpf,
linux-mm, cgroups, linux-kernel, Meta kernel team
On Fri, May 16, 2025 at 08:37:23AM -0700, Shakeel Butt wrote:
> On Fri, May 16, 2025 at 11:30:17AM +0200, Vlastimil Babka wrote:
> > On 5/16/25 08:49, Shakeel Butt wrote:
> > > The memcg accounting and stats uses this_cpu* and atomic* ops. There are
> > > archs which define CONFIG_HAVE_NMI but does not define
> > > CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS and ARCH_HAVE_NMI_SAFE_CMPXCHG, so
> > > memcg accounting for such archs in nmi context is not possible to
> > > support. Let's just disable memcg accounting in nmi context for such
> > > archs.
> > >
> > > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> > > ---
> > > include/linux/memcontrol.h | 5 +++++
> > > mm/memcontrol.c | 15 +++++++++++++++
> > > 2 files changed, 20 insertions(+)
> > >
> > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > > index f7848f73f41c..53920528821f 100644
> > > --- a/include/linux/memcontrol.h
> > > +++ b/include/linux/memcontrol.h
> > > @@ -62,6 +62,11 @@ struct mem_cgroup_reclaim_cookie {
> > >
> > > #ifdef CONFIG_MEMCG
> > >
> > > +#if defined(CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS) || \
> > > + !defined(CONFIG_HAVE_NMI) || defined(ARCH_HAVE_NMI_SAFE_CMPXCHG)
> > > +#define MEMCG_SUPPORTS_NMI_CHARGING
> > > +#endif
> > > +
> > > #define MEM_CGROUP_ID_SHIFT 16
> > >
> > > struct mem_cgroup_id {
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index e17b698f6243..dface07f69bb 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -2647,11 +2647,26 @@ static struct obj_cgroup *current_objcg_update(void)
> > > return objcg;
> > > }
> > >
> > > +#ifdef MEMCG_SUPPORTS_NMI_CHARGING
> > > +static inline bool nmi_charging_allowed(void)
> > > +{
> > > + return true;
> > > +}
> > > +#else
> > > +static inline bool nmi_charging_allowed(void)
> > > +{
> > > + return false;
> > > +}
> > > +#endif
> > > +
> > > __always_inline struct obj_cgroup *current_obj_cgroup(void)
> > > {
> > > struct mem_cgroup *memcg;
> > > struct obj_cgroup *objcg;
> > >
> > > + if (in_nmi() && !nmi_charging_allowed())
> >
> > Exchange the two as the latter is compile-time constant, so it can shortcut
> > the in_nmi() check away in all the good cases?
> >
>
> Oh I thought compiler would figure that out but now that I think about
> it, it can only do so if the first condition does not have any
> side-effects and though in_nmi() does not, I am not sure if compiler can
> extract that information.
>
> I will fix this and make sure that compiler is doing the right thing.
So, gcc 11.5 generates the same code irrespective of checking in_nmi()
first or second i.e. avoid in_nmi() check altogether on x86_64. I will
still rearrange the checks to not leave this optimization to compilers.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-05-16 18:21 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-16 6:49 [PATCH v2 0/5] memcg: nmi-safe kmem charging Shakeel Butt
2025-05-16 6:49 ` [PATCH 1/5] memcg: disable kmem charging in nmi for unsupported arch Shakeel Butt
2025-05-16 9:30 ` Vlastimil Babka
2025-05-16 15:37 ` Shakeel Butt
2025-05-16 18:20 ` Shakeel Butt
2025-05-16 6:49 ` [PATCH 2/5] memcg: nmi safe memcg stats for specific archs Shakeel Butt
2025-05-16 9:43 ` Vlastimil Babka
2025-05-16 6:49 ` [PATCH 3/5] memcg: add nmi-safe update for MEMCG_KMEM Shakeel Butt
2025-05-16 9:43 ` Vlastimil Babka
2025-05-16 6:49 ` [PATCH 4/5] memcg: nmi-safe slab stats updates Shakeel Butt
2025-05-16 9:44 ` Vlastimil Babka
2025-05-16 6:49 ` [PATCH 5/5] memcg: make memcg_rstat_updated nmi safe Shakeel Butt
2025-05-16 9:45 ` Vlastimil Babka
2025-05-16 15:34 ` Shakeel Butt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).