* [PATCH v4 0/5] memcg: nmi-safe kmem charging
@ 2025-05-19 6:31 Shakeel Butt
2025-05-19 6:31 ` [PATCH v4 1/5] memcg: disable kmem charging in nmi for unsupported arch Shakeel Butt
` (5 more replies)
0 siblings, 6 replies; 9+ messages in thread
From: Shakeel Butt @ 2025-05-19 6:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Vlastimil Babka, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers,
Tejun Heo, bpf, linux-mm, cgroups, linux-kernel, Meta kernel team
Users can attached their BPF programs at arbitrary execution points in
the kernel and such BPF programs may run in nmi context. In addition,
these programs can trigger memcg charged kernel allocations in the nmi
context. However memcg charging infra for kernel memory is not equipped
to handle nmi context for all architectures.
This series removes the hurdles to enable kmem charging in the nmi
context for most of the archs. For archs without CONFIG_HAVE_NMI, this
series is a noop. For archs with NMI support and have
CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS, the previous work to make memcg
stats re-entrant is sufficient for allowing kmem charging in nmi
context. For archs with NMI support but without
CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS and with
ARCH_HAVE_NMI_SAFE_CMPXCHG, this series added infra to support kmem
charging in nmi context. Lastly those archs with NMI support but without
CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS and ARCH_HAVE_NMI_SAFE_CMPXCHG,
kmem charging in nmi context is not supported at all.
Mostly used archs have support for CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
and this series should be almost a noop (other than making
memcg_rstat_updated nmi safe) for such archs.
Changes since v3:
- Use internal config symbols for nmi unsafe configs as suggested by
Johannes.
Changes since v2:
- Rearrange in_nmi() check as suggested by Vlastimil
- Fix commit messag of patch 5 as suggested by Vlastimil
Changes since v1:
- The main change was to explicitly differentiate between archs which
have sane NMI support from others and make the series almost a noop
for such archs. (Suggested by Vlastimil)
- This version very explicitly describes where kmem charging in nmi
context is supported and where it is not.
Shakeel Butt (5):
memcg: disable kmem charging in nmi for unsupported arch
memcg: nmi safe memcg stats for specific archs
memcg: add nmi-safe update for MEMCG_KMEM
memcg: nmi-safe slab stats updates
memcg: make memcg_rstat_updated nmi safe
include/linux/memcontrol.h | 21 ++++++
mm/memcontrol.c | 136 +++++++++++++++++++++++++++++++++----
2 files changed, 145 insertions(+), 12 deletions(-)
--
2.47.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v4 1/5] memcg: disable kmem charging in nmi for unsupported arch
2025-05-19 6:31 [PATCH v4 0/5] memcg: nmi-safe kmem charging Shakeel Butt
@ 2025-05-19 6:31 ` Shakeel Butt
2025-05-19 6:46 ` Vlastimil Babka
2025-05-19 6:31 ` [PATCH v4 2/5] memcg: nmi safe memcg stats for specific archs Shakeel Butt
` (4 subsequent siblings)
5 siblings, 1 reply; 9+ messages in thread
From: Shakeel Butt @ 2025-05-19 6:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Vlastimil Babka, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers,
Tejun Heo, bpf, linux-mm, cgroups, linux-kernel, Meta kernel team
The memcg accounting and stats uses this_cpu* and atomic* ops. There are
archs which define CONFIG_HAVE_NMI but does not define
CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS and ARCH_HAVE_NMI_SAFE_CMPXCHG, so
memcg accounting for such archs in nmi context is not possible to
support. Let's just disable memcg accounting in nmi context for such
archs.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
---
init/Kconfig | 7 +++++++
mm/memcontrol.c | 3 +++
2 files changed, 10 insertions(+)
diff --git a/init/Kconfig b/init/Kconfig
index 4cdd1049283c..a2aa49cfb8bd 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1006,6 +1006,13 @@ config MEMCG
help
Provides control over the memory footprint of tasks in a cgroup.
+config MEMCG_NMI_UNSAFE
+ bool
+ depends on MEMCG
+ depends on HAVE_NMI
+ depends on !ARCH_HAS_NMI_SAFE_THIS_CPU_OPS && !ARCH_HAVE_NMI_SAFE_CMPXCHG
+ default y
+
config MEMCG_V1
bool "Legacy cgroup v1 memory controller"
depends on MEMCG
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e17b698f6243..532e2c06ea60 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2652,6 +2652,9 @@ __always_inline struct obj_cgroup *current_obj_cgroup(void)
struct mem_cgroup *memcg;
struct obj_cgroup *objcg;
+ if (IS_ENABLED(CONFIG_MEMCG_NMI_UNSAFE) && in_nmi())
+ return NULL;
+
if (in_task()) {
memcg = current->active_memcg;
if (unlikely(memcg))
--
2.47.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 2/5] memcg: nmi safe memcg stats for specific archs
2025-05-19 6:31 [PATCH v4 0/5] memcg: nmi-safe kmem charging Shakeel Butt
2025-05-19 6:31 ` [PATCH v4 1/5] memcg: disable kmem charging in nmi for unsupported arch Shakeel Butt
@ 2025-05-19 6:31 ` Shakeel Butt
2025-05-19 6:31 ` [PATCH v4 3/5] memcg: add nmi-safe update for MEMCG_KMEM Shakeel Butt
` (3 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2025-05-19 6:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Vlastimil Babka, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers,
Tejun Heo, bpf, linux-mm, cgroups, linux-kernel, Meta kernel team
There are archs which have NMI but does not support this_cpu_* ops
safely in the nmi context but they support safe atomic ops in nmi
context. For such archs, let's add infra to use atomic ops for the memcg
stats which can be updated in nmi.
At the moment, the memcg stats which get updated in the objcg charging
path are MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B.
Rather than adding support for all memcg stats to be nmi safe, let's
just add infra to make these three stats nmi safe which this patch is
doing.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
---
include/linux/memcontrol.h | 10 ++++++++
init/Kconfig | 7 ++++++
mm/memcontrol.c | 49 ++++++++++++++++++++++++++++++++++++++
3 files changed, 66 insertions(+)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index f7848f73f41c..87b6688f124a 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -113,6 +113,12 @@ struct mem_cgroup_per_node {
CACHELINE_PADDING(_pad2_);
unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS];
struct mem_cgroup_reclaim_iter iter;
+
+#ifdef CONFIG_MEMCG_NMI_SAFETY_REQUIRES_ATOMIC
+ /* slab stats for nmi context */
+ atomic_t slab_reclaimable;
+ atomic_t slab_unreclaimable;
+#endif
};
struct mem_cgroup_threshold {
@@ -236,6 +242,10 @@ struct mem_cgroup {
atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS];
atomic_long_t memory_events_local[MEMCG_NR_MEMORY_EVENTS];
+#ifdef CONFIG_MEMCG_NMI_SAFETY_REQUIRES_ATOMIC
+ /* MEMCG_KMEM for nmi context */
+ atomic_t kmem_stat;
+#endif
/*
* Hint of reclaim pressure for socket memroy management. Note
* that this indicator should NOT be used in legacy cgroup mode
diff --git a/init/Kconfig b/init/Kconfig
index a2aa49cfb8bd..e9f47baa34e0 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1013,6 +1013,13 @@ config MEMCG_NMI_UNSAFE
depends on !ARCH_HAS_NMI_SAFE_THIS_CPU_OPS && !ARCH_HAVE_NMI_SAFE_CMPXCHG
default y
+config MEMCG_NMI_SAFETY_REQUIRES_ATOMIC
+ bool
+ depends on MEMCG
+ depends on HAVE_NMI
+ depends on !ARCH_HAS_NMI_SAFE_THIS_CPU_OPS && ARCH_HAVE_NMI_SAFE_CMPXCHG
+ default y
+
config MEMCG_V1
bool "Legacy cgroup v1 memory controller"
depends on MEMCG
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 532e2c06ea60..e3f0b298308a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3967,6 +3967,53 @@ static void mem_cgroup_stat_aggregate(struct aggregate_control *ac)
}
}
+#ifdef CONFIG_MEMCG_NMI_SAFETY_REQUIRES_ATOMIC
+static void flush_nmi_stats(struct mem_cgroup *memcg, struct mem_cgroup *parent,
+ int cpu)
+{
+ int nid;
+
+ if (atomic_read(&memcg->kmem_stat)) {
+ int kmem = atomic_xchg(&memcg->kmem_stat, 0);
+ int index = memcg_stats_index(MEMCG_KMEM);
+
+ memcg->vmstats->state[index] += kmem;
+ if (parent)
+ parent->vmstats->state_pending[index] += kmem;
+ }
+
+ for_each_node_state(nid, N_MEMORY) {
+ struct mem_cgroup_per_node *pn = memcg->nodeinfo[nid];
+ struct lruvec_stats *lstats = pn->lruvec_stats;
+ struct lruvec_stats *plstats = NULL;
+
+ if (parent)
+ plstats = parent->nodeinfo[nid]->lruvec_stats;
+
+ if (atomic_read(&pn->slab_reclaimable)) {
+ int slab = atomic_xchg(&pn->slab_reclaimable, 0);
+ int index = memcg_stats_index(NR_SLAB_RECLAIMABLE_B);
+
+ lstats->state[index] += slab;
+ if (plstats)
+ plstats->state_pending[index] += slab;
+ }
+ if (atomic_read(&pn->slab_unreclaimable)) {
+ int slab = atomic_xchg(&pn->slab_unreclaimable, 0);
+ int index = memcg_stats_index(NR_SLAB_UNRECLAIMABLE_B);
+
+ lstats->state[index] += slab;
+ if (plstats)
+ plstats->state_pending[index] += slab;
+ }
+ }
+}
+#else
+static void flush_nmi_stats(struct mem_cgroup *memcg, struct mem_cgroup *parent,
+ int cpu)
+{}
+#endif
+
static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu)
{
struct mem_cgroup *memcg = mem_cgroup_from_css(css);
@@ -3975,6 +4022,8 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu)
struct aggregate_control ac;
int nid;
+ flush_nmi_stats(memcg, parent, cpu);
+
statc = per_cpu_ptr(memcg->vmstats_percpu, cpu);
ac = (struct aggregate_control) {
--
2.47.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 3/5] memcg: add nmi-safe update for MEMCG_KMEM
2025-05-19 6:31 [PATCH v4 0/5] memcg: nmi-safe kmem charging Shakeel Butt
2025-05-19 6:31 ` [PATCH v4 1/5] memcg: disable kmem charging in nmi for unsupported arch Shakeel Butt
2025-05-19 6:31 ` [PATCH v4 2/5] memcg: nmi safe memcg stats for specific archs Shakeel Butt
@ 2025-05-19 6:31 ` Shakeel Butt
2025-05-19 6:31 ` [PATCH v4 4/5] memcg: nmi-safe slab stats updates Shakeel Butt
` (2 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2025-05-19 6:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Vlastimil Babka, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers,
Tejun Heo, bpf, linux-mm, cgroups, linux-kernel, Meta kernel team
The objcg based kmem charging and uncharging code path needs to update
MEMCG_KMEM appropriately. Let's add support to update MEMCG_KMEM in
nmi-safe way for those code paths.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/memcontrol.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e3f0b298308a..62d7997640db 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2717,6 +2717,23 @@ struct obj_cgroup *get_obj_cgroup_from_folio(struct folio *folio)
return objcg;
}
+#ifdef CONFIG_MEMCG_NMI_SAFETY_REQUIRES_ATOMIC
+static inline void account_kmem_nmi_safe(struct mem_cgroup *memcg, int val)
+{
+ if (likely(!in_nmi())) {
+ mod_memcg_state(memcg, MEMCG_KMEM, val);
+ } else {
+ /* TODO: add to cgroup update tree once it is nmi-safe. */
+ atomic_add(val, &memcg->kmem_stat);
+ }
+}
+#else
+static inline void account_kmem_nmi_safe(struct mem_cgroup *memcg, int val)
+{
+ mod_memcg_state(memcg, MEMCG_KMEM, val);
+}
+#endif
+
/*
* obj_cgroup_uncharge_pages: uncharge a number of kernel pages from a objcg
* @objcg: object cgroup to uncharge
@@ -2729,7 +2746,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
memcg = get_mem_cgroup_from_objcg(objcg);
- mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages);
+ account_kmem_nmi_safe(memcg, -nr_pages);
memcg1_account_kmem(memcg, -nr_pages);
if (!mem_cgroup_is_root(memcg))
refill_stock(memcg, nr_pages);
@@ -2757,7 +2774,7 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
if (ret)
goto out;
- mod_memcg_state(memcg, MEMCG_KMEM, nr_pages);
+ account_kmem_nmi_safe(memcg, nr_pages);
memcg1_account_kmem(memcg, nr_pages);
out:
css_put(&memcg->css);
--
2.47.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 4/5] memcg: nmi-safe slab stats updates
2025-05-19 6:31 [PATCH v4 0/5] memcg: nmi-safe kmem charging Shakeel Butt
` (2 preceding siblings ...)
2025-05-19 6:31 ` [PATCH v4 3/5] memcg: add nmi-safe update for MEMCG_KMEM Shakeel Butt
@ 2025-05-19 6:31 ` Shakeel Butt
2025-05-19 6:31 ` [PATCH v4 5/5] memcg: make memcg_rstat_updated nmi safe Shakeel Butt
2025-06-02 14:45 ` [PATCH v4 0/5] memcg: nmi-safe kmem charging Michal Koutný
5 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2025-05-19 6:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Vlastimil Babka, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers,
Tejun Heo, bpf, linux-mm, cgroups, linux-kernel, Meta kernel team
The objcg based kmem [un]charging can be called in nmi context and it
may need to update NR_SLAB_[UN]RECLAIMABLE_B stats. So, let's correctly
handle the updates of these stats in the nmi context.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/memcontrol.c | 36 +++++++++++++++++++++++++++++++++---
1 file changed, 33 insertions(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 62d7997640db..f1a46c29dde8 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2517,17 +2517,47 @@ static void commit_charge(struct folio *folio, struct mem_cgroup *memcg)
folio->memcg_data = (unsigned long)memcg;
}
+#ifdef CONFIG_MEMCG_NMI_SAFETY_REQUIRES_ATOMIC
+static inline void account_slab_nmi_safe(struct mem_cgroup *memcg,
+ struct pglist_data *pgdat,
+ enum node_stat_item idx, int nr)
+{
+ struct lruvec *lruvec;
+
+ if (likely(!in_nmi())) {
+ lruvec = mem_cgroup_lruvec(memcg, pgdat);
+ mod_memcg_lruvec_state(lruvec, idx, nr);
+ } else {
+ struct mem_cgroup_per_node *pn = memcg->nodeinfo[pgdat->node_id];
+
+ /* TODO: add to cgroup update tree once it is nmi-safe. */
+ if (idx == NR_SLAB_RECLAIMABLE_B)
+ atomic_add(nr, &pn->slab_reclaimable);
+ else
+ atomic_add(nr, &pn->slab_unreclaimable);
+ }
+}
+#else
+static inline void account_slab_nmi_safe(struct mem_cgroup *memcg,
+ struct pglist_data *pgdat,
+ enum node_stat_item idx, int nr)
+{
+ struct lruvec *lruvec;
+
+ lruvec = mem_cgroup_lruvec(memcg, pgdat);
+ mod_memcg_lruvec_state(lruvec, idx, nr);
+}
+#endif
+
static inline void mod_objcg_mlstate(struct obj_cgroup *objcg,
struct pglist_data *pgdat,
enum node_stat_item idx, int nr)
{
struct mem_cgroup *memcg;
- struct lruvec *lruvec;
rcu_read_lock();
memcg = obj_cgroup_memcg(objcg);
- lruvec = mem_cgroup_lruvec(memcg, pgdat);
- mod_memcg_lruvec_state(lruvec, idx, nr);
+ account_slab_nmi_safe(memcg, pgdat, idx, nr);
rcu_read_unlock();
}
--
2.47.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 5/5] memcg: make memcg_rstat_updated nmi safe
2025-05-19 6:31 [PATCH v4 0/5] memcg: nmi-safe kmem charging Shakeel Butt
` (3 preceding siblings ...)
2025-05-19 6:31 ` [PATCH v4 4/5] memcg: nmi-safe slab stats updates Shakeel Butt
@ 2025-05-19 6:31 ` Shakeel Butt
2025-06-02 14:45 ` [PATCH v4 0/5] memcg: nmi-safe kmem charging Michal Koutný
5 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2025-05-19 6:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Vlastimil Babka, Alexei Starovoitov, Sebastian Andrzej Siewior,
Harry Yoo, Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers,
Tejun Heo, bpf, linux-mm, cgroups, linux-kernel, Meta kernel team
Currently kernel maintains memory related stats updates per-cgroup to
optimize stats flushing. The stats_updates is defined as atomic64_t
which is not nmi-safe on some archs. Actually we don't really need 64bit
atomic as the max value stats_updates can get should be less than
nr_cpus * MEMCG_CHARGE_BATCH. A normal atomic_t should suffice.
Also the function cgroup_rstat_updated() is still not nmi-safe but there
is parallel effort to make it nmi-safe, so until then let's ignore it in
the nmi context.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/memcontrol.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f1a46c29dde8..59d969283cc0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -533,7 +533,7 @@ struct memcg_vmstats {
unsigned long events_pending[NR_MEMCG_EVENTS];
/* Stats updates since the last flush */
- atomic64_t stats_updates;
+ atomic_t stats_updates;
};
/*
@@ -559,7 +559,7 @@ static u64 flush_last_time;
static bool memcg_vmstats_needs_flush(struct memcg_vmstats *vmstats)
{
- return atomic64_read(&vmstats->stats_updates) >
+ return atomic_read(&vmstats->stats_updates) >
MEMCG_CHARGE_BATCH * num_online_cpus();
}
@@ -573,7 +573,9 @@ static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val,
if (!val)
return;
- cgroup_rstat_updated(memcg->css.cgroup, cpu);
+ /* TODO: add to cgroup update tree once it is nmi-safe. */
+ if (!in_nmi())
+ cgroup_rstat_updated(memcg->css.cgroup, cpu);
statc_pcpu = memcg->vmstats_percpu;
for (; statc_pcpu; statc_pcpu = statc->parent_pcpu) {
statc = this_cpu_ptr(statc_pcpu);
@@ -591,7 +593,7 @@ static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val,
continue;
stats_updates = this_cpu_xchg(statc_pcpu->stats_updates, 0);
- atomic64_add(stats_updates, &statc->vmstats->stats_updates);
+ atomic_add(stats_updates, &statc->vmstats->stats_updates);
}
}
@@ -599,7 +601,7 @@ static void __mem_cgroup_flush_stats(struct mem_cgroup *memcg, bool force)
{
bool needs_flush = memcg_vmstats_needs_flush(memcg->vmstats);
- trace_memcg_flush_stats(memcg, atomic64_read(&memcg->vmstats->stats_updates),
+ trace_memcg_flush_stats(memcg, atomic_read(&memcg->vmstats->stats_updates),
force, needs_flush);
if (!force && !needs_flush)
@@ -4120,8 +4122,8 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu)
}
WRITE_ONCE(statc->stats_updates, 0);
/* We are in a per-cpu loop here, only do the atomic write once */
- if (atomic64_read(&memcg->vmstats->stats_updates))
- atomic64_set(&memcg->vmstats->stats_updates, 0);
+ if (atomic_read(&memcg->vmstats->stats_updates))
+ atomic_set(&memcg->vmstats->stats_updates, 0);
}
static void mem_cgroup_fork(struct task_struct *task)
--
2.47.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v4 1/5] memcg: disable kmem charging in nmi for unsupported arch
2025-05-19 6:31 ` [PATCH v4 1/5] memcg: disable kmem charging in nmi for unsupported arch Shakeel Butt
@ 2025-05-19 6:46 ` Vlastimil Babka
0 siblings, 0 replies; 9+ messages in thread
From: Vlastimil Babka @ 2025-05-19 6:46 UTC (permalink / raw)
To: Shakeel Butt, Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Alexei Starovoitov, Sebastian Andrzej Siewior, Harry Yoo,
Yosry Ahmed, Peter Zijlstra, Mathieu Desnoyers, Tejun Heo, bpf,
linux-mm, cgroups, linux-kernel, Meta kernel team
On 5/19/25 08:31, Shakeel Butt wrote:
> The memcg accounting and stats uses this_cpu* and atomic* ops. There are
> archs which define CONFIG_HAVE_NMI but does not define
> CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS and ARCH_HAVE_NMI_SAFE_CMPXCHG, so
> memcg accounting for such archs in nmi context is not possible to
> support. Let's just disable memcg accounting in nmi context for such
> archs.
>
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> init/Kconfig | 7 +++++++
> mm/memcontrol.c | 3 +++
> 2 files changed, 10 insertions(+)
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 4cdd1049283c..a2aa49cfb8bd 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1006,6 +1006,13 @@ config MEMCG
> help
> Provides control over the memory footprint of tasks in a cgroup.
>
> +config MEMCG_NMI_UNSAFE
> + bool
> + depends on MEMCG
> + depends on HAVE_NMI
> + depends on !ARCH_HAS_NMI_SAFE_THIS_CPU_OPS && !ARCH_HAVE_NMI_SAFE_CMPXCHG
> + default y
> +
> config MEMCG_V1
> bool "Legacy cgroup v1 memory controller"
> depends on MEMCG
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e17b698f6243..532e2c06ea60 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2652,6 +2652,9 @@ __always_inline struct obj_cgroup *current_obj_cgroup(void)
> struct mem_cgroup *memcg;
> struct obj_cgroup *objcg;
>
> + if (IS_ENABLED(CONFIG_MEMCG_NMI_UNSAFE) && in_nmi())
> + return NULL;
> +
> if (in_task()) {
> memcg = current->active_memcg;
> if (unlikely(memcg))
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 0/5] memcg: nmi-safe kmem charging
2025-05-19 6:31 [PATCH v4 0/5] memcg: nmi-safe kmem charging Shakeel Butt
` (4 preceding siblings ...)
2025-05-19 6:31 ` [PATCH v4 5/5] memcg: make memcg_rstat_updated nmi safe Shakeel Butt
@ 2025-06-02 14:45 ` Michal Koutný
2025-06-02 14:54 ` Shakeel Butt
5 siblings, 1 reply; 9+ messages in thread
From: Michal Koutný @ 2025-06-02 14:45 UTC (permalink / raw)
To: Shakeel Butt
Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Roman Gushchin,
Muchun Song, Vlastimil Babka, Alexei Starovoitov,
Sebastian Andrzej Siewior, Harry Yoo, Yosry Ahmed, Peter Zijlstra,
Mathieu Desnoyers, Tejun Heo, bpf, linux-mm, cgroups,
linux-kernel, Meta kernel team
[-- Attachment #1: Type: text/plain, Size: 629 bytes --]
Hello Shakeel.
On Sun, May 18, 2025 at 11:31:37PM -0700, Shakeel Butt <shakeel.butt@linux.dev> wrote:
> Users can attached their BPF programs at arbitrary execution points in
> the kernel and such BPF programs may run in nmi context. In addition,
> these programs can trigger memcg charged kernel allocations in the nmi
> context. However memcg charging infra for kernel memory is not equipped
> to handle nmi context for all architectures.
How much memory does this refer to? Is it unbound (in particular for
non-privileged eBPF)? Or is it rather negligible? (I assume the former
to make the series worth it.)
Thanks,
Michal
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 0/5] memcg: nmi-safe kmem charging
2025-06-02 14:45 ` [PATCH v4 0/5] memcg: nmi-safe kmem charging Michal Koutný
@ 2025-06-02 14:54 ` Shakeel Butt
0 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2025-06-02 14:54 UTC (permalink / raw)
To: Michal Koutný
Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Roman Gushchin,
Muchun Song, Vlastimil Babka, Alexei Starovoitov,
Sebastian Andrzej Siewior, Harry Yoo, Yosry Ahmed, Peter Zijlstra,
Mathieu Desnoyers, Tejun Heo, bpf, linux-mm, cgroups,
linux-kernel, Meta kernel team
On Mon, Jun 02, 2025 at 04:45:31PM +0200, Michal Koutný wrote:
> Hello Shakeel.
>
> On Sun, May 18, 2025 at 11:31:37PM -0700, Shakeel Butt <shakeel.butt@linux.dev> wrote:
> > Users can attached their BPF programs at arbitrary execution points in
> > the kernel and such BPF programs may run in nmi context. In addition,
> > these programs can trigger memcg charged kernel allocations in the nmi
> > context. However memcg charging infra for kernel memory is not equipped
> > to handle nmi context for all architectures.
>
> How much memory does this refer to? Is it unbound (in particular for
> non-privileged eBPF)? Or is it rather negligible? (I assume the former
> to make the series worth it.)
>
It depends on the BPF program and thus can be arbitrarily large. So,
irrespective of privileged or non-privileged BPF programs, they can
allocate large amount of memory to maintain monitoring or debugging (or
for some other use-case) information.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-06-02 14:54 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-19 6:31 [PATCH v4 0/5] memcg: nmi-safe kmem charging Shakeel Butt
2025-05-19 6:31 ` [PATCH v4 1/5] memcg: disable kmem charging in nmi for unsupported arch Shakeel Butt
2025-05-19 6:46 ` Vlastimil Babka
2025-05-19 6:31 ` [PATCH v4 2/5] memcg: nmi safe memcg stats for specific archs Shakeel Butt
2025-05-19 6:31 ` [PATCH v4 3/5] memcg: add nmi-safe update for MEMCG_KMEM Shakeel Butt
2025-05-19 6:31 ` [PATCH v4 4/5] memcg: nmi-safe slab stats updates Shakeel Butt
2025-05-19 6:31 ` [PATCH v4 5/5] memcg: make memcg_rstat_updated nmi safe Shakeel Butt
2025-06-02 14:45 ` [PATCH v4 0/5] memcg: nmi-safe kmem charging Michal Koutný
2025-06-02 14:54 ` Shakeel Butt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).