* [RFC PATCH v3 0/9] pids controller events rework and migration charging
@ 2024-04-05 17:05 Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 1/9] cgroup/pids: Remove superfluous zeroing Michal Koutný
` (8 more replies)
0 siblings, 9 replies; 21+ messages in thread
From: Michal Koutný @ 2024-04-05 17:05 UTC (permalink / raw)
To: cgroups, linux-doc, linux-kernel, linux-kselftest
Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
The series consists of two parts:
- pids.events rework (originally v2, patches 1-6,
- migration charging, patches 7-9.
The changes are independent in principle, I stacked them for (my)
convenience and because they both deserve RFC:
1) Changed semantics of v2 pids.events
- similar change was proposed for memory.swap.events:max [1]
2) Migration charging is obsolete concept
How are the new events supposed to be useful?
- pids.events.local:max
- tells that cgroup's limit is hit (too tight?)
- pids.events.local:max.imposed
- tells that cgroup's workload was restricted (generalization of
'cgroup: fork rejected by pids controller in %s' message)
- pids.events:*
- "only" directs top-down search to cgroups of interest
The migration charging is motivated by apparenty surprising
pids.current > pids.max
because supervised processes are forked in supervisor's cgroup (more
details in commit cgroup/pids: Enforce pids.max on task migrations too)
Changes from v2 (https://lore.kernel.org/r/20200205134426.10570-1-mkoutny@suse.com)
- implemented pids.events.local (Tejun)
- added migration charging
[1] https://lore.kernel.org/r/20230202155626.1829121-1-hannes@cmpxchg.org/
Michal Koutný (9):
cgroup/pids: Remove superfluous zeroing
cgroup/pids: Separate semantics of pids.events related to pids.max
cgroup/pids: Make event counters hierarchical
cgroup/pids: Add pids.events.local
selftests: cgroup: Lexicographic order in Makefile
selftests: cgroup: Add basic tests for pids controller
cgroup/pids: Replace uncharge/charge pair with a single function
cgroup/pids: Enforce pids.max on task migrations
selftests: cgroup: Add tests pids controller
Documentation/admin-guide/cgroup-v1/pids.rst | 3 +-
Documentation/admin-guide/cgroup-v2.rst | 22 +-
include/linux/cgroup-defs.h | 7 +-
kernel/cgroup/cgroup.c | 16 +-
kernel/cgroup/pids.c | 206 +++++++++----
tools/testing/selftests/cgroup/Makefile | 25 +-
tools/testing/selftests/cgroup/test_pids.c | 302 +++++++++++++++++++
7 files changed, 514 insertions(+), 67 deletions(-)
create mode 100644 tools/testing/selftests/cgroup/test_pids.c
base-commit: 026e680b0a08a62b1d948e5a8ca78700bfac0e6e
--
2.44.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v3 1/9] cgroup/pids: Remove superfluous zeroing
2024-04-05 17:05 [RFC PATCH v3 0/9] pids controller events rework and migration charging Michal Koutný
@ 2024-04-05 17:05 ` Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 2/9] cgroup/pids: Separate semantics of pids.events related to pids.max Michal Koutný
` (7 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Michal Koutný @ 2024-04-05 17:05 UTC (permalink / raw)
To: cgroups, linux-doc, linux-kernel, linux-kselftest
Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
Atomic counters are in kzalloc'd struct. They are zeroed already and
atomic64_t does not need special initialization
(cf kernel/trace/trace_clock.c:trace_counter).
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
kernel/cgroup/pids.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index 7695e60bcb40..0e5ec7d59b4d 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -75,9 +75,7 @@ pids_css_alloc(struct cgroup_subsys_state *parent)
if (!pids)
return ERR_PTR(-ENOMEM);
- atomic64_set(&pids->counter, 0);
atomic64_set(&pids->limit, PIDS_MAX);
- atomic64_set(&pids->events_limit, 0);
return &pids->css;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC PATCH v3 2/9] cgroup/pids: Separate semantics of pids.events related to pids.max
2024-04-05 17:05 [RFC PATCH v3 0/9] pids controller events rework and migration charging Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 1/9] cgroup/pids: Remove superfluous zeroing Michal Koutný
@ 2024-04-05 17:05 ` Michal Koutný
2024-04-08 17:55 ` Tejun Heo
2024-04-05 17:05 ` [RFC PATCH v3 3/9] cgroup/pids: Make event counters hierarchical Michal Koutný
` (6 subsequent siblings)
8 siblings, 1 reply; 21+ messages in thread
From: Michal Koutný @ 2024-04-05 17:05 UTC (permalink / raw)
To: cgroups, linux-doc, linux-kernel, linux-kselftest
Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
Currently, when pids.max limit is breached in the hierarchy, the event
is counted and reported in the cgroup where the forking task resides.
This decouples the limit and the notification caused by the limit making
it hard to detect when the actual limit was effected.
Let's introduce new events:
max
The number of times the limit of the cgroup was hit.
max.imposed
The number of times fork failed in the cgroup because of self
or ancestor limit.
Since it changes semantics of the original "max" event, we introduce
this change only in the v2 API of the controller.
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
Documentation/admin-guide/cgroup-v1/pids.rst | 3 +-
Documentation/admin-guide/cgroup-v2.rst | 12 ++++
kernel/cgroup/pids.c | 71 ++++++++++++++++----
3 files changed, 73 insertions(+), 13 deletions(-)
diff --git a/Documentation/admin-guide/cgroup-v1/pids.rst b/Documentation/admin-guide/cgroup-v1/pids.rst
index 6acebd9e72c8..0f9f9a7b1f6c 100644
--- a/Documentation/admin-guide/cgroup-v1/pids.rst
+++ b/Documentation/admin-guide/cgroup-v1/pids.rst
@@ -36,7 +36,8 @@ superset of parent/child/pids.current.
The pids.events file contains event counters:
- - max: Number of times fork failed because limit was hit.
+ - max: Number of times fork failed in the cgroup because limit was hit in
+ self or ancestors.
Example
-------
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 17e6e9565156..4f04538d688c 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2186,6 +2186,18 @@ PID Interface Files
The number of processes currently in the cgroup and its
descendants.
+ pids.events
+ A read-only flat-keyed file which exists on non-root cgroups. Unless
+ specified otherwise, a value change in this file generates a file modified
+ event. The following entries are defined.
+
+ max
+ The number of times the limit of the cgroup was hit.
+
+ max.imposed
+ The number of times fork failed in the cgroup because of self
+ or ancestor limit.
+
Organisational operations are not blocked by cgroup policies, so it is
possible to have pids.current > pids.max. This can be done by either
setting the limit to be smaller than pids.current, or attaching enough
diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index 0e5ec7d59b4d..471562609eef 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -38,6 +38,14 @@
#define PIDS_MAX (PID_MAX_LIMIT + 1ULL)
#define PIDS_MAX_STR "max"
+enum pidcg_event {
+ /* Fork failed in subtree because this pids_cgroup limit was hit. */
+ PIDCG_MAX,
+ /* Fork failed in this pids_cgroup because ancestor limit was hit. */
+ PIDCG_MAX_IMPOSED,
+ NR_PIDCG_EVENTS,
+};
+
struct pids_cgroup {
struct cgroup_subsys_state css;
@@ -52,8 +60,7 @@ struct pids_cgroup {
/* Handle for "pids.events" */
struct cgroup_file events_file;
- /* Number of times fork failed because limit was hit. */
- atomic64_t events_limit;
+ atomic64_t events[NR_PIDCG_EVENTS];
};
static struct pids_cgroup *css_pids(struct cgroup_subsys_state *css)
@@ -148,12 +155,13 @@ static void pids_charge(struct pids_cgroup *pids, int num)
* pids_try_charge - hierarchically try to charge the pid count
* @pids: the pid cgroup state
* @num: the number of pids to charge
+ * @fail: storage of pid cgroup causing the fail
*
* This function follows the set limit. It will fail if the charge would cause
* the new value to exceed the hierarchical limit. Returns 0 if the charge
* succeeded, otherwise -EAGAIN.
*/
-static int pids_try_charge(struct pids_cgroup *pids, int num)
+static int pids_try_charge(struct pids_cgroup *pids, int num, struct pids_cgroup **fail)
{
struct pids_cgroup *p, *q;
@@ -166,9 +174,10 @@ static int pids_try_charge(struct pids_cgroup *pids, int num)
* p->limit is %PIDS_MAX then we know that this test will never
* fail.
*/
- if (new > limit)
+ if (new > limit) {
+ *fail = p;
goto revert;
-
+ }
/*
* Not technically accurate if we go over limit somewhere up
* the hierarchy, but that's tolerable for the watermark.
@@ -236,7 +245,7 @@ static void pids_cancel_attach(struct cgroup_taskset *tset)
static int pids_can_fork(struct task_struct *task, struct css_set *cset)
{
struct cgroup_subsys_state *css;
- struct pids_cgroup *pids;
+ struct pids_cgroup *pids, *pids_over_limit;
int err;
if (cset)
@@ -244,15 +253,23 @@ static int pids_can_fork(struct task_struct *task, struct css_set *cset)
else
css = task_css_check(current, pids_cgrp_id, true);
pids = css_pids(css);
- err = pids_try_charge(pids, 1);
+ err = pids_try_charge(pids, 1, &pids_over_limit);
if (err) {
- /* Only log the first time events_limit is incremented. */
- if (atomic64_inc_return(&pids->events_limit) == 1) {
+ /* compatibility on v1 where events were notified in leaves. */
+ if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
+ pids_over_limit = pids;
+
+ /* Only log the first time limit is hit. */
+ if (atomic64_inc_return(&pids->events[PIDCG_MAX_IMPOSED]) == 1) {
pr_info("cgroup: fork rejected by pids controller in ");
- pr_cont_cgroup_path(css->cgroup);
+ pr_cont_cgroup_path(pids->css.cgroup);
pr_cont("\n");
}
+ atomic64_inc(&pids_over_limit->events[PIDCG_MAX]);
+
cgroup_file_notify(&pids->events_file);
+ if (pids_over_limit != pids)
+ cgroup_file_notify(&pids_over_limit->events_file);
}
return err;
}
@@ -341,7 +358,16 @@ static int pids_events_show(struct seq_file *sf, void *v)
{
struct pids_cgroup *pids = css_pids(seq_css(sf));
- seq_printf(sf, "max %lld\n", (s64)atomic64_read(&pids->events_limit));
+ seq_printf(sf, "max %lld\n", (s64)atomic64_read(&pids->events[PIDCG_MAX]));
+ seq_printf(sf, "max.imposed %lld\n", (s64)atomic64_read(&pids->events[PIDCG_MAX_IMPOSED]));
+ return 0;
+}
+
+static int pids_events_show_legacy(struct seq_file *sf, void *v)
+{
+ struct pids_cgroup *pids = css_pids(seq_css(sf));
+
+ seq_printf(sf, "max%lld\n", (s64)atomic64_read(&pids->events[PIDCG_MAX_IMPOSED]));
return 0;
}
@@ -371,6 +397,27 @@ static struct cftype pids_files[] = {
{ } /* terminate */
};
+static struct cftype pids_files_legacy[] = {
+ {
+ .name = "max",
+ .write = pids_max_write,
+ .seq_show = pids_max_show,
+ .flags = CFTYPE_NOT_ON_ROOT,
+ },
+ {
+ .name = "current",
+ .read_s64 = pids_current_read,
+ .flags = CFTYPE_NOT_ON_ROOT,
+ },
+ {
+ .name = "events",
+ .seq_show = pids_events_show_legacy,
+ .file_offset = offsetof(struct pids_cgroup, events_file),
+ .flags = CFTYPE_NOT_ON_ROOT,
+ },
+ { } /* terminate */
+};
+
struct cgroup_subsys pids_cgrp_subsys = {
.css_alloc = pids_css_alloc,
.css_free = pids_css_free,
@@ -379,7 +426,7 @@ struct cgroup_subsys pids_cgrp_subsys = {
.can_fork = pids_can_fork,
.cancel_fork = pids_cancel_fork,
.release = pids_release,
- .legacy_cftypes = pids_files,
.dfl_cftypes = pids_files,
+ .legacy_cftypes = pids_files_legacy,
.threaded = true,
};
--
2.44.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC PATCH v3 3/9] cgroup/pids: Make event counters hierarchical
2024-04-05 17:05 [RFC PATCH v3 0/9] pids controller events rework and migration charging Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 1/9] cgroup/pids: Remove superfluous zeroing Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 2/9] cgroup/pids: Separate semantics of pids.events related to pids.max Michal Koutný
@ 2024-04-05 17:05 ` Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 4/9] cgroup/pids: Add pids.events.local Michal Koutný
` (5 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Michal Koutný @ 2024-04-05 17:05 UTC (permalink / raw)
To: cgroups, linux-doc, linux-kernel, linux-kselftest
Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
The pids.events file should honor the hierarchy, so make the events
propagate from their origin up to the root on the unified hierarchy. The
v1 behavior remains non-hierarchical.
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
Documentation/admin-guide/cgroup-v2.rst | 4 ++-
kernel/cgroup/pids.c | 46 ++++++++++++++++---------
2 files changed, 33 insertions(+), 17 deletions(-)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 4f04538d688c..5d4c505cae06 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2189,7 +2189,9 @@ PID Interface Files
pids.events
A read-only flat-keyed file which exists on non-root cgroups. Unless
specified otherwise, a value change in this file generates a file modified
- event. The following entries are defined.
+ event. Fields in this file are hierarchical and the file modified event
+ can be generated due to an event down the hierarchy. The following
+ entries are defined.
max
The number of times the limit of the cgroup was hit.
diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index 471562609eef..76c0a97e42da 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -238,6 +238,34 @@ static void pids_cancel_attach(struct cgroup_taskset *tset)
}
}
+static void pids_event(struct pids_cgroup *pids_forking,
+ struct pids_cgroup *pids_over_limit)
+{
+ struct pids_cgroup *p = pids_forking;
+ bool limit = false;
+
+ /* Only log the first time limit is hit. */
+ if (atomic64_inc_return(&p->events[PIDCG_MAX_IMPOSED]) == 1) {
+ pr_info("cgroup: fork rejected by pids controller in ");
+ pr_cont_cgroup_path(p->css.cgroup);
+ pr_cont("\n");
+ }
+ /* Events are only notified in pids_forking on v1 */
+ if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
+ return;
+
+ for (; parent_pids(p); p = parent_pids(p)) {
+ atomic64_inc(&p->events[PIDCG_MAX_IMPOSED]);
+
+ if (p == pids_over_limit)
+ limit = true;
+ if (limit)
+ atomic64_inc(&p->events[PIDCG_MAX]);
+
+ cgroup_file_notify(&p->events_file);
+ }
+}
+
/*
* task_css_check(true) in pids_can_fork() and pids_cancel_fork() relies
* on cgroup_threadgroup_change_begin() held by the copy_process().
@@ -254,23 +282,9 @@ static int pids_can_fork(struct task_struct *task, struct css_set *cset)
css = task_css_check(current, pids_cgrp_id, true);
pids = css_pids(css);
err = pids_try_charge(pids, 1, &pids_over_limit);
- if (err) {
- /* compatibility on v1 where events were notified in leaves. */
- if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
- pids_over_limit = pids;
-
- /* Only log the first time limit is hit. */
- if (atomic64_inc_return(&pids->events[PIDCG_MAX_IMPOSED]) == 1) {
- pr_info("cgroup: fork rejected by pids controller in ");
- pr_cont_cgroup_path(pids->css.cgroup);
- pr_cont("\n");
- }
- atomic64_inc(&pids_over_limit->events[PIDCG_MAX]);
+ if (err)
+ pids_event(pids, pids_over_limit);
- cgroup_file_notify(&pids->events_file);
- if (pids_over_limit != pids)
- cgroup_file_notify(&pids_over_limit->events_file);
- }
return err;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC PATCH v3 4/9] cgroup/pids: Add pids.events.local
2024-04-05 17:05 [RFC PATCH v3 0/9] pids controller events rework and migration charging Michal Koutný
` (2 preceding siblings ...)
2024-04-05 17:05 ` [RFC PATCH v3 3/9] cgroup/pids: Make event counters hierarchical Michal Koutný
@ 2024-04-05 17:05 ` Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 5/9] selftests: cgroup: Lexicographic order in Makefile Michal Koutný
` (4 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Michal Koutný @ 2024-04-05 17:05 UTC (permalink / raw)
To: cgroups, linux-doc, linux-kernel, linux-kselftest
Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
Hierarchical counting of events is not practical for watching when a
particular pids.max is being hit. Therefore introduce .local flavor of
events file (akin to memory controller) that collects only events
relevant to given cgroup.
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
kernel/cgroup/pids.c | 41 +++++++++++++++++++++++++++++++++--------
1 file changed, 33 insertions(+), 8 deletions(-)
diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index 76c0a97e42da..f5f81274658e 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -57,10 +57,12 @@ struct pids_cgroup {
atomic64_t limit;
int64_t watermark;
- /* Handle for "pids.events" */
+ /* Handles for pids.events[.local] */
struct cgroup_file events_file;
+ struct cgroup_file events_local_file;
atomic64_t events[NR_PIDCG_EVENTS];
+ atomic64_t events_local[NR_PIDCG_EVENTS];
};
static struct pids_cgroup *css_pids(struct cgroup_subsys_state *css)
@@ -245,11 +247,12 @@ static void pids_event(struct pids_cgroup *pids_forking,
bool limit = false;
/* Only log the first time limit is hit. */
- if (atomic64_inc_return(&p->events[PIDCG_MAX_IMPOSED]) == 1) {
+ if (atomic64_inc_return(&p->events_local[PIDCG_MAX_IMPOSED]) == 1) {
pr_info("cgroup: fork rejected by pids controller in ");
pr_cont_cgroup_path(p->css.cgroup);
pr_cont("\n");
}
+ cgroup_file_notify(&p->events_local_file);
/* Events are only notified in pids_forking on v1 */
if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
return;
@@ -257,8 +260,11 @@ static void pids_event(struct pids_cgroup *pids_forking,
for (; parent_pids(p); p = parent_pids(p)) {
atomic64_inc(&p->events[PIDCG_MAX_IMPOSED]);
- if (p == pids_over_limit)
+ if (p == pids_over_limit) {
limit = true;
+ atomic64_inc(&p->events_local[PIDCG_MAX]);
+ cgroup_file_notify(&p->events_local_file);
+ }
if (limit)
atomic64_inc(&p->events[PIDCG_MAX]);
@@ -368,12 +374,25 @@ static s64 pids_peak_read(struct cgroup_subsys_state *css,
return READ_ONCE(pids->watermark);
}
-static int pids_events_show(struct seq_file *sf, void *v)
+static int __pids_events_show(struct seq_file *sf, bool local)
{
struct pids_cgroup *pids = css_pids(seq_css(sf));
+ atomic64_t *events = local ? pids->events_local : pids->events;
+
+ seq_printf(sf, "max %lld\n", (s64)atomic64_read(&events[PIDCG_MAX]));
+ seq_printf(sf, "max.imposed %lld\n", (s64)atomic64_read(&events[PIDCG_MAX_IMPOSED]));
+ return 0;
+}
+
+static int pids_events_show(struct seq_file *sf, void *v)
+{
+ __pids_events_show(sf, false);
+ return 0;
+}
- seq_printf(sf, "max %lld\n", (s64)atomic64_read(&pids->events[PIDCG_MAX]));
- seq_printf(sf, "max.imposed %lld\n", (s64)atomic64_read(&pids->events[PIDCG_MAX_IMPOSED]));
+static int pids_events_local_show(struct seq_file *sf, void *v)
+{
+ __pids_events_show(sf, true);
return 0;
}
@@ -381,7 +400,7 @@ static int pids_events_show_legacy(struct seq_file *sf, void *v)
{
struct pids_cgroup *pids = css_pids(seq_css(sf));
- seq_printf(sf, "max%lld\n", (s64)atomic64_read(&pids->events[PIDCG_MAX_IMPOSED]));
+ seq_printf(sf, "max%lld\n", (s64)atomic64_read(&pids->events_local[PIDCG_MAX_IMPOSED]));
return 0;
}
@@ -408,6 +427,12 @@ static struct cftype pids_files[] = {
.file_offset = offsetof(struct pids_cgroup, events_file),
.flags = CFTYPE_NOT_ON_ROOT,
},
+ {
+ .name = "events.local",
+ .seq_show = pids_events_local_show,
+ .file_offset = offsetof(struct pids_cgroup, events_file),
+ .flags = CFTYPE_NOT_ON_ROOT,
+ },
{ } /* terminate */
};
@@ -426,7 +451,7 @@ static struct cftype pids_files_legacy[] = {
{
.name = "events",
.seq_show = pids_events_show_legacy,
- .file_offset = offsetof(struct pids_cgroup, events_file),
+ .file_offset = offsetof(struct pids_cgroup, events_local_file),
.flags = CFTYPE_NOT_ON_ROOT,
},
{ } /* terminate */
--
2.44.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC PATCH v3 5/9] selftests: cgroup: Lexicographic order in Makefile
2024-04-05 17:05 [RFC PATCH v3 0/9] pids controller events rework and migration charging Michal Koutný
` (3 preceding siblings ...)
2024-04-05 17:05 ` [RFC PATCH v3 4/9] cgroup/pids: Add pids.events.local Michal Koutný
@ 2024-04-05 17:05 ` Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 6/9] selftests: cgroup: Add basic tests for pids controller Michal Koutný
` (3 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Michal Koutný @ 2024-04-05 17:05 UTC (permalink / raw)
To: cgroups, linux-doc, linux-kernel, linux-kselftest
Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
This will reduce number of conflicts when modifying the lists.
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
tools/testing/selftests/cgroup/Makefile | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/tools/testing/selftests/cgroup/Makefile b/tools/testing/selftests/cgroup/Makefile
index 00b441928909..f3e1ef69e88d 100644
--- a/tools/testing/selftests/cgroup/Makefile
+++ b/tools/testing/selftests/cgroup/Makefile
@@ -6,26 +6,27 @@ all: ${HELPER_PROGS}
TEST_FILES := with_stress.sh
TEST_PROGS := test_stress.sh test_cpuset_prs.sh
TEST_GEN_FILES := wait_inotify
-TEST_GEN_PROGS = test_memcontrol
-TEST_GEN_PROGS += test_kmem
-TEST_GEN_PROGS += test_core
-TEST_GEN_PROGS += test_freezer
-TEST_GEN_PROGS += test_kill
+# Keep the lists lexicographically sorted
+TEST_GEN_PROGS = test_core
TEST_GEN_PROGS += test_cpu
TEST_GEN_PROGS += test_cpuset
-TEST_GEN_PROGS += test_zswap
+TEST_GEN_PROGS += test_freezer
TEST_GEN_PROGS += test_hugetlb_memcg
+TEST_GEN_PROGS += test_kill
+TEST_GEN_PROGS += test_kmem
+TEST_GEN_PROGS += test_memcontrol
+TEST_GEN_PROGS += test_zswap
LOCAL_HDRS += $(selfdir)/clone3/clone3_selftests.h $(selfdir)/pidfd/pidfd.h
include ../lib.mk
-$(OUTPUT)/test_memcontrol: cgroup_util.c
-$(OUTPUT)/test_kmem: cgroup_util.c
$(OUTPUT)/test_core: cgroup_util.c
-$(OUTPUT)/test_freezer: cgroup_util.c
-$(OUTPUT)/test_kill: cgroup_util.c
$(OUTPUT)/test_cpu: cgroup_util.c
$(OUTPUT)/test_cpuset: cgroup_util.c
-$(OUTPUT)/test_zswap: cgroup_util.c
+$(OUTPUT)/test_freezer: cgroup_util.c
$(OUTPUT)/test_hugetlb_memcg: cgroup_util.c
+$(OUTPUT)/test_kill: cgroup_util.c
+$(OUTPUT)/test_kmem: cgroup_util.c
+$(OUTPUT)/test_memcontrol: cgroup_util.c
+$(OUTPUT)/test_zswap: cgroup_util.c
--
2.44.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC PATCH v3 6/9] selftests: cgroup: Add basic tests for pids controller
2024-04-05 17:05 [RFC PATCH v3 0/9] pids controller events rework and migration charging Michal Koutný
` (4 preceding siblings ...)
2024-04-05 17:05 ` [RFC PATCH v3 5/9] selftests: cgroup: Lexicographic order in Makefile Michal Koutný
@ 2024-04-05 17:05 ` Michal Koutný
2024-04-06 21:37 ` Muhammad Usama Anjum
2024-04-05 17:05 ` [RFC PATCH v3 7/9] cgroup/pids: Replace uncharge/charge pair with a single function Michal Koutný
` (2 subsequent siblings)
8 siblings, 1 reply; 21+ messages in thread
From: Michal Koutný @ 2024-04-05 17:05 UTC (permalink / raw)
To: cgroups, linux-doc, linux-kernel, linux-kselftest
Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
This commit adds (and wires in) new test program for checking basic pids
controller functionality -- restricting tasks in a cgroup and correct
event counting.
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
tools/testing/selftests/cgroup/Makefile | 2 +
tools/testing/selftests/cgroup/test_pids.c | 187 +++++++++++++++++++++
2 files changed, 189 insertions(+)
create mode 100644 tools/testing/selftests/cgroup/test_pids.c
diff --git a/tools/testing/selftests/cgroup/Makefile b/tools/testing/selftests/cgroup/Makefile
index f3e1ef69e88d..f5f0886a2c4a 100644
--- a/tools/testing/selftests/cgroup/Makefile
+++ b/tools/testing/selftests/cgroup/Makefile
@@ -15,6 +15,7 @@ TEST_GEN_PROGS += test_hugetlb_memcg
TEST_GEN_PROGS += test_kill
TEST_GEN_PROGS += test_kmem
TEST_GEN_PROGS += test_memcontrol
+TEST_GEN_PROGS += test_pids
TEST_GEN_PROGS += test_zswap
LOCAL_HDRS += $(selfdir)/clone3/clone3_selftests.h $(selfdir)/pidfd/pidfd.h
@@ -29,4 +30,5 @@ $(OUTPUT)/test_hugetlb_memcg: cgroup_util.c
$(OUTPUT)/test_kill: cgroup_util.c
$(OUTPUT)/test_kmem: cgroup_util.c
$(OUTPUT)/test_memcontrol: cgroup_util.c
+$(OUTPUT)/test_pids: cgroup_util.c
$(OUTPUT)/test_zswap: cgroup_util.c
diff --git a/tools/testing/selftests/cgroup/test_pids.c b/tools/testing/selftests/cgroup/test_pids.c
new file mode 100644
index 000000000000..c1c3a3965624
--- /dev/null
+++ b/tools/testing/selftests/cgroup/test_pids.c
@@ -0,0 +1,187 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+
+#include <errno.h>
+#include <linux/limits.h>
+#include <signal.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "../kselftest.h"
+#include "cgroup_util.h"
+
+static int run_success(const char *cgroup, void *arg)
+{
+ return 0;
+}
+
+static int run_pause(const char *cgroup, void *arg)
+{
+ return pause();
+}
+
+/*
+ * This test checks that pids.max prevents forking new children above the
+ * specified limit in the cgroup.
+ */
+static int test_pids_max(const char *root)
+{
+ int ret = KSFT_FAIL;
+ char *cg_pids;
+ int pid;
+
+
+ cg_pids = cg_name(root, "pids_test");
+ if (!cg_pids)
+ goto cleanup;
+
+ if (cg_create(cg_pids))
+ goto cleanup;
+
+ if (cg_read_strcmp(cg_pids, "pids.max", "max\n"))
+ goto cleanup;
+
+ if (cg_write(cg_pids, "pids.max", "2"))
+ goto cleanup;
+
+ if (cg_enter_current(cg_pids))
+ goto cleanup;
+
+ pid = cg_run_nowait(cg_pids, run_pause, NULL);
+ if (pid < 0)
+ goto cleanup;
+
+ if (cg_run_nowait(cg_pids, run_success, NULL) != -1 || errno != EAGAIN)
+ goto cleanup;
+
+ if (kill(pid, SIGINT))
+ goto cleanup;
+
+ ret = KSFT_PASS;
+
+cleanup:
+ cg_enter_current(root);
+ cg_destroy(cg_pids);
+ free(cg_pids);
+
+ return ret;
+}
+
+/*
+ * This test checks that pids.max prevents forking new children above the
+ * specified limit in the cgroup.
+ */
+static int test_pids_events(const char *root)
+{
+ int ret = KSFT_FAIL;
+ char *cg_parent = NULL, *cg_child = NULL;
+ int pid;
+
+
+ cg_parent = cg_name(root, "pids_parent");
+ cg_child = cg_name(cg_parent, "pids_child");
+ if (!cg_parent || !cg_child)
+ goto cleanup;
+
+ if (cg_create(cg_parent))
+ goto cleanup;
+ if (cg_write(cg_parent, "cgroup.subtree_control", "+pids"))
+ goto cleanup;
+ if (cg_create(cg_child))
+ goto cleanup;
+
+ if (cg_write(cg_parent, "pids.max", "2"))
+ goto cleanup;
+
+ if (cg_read_strcmp(cg_child, "pids.max", "max\n"))
+ goto cleanup;
+
+ if (cg_enter_current(cg_child))
+ goto cleanup;
+
+ pid = cg_run_nowait(cg_child, run_pause, NULL);
+ if (pid < 0)
+ goto cleanup;
+
+ if (cg_run_nowait(cg_child, run_success, NULL) != -1 || errno != EAGAIN)
+ goto cleanup;
+
+ if (kill(pid, SIGINT))
+ goto cleanup;
+
+
+ if (cg_read_key_long(cg_child, "pids.events", "max ") != 0)
+ goto cleanup;
+ if (cg_read_key_long(cg_child, "pids.events", "max.imposed ") != 1)
+ goto cleanup;
+
+ if (cg_read_key_long(cg_parent, "pids.events", "max ") != 1)
+ goto cleanup;
+ if (cg_read_key_long(cg_parent, "pids.events", "max.imposed ") != 1)
+ goto cleanup;
+
+
+ ret = KSFT_PASS;
+
+cleanup:
+ cg_enter_current(root);
+ if (cg_child)
+ cg_destroy(cg_child);
+ if (cg_parent)
+ cg_destroy(cg_parent);
+ free(cg_child);
+ free(cg_parent);
+
+ return ret;
+}
+
+
+
+#define T(x) { x, #x }
+struct pids_test {
+ int (*fn)(const char *root);
+ const char *name;
+} tests[] = {
+ T(test_pids_max),
+ T(test_pids_events),
+};
+#undef T
+
+int main(int argc, char **argv)
+{
+ char root[PATH_MAX];
+ int i, ret = EXIT_SUCCESS;
+
+ if (cg_find_unified_root(root, sizeof(root)))
+ ksft_exit_skip("cgroup v2 isn't mounted\n");
+
+ /*
+ * Check that pids controller is available:
+ * pids is listed in cgroup.controllers
+ */
+ if (cg_read_strstr(root, "cgroup.controllers", "pids"))
+ ksft_exit_skip("pids controller isn't available\n");
+
+ if (cg_read_strstr(root, "cgroup.subtree_control", "pids"))
+ if (cg_write(root, "cgroup.subtree_control", "+pids"))
+ ksft_exit_skip("Failed to set pids controller\n");
+
+ for (i = 0; i < ARRAY_SIZE(tests); i++) {
+ switch (tests[i].fn(root)) {
+ case KSFT_PASS:
+ ksft_test_result_pass("%s\n", tests[i].name);
+ break;
+ case KSFT_SKIP:
+ ksft_test_result_skip("%s\n", tests[i].name);
+ break;
+ default:
+ ret = EXIT_FAILURE;
+ ksft_test_result_fail("%s\n", tests[i].name);
+ break;
+ }
+ }
+
+ return ret;
+}
--
2.44.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC PATCH v3 7/9] cgroup/pids: Replace uncharge/charge pair with a single function
2024-04-05 17:05 [RFC PATCH v3 0/9] pids controller events rework and migration charging Michal Koutný
` (5 preceding siblings ...)
2024-04-05 17:05 ` [RFC PATCH v3 6/9] selftests: cgroup: Add basic tests for pids controller Michal Koutný
@ 2024-04-05 17:05 ` Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 8/9] cgroup/pids: Enforce pids.max on task migrations Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 9/9] selftests: cgroup: Add tests pids controller Michal Koutný
8 siblings, 0 replies; 21+ messages in thread
From: Michal Koutný @ 2024-04-05 17:05 UTC (permalink / raw)
To: cgroups, linux-doc, linux-kernel, linux-kselftest
Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
No functional change intended. This rework reduces modifications of pids
counters only to a minimal subtree of uncharged/charged cgroups.
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
kernel/cgroup/pids.c | 80 ++++++++++++++++++++++++++------------------
1 file changed, 47 insertions(+), 33 deletions(-)
diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index f5f81274658e..9df8a209a6e2 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -133,41 +133,23 @@ static void pids_uncharge(struct pids_cgroup *pids, int num)
pids_cancel(p, num);
}
-/**
- * pids_charge - hierarchically charge the pid count
- * @pids: the pid cgroup state
- * @num: the number of pids to charge
- *
- * This function does *not* follow the pid limit set. It cannot fail and the new
- * pid count may exceed the limit. This is only used for reverting failed
- * attaches, where there is no other way out than violating the limit.
- */
-static void pids_charge(struct pids_cgroup *pids, int num)
-{
- struct pids_cgroup *p;
-
- for (p = pids; parent_pids(p); p = parent_pids(p)) {
- int64_t new = atomic64_add_return(num, &p->counter);
-
- pids_update_watermark(p, new);
- }
-}
-
/**
* pids_try_charge - hierarchically try to charge the pid count
* @pids: the pid cgroup state
* @num: the number of pids to charge
+ * @root: charge only under this root (NULL is global root)
* @fail: storage of pid cgroup causing the fail
*
* This function follows the set limit. It will fail if the charge would cause
- * the new value to exceed the hierarchical limit. Returns 0 if the charge
- * succeeded, otherwise -EAGAIN.
+ * the new value to exceed the hierarchical limit and fail is set. Returns 0 if
+ * no limit was hit, otherwise -EAGAIN.
*/
-static int pids_try_charge(struct pids_cgroup *pids, int num, struct pids_cgroup **fail)
+static int pids_try_charge(struct pids_cgroup *pids, int num, struct pids_cgroup *root, struct pids_cgroup **fail)
{
struct pids_cgroup *p, *q;
+ int ret = 0;
- for (p = pids; parent_pids(p); p = parent_pids(p)) {
+ for (p = pids; parent_pids(p) && p != root; p = parent_pids(p)) {
int64_t new = atomic64_add_return(num, &p->counter);
int64_t limit = atomic64_read(&p->limit);
@@ -177,8 +159,11 @@ static int pids_try_charge(struct pids_cgroup *pids, int num, struct pids_cgroup
* fail.
*/
if (new > limit) {
- *fail = p;
- goto revert;
+ ret = -EAGAIN;
+ if (fail) {
+ *fail = p;
+ goto revert;
+ }
}
/*
* Not technically accurate if we go over limit somewhere up
@@ -187,14 +172,45 @@ static int pids_try_charge(struct pids_cgroup *pids, int num, struct pids_cgroup
pids_update_watermark(p, new);
}
- return 0;
+ return ret;
revert:
for (q = pids; q != p; q = parent_pids(q))
pids_cancel(q, num);
pids_cancel(p, num);
- return -EAGAIN;
+ return ret;
+}
+
+/**
+ * pids_tranfer_charge - charge/uncharge in subtree betwee src and dst
+ * @src: pid cgroup state to uncharge
+ * @dst: pid cgroup state to charge
+ * @num: the number of pids to transfer
+ *
+ * The function updates charged pids in subtree whose root is the closest
+ * common ancestor of @src and @dst. This root and its ancestors are not
+ * modified (their limits are not enacted).
+ *
+ * Returns 0 if no limit was hit, -EAGAIN if a limit on path [@dst, @comm) was
+ * hit (charges are transferred despite the limit).
+ */
+static int pids_tranfer_charge(struct pids_cgroup *src, struct pids_cgroup *dst, int num)
+{
+ struct pids_cgroup *p, *comm = src;
+ int ret;
+
+ /* for stable cgroup tree */
+ lockdep_assert_held(&cgroup_mutex);
+
+ while (!cgroup_is_descendant(dst->css.cgroup, comm->css.cgroup))
+ comm = parent_pids(comm);
+
+ ret = pids_try_charge(dst, num, comm, NULL);
+
+ for (p = src; p != comm; p = parent_pids(p))
+ pids_cancel(p, num);
+ return ret;
}
static int pids_can_attach(struct cgroup_taskset *tset)
@@ -215,8 +231,7 @@ static int pids_can_attach(struct cgroup_taskset *tset)
old_css = task_css(task, pids_cgrp_id);
old_pids = css_pids(old_css);
- pids_charge(pids, 1);
- pids_uncharge(old_pids, 1);
+ (void) pids_tranfer_charge(old_pids, pids, 1);
}
return 0;
@@ -235,8 +250,7 @@ static void pids_cancel_attach(struct cgroup_taskset *tset)
old_css = task_css(task, pids_cgrp_id);
old_pids = css_pids(old_css);
- pids_charge(old_pids, 1);
- pids_uncharge(pids, 1);
+ (void) pids_tranfer_charge(pids, old_pids, 1);
}
}
@@ -287,7 +301,7 @@ static int pids_can_fork(struct task_struct *task, struct css_set *cset)
else
css = task_css_check(current, pids_cgrp_id, true);
pids = css_pids(css);
- err = pids_try_charge(pids, 1, &pids_over_limit);
+ err = pids_try_charge(pids, 1, NULL, &pids_over_limit);
if (err)
pids_event(pids, pids_over_limit);
--
2.44.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC PATCH v3 8/9] cgroup/pids: Enforce pids.max on task migrations
2024-04-05 17:05 [RFC PATCH v3 0/9] pids controller events rework and migration charging Michal Koutný
` (6 preceding siblings ...)
2024-04-05 17:05 ` [RFC PATCH v3 7/9] cgroup/pids: Replace uncharge/charge pair with a single function Michal Koutný
@ 2024-04-05 17:05 ` Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 9/9] selftests: cgroup: Add tests pids controller Michal Koutný
8 siblings, 0 replies; 21+ messages in thread
From: Michal Koutný @ 2024-04-05 17:05 UTC (permalink / raw)
To: cgroups, linux-doc, linux-kernel, linux-kselftest
Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
While pids controller is designed with only forks in mind, it leads to
situations where limit is apparently ineffective.
A manager daemon is in /src and it spawns tasks into /dst. The
administrator sets up a limit dst/pids.max while src/pids.max is
unlimited. The manager daemon can spawn more than dst/pids.max tasks
because they get into their target cgroup via migration (or
CLONE_INTO_CGROUP).
For this (migration) to work both src and dst must be in the same
resource domain so the manager daemon does not honor the limit which is
under its control anyway and no excessive resource consumption happens.
dst/pids.current > dst/pids.max may come as a surprise when the
spawning mechanism is opaque to the administrator of dst/pids.max.
Change the behavior of pids controller to take into account limits of
target cgroup upon migration (but only below common ancestor src and
dst, pids.current of common ancestor and above is not affected by
migration, so deliberatly ignore pre-existing pids.current > pids.max).
This change of behavior is hidden behind cgroup2 mount option and
the default is unchanged, pids.max won't affect migrations.
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
Documentation/admin-guide/cgroup-v2.rst | 8 +++++++-
include/linux/cgroup-defs.h | 7 ++++++-
kernel/cgroup/cgroup.c | 16 +++++++++++++++-
kernel/cgroup/pids.c | 8 ++++++--
4 files changed, 34 insertions(+), 5 deletions(-)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 5d4c505cae06..d7e721aed584 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -239,6 +239,11 @@ cgroup v2 currently supports the following mount options.
will not be tracked by the memory controller (even if cgroup
v2 is remounted later on).
+ pids_miglimit
+ Apply pids.max limit also when migrating tasks between cgroups. Only
+ new destination limit are taken into account, i.e. if subtree has
+ pids.current > pids.max, migration within that subtree is allowed.
+
Organizing Processes and Threads
--------------------------------
@@ -2204,7 +2209,8 @@ Organisational operations are not blocked by cgroup policies, so it is
possible to have pids.current > pids.max. This can be done by either
setting the limit to be smaller than pids.current, or attaching enough
processes to the cgroup such that pids.current is larger than
-pids.max. However, it is not possible to violate a cgroup PID policy
+pids.max (unless pids_miglimit mount options is given).
+However, it is not possible to violate a cgroup PID policy
through fork() or clone(). These will return -EAGAIN if the creation
of a new process would cause a cgroup policy to be violated.
diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index ea48c861cd36..a99db24b5496 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -119,7 +119,12 @@ enum {
/*
* Enable hugetlb accounting for the memory controller.
*/
- CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING = (1 << 19),
+ CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING = (1 << 19),
+
+ /*
+ * Enforce pids limit upon task migration
+ */
+ CGRP_ROOT_PIDS_MIGRATION_LIMIT = (1 << 20),
};
/* cftype->flags */
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index a66c088c851c..9aa6428c84c1 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1922,6 +1922,7 @@ enum cgroup2_param {
Opt_memory_localevents,
Opt_memory_recursiveprot,
Opt_memory_hugetlb_accounting,
+ Opt_pids_miglimit,
nr__cgroup2_params
};
@@ -1931,6 +1932,7 @@ static const struct fs_parameter_spec cgroup2_fs_parameters[] = {
fsparam_flag("memory_localevents", Opt_memory_localevents),
fsparam_flag("memory_recursiveprot", Opt_memory_recursiveprot),
fsparam_flag("memory_hugetlb_accounting", Opt_memory_hugetlb_accounting),
+ fsparam_flag("pids_miglimit", Opt_pids_miglimit),
{}
};
@@ -1960,6 +1962,9 @@ static int cgroup2_parse_param(struct fs_context *fc, struct fs_parameter *param
case Opt_memory_hugetlb_accounting:
ctx->flags |= CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING;
return 0;
+ case Opt_pids_miglimit:
+ ctx->flags |= CGRP_ROOT_PIDS_MIGRATION_LIMIT;
+ return 0;
}
return -EINVAL;
}
@@ -1989,6 +1994,12 @@ static void apply_cgroup_root_flags(unsigned int root_flags)
cgrp_dfl_root.flags |= CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING;
else
cgrp_dfl_root.flags &= ~CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING;
+
+ if (root_flags & CGRP_ROOT_PIDS_MIGRATION_LIMIT)
+ cgrp_dfl_root.flags |= CGRP_ROOT_PIDS_MIGRATION_LIMIT;
+ else
+ cgrp_dfl_root.flags &= ~CGRP_ROOT_PIDS_MIGRATION_LIMIT;
+
}
}
@@ -2004,6 +2015,8 @@ static int cgroup_show_options(struct seq_file *seq, struct kernfs_root *kf_root
seq_puts(seq, ",memory_recursiveprot");
if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING)
seq_puts(seq, ",memory_hugetlb_accounting");
+ if (cgrp_dfl_root.flags & CGRP_ROOT_PIDS_MIGRATION_LIMIT)
+ seq_puts(seq, ",pids_miglimit");
return 0;
}
@@ -7061,7 +7074,8 @@ static ssize_t features_show(struct kobject *kobj, struct kobj_attribute *attr,
"favordynmods\n"
"memory_localevents\n"
"memory_recursiveprot\n"
- "memory_hugetlb_accounting\n");
+ "memory_hugetlb_accounting\n"
+ "pids_miglimit\n");
}
static struct kobj_attribute cgroup_features_attr = __ATTR_RO(features);
diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index 9df8a209a6e2..4683629b8168 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -217,6 +217,7 @@ static int pids_can_attach(struct cgroup_taskset *tset)
{
struct task_struct *task;
struct cgroup_subsys_state *dst_css;
+ int err, ret = 0;
cgroup_taskset_for_each(task, dst_css, tset) {
struct pids_cgroup *pids = css_pids(dst_css);
@@ -231,10 +232,13 @@ static int pids_can_attach(struct cgroup_taskset *tset)
old_css = task_css(task, pids_cgrp_id);
old_pids = css_pids(old_css);
- (void) pids_tranfer_charge(old_pids, pids, 1);
+ err = pids_tranfer_charge(old_pids, pids, 1);
+
+ if (!ret && (cgrp_dfl_root.flags & CGRP_ROOT_PIDS_MIGRATION_LIMIT))
+ ret = err;
}
- return 0;
+ return ret;
}
static void pids_cancel_attach(struct cgroup_taskset *tset)
--
2.44.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC PATCH v3 9/9] selftests: cgroup: Add tests pids controller
2024-04-05 17:05 [RFC PATCH v3 0/9] pids controller events rework and migration charging Michal Koutný
` (7 preceding siblings ...)
2024-04-05 17:05 ` [RFC PATCH v3 8/9] cgroup/pids: Enforce pids.max on task migrations Michal Koutný
@ 2024-04-05 17:05 ` Michal Koutný
8 siblings, 0 replies; 21+ messages in thread
From: Michal Koutný @ 2024-04-05 17:05 UTC (permalink / raw)
To: cgroups, linux-doc, linux-kernel, linux-kselftest
Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
This adds a couple of tests to check enforcing of limits in pids
controller upon migration. When the new option does not exist, the test
is skipped.
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
tools/testing/selftests/cgroup/test_pids.c | 117 ++++++++++++++++++++-
1 file changed, 116 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_pids.c b/tools/testing/selftests/cgroup/test_pids.c
index c1c3a3965624..a3ad5a495f59 100644
--- a/tools/testing/selftests/cgroup/test_pids.c
+++ b/tools/testing/selftests/cgroup/test_pids.c
@@ -12,6 +12,8 @@
#include "../kselftest.h"
#include "cgroup_util.h"
+static bool has_miglimit;
+
static int run_success(const char *cgroup, void *arg)
{
return 0;
@@ -69,6 +71,112 @@ static int test_pids_max(const char *root)
return ret;
}
+/*
+ * This test checks that pids.max prevents migrating tasks over limit into the
+ * cgroup.
+ */
+static int test_pids_max_migration(const char *root)
+{
+ int ret = KSFT_FAIL;
+ char *cg_pids;
+ int pid;
+
+ if (!has_miglimit)
+ return KSFT_SKIP;
+
+ cg_pids = cg_name(root, "pids_test");
+ if (!cg_pids)
+ goto cleanup;
+
+ if (cg_create(cg_pids))
+ goto cleanup;
+
+ if (cg_write(cg_pids, "pids.max", "1"))
+ goto cleanup;
+
+ pid = cg_run_nowait(cg_pids, run_pause, NULL);
+ if (pid < 0)
+ goto cleanup;
+
+ if (cg_enter_current(cg_pids) >= 0)
+ goto cleanup;
+
+ if (kill(pid, SIGINT))
+ goto cleanup;
+
+ ret = KSFT_PASS;
+
+cleanup:
+ cg_enter_current(root);
+ cg_destroy(cg_pids);
+ free(cg_pids);
+
+ return ret;
+}
+
+/*
+ * This test checks that pids.max does not prevent migrating existing tasks
+ * inside subtree.
+ */
+static int test_pids_max_overcommit(const char *root)
+{
+ int ret = KSFT_FAIL;
+ char *cg_parent = NULL, *cg_src = NULL, *cg_dst = NULL;
+ int pid;
+
+ if (!has_miglimit)
+ return KSFT_SKIP;
+
+ cg_parent = cg_name(root, "pids_test");
+ if (!cg_parent)
+ goto cleanup;
+ cg_src = cg_name(cg_parent, "src");
+ if (!cg_src)
+ goto cleanup;
+ cg_dst = cg_name(cg_parent, "dst");
+ if (!cg_dst)
+ goto cleanup;
+
+ if (cg_create(cg_parent))
+ goto cleanup;
+ if (cg_write(cg_parent, "cgroup.subtree_control", "+pids"))
+ goto cleanup;
+ if (cg_create(cg_src))
+ goto cleanup;
+ if (cg_create(cg_dst))
+ goto cleanup;
+
+ if (cg_enter_current(cg_src) < 0)
+ goto cleanup;
+
+ pid = cg_run_nowait(cg_src, run_pause, NULL);
+ if (pid < 0)
+ goto cleanup;
+
+ if (cg_write(cg_parent, "pids.max", "1"))
+ goto cleanup;
+
+ if (cg_enter(cg_dst, pid) < 0)
+ goto cleanup;
+
+ if (kill(pid, SIGINT))
+ goto cleanup;
+
+ ret = KSFT_PASS;
+
+cleanup:
+ cg_enter_current(root);
+ cg_destroy(cg_dst);
+ cg_destroy(cg_src);
+ cg_destroy(cg_parent);
+ free(cg_dst);
+ free(cg_src);
+ free(cg_parent);
+
+ return ret;
+}
+
+
/*
* This test checks that pids.max prevents forking new children above the
* specified limit in the cgroup.
@@ -145,6 +253,8 @@ struct pids_test {
const char *name;
} tests[] = {
T(test_pids_max),
+ T(test_pids_max_migration),
+ T(test_pids_max_overcommit),
T(test_pids_events),
};
#undef T
@@ -152,7 +262,7 @@ struct pids_test {
int main(int argc, char **argv)
{
char root[PATH_MAX];
- int i, ret = EXIT_SUCCESS;
+ int i, proc_status, ret = EXIT_SUCCESS;
if (cg_find_unified_root(root, sizeof(root)))
ksft_exit_skip("cgroup v2 isn't mounted\n");
@@ -168,6 +278,11 @@ int main(int argc, char **argv)
if (cg_write(root, "cgroup.subtree_control", "+pids"))
ksft_exit_skip("Failed to set pids controller\n");
+ proc_status = proc_mount_contains("pids_miglimit");
+ if (proc_status < 0)
+ ksft_exit_skip("Failed to query cgroup mount option\n");
+ has_miglimit = proc_status;
+
for (i = 0; i < ARRAY_SIZE(tests); i++) {
switch (tests[i].fn(root)) {
case KSFT_PASS:
--
2.44.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v3 6/9] selftests: cgroup: Add basic tests for pids controller
2024-04-05 17:05 ` [RFC PATCH v3 6/9] selftests: cgroup: Add basic tests for pids controller Michal Koutný
@ 2024-04-06 21:37 ` Muhammad Usama Anjum
2024-04-08 11:29 ` Michal Koutný
0 siblings, 1 reply; 21+ messages in thread
From: Muhammad Usama Anjum @ 2024-04-06 21:37 UTC (permalink / raw)
To: Michal Koutný, cgroups, linux-doc, linux-kernel,
linux-kselftest
Cc: Muhammad Usama Anjum, Tejun Heo, Zefan Li, Johannes Weiner,
Jonathan Corbet, Shuah Khan
On 4/5/24 10:05 PM, Michal Koutný wrote:
> This commit adds (and wires in) new test program for checking basic pids
> controller functionality -- restricting tasks in a cgroup and correct
> event counting.
>
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
> tools/testing/selftests/cgroup/Makefile | 2 +
> tools/testing/selftests/cgroup/test_pids.c | 187 +++++++++++++++++++++
Please create/add test_pid to .gitignore file.
> 2 files changed, 189 insertions(+)
> create mode 100644 tools/testing/selftests/cgroup/test_pids.c
>
> diff --git a/tools/testing/selftests/cgroup/Makefile b/tools/testing/selftests/cgroup/Makefile
> index f3e1ef69e88d..f5f0886a2c4a 100644
> --- a/tools/testing/selftests/cgroup/Makefile
> +++ b/tools/testing/selftests/cgroup/Makefile
> @@ -15,6 +15,7 @@ TEST_GEN_PROGS += test_hugetlb_memcg
> TEST_GEN_PROGS += test_kill
> TEST_GEN_PROGS += test_kmem
> TEST_GEN_PROGS += test_memcontrol
> +TEST_GEN_PROGS += test_pids
> TEST_GEN_PROGS += test_zswap
>
> LOCAL_HDRS += $(selfdir)/clone3/clone3_selftests.h $(selfdir)/pidfd/pidfd.h
> @@ -29,4 +30,5 @@ $(OUTPUT)/test_hugetlb_memcg: cgroup_util.c
> $(OUTPUT)/test_kill: cgroup_util.c
> $(OUTPUT)/test_kmem: cgroup_util.c
> $(OUTPUT)/test_memcontrol: cgroup_util.c
> +$(OUTPUT)/test_pids: cgroup_util.c
> $(OUTPUT)/test_zswap: cgroup_util.c
> diff --git a/tools/testing/selftests/cgroup/test_pids.c b/tools/testing/selftests/cgroup/test_pids.c
> new file mode 100644
> index 000000000000..c1c3a3965624
> --- /dev/null
> +++ b/tools/testing/selftests/cgroup/test_pids.c
> @@ -0,0 +1,187 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#define _GNU_SOURCE
> +
> +#include <errno.h>
> +#include <linux/limits.h>
> +#include <signal.h>
> +#include <string.h>
> +#include <sys/stat.h>
> +#include <sys/types.h>
> +#include <unistd.h>
> +
> +#include "../kselftest.h"
> +#include "cgroup_util.h"
> +
> +static int run_success(const char *cgroup, void *arg)
> +{
> + return 0;
> +}
> +
> +static int run_pause(const char *cgroup, void *arg)
> +{
> + return pause();
> +}
> +
> +/*
> + * This test checks that pids.max prevents forking new children above the
> + * specified limit in the cgroup.
> + */
> +static int test_pids_max(const char *root)
> +{
> + int ret = KSFT_FAIL;
> + char *cg_pids;
> + int pid;
> +
> +
Please remove extra line.
> + cg_pids = cg_name(root, "pids_test");
> + if (!cg_pids)
> + goto cleanup;
> +
> + if (cg_create(cg_pids))
> + goto cleanup;
> +
> + if (cg_read_strcmp(cg_pids, "pids.max", "max\n"))
> + goto cleanup;
> +
> + if (cg_write(cg_pids, "pids.max", "2"))
> + goto cleanup;
> +
> + if (cg_enter_current(cg_pids))
> + goto cleanup;
> +
> + pid = cg_run_nowait(cg_pids, run_pause, NULL);
> + if (pid < 0)
> + goto cleanup;
> +
> + if (cg_run_nowait(cg_pids, run_success, NULL) != -1 || errno != EAGAIN)
> + goto cleanup;
> +
> + if (kill(pid, SIGINT))
> + goto cleanup;
> +
> + ret = KSFT_PASS;
> +
> +cleanup:
> + cg_enter_current(root);
> + cg_destroy(cg_pids);
> + free(cg_pids);
> +
> + return ret;
> +}
> +
> +/*
> + * This test checks that pids.max prevents forking new children above the
> + * specified limit in the cgroup.
> + */
> +static int test_pids_events(const char *root)
> +{
> + int ret = KSFT_FAIL;
> + char *cg_parent = NULL, *cg_child = NULL;
> + int pid;
> +
> +
> + cg_parent = cg_name(root, "pids_parent");
> + cg_child = cg_name(cg_parent, "pids_child");
> + if (!cg_parent || !cg_child)
> + goto cleanup;
> +
> + if (cg_create(cg_parent))
> + goto cleanup;
> + if (cg_write(cg_parent, "cgroup.subtree_control", "+pids"))
> + goto cleanup;
> + if (cg_create(cg_child))
> + goto cleanup;
> +
> + if (cg_write(cg_parent, "pids.max", "2"))
> + goto cleanup;
> +
> + if (cg_read_strcmp(cg_child, "pids.max", "max\n"))
> + goto cleanup;
> +
> + if (cg_enter_current(cg_child))
> + goto cleanup;
> +
> + pid = cg_run_nowait(cg_child, run_pause, NULL);
> + if (pid < 0)
> + goto cleanup;
> +
> + if (cg_run_nowait(cg_child, run_success, NULL) != -1 || errno != EAGAIN)
> + goto cleanup;
> +
> + if (kill(pid, SIGINT))
> + goto cleanup;
> +
> +
Remove extra line.
> + if (cg_read_key_long(cg_child, "pids.events", "max ") != 0)
> + goto cleanup;
> + if (cg_read_key_long(cg_child, "pids.events", "max.imposed ") != 1)
> + goto cleanup;
> +
> + if (cg_read_key_long(cg_parent, "pids.events", "max ") != 1)
> + goto cleanup;
> + if (cg_read_key_long(cg_parent, "pids.events", "max.imposed ") != 1)
> + goto cleanup;
> +
> +
> + ret = KSFT_PASS;
> +
> +cleanup:
> + cg_enter_current(root);
> + if (cg_child)
> + cg_destroy(cg_child);
> + if (cg_parent)
> + cg_destroy(cg_parent);
> + free(cg_child);
> + free(cg_parent);
> +
> + return ret;
> +}
> +
> +
> +
> +#define T(x) { x, #x }
> +struct pids_test {
> + int (*fn)(const char *root);
> + const char *name;
> +} tests[] = {
> + T(test_pids_max),
> + T(test_pids_events),
> +};
> +#undef T
> +
> +int main(int argc, char **argv)
> +{
> + char root[PATH_MAX];
> + int i, ret = EXIT_SUCCESS;
The
ksft_print_header();
ksft_set_plan(total_number_of_tests);
are missing. Please use all of the ksft APIs to make the test TAP compliant.
> +
> + if (cg_find_unified_root(root, sizeof(root)))
> + ksft_exit_skip("cgroup v2 isn't mounted\n");
> +
> + /*
> + * Check that pids controller is available:
> + * pids is listed in cgroup.controllers
> + */
> + if (cg_read_strstr(root, "cgroup.controllers", "pids"))
> + ksft_exit_skip("pids controller isn't available\n");
> +
> + if (cg_read_strstr(root, "cgroup.subtree_control", "pids"))
> + if (cg_write(root, "cgroup.subtree_control", "+pids"))
> + ksft_exit_skip("Failed to set pids controller\n");
> +
> + for (i = 0; i < ARRAY_SIZE(tests); i++) {
> + switch (tests[i].fn(root)) {
> + case KSFT_PASS:
> + ksft_test_result_pass("%s\n", tests[i].name);
> + break;
> + case KSFT_SKIP:
> + ksft_test_result_skip("%s\n", tests[i].name);
> + break;
> + default:
> + ret = EXIT_FAILURE;
> + ksft_test_result_fail("%s\n", tests[i].name);
> + break;
Use ksft_test_result_report() instead of swith-case here.
> + }
> + }
> +
> + return ret;
> +}
--
BR,
Muhammad Usama Anjum
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Re: [RFC PATCH v3 6/9] selftests: cgroup: Add basic tests for pids controller
2024-04-06 21:37 ` Muhammad Usama Anjum
@ 2024-04-08 11:29 ` Michal Koutný
2024-04-08 11:53 ` Muhammad Usama Anjum
0 siblings, 1 reply; 21+ messages in thread
From: Michal Koutný @ 2024-04-08 11:29 UTC (permalink / raw)
To: Muhammad Usama Anjum
Cc: cgroups, linux-doc, linux-kernel, linux-kselftest, Tejun Heo,
Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
[-- Attachment #1: Type: text/plain, Size: 938 bytes --]
On Sun, Apr 07, 2024 at 02:37:44AM +0500, Muhammad Usama Anjum <usama.anjum@collabora.com> wrote:
> The
> ksft_print_header();
> ksft_set_plan(total_number_of_tests);
> are missing. Please use all of the ksft APIs to make the test TAP compliant.
Will do.
> > + for (i = 0; i < ARRAY_SIZE(tests); i++) {
> > + switch (tests[i].fn(root)) {
> > + case KSFT_PASS:
> > + ksft_test_result_pass("%s\n", tests[i].name);
> > + break;
> > + case KSFT_SKIP:
> > + ksft_test_result_skip("%s\n", tests[i].name);
> > + break;
> > + default:
> > + ret = EXIT_FAILURE;
> > + ksft_test_result_fail("%s\n", tests[i].name);
> > + break;
> Use ksft_test_result_report() instead of swith-case here.
Do you mean ksft_test_result()? That one cannot distinguish the
KSFT_SKIP case.
Or ksft_test_result_code(tests[i].fn(root), tests[i].name)?
Would the existing ksft_test_resul_*() calls inside switch-case still
TAP-work?
Thanks,
Michal
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v3 6/9] selftests: cgroup: Add basic tests for pids controller
2024-04-08 11:29 ` Michal Koutný
@ 2024-04-08 11:53 ` Muhammad Usama Anjum
2024-04-08 12:01 ` Michal Koutný
0 siblings, 1 reply; 21+ messages in thread
From: Muhammad Usama Anjum @ 2024-04-08 11:53 UTC (permalink / raw)
To: Michal Koutný
Cc: Muhammad Usama Anjum, cgroups, linux-doc, linux-kernel,
linux-kselftest, Tejun Heo, Zefan Li, Johannes Weiner,
Jonathan Corbet, Shuah Khan
On 4/8/24 4:29 PM, Michal Koutný wrote:
> On Sun, Apr 07, 2024 at 02:37:44AM +0500, Muhammad Usama Anjum <usama.anjum@collabora.com> wrote:
>> The
>> ksft_print_header();
>> ksft_set_plan(total_number_of_tests);
>> are missing. Please use all of the ksft APIs to make the test TAP compliant.
>
> Will do.
>
>>> + for (i = 0; i < ARRAY_SIZE(tests); i++) {
>>> + switch (tests[i].fn(root)) {
>>> + case KSFT_PASS:
>>> + ksft_test_result_pass("%s\n", tests[i].name);
>>> + break;
>>> + case KSFT_SKIP:
>>> + ksft_test_result_skip("%s\n", tests[i].name);
>>> + break;
>>> + default:
>>> + ret = EXIT_FAILURE;
>>> + ksft_test_result_fail("%s\n", tests[i].name);
>>> + break;
>> Use ksft_test_result_report() instead of swith-case here.
>
> Do you mean ksft_test_result()? That one cannot distinguish the
> KSFT_SKIP case.
> Or ksft_test_result_code(tests[i].fn(root), tests[i].name)?
No, this doesn't seem useful here.
>
> Would the existing ksft_test_resul_*() calls inside switch-case still
> TAP-work?
This part of your switch-case are correct. It just that by using
ksft_test_result_report you can achieve the same thing. It has has SKIP
support.
ksft_test_result_report(tests[i].fn(root), tests[i].name)
>
> Thanks,
> Michal
--
BR,
Muhammad Usama Anjum
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Re: [RFC PATCH v3 6/9] selftests: cgroup: Add basic tests for pids controller
2024-04-08 11:53 ` Muhammad Usama Anjum
@ 2024-04-08 12:01 ` Michal Koutný
2024-04-08 12:04 ` Muhammad Usama Anjum
0 siblings, 1 reply; 21+ messages in thread
From: Michal Koutný @ 2024-04-08 12:01 UTC (permalink / raw)
To: Muhammad Usama Anjum
Cc: cgroups, linux-doc, linux-kernel, linux-kselftest, Tejun Heo,
Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
[-- Attachment #1: Type: text/plain, Size: 302 bytes --]
On Mon, Apr 08, 2024 at 04:53:11PM +0500, Muhammad Usama Anjum <usama.anjum@collabora.com> wrote:
> ksft_test_result_report(tests[i].fn(root), tests[i].name)
$ git grep ksft_test_result_report v6.9-rc3 --
(empty result)
I can't find that helper. Is that in some devel repositories?
Michal
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v3 6/9] selftests: cgroup: Add basic tests for pids controller
2024-04-08 12:01 ` Michal Koutný
@ 2024-04-08 12:04 ` Muhammad Usama Anjum
2024-04-09 0:12 ` Waiman Long
0 siblings, 1 reply; 21+ messages in thread
From: Muhammad Usama Anjum @ 2024-04-08 12:04 UTC (permalink / raw)
To: Michal Koutný
Cc: Muhammad Usama Anjum, cgroups, linux-doc, linux-kernel,
linux-kselftest, Tejun Heo, Zefan Li, Johannes Weiner,
Jonathan Corbet, Shuah Khan
On 4/8/24 5:01 PM, Michal Koutný wrote:
> On Mon, Apr 08, 2024 at 04:53:11PM +0500, Muhammad Usama Anjum <usama.anjum@collabora.com> wrote:
>> ksft_test_result_report(tests[i].fn(root), tests[i].name)
>
> $ git grep ksft_test_result_report v6.9-rc3 --
> (empty result)
>
> I can't find that helper. Is that in some devel repositories?
Sorry, I always do development on next. So it has been added recently. Try
searching it on next:
git grep ksft_test_result_report next-20240404 --
>
> Michal
--
BR,
Muhammad Usama Anjum
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v3 2/9] cgroup/pids: Separate semantics of pids.events related to pids.max
2024-04-05 17:05 ` [RFC PATCH v3 2/9] cgroup/pids: Separate semantics of pids.events related to pids.max Michal Koutný
@ 2024-04-08 17:55 ` Tejun Heo
2024-04-09 16:02 ` Johannes Weiner
2024-04-12 14:23 ` Michal Koutný
0 siblings, 2 replies; 21+ messages in thread
From: Tejun Heo @ 2024-04-08 17:55 UTC (permalink / raw)
To: Michal Koutný
Cc: cgroups, linux-doc, linux-kernel, linux-kselftest, Zefan Li,
Johannes Weiner, Jonathan Corbet, Shuah Khan
Hello,
On Fri, Apr 05, 2024 at 07:05:41PM +0200, Michal Koutný wrote:
> Currently, when pids.max limit is breached in the hierarchy, the event
> is counted and reported in the cgroup where the forking task resides.
>
> This decouples the limit and the notification caused by the limit making
> it hard to detect when the actual limit was effected.
>
> Let's introduce new events:
> max
> The number of times the limit of the cgroup was hit.
>
> max.imposed
> The number of times fork failed in the cgroup because of self
> or ancestor limit.
The whole series make sense to me. I'm not sure about max.imposed field
name. Maybe a name which clearly signfies rejection of forks would be
clearer? Johannes, what do you think?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v3 6/9] selftests: cgroup: Add basic tests for pids controller
2024-04-08 12:04 ` Muhammad Usama Anjum
@ 2024-04-09 0:12 ` Waiman Long
2024-04-09 13:00 ` Muhammad Usama Anjum
0 siblings, 1 reply; 21+ messages in thread
From: Waiman Long @ 2024-04-09 0:12 UTC (permalink / raw)
To: Muhammad Usama Anjum, Michal Koutný
Cc: cgroups, linux-doc, linux-kernel, linux-kselftest, Tejun Heo,
Zefan Li, Johannes Weiner, Jonathan Corbet, Shuah Khan
On 4/8/24 08:04, Muhammad Usama Anjum wrote:
> On 4/8/24 5:01 PM, Michal Koutný wrote:
>> On Mon, Apr 08, 2024 at 04:53:11PM +0500, Muhammad Usama Anjum <usama.anjum@collabora.com> wrote:
>>> ksft_test_result_report(tests[i].fn(root), tests[i].name)
>> $ git grep ksft_test_result_report v6.9-rc3 --
>> (empty result)
>>
>> I can't find that helper. Is that in some devel repositories?
> Sorry, I always do development on next. So it has been added recently. Try
> searching it on next:
>
> git grep ksft_test_result_report next-20240404 --
I don't believe it is a good idea to make this patch having a dependency
on another set of patches in -next because the test won't run in a
non-next environment. We can always have additional patches later on to
modify the tests to use the newly available APIs.
Cheers,
Longman
>
>> Michal
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v3 6/9] selftests: cgroup: Add basic tests for pids controller
2024-04-09 0:12 ` Waiman Long
@ 2024-04-09 13:00 ` Muhammad Usama Anjum
0 siblings, 0 replies; 21+ messages in thread
From: Muhammad Usama Anjum @ 2024-04-09 13:00 UTC (permalink / raw)
To: Waiman Long, Michal Koutný
Cc: Muhammad Usama Anjum, cgroups, linux-doc, linux-kernel,
linux-kselftest, Tejun Heo, Zefan Li, Johannes Weiner,
Jonathan Corbet, Shuah Khan
On 4/9/24 5:12 AM, Waiman Long wrote:
>
> On 4/8/24 08:04, Muhammad Usama Anjum wrote:
>> On 4/8/24 5:01 PM, Michal Koutný wrote:
>>> On Mon, Apr 08, 2024 at 04:53:11PM +0500, Muhammad Usama Anjum
>>> <usama.anjum@collabora.com> wrote:
>>>> ksft_test_result_report(tests[i].fn(root), tests[i].name)
>>> $ git grep ksft_test_result_report v6.9-rc3 --
>>> (empty result)
>>>
>>> I can't find that helper. Is that in some devel repositories?
>> Sorry, I always do development on next. So it has been added recently. Try
>> searching it on next:
>>
>> git grep ksft_test_result_report next-20240404 --
>
> I don't believe it is a good idea to make this patch having a dependency on
> another set of patches in -next because the test won't run in a non-next
> environment. We can always have additional patches later on to modify the
> tests to use the newly available APIs.
Sure, it is okay with me.
>
> Cheers,
> Longman
>
>>
>>> Michal
>
>
--
BR,
Muhammad Usama Anjum
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v3 2/9] cgroup/pids: Separate semantics of pids.events related to pids.max
2024-04-08 17:55 ` Tejun Heo
@ 2024-04-09 16:02 ` Johannes Weiner
2024-04-12 14:23 ` Michal Koutný
1 sibling, 0 replies; 21+ messages in thread
From: Johannes Weiner @ 2024-04-09 16:02 UTC (permalink / raw)
To: Tejun Heo
Cc: Michal Koutný, cgroups, linux-doc, linux-kernel,
linux-kselftest, Zefan Li, Jonathan Corbet, Shuah Khan
On Mon, Apr 08, 2024 at 07:55:38AM -1000, Tejun Heo wrote:
> Hello,
>
> On Fri, Apr 05, 2024 at 07:05:41PM +0200, Michal Koutný wrote:
> > Currently, when pids.max limit is breached in the hierarchy, the event
> > is counted and reported in the cgroup where the forking task resides.
> >
> > This decouples the limit and the notification caused by the limit making
> > it hard to detect when the actual limit was effected.
> >
> > Let's introduce new events:
> > max
> > The number of times the limit of the cgroup was hit.
> >
> > max.imposed
> > The number of times fork failed in the cgroup because of self
> > or ancestor limit.
>
> The whole series make sense to me. I'm not sure about max.imposed field
> name. Maybe a name which clearly signfies rejection of forks would be
> clearer? Johannes, what do you think?
The max event at the level where the limit is set (and up, for
hierarchical accounting) makes sense to me.
max.imposed is conceptually not entirely unprecedented, but something
we've tried to avoid. Usually the idea is that events correspond to
specific cgroup limitations at that level. Failures due to constraints
higher up could be from anything, including system-level shortages.
IOW, events are supposed to be more about "how many times did this
limit here trigger", and less about "how many times did something
happen to the tasks local to this group".
It's a bit arbitrary and not perfectly followed everywhere, but I
think there is value in trying to maintain that distinction, so that
somebody looking at those files doesn't have to rack their brains or
look up every counter in the docs to figure out what it's tracking.
It's at least true for the misc controller, and for most of memcg -
with the weird exception of the swap.max events which we've tried to
fix before...
For "things that are happening to the tasks in this group", would it
make more sense to have an e.g. pids.stat::forkfail instead?
(Or just not have that event at all? I'm not sure if it's actually
needed or whether you kept it only to maintain some form of the
information that is currently provided by the pr_info()).
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Re: [RFC PATCH v3 2/9] cgroup/pids: Separate semantics of pids.events related to pids.max
2024-04-08 17:55 ` Tejun Heo
2024-04-09 16:02 ` Johannes Weiner
@ 2024-04-12 14:23 ` Michal Koutný
2024-04-12 17:04 ` Tejun Heo
1 sibling, 1 reply; 21+ messages in thread
From: Michal Koutný @ 2024-04-12 14:23 UTC (permalink / raw)
To: Tejun Heo
Cc: cgroups, linux-doc, linux-kernel, linux-kselftest, Zefan Li,
Johannes Weiner, Jonathan Corbet, Shuah Khan
[-- Attachment #1: Type: text/plain, Size: 220 bytes --]
On Mon, Apr 08, 2024 at 07:55:38AM -1000, Tejun Heo <tj@kernel.org> wrote:
> The whole series make sense to me.
Including the migration charging?
(Asking whether I should keep it stacked in v4 posting.)
Thanks,
Michal
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Re: [RFC PATCH v3 2/9] cgroup/pids: Separate semantics of pids.events related to pids.max
2024-04-12 14:23 ` Michal Koutný
@ 2024-04-12 17:04 ` Tejun Heo
0 siblings, 0 replies; 21+ messages in thread
From: Tejun Heo @ 2024-04-12 17:04 UTC (permalink / raw)
To: Michal Koutný
Cc: cgroups, linux-doc, linux-kernel, linux-kselftest, Zefan Li,
Johannes Weiner, Jonathan Corbet, Shuah Khan
On Fri, Apr 12, 2024 at 04:23:24PM +0200, Michal Koutný wrote:
> On Mon, Apr 08, 2024 at 07:55:38AM -1000, Tejun Heo <tj@kernel.org> wrote:
> > The whole series make sense to me.
>
> Including the migration charging?
> (Asking whether I should keep it stacked in v4 posting.)
Oh, let's separate that part out. I'm not sure about that. The problem with
can_attach failures is that they're really opaque and the more we do it the
less we'll be able to tell where the failures are coming from, so I'm not
very enthusiastic about them.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2024-04-12 17:04 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-05 17:05 [RFC PATCH v3 0/9] pids controller events rework and migration charging Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 1/9] cgroup/pids: Remove superfluous zeroing Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 2/9] cgroup/pids: Separate semantics of pids.events related to pids.max Michal Koutný
2024-04-08 17:55 ` Tejun Heo
2024-04-09 16:02 ` Johannes Weiner
2024-04-12 14:23 ` Michal Koutný
2024-04-12 17:04 ` Tejun Heo
2024-04-05 17:05 ` [RFC PATCH v3 3/9] cgroup/pids: Make event counters hierarchical Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 4/9] cgroup/pids: Add pids.events.local Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 5/9] selftests: cgroup: Lexicographic order in Makefile Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 6/9] selftests: cgroup: Add basic tests for pids controller Michal Koutný
2024-04-06 21:37 ` Muhammad Usama Anjum
2024-04-08 11:29 ` Michal Koutný
2024-04-08 11:53 ` Muhammad Usama Anjum
2024-04-08 12:01 ` Michal Koutný
2024-04-08 12:04 ` Muhammad Usama Anjum
2024-04-09 0:12 ` Waiman Long
2024-04-09 13:00 ` Muhammad Usama Anjum
2024-04-05 17:05 ` [RFC PATCH v3 7/9] cgroup/pids: Replace uncharge/charge pair with a single function Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 8/9] cgroup/pids: Enforce pids.max on task migrations Michal Koutný
2024-04-05 17:05 ` [RFC PATCH v3 9/9] selftests: cgroup: Add tests pids controller Michal Koutný
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).