From: "Michal Koutný" <mkoutny@suse.com>
To: cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>, Zefan Li <lizefan.x@bytedance.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Jonathan Corbet <corbet@lwn.net>, Shuah Khan <shuah@kernel.org>,
Muhammad Usama Anjum <usama.anjum@collabora.com>
Subject: [PATCH v5 2/5] cgroup/pids: Make event counters hierarchical
Date: Tue, 21 May 2024 11:21:27 +0200 [thread overview]
Message-ID: <20240521092130.7883-3-mkoutny@suse.com> (raw)
In-Reply-To: <20240521092130.7883-1-mkoutny@suse.com>
The pids.events file should honor the hierarchy, so make the events
propagate from their origin up to the root on the unified hierarchy. The
legacy behavior remains non-hierarchical.
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
Documentation/admin-guide/cgroup-v2.rst | 9 +++--
kernel/cgroup/pids.c | 46 ++++++++++++++++---------
2 files changed, 36 insertions(+), 19 deletions(-)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 945ff743a3c9..0b5f77104e8b 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -240,8 +240,11 @@ cgroup v2 currently supports the following mount options.
v2 is remounted later on).
pids_localevents
- Represent fork failures inside cgroup's pids.events:max (v1 behavior),
- not its limit being hit (v2 behavior).
+ The option restores v1-like behavior of pids.events:max, that is only
+ local (inside cgroup proper) fork failures are counted. Without this
+ option pids.events.max represents any pids.max enforcemnt across
+ cgroup's subtree.
+
Organizing Processes and Threads
@@ -2205,7 +2208,7 @@ PID Interface Files
modified event. The following entries are defined.
max
- The number of times the cgroup's number of processes hit the
+ The number of times the cgroup's total number of processes hit the pids.max
limit (see also pids_localevents).
Organisational operations are not blocked by cgroup policies, so it is
diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index a557f5c8300b..c09b744d548c 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -238,6 +238,34 @@ static void pids_cancel_attach(struct cgroup_taskset *tset)
}
}
+static void pids_event(struct pids_cgroup *pids_forking,
+ struct pids_cgroup *pids_over_limit)
+{
+ struct pids_cgroup *p = pids_forking;
+ bool limit = false;
+
+ for (; parent_pids(p); p = parent_pids(p)) {
+ /* Only log the first time limit is hit. */
+ if (atomic64_inc_return(&p->events[PIDCG_FORKFAIL]) == 1) {
+ pr_info("cgroup: fork rejected by pids controller in ");
+ pr_cont_cgroup_path(p->css.cgroup);
+ pr_cont("\n");
+ }
+ cgroup_file_notify(&p->events_file);
+
+ if (!cgroup_subsys_on_dfl(pids_cgrp_subsys) ||
+ cgrp_dfl_root.flags & CGRP_ROOT_PIDS_LOCAL_EVENTS)
+ break;
+
+ if (p == pids_over_limit)
+ limit = true;
+ if (limit)
+ atomic64_inc(&p->events[PIDCG_MAX]);
+
+ cgroup_file_notify(&p->events_file);
+ }
+}
+
/*
* task_css_check(true) in pids_can_fork() and pids_cancel_fork() relies
* on cgroup_threadgroup_change_begin() held by the copy_process().
@@ -254,23 +282,9 @@ static int pids_can_fork(struct task_struct *task, struct css_set *cset)
css = task_css_check(current, pids_cgrp_id, true);
pids = css_pids(css);
err = pids_try_charge(pids, 1, &pids_over_limit);
- if (err) {
- /* compatibility on v1 where events were notified in leaves. */
- if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
- pids_over_limit = pids;
-
- /* Only log the first time limit is hit. */
- if (atomic64_inc_return(&pids->events[PIDCG_FORKFAIL]) == 1) {
- pr_info("cgroup: fork rejected by pids controller in ");
- pr_cont_cgroup_path(pids->css.cgroup);
- pr_cont("\n");
- }
- atomic64_inc(&pids_over_limit->events[PIDCG_MAX]);
+ if (err)
+ pids_event(pids, pids_over_limit);
- cgroup_file_notify(&pids->events_file);
- if (pids_over_limit != pids)
- cgroup_file_notify(&pids_over_limit->events_file);
- }
return err;
}
--
2.44.0
next prev parent reply other threads:[~2024-05-21 9:21 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-21 9:21 [PATCH v5 0/5] pids controller events rework Michal Koutný
2024-05-21 9:21 ` [PATCH v5 1/5] cgroup/pids: Separate semantics of pids.events related to pids.max Michal Koutný
2024-05-21 9:21 ` Michal Koutný [this message]
2024-07-03 6:59 ` [PATCH v5 2/5] cgroup/pids: Make event counters hierarchical xiujianfeng
2024-07-16 3:27 ` xiujianfeng
2024-07-25 9:38 ` Michal Koutný
2024-07-30 3:21 ` Xiu Jianfeng
2024-05-21 9:21 ` [PATCH v5 3/5] cgroup/pids: Add pids.events.local Michal Koutný
2024-05-21 9:21 ` [PATCH v5 4/5] selftests: cgroup: Lexicographic order in Makefile Michal Koutný
2024-05-21 9:21 ` [PATCH v5 5/5] selftests: cgroup: Add basic tests for pids controller Michal Koutný
2024-05-26 18:47 ` [PATCH v5 0/5] pids controller events rework Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240521092130.7883-3-mkoutny@suse.com \
--to=mkoutny@suse.com \
--cc=cgroups@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=hannes@cmpxchg.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=lizefan.x@bytedance.com \
--cc=shuah@kernel.org \
--cc=tj@kernel.org \
--cc=usama.anjum@collabora.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox