From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Aleksa Sarai <cyphar-gVpy/LI/lHzQT0dZR+AlfA@public.gmane.org>
Cc: lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org,
mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org,
richard-/L3Ra7n9ekc@public.gmane.org,
fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH v10 4/4] cgroups: implement the PIDs subsystem
Date: Wed, 22 Apr 2015 12:29:54 -0400 [thread overview]
Message-ID: <20150422162954.GF10738@htj.duckdns.org> (raw)
In-Reply-To: <1429446154-10660-5-git-send-email-cyphar-gVpy/LI/lHzQT0dZR+AlfA@public.gmane.org>
> @@ -0,0 +1,368 @@
> +/*
> + * Process number limiting controller for cgroups.
> + *
> + * Used to allow a cgroup hierarchy to stop any new processes
> + * from fork()ing after a certain limit is reached.
> + *
> + * Since it is trivial to hit the task limit without hitting
> + * any kmemcg limits in place, PIDs are a fundamental resource.
> + * As such, PID exhaustion must be preventable in the scope of
> + * a cgroup hierarchy by allowing resource limiting of the
> + * number of tasks in a cgroup.
> + *
> + * In order to use the `pids` controller, set the maximum number
> + * of tasks in pids.max (this is not available in the root cgroup
> + * for obvious reasons). The number of processes currently
> + * in the cgroup is given by pids.current. Organisational operations
> + * are not blocked by cgroup policies, so it is possible to have
> + * pids.current > pids.max. However, fork()s will still not work.
> + *
> + * To set a cgroup to have no limit, set pids.max to "max". fork()
> + * will return -EBUSY if forking would cause a cgroup policy to be
> + * violated.
> + *
> + * pids.current tracks all child cgroup hierarchies, so
> + * parent/pids.current is a superset of parent/child/pids.current.
> + *
> + * Copyright (C) 2015 Aleksa Sarai <cyphar-gVpy/LI/lHzQT0dZR+AlfA@public.gmane.org>
The above text looks wrapped too narrow.
> +struct pids_cgroup {
> + struct cgroup_subsys_state css;
> +
> + /*
> + * Use 64-bit types so that we can safely represent "max" as
> + * (PID_MAX_LIMIT + 1).
^^^^^^^^^^^^^^^^^
...
> +static struct cgroup_subsys_state *
> +pids_css_alloc(struct cgroup_subsys_state *parent)
> +{
> + struct pids_cgroup *pids;
> +
> + pids = kzalloc(sizeof(struct pids_cgroup), GFP_KERNEL);
> + if (!pids)
> + return ERR_PTR(-ENOMEM);
> +
> + pids->limit = PIDS_MAX;
^^^^^^^^^
> + atomic64_set(&pids->counter, 0);
> + return &pids->css;
> +}
...
> +static void pids_detach(struct cgroup_subsys_state *old_css,
> + struct task_struct *task)
> +{
> + struct pids_cgroup *old_pids = css_pids(old_css);
> +
> + pids_uncharge(old_pids, 1);
> +}
You can do the above as a part of can/cancel.
> +static int pids_can_fork(struct task_struct *task, void **private)
Maybe @priv_p or something which signifies it's of different type from
others?
> +{
...
> + rcu_read_lock();
> + css = task_css(current, pids_cgrp_id);
> + if (!css_tryget_online(css)) {
> + retval = -EBUSY;
> + goto err_rcu_unlock;
> + }
> + rcu_read_unlock();
Hmmm... so, the above is guaranteed to succeed in finite amount of
time (the race window is actually very narrow) and it'd be silly to
fail fork because a task was being moved across cgroups.
I think it'd be a good idea to implement task_get_css() which loops
and returns the current css for the requested subsystem with reference
count bumped and it can use css_tryget() too. Holding a ref doesn't
prevent css from dying anyway, so it doesn't make any difference.
> +static void pids_fork(struct task_struct *task, void *private)
> +{
...
> + rcu_read_lock();
> + css = task_css(task, pids_cgrp_id);
> + css_get(css);
Why is this safe? What guarantees that css's ref isn't already zero
at this point?
> + rcu_read_unlock();
> +
> + pids = css_pids(css);
> +
> + /*
> + * The association has changed, we have to revert and reapply the
> + * charge/uncharge on the wrong hierarchy to the current one. Since
> + * the association can only change due to an organisation event, its
> + * okay for us to ignore the limit in this case.
> + */
> + if (pids != old_pids) {
> + pids_uncharge(old_pids, 1);
> + pids_charge(pids, 1);
> + }
> +
> + css_put(css);
> + css_put(old_css);
> +}
...
> +static ssize_t pids_max_write(struct kernfs_open_file *of, char *buf,
> + size_t nbytes, loff_t off)
> +{
> + struct cgroup_subsys_state *css = of_css(of);
> + struct pids_cgroup *pids = css_pids(css);
> + int64_t limit;
> + int err;
> +
> + buf = strstrip(buf);
> + if (!strcmp(buf, PIDS_MAX_STR)) {
> + limit = PIDS_MAX;
> + goto set_limit;
> + }
> +
> + err = kstrtoll(buf, 0, &limit);
> + if (err)
> + return err;
> +
> + /* We use INT_MAX as the maximum value of pid_t. */
> + if (limit < 0 || limit > INT_MAX)
This is kinda weird if we're using PIDS_MAX for max as it may end up
showing "max" after some larger number is written to the file.
Thanks.
--
tejun
next prev parent reply other threads:[~2015-04-22 16:29 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-19 12:22 [PATCH v10 0/4] cgroups: add pids subsystem Aleksa Sarai
[not found] ` <1429446154-10660-1-git-send-email-cyphar-gVpy/LI/lHzQT0dZR+AlfA@public.gmane.org>
2015-04-19 12:22 ` [PATCH v10 1/4] cgroups: use bitmask to filter for_each_subsys Aleksa Sarai
2015-04-22 15:25 ` Tejun Heo
2015-04-22 15:42 ` Peter Zijlstra
2015-04-22 16:02 ` Tejun Heo
2015-04-26 16:05 ` Aleksa Sarai
[not found] ` <CAOviyagvVTKjRp4wxyNo4Oke-w=Wbda+UyHAN-ih9yFX-8jqmw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-26 16:09 ` Tejun Heo
[not found] ` <20150426160909.GC1499-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2015-05-13 5:44 ` Aleksa Sarai
[not found] ` <CAOviyah2kjtaRBFRcd3cAb1DeYX11Ks_KVfwXPMXK5BENSqOiA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-13 13:50 ` Tejun Heo
2015-04-22 15:30 ` Tejun Heo
2015-04-19 12:22 ` [PATCH v10 3/4] cgroups: allow a cgroup subsystem to reject a fork Aleksa Sarai
[not found] ` <1429446154-10660-4-git-send-email-cyphar-gVpy/LI/lHzQT0dZR+AlfA@public.gmane.org>
2015-04-22 15:54 ` Tejun Heo
2015-04-24 13:59 ` Aleksa Sarai
2015-04-24 15:48 ` Tejun Heo
2015-05-14 10:57 ` Aleksa Sarai
2015-05-14 15:08 ` Tejun Heo
2015-04-19 12:22 ` [PATCH v10 4/4] cgroups: implement the PIDs subsystem Aleksa Sarai
[not found] ` <1429446154-10660-5-git-send-email-cyphar-gVpy/LI/lHzQT0dZR+AlfA@public.gmane.org>
2015-04-22 16:29 ` Tejun Heo [this message]
[not found] ` <20150422162954.GF10738-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2015-04-23 0:43 ` Aleksa Sarai
[not found] ` <CAOviyagHJhxD8E+CeEdy399ARPaNyyiSJSJjByK=5ALN5jxbJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-24 15:36 ` Tejun Heo
[not found] ` <20150424153657.GC24029-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2015-05-13 17:04 ` Aleksa Sarai
[not found] ` <CAOviyah9yyXJG0eUeizpb5ZnzNUAKfxXGkXJw-5sNw7sQOAEuA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-13 17:29 ` Tejun Heo
2015-05-13 17:44 ` Aleksa Sarai
[not found] ` <CAOviyaimfHD=Jg1+FEwyapwO=zdaL0bvq8YDA1Cnq90YbfGENQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-13 17:47 ` Tejun Heo
[not found] ` <20150513174707.GA11388-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2015-05-16 3:59 ` Aleksa Sarai
[not found] ` <CAOviyaiJnxmiRPEGFutEXiV-D5XwJUhwRJ1w8P3Yn2QLGO9bbw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-18 1:24 ` Tejun Heo
2015-04-24 14:24 ` Aleksa Sarai
2015-04-19 12:22 ` [PATCH v10 2/4] cgroups: replace explicit ss_mask checking with for_each_subsys_which Aleksa Sarai
[not found] ` <1429446154-10660-3-git-send-email-cyphar-gVpy/LI/lHzQT0dZR+AlfA@public.gmane.org>
2015-04-22 15:31 ` Tejun Heo
-- strict thread matches above, loose matches on Subject: below --
2015-04-24 14:07 [PATCH v10 4/4] cgroups: implement the PIDs subsystem Aleksa Sarai
2015-04-24 15:26 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150422162954.GF10738@htj.duckdns.org \
--to=tj-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=cyphar-gVpy/LI/lHzQT0dZR+AlfA@public.gmane.org \
--cc=fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
--cc=mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=richard-/L3Ra7n9ekc@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox