From: Ingo Molnar <mingo@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <htejun@gmail.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY")
Date: Mon, 2 Dec 2013 16:37:46 +0100 [thread overview]
Message-ID: <20131202153746.GA27781@gmail.com> (raw)
In-Reply-To: <20131202152956.GI16796@laptop.programming.kicks-ass.net>
* Peter Zijlstra <peterz@infradead.org> wrote:
> Subject: sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY")
>
> PF_NO_SETAFFINITY, which crudely turns off affinity control and cgroups
> access to tasks, and which is in use by every workqueue thread in Linux (!),
> is conceptually wrong on many levels:
>
> - We should strive to never consciously place artificial limitations on
> kernel functionality; our main reason to place limitations should be
> correctness.
>
> There are cases where limiting affinity is justified: for example the
> case of single cpu workqueue threads, which are special for their
> limited concurrency, esp. when coupled with per-cpu resources --
> allowing such threads to run on other cpus is a correctness violation
> and can crash the kernel.
>
> - But using it outside this case is overly broad; it dis-allows usage
> that is functionally fine and in some cases desired.
>
> In particular; tj argues ( http://lkml.kernel.org/r/20131128143848.GD3925@htj.dyndns.org )
>
> "That's just inviting people to do weirdest things and then
> reporting things like "crypt jobs on some of our 500 machines end up
> stuck on a single cpu once in a while" which will eventually be
> tracked down to some weird shell script setting affinity on workers
> doing something else."
>
> While that is exactly what HPC/RT people _want_ in order to avoid
> disturbing the other CPUs with said crypt work.
>
> - Furthermore, the above example is also wrong in that you should not
> protect root from itself; there's plenty root can do to shoot his
> own foot off, let alone shoot his head off.
>
> Setting affinities is limited to root, and if root messes up the
> system he can keep the pieces. But limiting in an overly broad
> fashion stifles the functionality of the system.
>
> - Lastly; the flag actually blocks entry into cgroups as well as
> sched_setaffinity(), so the name is misleading at best.
>
> The right fix is to only set PF_THREAD_BOUND on per-cpu workers:
>
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
>
> set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>
> - /* prevent userland from meddling with cpumask of workqueue workers */
> - worker->task->flags |= PF_NO_SETAFFINITY;
> -
> /*
> * The caller is responsible for ensuring %POOL_DISASSOCIATED
> * remains stable across this function. See the comments above the
> * flag definition for details.
> */
> - if (pool->flags & POOL_DISASSOCIATED)
> + if (pool->flags & POOL_DISASSOCIATED) {
> worker->flags |= WORKER_UNBOUND;
> + } else {
> + /*
> + * Prevent userland from meddling with cpumask of workqueue
> + * workers:
> + */
> + worker->task->flags |= PF_THREAD_BOUND;
> + }
>
> Therefore revert the patch and add an annoying but non-destructive warning
> check against abuse.
Hm, I missed these problems with the approach, but I think you are
right.
Tejun, I suspect you concur with Peter's analysis, can I also add
Peter's workqueue.c fixlet above to workqueue.c to this patch plus
your Acked-by, so that the two changes are together?
Thanks,
Ingo
next prev parent reply other threads:[~2013-12-02 15:37 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-02 15:29 [PATCH] sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY") Peter Zijlstra
2013-12-02 15:37 ` Ingo Molnar [this message]
2013-12-02 16:50 ` Tejun Heo
2013-12-02 17:28 ` Ingo Molnar
2013-12-02 19:17 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131202153746.GA27781@gmail.com \
--to=mingo@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=htejun@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox