Re: [PATCH] sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY")

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <htejun@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY")
Date: Mon, 2 Dec 2013 16:37:46 +0100	[thread overview]
Message-ID: <20131202153746.GA27781@gmail.com> (raw)
In-Reply-To: <20131202152956.GI16796@laptop.programming.kicks-ass.net>


* Peter Zijlstra <peterz@infradead.org> wrote:

> Subject: sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY")
> 
> PF_NO_SETAFFINITY, which crudely turns off affinity control and cgroups
> access to tasks, and which is in use by every workqueue thread in Linux (!),
> is conceptually wrong on many levels:
> 
>  - We should strive to never consciously place artificial limitations on
>    kernel functionality; our main reason to place limitations should be
>    correctness.
> 
>    There are cases where limiting affinity is justified: for example the
>    case of single cpu workqueue threads, which are special for their
>    limited concurrency, esp. when coupled with per-cpu resources --
>    allowing such threads to run on other cpus is a correctness violation
>    and can crash the kernel.
> 
>  - But using it outside this case is overly broad; it dis-allows usage
>    that is functionally fine and in some cases desired.
> 
>    In particular; tj argues ( http://lkml.kernel.org/r/20131128143848.GD3925@htj.dyndns.org )
> 
>    "That's just inviting people to do weirdest things and then
>    reporting things like "crypt jobs on some of our 500 machines end up
>    stuck on a single cpu once in a while" which will eventually be
>    tracked down to some weird shell script setting affinity on workers
>    doing something else."
> 
>    While that is exactly what HPC/RT people _want_ in order to avoid
>    disturbing the other CPUs with said crypt work.
> 
>  - Furthermore, the above example is also wrong in that you should not
>    protect root from itself; there's plenty root can do to shoot his
>    own foot off, let alone shoot his head off.
> 
>    Setting affinities is limited to root, and if root messes up the
>    system he can keep the pieces. But limiting in an overly broad
>    fashion stifles the functionality of the system.
> 
>  - Lastly; the flag actually blocks entry into cgroups as well as
>    sched_setaffinity(), so the name is misleading at best.
> 
> The right fix is to only set PF_THREAD_BOUND on per-cpu workers:
> 
>  --- a/kernel/workqueue.c
>  +++ b/kernel/workqueue.c
> 
>          set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> 
>  -       /* prevent userland from meddling with cpumask of workqueue workers */
>  -       worker->task->flags |= PF_NO_SETAFFINITY;
>  -
>          /*
>           * The caller is responsible for ensuring %POOL_DISASSOCIATED
>           * remains stable across this function.  See the comments above the
>           * flag definition for details.
>           */
>  -       if (pool->flags & POOL_DISASSOCIATED)
>  +       if (pool->flags & POOL_DISASSOCIATED) {
>                  worker->flags |= WORKER_UNBOUND;
>  +       } else {
>  +               /*
>  +                * Prevent userland from meddling with cpumask of workqueue
>  +                * workers:
>  +                */
>  +               worker->task->flags |= PF_THREAD_BOUND;
>  +       }
> 
> Therefore revert the patch and add an annoying but non-destructive warning
> check against abuse.

Hm, I missed these problems with the approach, but I think you are 
right.

Tejun, I suspect you concur with Peter's analysis, can I also add 
Peter's workqueue.c fixlet above to workqueue.c to this patch plus 
your Acked-by, so that the two changes are together?

Thanks,

	Ingo

next prev parent reply	other threads:[~2013-12-02 15:37 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-02 15:29 [PATCH] sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY") Peter Zijlstra
2013-12-02 15:37 ` Ingo Molnar [this message]
2013-12-02 16:50   ` Tejun Heo
2013-12-02 17:28     ` Ingo Molnar
2013-12-02 19:17       ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131202153746.GA27781@gmail.com \
    --to=mingo@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=htejun@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox