public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY")
@ 2013-12-02 15:29 Peter Zijlstra
  2013-12-02 15:37 ` Ingo Molnar
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2013-12-02 15:29 UTC (permalink / raw)
  To: mingo
  Cc: Tejun Heo, Linus Torvalds, Thomas Gleixner, linux-kernel,
	Andrew Morton

Subject: sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY")

PF_NO_SETAFFINITY, which crudely turns off affinity control and cgroups
access to tasks, and which is in use by every workqueue thread in Linux (!),
is conceptually wrong on many levels:

 - We should strive to never consciously place artificial limitations on
   kernel functionality; our main reason to place limitations should be
   correctness.

   There are cases where limiting affinity is justified: for example the
   case of single cpu workqueue threads, which are special for their
   limited concurrency, esp. when coupled with per-cpu resources --
   allowing such threads to run on other cpus is a correctness violation
   and can crash the kernel.

 - But using it outside this case is overly broad; it dis-allows usage
   that is functionally fine and in some cases desired.

   In particular; tj argues ( http://lkml.kernel.org/r/20131128143848.GD3925@htj.dyndns.org )

   "That's just inviting people to do weirdest things and then
   reporting things like "crypt jobs on some of our 500 machines end up
   stuck on a single cpu once in a while" which will eventually be
   tracked down to some weird shell script setting affinity on workers
   doing something else."

   While that is exactly what HPC/RT people _want_ in order to avoid
   disturbing the other CPUs with said crypt work.

 - Furthermore, the above example is also wrong in that you should not
   protect root from itself; there's plenty root can do to shoot his
   own foot off, let alone shoot his head off.

   Setting affinities is limited to root, and if root messes up the
   system he can keep the pieces. But limiting in an overly broad
   fashion stifles the functionality of the system.

 - Lastly; the flag actually blocks entry into cgroups as well as
   sched_setaffinity(), so the name is misleading at best.

The right fix is to only set PF_THREAD_BOUND on per-cpu workers:

 --- a/kernel/workqueue.c
 +++ b/kernel/workqueue.c

         set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);

 -       /* prevent userland from meddling with cpumask of workqueue workers */
 -       worker->task->flags |= PF_NO_SETAFFINITY;
 -
         /*
          * The caller is responsible for ensuring %POOL_DISASSOCIATED
          * remains stable across this function.  See the comments above the
          * flag definition for details.
          */
 -       if (pool->flags & POOL_DISASSOCIATED)
 +       if (pool->flags & POOL_DISASSOCIATED) {
                 worker->flags |= WORKER_UNBOUND;
 +       } else {
 +               /*
 +                * Prevent userland from meddling with cpumask of workqueue
 +                * workers:
 +                */
 +               worker->task->flags |= PF_THREAD_BOUND;
 +       }

Therefore revert the patch and add an annoying but non-destructive warning
check against abuse.

Cc: Tejun Heo <htejun@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
---
 include/linux/sched.h |  2 +-
 kernel/cgroup.c       |  4 ++--
 kernel/cpuset.c       |  2 +-
 kernel/kthread.c      |  2 +-
 kernel/reboot.c       |  2 +-
 kernel/sched/core.c   | 18 +++++++++++++++++-
 kernel/workqueue.c    |  4 ++--
 7 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 768b037dfacb..ecf29700f4db 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1693,7 +1693,7 @@ extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut,
 #define PF_SWAPWRITE	0x00800000	/* Allowed to write to swap */
 #define PF_SPREAD_PAGE	0x01000000	/* Spread page cache over cpuset */
 #define PF_SPREAD_SLAB	0x02000000	/* Spread some slab caches over cpuset */
-#define PF_NO_SETAFFINITY 0x04000000	/* Userland is not allowed to meddle with cpus_allowed */
+#define PF_THREAD_BOUND 0x04000000	/* Userland is not allowed to meddle with cpus_allowed */
 #define PF_MCE_EARLY    0x08000000      /* Early kill for mce process policy */
 #define PF_MEMPOLICY	0x10000000	/* Non-default NUMA mempolicy */
 #define PF_MUTEX_TESTER	0x20000000	/* Thread belongs to the rt mutex tester */
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 4c62513fe19f..b0f88306955e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2151,11 +2151,11 @@ static int attach_task_by_pid(struct cgroup *cgrp, u64 pid, bool threadgroup)
 		tsk = tsk->group_leader;
 
 	/*
-	 * Workqueue threads may acquire PF_NO_SETAFFINITY and become
+	 * Workqueue threads may acquire PF_THREAD_BOUND and become
 	 * trapped in a cpuset, or RT worker may be born in a cgroup
 	 * with no rt_runtime allocated.  Just say no.
 	 */
-	if (tsk == kthreadd_task || (tsk->flags & PF_NO_SETAFFINITY)) {
+	if (tsk == kthreadd_task || (tsk->flags & PF_THREAD_BOUND)) {
 		ret = -EINVAL;
 		rcu_read_unlock();
 		goto out_unlock_cgroup;
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 6bf981e13c43..aa9b4ac9fc28 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1475,7 +1475,7 @@ static int cpuset_can_attach(struct cgroup_subsys_state *css,
 		 * before cpus_allowed may be changed.
 		 */
 		ret = -EINVAL;
-		if (task->flags & PF_NO_SETAFFINITY)
+		if (task->flags & PF_THREAD_BOUND)
 			goto out_unlock;
 		ret = security_task_setscheduler(task);
 		if (ret)
diff --git a/kernel/kthread.c b/kernel/kthread.c
index b5ae3ee860a9..626dbba190c8 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -334,7 +334,7 @@ static void __kthread_bind(struct task_struct *p, unsigned int cpu, long state)
 	}
 	/* It's safe because the task is inactive. */
 	do_set_cpus_allowed(p, cpumask_of(cpu));
-	p->flags |= PF_NO_SETAFFINITY;
+	p->flags |= PF_THREAD_BOUND;
 }
 
 /**
diff --git a/kernel/reboot.c b/kernel/reboot.c
index f813b3474646..de0911160e5a 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -116,7 +116,7 @@ static void migrate_to_reboot_cpu(void)
 		cpu = cpumask_first(cpu_online_mask);
 
 	/* Prevent races with other tasks migrating this task */
-	current->flags |= PF_NO_SETAFFINITY;
+	current->flags |= PF_THREAD_BOUND;
 
 	/* Make certain I only run on the appropriate processor */
 	set_cpus_allowed_ptr(current, cpumask_of(cpu));
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 718730dd0480..37f544ef95a7 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2407,6 +2407,18 @@ static noinline void __schedule_bug(struct task_struct *prev)
 	add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
 }
 
+static noinline void __schedule_bound_bug(struct task_struct *prev)
+{
+	if (oops_in_progress)
+		return;
+
+	printk(KERN_ERR "BUG: PF_THREAD_BOUND set on an unbound thread: %s/%d\n",
+			prev->comm, prev->pid);
+
+	prev->flags &= ~PF_THREAD_BOUND;
+	add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
+}
+
 /*
  * Various schedule()-time debugging checks and statistics:
  */
@@ -2421,6 +2433,10 @@ static inline void schedule_debug(struct task_struct *prev)
 		__schedule_bug(prev);
 	rcu_sleep_check();
 
+	if (unlikely((prev->flags & PF_THREAD_BOUND) && 
+		     prev->nr_cpus_allowed != 1))
+		__schedule_bound_bug(prev);
+
 	profile_hit(SCHED_PROFILING, __builtin_return_address(0));
 
 	schedstat_inc(this_rq(), sched_count);
@@ -3350,7 +3366,7 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
 	get_task_struct(p);
 	rcu_read_unlock();
 
-	if (p->flags & PF_NO_SETAFFINITY) {
+	if (p->flags & PF_THREAD_BOUND) {
 		retval = -EINVAL;
 		goto out_put_task;
 	}
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 987293d03ebc..89314c0b3285 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1744,7 +1744,7 @@ static struct worker *create_worker(struct worker_pool *pool)
 	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
 
 	/* prevent userland from meddling with cpumask of workqueue workers */
-	worker->task->flags |= PF_NO_SETAFFINITY;
+	worker->task->flags |= PF_THREAD_BOUND;
 
 	/*
 	 * The caller is responsible for ensuring %POOL_DISASSOCIATED
@@ -4215,7 +4215,7 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
 		}
 
 		wq->rescuer = rescuer;
-		rescuer->task->flags |= PF_NO_SETAFFINITY;
+		rescuer->task->flags |= PF_THREAD_BOUND;
 		wake_up_process(rescuer->task);
 	}
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY")
  2013-12-02 15:29 [PATCH] sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY") Peter Zijlstra
@ 2013-12-02 15:37 ` Ingo Molnar
  2013-12-02 16:50   ` Tejun Heo
  0 siblings, 1 reply; 5+ messages in thread
From: Ingo Molnar @ 2013-12-02 15:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Tejun Heo, Linus Torvalds, Thomas Gleixner, linux-kernel,
	Andrew Morton


* Peter Zijlstra <peterz@infradead.org> wrote:

> Subject: sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY")
> 
> PF_NO_SETAFFINITY, which crudely turns off affinity control and cgroups
> access to tasks, and which is in use by every workqueue thread in Linux (!),
> is conceptually wrong on many levels:
> 
>  - We should strive to never consciously place artificial limitations on
>    kernel functionality; our main reason to place limitations should be
>    correctness.
> 
>    There are cases where limiting affinity is justified: for example the
>    case of single cpu workqueue threads, which are special for their
>    limited concurrency, esp. when coupled with per-cpu resources --
>    allowing such threads to run on other cpus is a correctness violation
>    and can crash the kernel.
> 
>  - But using it outside this case is overly broad; it dis-allows usage
>    that is functionally fine and in some cases desired.
> 
>    In particular; tj argues ( http://lkml.kernel.org/r/20131128143848.GD3925@htj.dyndns.org )
> 
>    "That's just inviting people to do weirdest things and then
>    reporting things like "crypt jobs on some of our 500 machines end up
>    stuck on a single cpu once in a while" which will eventually be
>    tracked down to some weird shell script setting affinity on workers
>    doing something else."
> 
>    While that is exactly what HPC/RT people _want_ in order to avoid
>    disturbing the other CPUs with said crypt work.
> 
>  - Furthermore, the above example is also wrong in that you should not
>    protect root from itself; there's plenty root can do to shoot his
>    own foot off, let alone shoot his head off.
> 
>    Setting affinities is limited to root, and if root messes up the
>    system he can keep the pieces. But limiting in an overly broad
>    fashion stifles the functionality of the system.
> 
>  - Lastly; the flag actually blocks entry into cgroups as well as
>    sched_setaffinity(), so the name is misleading at best.
> 
> The right fix is to only set PF_THREAD_BOUND on per-cpu workers:
> 
>  --- a/kernel/workqueue.c
>  +++ b/kernel/workqueue.c
> 
>          set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> 
>  -       /* prevent userland from meddling with cpumask of workqueue workers */
>  -       worker->task->flags |= PF_NO_SETAFFINITY;
>  -
>          /*
>           * The caller is responsible for ensuring %POOL_DISASSOCIATED
>           * remains stable across this function.  See the comments above the
>           * flag definition for details.
>           */
>  -       if (pool->flags & POOL_DISASSOCIATED)
>  +       if (pool->flags & POOL_DISASSOCIATED) {
>                  worker->flags |= WORKER_UNBOUND;
>  +       } else {
>  +               /*
>  +                * Prevent userland from meddling with cpumask of workqueue
>  +                * workers:
>  +                */
>  +               worker->task->flags |= PF_THREAD_BOUND;
>  +       }
> 
> Therefore revert the patch and add an annoying but non-destructive warning
> check against abuse.

Hm, I missed these problems with the approach, but I think you are 
right.

Tejun, I suspect you concur with Peter's analysis, can I also add 
Peter's workqueue.c fixlet above to workqueue.c to this patch plus 
your Acked-by, so that the two changes are together?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY")
  2013-12-02 15:37 ` Ingo Molnar
@ 2013-12-02 16:50   ` Tejun Heo
  2013-12-02 17:28     ` Ingo Molnar
  0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2013-12-02 16:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Linus Torvalds, Thomas Gleixner, linux-kernel,
	Andrew Morton

Hello, guys.

On Mon, Dec 02, 2013 at 04:37:46PM +0100, Ingo Molnar wrote:
> >  - We should strive to never consciously place artificial limitations on
> >    kernel functionality; our main reason to place limitations should be
> >    correctness.

It is a correctness issue tho.  We'd be promising, say, NUMA affinity
and then possibly give out workers which aren't affine to the NUMA
node.  Workqueue provides an API to set random affinity and its users
can rightfully expect the configured affinity and code accordingly.

> >    There are cases where limiting affinity is justified: for example the
> >    case of single cpu workqueue threads, which are special for their
> >    limited concurrency, esp. when coupled with per-cpu resources --
> >    allowing such threads to run on other cpus is a correctness violation
> >    and can crash the kernel.

So can any requested affinity if the wq user configured it.  I think
you're too focused on the binary bound / unbound distinction.  Things
are a lot more generic now.

> >  - But using it outside this case is overly broad; it dis-allows usage
> >    that is functionally fine and in some cases desired.
> > 
> >    In particular; tj argues ( http://lkml.kernel.org/r/20131128143848.GD3925@htj.dyndns.org )
> > 
> >    "That's just inviting people to do weirdest things and then
> >    reporting things like "crypt jobs on some of our 500 machines end up
> >    stuck on a single cpu once in a while" which will eventually be
> >    tracked down to some weird shell script setting affinity on workers
> >    doing something else."
> > 
> >    While that is exactly what HPC/RT people _want_ in order to avoid
> >    disturbing the other CPUs with said crypt work.

Hmmm... I think you're missing the point there.  The point is that you
can't really do that reliably by meddling with kworkers from userland
directly.  It's gonna be fragile and dangerous.  Userland would have
to continously poll for new kworkers while kernel workqueue users
would get their affinity expectations broken.  I don't think that's
what HPC/RT people want.  If HPC/RT people want to limit the cpus that
unbound workqueues without specific affinity configured can run on,
let's please implement that properly.

> >  - Furthermore, the above example is also wrong in that you should not
> >    protect root from itself; there's plenty root can do to shoot his
> >    own foot off, let alone shoot his head off.
> > 
> >    Setting affinities is limited to root, and if root messes up the
> >    system he can keep the pieces. But limiting in an overly broad
> >    fashion stifles the functionality of the system.

This is the same problem with bound workers.  Userland could be
actively breaking configured affinities of kworkers which may be
depended upon by its users and there's no way for userland to tell
which kworker is gonna serve which workqueue - there's no fixed
relationship between them, so it's not like userland can, say, modify
affinities on crypto kworkers in isolation.  Both userland and kernel
wouldn't have much idea of the impact of such fiddling.

> >  - Lastly; the flag actually blocks entry into cgroups as well asn
> >    sched_setaffinity(), so the name is misleading at best.

Yeah, the name could be better.

> > The right fix is to only set PF_THREAD_BOUND on per-cpu workers:
> > 
> >  --- a/kernel/workqueue.c
> >  +++ b/kernel/workqueue.c
> > 
> >          set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> > 
> >  -       /* prevent userland from meddling with cpumask of workqueue workers */
> >  -       worker->task->flags |= PF_NO_SETAFFINITY;
> >  -
> >          /*
> >           * The caller is responsible for ensuring %POOL_DISASSOCIATED
> >           * remains stable across this function.  See the comments above the
> >           * flag definition for details.
> >           */
> >  -       if (pool->flags & POOL_DISASSOCIATED)
> >  +       if (pool->flags & POOL_DISASSOCIATED) {
> >                  worker->flags |= WORKER_UNBOUND;
> >  +       } else {
> >  +               /*
> >  +                * Prevent userland from meddling with cpumask of workqueue
> >  +                * workers:
> >  +                */
> >  +               worker->task->flags |= PF_THREAD_BOUND;
> >  +       }
> > 
> > Therefore revert the patch and add an annoying but non-destructive warning
> > check against abuse.
> 
> Hm, I missed these problems with the approach, but I think you are 
> right.
> 
> Tejun, I suspect you concur with Peter's analysis, can I also add 

So, not really.

> Peter's workqueue.c fixlet above to workqueue.c to this patch plus 
> your Acked-by, so that the two changes are together?

The above would trigger the added warning if a new kworker is created
for a per-cpu workqueue while the cpu is down, which may happen, so
the patch itself is somewhat broken.

I don't think this is the right path to accomodate HPC/RT use cases.
Let's please implement something proper if keeping unbound kworkers
off certain cpus is necessary.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY")
  2013-12-02 16:50   ` Tejun Heo
@ 2013-12-02 17:28     ` Ingo Molnar
  2013-12-02 19:17       ` Tejun Heo
  0 siblings, 1 reply; 5+ messages in thread
From: Ingo Molnar @ 2013-12-02 17:28 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Peter Zijlstra, Linus Torvalds, Thomas Gleixner, linux-kernel,
	Andrew Morton


* Tejun Heo <htejun@gmail.com> wrote:

> Hello, guys.
> 
> On Mon, Dec 02, 2013 at 04:37:46PM +0100, Ingo Molnar wrote:
> > >  - We should strive to never consciously place artificial limitations on
> > >    kernel functionality; our main reason to place limitations should be
> > >    correctness.
> 
> It is a correctness issue tho.  We'd be promising, say, NUMA affinity
> and then possibly give out workers which aren't affine to the NUMA
> node. [...]

No, NUMA is a performance issue in this specific case, not a 
correctness issue. 'Correctness' for configuration details mostly 
means 'stuff does not crash'.

We try to give good, sensible, well performing defaults out of box, 
but otherwise we try to let root mis-configure (or improve!) their 
systems rather widely.

This is really fundamental: our affinity APIs are generic, and they 
should only ever be limited in such a drastic fashion in the very 
strict 'stuff crashes' case, and that is properly expressed via the 
PF_THREAD_BOUND flag.

(Your other arguments fail for similar reasons.)

> This is the same problem with bound workers.  Userland could be 
> actively breaking configured affinities of kworkers which may be 
> depended upon by its users and there's no way for userland to tell 
> which kworker is gonna serve which workqueue - there's no fixed 
> relationship between them, so it's not like userland can, say, 
> modify affinities on crypto kworkers in isolation.  Both userland 
> and kernel wouldn't have much idea of the impact of such fiddling.

That's only because 'private affinity' attributes detached scheduler 
affinity handling snuck into the workqueue code. It's bad design and 
it needs to be fixed - and the worst aspect is undone via this revert.

> > Peter's workqueue.c fixlet above to workqueue.c to this patch plus 
> > your Acked-by, so that the two changes are together?
> 
> The above would trigger the added warning if a new kworker is 
> created for a per-cpu workqueue while the cpu is down, which may 
> happen, so the patch itself is somewhat broken.

I'll wait for a new submission from Peter.

> I don't think this is the right path to accomodate HPC/RT use cases. 
> Let's please implement something proper if keeping unbound kworkers 
> off certain cpus is necessary.

We already have APIs to manage affinities, in the scheduler. 
Privatizing this function into workqueue.c is limiting and actively 
wrong.

Such APIs should be done properly, by extending existing scheduler 
APIs if needed, not by turning off the old APIs via a crude flag and 
creating some private, per subsystem implementation ...

Thanks,
	
	Ingo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY")
  2013-12-02 17:28     ` Ingo Molnar
@ 2013-12-02 19:17       ` Tejun Heo
  0 siblings, 0 replies; 5+ messages in thread
From: Tejun Heo @ 2013-12-02 19:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Linus Torvalds, Thomas Gleixner, linux-kernel,
	Andrew Morton

Hello, Ingo.

On Mon, Dec 02, 2013 at 06:28:37PM +0100, Ingo Molnar wrote:
> No, NUMA is a performance issue in this specific case, not a 
> correctness issue. 'Correctness' for configuration details mostly 
> means 'stuff does not crash'.

That's exactly the thing.  Workqueue users can set any random affinity
be that per-cpu, per-NUMA or whatever (I'm not talking about the
automatic NUMA affinity here), and they should be able to depend on
the configured affinity being honored as long as that request has
succeeded.  Changing affinity underneath workqueue code actively
breaks the API and has possibility to lead to crashes.

> We try to give good, sensible, well performing defaults out of box, 
> but otherwise we try to let root mis-configure (or improve!) their 
> systems rather widely.

Yeah, sure, that's why there's custom attributes interface which can
be exposed via sysfs.

> That's only because 'private affinity' attributes detached scheduler 
> affinity handling snuck into the workqueue code. It's bad design and 
> it needs to be fixed - and the worst aspect is undone via this revert.

You can't do that because of the very nature of workqueue - the tasks
are completely ephemeral and automatically pooled.  Nothing is
per-task.  There are attributes which aren't even defined for
individual tasks.  For workqueue attributes to be customizable, it
needs some form of custom interface.

> We already have APIs to manage affinities, in the scheduler. 
> Privatizing this function into workqueue.c is limiting and actively 
> wrong.

Those are pool attributes which the tasks are keyed by.  workqueue
needs to know them to put the matching ones in the same pool.

> Such APIs should be done properly, by extending existing scheduler 
> APIs if needed, not by turning off the old APIs via a crude flag and 
> creating some private, per subsystem implementation ...

I don't really follow this logic.  workqueue is a super structure
built on top of tasks and has certain requirements on how its member
tasks behave and will break, in ways which would be difficult to track
down, if those requirements are breached.  This is something
structurally fundamental.

I suppose workqueue can taint / generate warnings if workers which are
breaking out of the requested parameters are detected, if people are
hell-bent on doing it, but the pros aren't clear at all to me.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-12-02 20:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-02 15:29 [PATCH] sched: Revert 14a40ffccd61 ("sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY") Peter Zijlstra
2013-12-02 15:37 ` Ingo Molnar
2013-12-02 16:50   ` Tejun Heo
2013-12-02 17:28     ` Ingo Molnar
2013-12-02 19:17       ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox