Re: [PATCH 4/7] sched: implement force_cpus_allowed()

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Tejun Heo <tj@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: tglx@linutronix.de, mingo@elte.hu, avi@redhat.com, efault@gmx.de,
	rusty@rustcorp.com.au, linux-kernel@vger.kernel.org,
	Gautham R Shenoy <ego@in.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH 4/7] sched: implement force_cpus_allowed()
Date: Wed, 09 Dec 2009 14:25:10 +0900	[thread overview]
Message-ID: <4B1F34B6.3090107@kernel.org> (raw)
In-Reply-To: <1260279351.3935.1375.camel@laptop>

Hello,

On 12/08/2009 10:35 PM, Peter Zijlstra wrote:
>> Slow and indeterminism comes in different magnitudes.
> 
> Determinism does _not_ come in magnitudes, its a very binary property,
> either it is or it is not.

It's semantics, right?  The whole process is made closer to being
deterministic by making each component deterministic, so in that way
"more deterministic" is a meaningful expression or it also can
describe how deterministic it feels to human perception.

> As to the order of slowness for unplug, that is about maximal, its _the_
> slowest path in the whole kernel.

Long running works may run for minutes.  They're slow to wait for as
perceived by human beings, so we're talking about pretty different
scales.

> Ok, maybe, but that is not what I would call a generic thread pool.

Sure, it's not completely generic, not yet anyway.  The main focus is
to solve concurrency issues with the current workqueue implementation.
With concurrency issues solved, accomodating long running works
becomes quite easy - all we need to do is to migrate back unbound
workers during cup up - and as we have considerable number of users
which require such usage, it's reasonable to implement it.

> So the reason we have tons of idle workqueues around are purely because
> of deadlock scenarios? Or is there other crap about?

No, that's part of concurrency issues.  Works don't devour and compete
for CPU cycles but they overlap each other frequently.  Works are
often used to provide task context so that the users can use sleeping
synchronization constructs or wait for events.

> So why not start simple and only have one thread per cpu (lets call it
> events/#) and run all works there. Then when you enqueue a work and
> events/# is already busy with a work from anther wq, hand the work to a
> global event thread which will spawn a special single shot kthread for
> it, with a second exception for those reclaim wq's, for which you'll
> have this rescue thread which you'll bind to the right cpu for that
> work.
>
> That should get rid of all these gazillion threads we have, preserve the
> queue property and not be as invasive as your current thing.

That doesn't solve concurrency problem at all and we'll be ending up
bouncing around a LOT of works.

> If they're really as idle as reported you'll never need the fork-fest
> you currently propose, simply because there's not enough work.
>
> So basically, have events/# service the first non-empty cwq, when
> there's more non empty cwqs spawn them single shot threads, or use a
> rescue thread.

Idle doesn't mean they don't overlap.  They aren't cpu cycle hungry
but are context hungry.

It just isn't possible to implement shared worker pool which can scale
according to necessary level of concurrency without doing some sort of
dynamic worker management which necessarily involves forking new ones
when in need and killing some of them off when there are more than
enough.

As for the force_cpus_allowed() bit, I think it's a rather natural
interface to have and maybe we can replace kthread_bind() with it or
make kthread_bind() in terms of it.  It's the basic migration function
which adheres to the cpu hot plug/unplug synchronization rules.

>>   I thought about adding an unbound pool of workers
>> for cpu intensive works for completeness but I really couldn't find
>> much use for that.  If enough number of users would need something
>> like that, we can add an anonymous pool but for now I really don't see
>> the need to worry about that.
> 
> And I though I'd heard multiple parties express interesting in exactly
> that, btrfs, bdi and pohmelfs come to mind, also crypto looks like one
> that could actually do some work.

Yeah, anything cryptography related or crunches large chunk of data
would be a good candidate but still compared to the works and kthreads
we have just to have context and be able to wait for things, they're
minority.  Plus, it's something we can continue to work on if there
are enough reasons to do so.  If accomodating long running works there
makes more sense, we'll do that but for most of them whether bound to
certain cpu or not just isn't intersting at all and all we need to
serve them is the ability to migrate back the threads during CPU UP.
It's pretty isolated path.

Thanks.

-- 
tejun

next prev parent reply	other threads:[~2009-12-09  5:24 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-02  3:56 [PATCHSET tip/sched/core] sched: concurrency managed workqueue related sched patches Tejun Heo
2009-12-02  3:56 ` [PATCH 1/7] sched: revert 498657a478c60be092208422fefa9c7b248729c2 Tejun Heo
2009-12-02 10:42   ` [tip:sched/core] sched: Revert 498657a478c60be092208422fefa9c7b248729c2 tip-bot for Tejun Heo
2009-12-02  3:56 ` [PATCH 2/7] sched: rename preempt_notifiers to sched_notifiers and refactor implementation Tejun Heo
2009-12-02  3:56 ` [PATCH 3/7] sched: refactor try_to_wake_up() Tejun Heo
2009-12-02  9:05   ` Mike Galbraith
2009-12-02  9:51     ` Tejun Heo
2009-12-02 10:10       ` Mike Galbraith
2009-12-02 10:14         ` Tejun Heo
2009-12-02 11:01           ` Peter Zijlstra
2009-12-03  6:11   ` [PATCH UDPATED " Tejun Heo
2009-12-02  3:56 ` [PATCH 4/7] sched: implement force_cpus_allowed() Tejun Heo
2009-12-04 10:40   ` Peter Zijlstra
2009-12-04 10:43     ` Peter Zijlstra
2009-12-07  4:34       ` Tejun Heo
2009-12-07  8:35         ` Peter Zijlstra
2009-12-07 10:34           ` Tejun Heo
2009-12-07 10:54             ` Peter Zijlstra
2009-12-07 11:07               ` Tejun Heo
2009-12-08  8:41                 ` Tejun Heo
2009-12-08  9:02                   ` Peter Zijlstra
2009-12-08  9:12                     ` Tejun Heo
2009-12-08 10:34                       ` Peter Zijlstra
2009-12-08 10:38                         ` Peter Zijlstra
2009-12-08 11:26                           ` Tejun Heo
2009-12-08 11:24                         ` Tejun Heo
2009-12-08 11:48                           ` Peter Zijlstra
2009-12-08 11:56                             ` Tejun Heo
2009-12-08 12:10                               ` Peter Zijlstra
2009-12-08 12:23                                 ` Tejun Heo
2009-12-08 13:35                                   ` Peter Zijlstra
2009-12-09  5:25                                     ` Tejun Heo [this message]
2009-12-09  7:41                                       ` Peter Zijlstra
2009-12-09  8:03                                         ` Tejun Heo
2009-12-02  3:56 ` [PATCH 5/7] sched: make sched_notifiers unconditional Tejun Heo
2009-12-02  3:56 ` [PATCH 6/7] sched: add wakeup/sleep sched_notifiers and allow NULL notifier ops Tejun Heo
2009-12-02  3:56 ` [PATCH 7/7] sched: implement try_to_wake_up_local() Tejun Heo
2009-12-03  6:13   ` [PATCH UPDATED " Tejun Heo
2009-12-04 10:47     ` Peter Zijlstra
2009-12-07  3:31       ` Tejun Heo
2009-12-04 10:44   ` [PATCH " Peter Zijlstra
2009-12-07  3:26     ` Tejun Heo
2009-12-07  8:50       ` Peter Zijlstra
2009-12-07  8:56         ` Peter Zijlstra
2009-12-07 10:27           ` Tejun Heo
2009-12-08  8:53             ` Peter Zijlstra
2009-12-08  9:16               ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B1F34B6.3090107@kernel.org \
    --to=tj@kernel.org \
    --cc=avi@redhat.com \
    --cc=efault@gmx.de \
    --cc=ego@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox