public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: Circular Convolution scheduler
@ 2003-10-16  1:51 Clayton Weaver
  2003-10-21 18:44 ` bill davidsen
  0 siblings, 1 reply; 17+ messages in thread
From: Clayton Weaver @ 2003-10-16  1:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: jamie

----- Original Message -----
From: Jamie Lokier <jamie@shareable.org>
Date: Tue, 14 Oct 2003 11:18:53 +0100
To: Nick Piggin <piggin@cyberone.com.au>
Subject: Re: Circular Convolution scheduler


> Ok, but what is "circular convolution scheduling"?

It was a vague idea that I had for integrating
our current, more-or-less distributed system
of priority scaling heuristics into a single
state model and applying them all at once to a
scheduling decision with a single matrix
multiply. The motivation would be to engineer
out unexpected interactions between different
parts of the heuristic analyses that produce unpredicted (and potentially unwanted) behavior
in the scheduler.

Ad hoc code is fast, but it depends on
implementers being able to model the implied
state machine in their imagination, with the implementation of it distributed across various
code points in the scheduler. This makes isolated simulation and verification (proof that the
scheduler does what we intend it to do)
difficult, and we might get farther faster
by having an implementation of these scheduling-relevant runtime heuristic analyses
that allows us to reliably model scheduler
state in the abstract.

I'm not saying that can't be done with the
present system, it's just a lot harder to be
sure that your model really reflects what is
happening at runtime when you start with ad
hoc code and try to model it than if you start
with a model of a set of state transitions
that does what you want and then
implement/optimize the model.

As other people have pointed out, runtime
scheduler performance is the trump card,
and abstract verifiability that a scheduler
(with heuristic priority scaling) meets a
specified set of state transition constraints
is not much help if you can't implement the
model with code that performs at least as well
as a scheduler with ad-hoc heuristics pasted
in "wherever it looked convenient".

(But you are not likely to need to revert stuff
very often with a heuristic-enhanced scheduler
implemented from a verified formal model,
because you aren't guessing what a code change
is going to do to the state machine. You
already know.)

> Nick Piggin wrote:
> > I don't know anything about it, but I don't see what exactly you'd be
> > trying to predict: the kernel's scheduler _dictates_ scheduling behaviour,
> > obviously. Also, "best use of system resources" wrt scheduling is a big
> > ask considering there isn't one ideal scheduling pattern for all but the
> > most trivial loads, even on a single processor computer (fairness, latency,
> > priority, thoughput, etc). Its difficult to even say one pattern is better
> > than another.

> Hmm.  Prediction is potentially useful.
 
> Instead of an educated ad-hoc pile of heuristics for _dictating_
> scheduling behaviour, you can systematically analyse just what is it
> you're trying to achieve, and design a behaviour which achieves that
> as closely as possible.
 
> This is where good predictors come in: you feed all the possible
> scheduling decisions at any point in time into the predictor, and use
> the output to decide which decision gave the most desired result -
> taking into account the likelihood of future behaviours.  Of course
> you have to optimise this calculation.
 
> This is classical control theory.  In practice it comes up with
> something like what we have already :)  But the design path is
> different, and if you're very thoroughly analytical about it, maybe
> there's a chance of avoiding weird corner behaviours that weren't
> intended.
 
> The down side is that crafted heuristics, like the ones we have, tend
> to run a _lot_ faster.

> -- Jamie

I'm not qualified to comment on the value of
predictive scheduling, although that seems to
me the real intention of heuristics for
interactive process priority adaptation,
to predict that the process is going to need
higher effective priority than what the nominal
nice value of it may indicate, given some
scheduling latency threshold generated from
total time utilization of other concurrent
processes.

Other questions:

What about priority scaling history? Do we want
to try to smooth off spikes in what I think
of as a graph of positive indications of
interactive behavior over time? (I was
remembering someone's comment a while back
about some process doing something that
satisfies one of the heuristics for
interactive behavior and then proceeds to
spend the next 12 hours running code that
doesn't need interactive process priority
scaling. It scaled its effective nice
value to -[whatever] without the user
requesting it, needing it, or wanting it.)

Or do we want to react to spikes in the graph
quickly, with a fast decay in the interactive
process priority scaling factor? While you
are typing, clicking, dragging, digitizing,
etc, you want response right now, but batch
processes running in the background shouldn't
have to pay while you stare at the screen for
10 minutes at a time in between mouse clicks
or keystrokes.

(Or should they? Should people really expect
optimum batch performance on a 2-week simulation running in the background on the same box where
they read their email, play games, listen to
music, browse the web, and expect instant
response to actions on their desktop? Seems a bit
in the way of have-your-cake-and-eat-it-too to
me, but that isn't really relevant to the
discussion, which is about how to make the
scheduler heuristics implementation more
predictable.)
 
What about the X environment? What nice value
are people starting the X server, X session
manager, their window manager (and thus any
helper processes forked by the X server) at?
Don't those things need interactive process
priority scaling along with whatever foreground
process the keystroke or mouse click was intended
for, if you are really going to have snappy
desktop performance concurrent with a
substantial load of background batch processing?
Will the interactive process heuristics pick
up the need to scale the effective priority for
a group of other concurrent processes that the
foreground process depends on, too?

I don't really see how, but feel free to enlighten
me. (I think I'd be starting X, xdm, and
window managers at -19 if I wanted fast
type, click, and drag response under any
conditions.)

[make config option]

People who run purely batch servers will
inevitably see scheduler heuristics for
interactive process priority adaptation as
an unnecessary complication. (Designing the
scheduler originally to get good performance
for non-interactive server processes was not
a silly idea. That's a big market for linux.)
Concentrating the application of heuristics
under the umbrella of a single unified model,
whether it uses convolution or not, would seem
to simplify compiling them out for dedicated
batch servers with a make config option.

What do we do exactly when we get a positive
hit on the interactivity heuristics? Jump
queues in the scheduler to a faster tier?
Shorten time slices in front of us?

For some scheduler issues this is going to be
irrelevant, interactive box or not.
Real-time processes should always be
higher-priority than what you can get from
adaptive priority scaling. All processes
always have to get from the tail of the
scheduling queue to the front of the queue
for sure, no matter what kind of adaptive
priority scaling is happening concurrently.
These would be invariants in a scheduler state
model and in implementation probably outside
the application of a convolution to heuristic
results to obtain adaptive priority scaling
values.

I just don't like to see us trying to hack
around a problem with ad hoc special cases.
For hardware we don't have a choice, the
special cases exist a priori, ditto for
network protocols and patchwork standards
compliance on other systems, but we invented
this scheduler ourselves. One would expect that
we could fix problems with it top-down.

(Maybe that's practical, maybe not. Maybe the
intuition is right but a circular convolution
would be an impractical implementation. The
reason to explore the question is avoidance
of "three steps forward and two steps back"
scheduler code development sequences.)

Regards,

Clayton Weaver
<mailto: cgweav@email.com>

-- 
__________________________________________________________
Sign-up for your own personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

CareerBuilder.com has over 400,000 jobs. Be smarter about your job search
http://corp.mail.com/careers


^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: Circular Convolution scheduler
@ 2003-10-19  8:50 Clayton Weaver
  2003-10-21 18:51 ` bill davidsen
  0 siblings, 1 reply; 17+ messages in thread
From: Clayton Weaver @ 2003-10-19  8:50 UTC (permalink / raw)
  To: linux-kernel

(perhaps a bit less vague)

While the long-term time series analyses
would doubtless be interesting for enterprise
networks, I had something more modest in mind.

Most of the heuristic work so far seems to have
been directed toward how to identify interactive
processes as interactive without false positives
on batch processes, making the code correct (no
bugs), making it fast, and a little tuning to
obtain generally ("for most people") usable
values for how fast to scale up the priority
on a process that has matched the heuristics,
yes?

My question about inserting a convolution
would be more relevant to what do we do
with that information ("we believe that
this process is interactive")once we have it.

Do we just hit a boolean switch,
"this process is interactive, add N to
its nice value," and then leave it that
way until it exits? That might work for
most people with the right N, but it does
not take into account two other values
that we could obtain that (imho) should
modulate the interactive process priority
scaling.

Those two values would be a history value (how
many time slices since the last heuristic match
for this process) and how much time owned
by other processes in front of us in the
scheduler when we are requeued.

As the history value counts up, it's weight
as a factor in the interactive priority scaling
should decrease (adapting to when the user
falls asleep at the keyboard with half a dozen
interactive applications open, or when
something like a continuous network
traffic display initially matched an
interactivity heuristic).

As the number of processes in front of us
at reschedule time (regardless of whether
we called schedule ourselves or our time
slice expired) increases, that should
increase the weight of that factor in
scaling up the effective process priority
of an "interactive" (matched the heuristics)
process. "In front of us" would mean higher
nominal priority processes, processes with our
own nominal priority that were already queued
when our time slice expired, and processes
with lower priority that that have been
migrated upward in the queue by the
"no indefinite starvation" controls.

More processes in front of us means more
latency, so we should advance through
the scheduler at a faster rate (if we
don't want the interactivity response
enhancement to decay with process load).

Heuristic interactive process priority
promotion would thus adapt to actual
frequency of heuristic matches by a process
over time and to whatever latency in the
scheduler is attributable to the time slices
of other processes in front of us.

For batch processes, short circuit around the
whole thing if their heuristic history value
is 0 (never matched an interactive process
heuristic). They get scheduled by the scheduler's
normal priority queue management that happens
independent of interactive process heuristics.

(Perhaps a circular convolution is overkill
for these fairly simple adaptations. It was
initially interesting to me in this context
for its ability to encode an analog of something
of variable length in a fixed size data
structure and to reflect an approximation
of incremental changes in the variable length
input in its output. And for this task,
approximation is good enough. But a simple
gap-from-last-match history value and
scheduler-time-load-in-front-of-us already
fit in fixed size data structures.)

The key question, of course, is 'what does
it cost us to compute these?" Not much for
the history value, I don't know what it would
cost to keep track of schedulable aggregate
time load at a given nominal priority.

Comments?

Regards,

Clayton Weaver
<mailto: cgweav@email.com>

-- 
__________________________________________________________
Sign-up for your own personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

CareerBuilder.com has over 400,000 jobs. Be smarter about your job search
http://corp.mail.com/careers


^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: Circular Convolution scheduler
@ 2003-10-07 20:47 Clayton Weaver
  0 siblings, 0 replies; 17+ messages in thread
From: Clayton Weaver @ 2003-10-07 20:47 UTC (permalink / raw)
  To: linux-kernel

Note: the point of my question was not to belittle
the current work on scheduler heuristics to get better interactive usability on workstations.

Here is what the development process for the scheduler over the last couple of years looks
like to me: we start with a fair scheduler,
bias it for processes that always have work to do when they are scheduled, then poke around at it
with heuristics to try mitigate the starvation of interactive processes that resulted from favoring "always have work to do" batch processes in the
initial design.

This seems a bit like reverse-engineering code
that we wrote ourselves to begin with.

Ingo's idea of making scheduling itself faster
regardless of assigned priorities is a good plan,
but it is not really relevant to my question,
which is what sort of priority adjustment algorithm would take interactive (or batch) priority
adjustment code tweaks out of the realm of
guesswork.

Right now we seem to be writing new code without
knowing what effect it will have on batch-interactive
scheduling until after a lot of testing, because
the code is too complex to predict the effect
of changes beforehand. This seems like doing
things the hard way, with lots of detours.

Is interactive performance acceptable with a
completely fair scheduler that does not adjust
priorities based on runtime behavior? (Ie could
we possibly fix this in simpler fashion not
by special-casing interactivity but rather by
un-special-casing something else, maybe adjustable
with a /proc value, sysctl() call, etc.)

I'm also reminded of "compile for network router"
in earlier kernel configs.

"Compile as workstation? [Y]  "

(Not very intuitive. Everyone that didn't read
Configure.help wouldn't be able to guess what
the difference would be between Yes and No answers
to that question. But that's a pretty simple
way to put it.)

Regards,

Clayton Weaver
<mailto: cgweav@email.com>

-- 
__________________________________________________________
Sign-up for your own personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

CareerBuilder.com has over 400,000 jobs. Be smarter about your job search
http://corp.mail.com/careers


^ permalink raw reply	[flat|nested] 17+ messages in thread
* Circular Convolution scheduler
@ 2003-10-06 16:17 Clayton Weaver
  2003-10-07 22:19 ` George Anzinger
  0 siblings, 1 reply; 17+ messages in thread
From: Clayton Weaver @ 2003-10-06 16:17 UTC (permalink / raw)
  To: linux-kernel

Though the mechanism is doubtless familiar
to signal processing and graphics implementers,
it's probably not thought of much in a
process scheduling contex (although there was
the Evolution Scheduler of a few years ago,
whose implementer may have had something like
circular convolution in mind). It just seems to me
(intuition) that the concept of what circular convolution does is akin to what we've been
feeling around for with these ad hoc heuristic
tweaks to the scheduler to adjust for interactivity
and batch behavior, searching for an incremental self-adjusting mechanism that favors interactivity
on demand.

I've never implemented a circular convolver in
any context, so I was wondering if anyone who
has thinks scheduler prioritization would be
simpler if implemented directly as a circular convolution.

(If nothing else, it seems to me that the abstract model of what the schedule prioritizer is doing
would be more coherent than it is with ad hoc
code. This perhaps reduces the risk of unexpected side-effects of incremental tweaks to the scheduler. The behavior of an optimizer that implements
an integer approximation of a known mathematical transform when you change its inputs is fairly predictable.)

Regards,

Clayton Weaver
<mailto: cgweav@email.com>


-- 
__________________________________________________________
Sign-up for your own personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

CareerBuilder.com has over 400,000 jobs. Be smarter about your job search
http://corp.mail.com/careers


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2003-10-21 21:04 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-16  1:51 Circular Convolution scheduler Clayton Weaver
2003-10-21 18:44 ` bill davidsen
2003-10-21 20:15   ` Richard B. Johnson
2003-10-21 20:54     ` bill davidsen
  -- strict thread matches above, loose matches on Subject: below --
2003-10-19  8:50 Clayton Weaver
2003-10-21 18:51 ` bill davidsen
2003-10-07 20:47 Clayton Weaver
2003-10-06 16:17 Clayton Weaver
2003-10-07 22:19 ` George Anzinger
2003-10-14  8:37   ` Piet Delaney
2003-10-14  9:46     ` Jamie Lokier
2003-10-14 10:06       ` Nick Piggin
2003-10-14 10:18         ` Jamie Lokier
2003-10-14 10:28           ` Nick Piggin
2003-10-16  8:34             ` Piet Delaney
2003-10-16  9:03               ` Nick Piggin
2003-10-21 18:09         ` bill davidsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox