* Scheduling of I/O domains
@ 2004-07-21 17:21 Rob Gardner
2004-07-21 18:54 ` G. Milos
2004-08-02 11:48 ` G. Milos
0 siblings, 2 replies; 13+ messages in thread
From: Rob Gardner @ 2004-07-21 17:21 UTC (permalink / raw)
To: xen-devel
I have been looking for the code in xen that handles scheduling of I/O
domains, but have not succeeded in finding any. Only after some time did
I start thinking that maybe there wasn't any. So I did the following
simple experiment:
In domain 1:
time dd if=/dev/hda of=/dev/null bs=1024k count=100
and this this took about 2 seconds.
Then I ran a cpu-consuming process in domain 2, and repeated the dd
command above. The result was more than 8 seconds! This strongly
suggests that domain 0 is not being treated specially by the scheduler,
and is being made to wait before servicing I/O interrupts, thereby
killing I/O performance in the presence of cpu-bound domain activity. It
seems to me that I/O domains need to be scheduled and dispatched
immediately upon receipt of an I/O interrupt.
Am I missing something here? How come nobody has noticed this behavior
before? Is somebody working on this? I
Rob Gardner
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Scheduling of I/O domains
2004-07-21 17:21 Scheduling of I/O domains Rob Gardner
@ 2004-07-21 18:54 ` G. Milos
2004-07-21 22:25 ` Rob Gardner
2004-08-02 11:48 ` G. Milos
1 sibling, 1 reply; 13+ messages in thread
From: G. Milos @ 2004-07-21 18:54 UTC (permalink / raw)
To: Rob Gardner; +Cc: xen-devel
> I have been looking for the code in xen that handles scheduling of I/O
> domains, but have not succeeded in finding any. Only after some time did
> I start thinking that maybe there wasn't any. So I did the following
> simple experiment:
> In domain 1:
> time dd if=/dev/hda of=/dev/null bs=1024k count=100
> and this this took about 2 seconds.
> Then I ran a cpu-consuming process in domain 2, and repeated the dd
> command above. The result was more than 8 seconds! This strongly
> suggests that domain 0 is not being treated specially by the scheduler,
> and is being made to wait before servicing I/O interrupts, thereby
> killing I/O performance in the presence of cpu-bound domain activity. It
> seems to me that I/O domains need to be scheduled and dispatched
> immediately upon receipt of an I/O interrupt.
>
> Am I missing something here? How come nobody has noticed this behavior
> before? Is somebody working on this? I
>
Yes. We did notice the above behaviour.
The default Xen's scheduler is BVT (Borrowed Virtual Time). It
suffers from unfairness when I/O and CPU bound tasks (or domains in our
case) are run against each other, but allows to get quite small dispatch
lattency (more details about the parameters that are used to control the
scheduler can be found in the docs).
To target the unfairness I am developing a modification of BVT (I
called it Fair Borrowed Virtual Time [FBVT]). You can enable it by
supplying "sched=fbvt" command to Xen at the startup. The scheduler is
under
development and it needs some tweaking to get the best performance
(that
is what I am working on at the moment). It would be very helpful if you
could email me with the results of your tests for FBVT.
When developing FBVT I noticed a bug which also affects BVT,
unfortunatelly I did not have time to get round to fixing it yet.
Hopefully tomorrow.
The development plan for the schedulers is as follows:
1) Optimise the performance of FBVT.
2) Arrange the default parameters to favour domain 0. So far all domains
are treated exactly the same.
3) Arrange the default parameters for BVT.
4) Implement working version of Atropos (the current implementation broke
due to change of the scheduler interfaces).
In order to do 1) I need to create a benchmark suite. So if you have any
thoughts of what tests should be run give me an email as well.
Cheers
Gregor
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Scheduling of I/O domains
2004-07-21 18:54 ` G. Milos
@ 2004-07-21 22:25 ` Rob Gardner
2004-07-22 0:53 ` Keir Fraser
0 siblings, 1 reply; 13+ messages in thread
From: Rob Gardner @ 2004-07-21 22:25 UTC (permalink / raw)
To: G. Milos; +Cc: xen-devel
G. Milos wrote:
>
> To target the unfairness I am developing a modification of BVT (I called
> it Fair Borrowed Virtual Time [FBVT]). You can enable it by supplying
> "sched=fbvt" command to Xen at the startup. The scheduler is under
> development and it needs some tweaking to get the best performance (that
> is what I am working on at the moment). It would be very helpful if you
> could email me with the results of your tests for FBVT.
I tried booting xen with "sched=fbvt" in the command line in grub.conf.
It didn't change the results at all. And why would it? We are not
dealing with an "I/O bound domain" here, but rather with an "I/O
domain", two very different things.
It seems to me that this problem doesn't have anything to do with the
choice of scheduling policy or parameters; It is about when the
scheduler is called. It appears as though the xen cpu scheduler
currently only runs when the hardware timer ticks. It does not run when
an external interrupt happens. So there is a large latency introduced to
I/O interrupts, and this limits I/O performance. Changing the scheduler
algorithm won't help this.
The only way to avoid this is to immediately dispatch the I/O domain
responsible for a given I/O interrupt as soon as that interrupt occurs.
This means giving I/O domains with pending interrupts scheduling
priority over any "regular" domains. Just as in a "normal" operating
system, interrupt service routines must complete before any user
processes are executed. Otherwise, latencies are introduced that kill
I/O performance.
Rob Gardner
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Scheduling of I/O domains
2004-07-21 22:25 ` Rob Gardner
@ 2004-07-22 0:53 ` Keir Fraser
2004-07-30 13:32 ` Rolf Neugebauer
0 siblings, 1 reply; 13+ messages in thread
From: Keir Fraser @ 2004-07-22 0:53 UTC (permalink / raw)
To: Rob Gardner; +Cc: G. Milos, xen-devel
> It seems to me that this problem doesn't have anything to do with the
> choice of scheduling policy or parameters; It is about when the
> scheduler is called. It appears as though the xen cpu scheduler
> currently only runs when the hardware timer ticks. It does not run when
> an external interrupt happens. So there is a large latency introduced to
> I/O interrupts, and this limits I/O performance. Changing the scheduler
> algorithm won't help this.
>
> The only way to avoid this is to immediately dispatch the I/O domain
> responsible for a given I/O interrupt as soon as that interrupt occurs.
> This means giving I/O domains with pending interrupts scheduling
> priority over any "regular" domains. Just as in a "normal" operating
> system, interrupt service routines must complete before any user
> processes are executed. Otherwise, latencies are introduced that kill
> I/O performance.
When an event is queued for a domain we call a generic wakeup
function. A good deal more of that function ought to be
scheduler-specific, and should do something smarter than our current
default (which is to force a reschedule only if the CPU is idling).
However, fixing this shouldn't be that hard -- we should have saner
scheduling in the next few weeks.
-- Keir
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Scheduling of I/O domains
2004-07-22 0:53 ` Keir Fraser
@ 2004-07-30 13:32 ` Rolf Neugebauer
2004-07-30 14:48 ` G. Milos
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Rolf Neugebauer @ 2004-07-30 13:32 UTC (permalink / raw)
To: Keir Fraser; +Cc: rolf.neugebauer, Rob Gardner, G. Milos, xen-devel
On Thu, 2004-07-22 at 01:53, Keir Fraser wrote:
> > It seems to me that this problem doesn't have anything to do with the
> > choice of scheduling policy or parameters; It is about when the
> > scheduler is called. It appears as though the xen cpu scheduler
> > currently only runs when the hardware timer ticks. It does not run when
> > an external interrupt happens. So there is a large latency introduced to
>
> > I/O interrupts, and this limits I/O performance. Changing the scheduler
> > algorithm won't help this.
> >
> > The only way to avoid this is to immediately dispatch the I/O domain
> > responsible for a given I/O interrupt as soon as that interrupt occurs.
> > This means giving I/O domains with pending interrupts scheduling
> > priority over any "regular" domains. Just as in a "normal" operating
> > system, interrupt service routines must complete before any user
> > processes are executed. Otherwise, latencies are introduced that kill
> > I/O performance.
>
> When an event is queued for a domain we call a generic wakeup
> function. A good deal more of that function ought to be
> scheduler-specific, and should do something smarter than our current
> default (which is to force a reschedule only if the CPU is idling).
> However, fixing this shouldn't be that hard -- we should have saner
> scheduling in the next few weeks.
[sorry for the delayed reply]
as keir pointed out the problem is in the wakeup function and in
particular with a BVT hack.
BVT has the notion of a context switch allowance, i.e., the minimum time
a task is allowed run before it gets preempted, to avoid context switch
thrashing (ctx_allow=5ms in sched_bvt.c). after this time a new run
through the scheduler is performed
in our BVT implementation we extend this slightly in that if there is
only one runable task we expand the context switch allowance to 10 times
the normal amount in order to avoid to many runs through the scheduler.
The old (i.e., 1.2) BVT implementation would check on waking up another
domain if the current task already used up the ctx_allow and if it had
would force an immediate run through the scheduler (therefor ignoring
the the expanded context switch allowance).
In the -unstable implementation this is not the case anymore as the BVT
scheduling function reports back a time value for the next run through
the scheduler and no hook is provided into specific scheduler
implementations when a domains is unblocking. Therefor, if your CPU hog
is the only runable task when it is scheduled it will run 10*ctx_allow
(50ms) irrespective of other tasks becoming runnable during that time
(ie. your IO tasks). in the worst case the IO tasks then have to wait
for 50ms rather than 5ms before they get scheduled.
as a quick fix could you comment out line 366-371 (which extends the
context switch allowance if there is only one task running) in
sched_bvt.c and try your experiment again.
The proper fix should be a call into the scheduler if a task unblocks,
which shouldn't be too hard to add.
Rolf
-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Scheduling of I/O domains
2004-07-30 13:32 ` Rolf Neugebauer
@ 2004-07-30 14:48 ` G. Milos
2004-07-30 15:23 ` G. Milos
2004-08-05 22:21 ` Rob Gardner
2 siblings, 0 replies; 13+ messages in thread
From: G. Milos @ 2004-07-30 14:48 UTC (permalink / raw)
To: Rolf Neugebauer; +Cc: xen-devel
> [sorry for the delayed reply]
>
> as keir pointed out the problem is in the wakeup function and in
> particular with a BVT hack.
>
> BVT has the notion of a context switch allowance, i.e., the minimum time
> a task is allowed run before it gets preempted, to avoid context switch
> thrashing (ctx_allow=5ms in sched_bvt.c). after this time a new run
> through the scheduler is performed
>
>
> in our BVT implementation we extend this slightly in that if there is
> only one runable task we expand the context switch allowance to 10 times
> the normal amount in order to avoid to many runs through the scheduler.
In my opinion the above is not entiraly true. The context switch allowance
is introduced to stop two runnable tasks
switching between each other just after overtaking the other in virtual
time. This is quite different to immediate
dispatch when a task (domain) becomes runnable. Then the current task
should be preempted even if it run for less then
ctx_allow. Last few days I modified the scheduler interface to push
runqueue management and most of wakeup work to
specific schedulers (BVT by default), so that to let them decide what to
do exactly. Another bug was found in the meantime
(but it only activated itself when a domain with large AVT migrated
between processors). Now I will try to fix the early
dispatch bug. (It seems to me that simple modification to wakeup function
should do the trick).
> The old (i.e., 1.2) BVT implementation would check on waking up another
> domain if the current task already used up the ctx_allow and if it had
> would force an immediate run through the scheduler (therefor ignoring
> the the expanded context switch allowance).
>
> In the -unstable implementation this is not the case anymore as the BVT
> scheduling function reports back a time value for the next run through
> the scheduler and no hook is provided into specific scheduler
> implementations when a domains is unblocking. Therefor, if your CPU hog
> is the only runable task when it is scheduled it will run 10*ctx_allow
> (50ms) irrespective of other tasks becoming runnable during that time
> (ie. your IO tasks). in the worst case the IO tasks then have to wait
> for 50ms rather than 5ms before they get scheduled.
>
> as a quick fix could you comment out line 366-371 (which extends the
> context switch allowance if there is only one task running) in
> sched_bvt.c and try your experiment again.
>
> The proper fix should be a call into the scheduler if a task unblocks,
> which shouldn't be too hard to add.
>
Cheers
Gregor
-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Scheduling of I/O domains
2004-07-30 13:32 ` Rolf Neugebauer
2004-07-30 14:48 ` G. Milos
@ 2004-07-30 15:23 ` G. Milos
2004-08-05 22:21 ` Rob Gardner
2 siblings, 0 replies; 13+ messages in thread
From: G. Milos @ 2004-07-30 15:23 UTC (permalink / raw)
To: Rolf Neugebauer; +Cc: Keir Fraser, Rob Gardner, xen-devel
> [sorry for the delayed reply]
>
> as keir pointed out the problem is in the wakeup function and in
> particular with a BVT hack.
>
> BVT has the notion of a context switch allowance, i.e., the minimum time
> a task is allowed run before it gets preempted, to avoid context switch
> thrashing (ctx_allow=5ms in sched_bvt.c). after this time a new run
> through the scheduler is performed
>
> in our BVT implementation we extend this slightly in that if there is
> only one runable task we expand the context switch allowance to 10 times
> the normal amount in order to avoid to many runs through the scheduler.
>
> The old (i.e., 1.2) BVT implementation would check on waking up another
> domain if the current task already used up the ctx_allow and if it had
> would force an immediate run through the scheduler (therefor ignoring
> the the expanded context switch allowance).
>
> In the -unstable implementation this is not the case anymore as the BVT
> scheduling function reports back a time value for the next run through
> the scheduler
True.
> and no hook is provided into specific scheduler
> implementations when a domains is unblocking. Therefor, if your CPU hog
This is not the case.
Firstly the expanded ctx_allow does not affect the minimum time that a
domain is scheduled to run as it is allways set to 5ms (line 496 in
sched_bvt.c).
Secondly, when a domain is unblocking the scheduler specific wakeup
function decides if to force immediate run through the scheduler (line
232 sched_bvt.c)
> is the only runable task when it is scheduled it will run 10*ctx_allow
> (50ms) irrespective of other tasks becoming runnable during that time
> (ie. your IO tasks). in the worst case the IO tasks then have to wait
> for 50ms rather than 5ms before they get scheduled.
It will still be 5ms only. I have some scheduling traces which
confirm that. Still we want something close to 0 not 5ms.
> as a quick fix could you comment out line 366-371 (which extends the
> context switch allowance if there is only one task running) in
> sched_bvt.c and try your experiment again.
Also worth noting is that it is idle domain for which the time quantum
is expanded. If there are more runnable tasks (ie at least one "normal")
then ctx_allow will be 5ms (line 472 sched_bvt.c)
> The proper fix should be a call into the scheduler if a task unblocks,
> which shouldn't be too hard to add.
>
That's right. I will try to do that now.
Cheers
Gregor
-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Scheduling of I/O domains
2004-07-30 13:32 ` Rolf Neugebauer
2004-07-30 14:48 ` G. Milos
2004-07-30 15:23 ` G. Milos
@ 2004-08-05 22:21 ` Rob Gardner
2004-08-06 9:25 ` G. Milos
2 siblings, 1 reply; 13+ messages in thread
From: Rob Gardner @ 2004-08-05 22:21 UTC (permalink / raw)
To: rolf.neugebauer; +Cc: Keir Fraser, G. Milos, xen-devel
Rolf Neugebauer wrote:
> ...
> The proper fix should be a call into the scheduler if a task unblocks,
> which shouldn't be too hard to add.
>
I found a simple way of doing this. In schedule.c/domain_wake(), I
changed the following code slightly:
if ( is_idle_task(curr) || (min_time <= now) )
cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
This code causes the scheduler to be run if the current task is the idle
task, or if the current task has already used up its time slice. I
changed this to:
if ( is_idle_task(curr) || (min_time <= now)
|| IS_CAPABLE_PHYSDEV(d) )
cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
This causes the scheduler also to be run if the domain we are waking up
is a device domain.
The stock BVT scheduling code seems to take care of the rest. Since the
device domain tends to run relatively rarely, its virtual time is
smaller, which causes the BVT algorithm to switch to it right away.
This changes the result of my little 'dd' test to be much closer to nominal:
time dd if=/dev/hda5 bs=1024k count=11 of=/dev/null
takes 1.96s with nothing else running.
takes 2.1s with a cpu intensive domain running concurrently
took over 8s without this change.
Maybe not perfect, but way better.
Rob Gardner
HP
-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Scheduling of I/O domains
2004-08-05 22:21 ` Rob Gardner
@ 2004-08-06 9:25 ` G. Milos
2004-08-06 21:05 ` Rob Gardner
0 siblings, 1 reply; 13+ messages in thread
From: G. Milos @ 2004-08-06 9:25 UTC (permalink / raw)
To: Rob Gardner; +Cc: rolf.neugebauer, Keir Fraser, xen-devel
The patch you sent probably does work, but most of the code responsible
for waking up got pushed into specific schedulers (i.e. it is not in
schedule.c but in sched_bvt.c, sched_rrobin.c etc). I have replaced the
"min <= now" bit by what BVT research paper suggests and only
0.7% difference was observed in the dd test. It would be nice if you could
run the dd test on your machine after updating your xen code.
Also, yesterday I checked in a working version of the warping mechanism
(anybody wants to test it, and report bugs to me?). If we run the
IO-domains warped the difference in dd will probably be even smaller.
>> The proper fix should be a call into the scheduler if a task unblocks,
>> which shouldn't be too hard to add.
>>
>
> I found a simple way of doing this. In schedule.c/domain_wake(), I changed
> the following code slightly:
> if ( is_idle_task(curr) || (min_time <= now) )
> cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
> This code causes the scheduler to be run if the current task is the idle
> task, or if the current task has already used up its time slice. I changed
> this to:
> if ( is_idle_task(curr) || (min_time <= now)
> || IS_CAPABLE_PHYSDEV(d) )
> cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
>
> This causes the scheduler also to be run if the domain we are waking up is a
> device domain.
>
> The stock BVT scheduling code seems to take care of the rest. Since the
> device domain tends to run relatively rarely, its virtual time is smaller,
> which causes the BVT algorithm to switch to it right away.
>
> This changes the result of my little 'dd' test to be much closer to nominal:
>
> time dd if=/dev/hda5 bs=1024k count=11 of=/dev/null
>
> takes 1.96s with nothing else running.
>
> takes 2.1s with a cpu intensive domain running concurrently
>
> took over 8s without this change.
>
> Maybe not perfect, but way better.
>
>
> Rob Gardner
> HP
>
>
>
-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Scheduling of I/O domains
2004-08-06 9:25 ` G. Milos
@ 2004-08-06 21:05 ` Rob Gardner
2004-08-23 17:51 ` Ian Pratt
0 siblings, 1 reply; 13+ messages in thread
From: Rob Gardner @ 2004-08-06 21:05 UTC (permalink / raw)
To: G. Milos; +Cc: rolf.neugebauer, Keir Fraser, xen-devel
On Fri, 2004-08-06 at 03:25, G. Milos wrote:
> The patch you sent probably does work, but most of the code responsible
> for waking up got pushed into specific schedulers (i.e. it is not in
> schedule.c but in sched_bvt.c, sched_rrobin.c etc). I have replaced the
> "min <= now" bit by what BVT research paper suggests and only
> 0.7% difference was observed in the dd test. It would be nice if you could
> run the dd test on your machine after updating your xen code.
I ran my dd test with the latest xen bits, checked out today. The result
is:
dd test all by itself: best run 1.93s
dd test with another domain running an infinite loop: best run 2.5s
Well this is certainly much better than the 8s it was taking before, but
it's still giving up more than 25% in performance.
Rob Gardner
HP
-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Scheduling of I/O domains
2004-08-06 21:05 ` Rob Gardner
@ 2004-08-23 17:51 ` Ian Pratt
0 siblings, 0 replies; 13+ messages in thread
From: Ian Pratt @ 2004-08-23 17:51 UTC (permalink / raw)
To: Rob Gardner; +Cc: G. Milos, rolf.neugebauer, Keir Fraser, xen-devel, Ian.Pratt
> On Fri, 2004-08-06 at 03:25, G. Milos wrote:
> > The patch you sent probably does work, but most of the code responsible
> > for waking up got pushed into specific schedulers (i.e. it is not in
> > schedule.c but in sched_bvt.c, sched_rrobin.c etc). I have replaced the
> > "min <= now" bit by what BVT research paper suggests and only
> > 0.7% difference was observed in the dd test. It would be nice if you could
> > run the dd test on your machine after updating your xen code.
>
> I ran my dd test with the latest xen bits, checked out today. The result
> is:
> dd test all by itself: best run 1.93s
> dd test with another domain running an infinite loop: best run 2.5s
>
> Well this is certainly much better than the 8s it was taking before, but
> it's still giving up more than 25% in performance.
Is this running on the same processor? If so, I think its pretty
reasonable that we loose 25% of IO performance when only getting
50% of the CPU.
The downside of giving the driver domain absolute priority over
guest domains would be that we might end up doing more context
switches than necessary, and loose the benefit of batching.
-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Scheduling of I/O domains
2004-07-21 17:21 Scheduling of I/O domains Rob Gardner
2004-07-21 18:54 ` G. Milos
@ 2004-08-02 11:48 ` G. Milos
1 sibling, 0 replies; 13+ messages in thread
From: G. Milos @ 2004-08-02 11:48 UTC (permalink / raw)
To: Rob Gardner; +Cc: xen-devel
I've just checked in another set of changes to the scheduling code.
Now the default scheduler (BVT) handles I/O bound domains much better. I
run the test suggested by Rob:
time dd if=/dev/sda1 of=/dev/null bs=1024k count=1000
with the folloving results:
dd alone:
run 1: 16.830s
run 2: 17.058s
run 3: 16.999s
avg: 16.962s
dd against cpu bound process in another domain:
run 1: 17.348s
run 2: 16.973s
run 3: 16.931s
avg: 17.084s
difference: 0.72% (small enough)
As soon as the code gets pushed in to bkbits it will be available for
tests.
Currently the warp mechanism does not work (it has never been implemented
properly). Changing the warp parameters will have small effect on the
scheduler's behaviour. As soon as I get this to work we can arrange for
domain 0 to run warped, thus receiving CPU in "privilaged" manner.
Cheers
Gregor
> I have been looking for the code in xen that handles scheduling of I/O
> domains, but have not succeeded in finding any. Only after some time did
> I start thinking that maybe there wasn't any. So I did the following
> simple experiment:
> In domain 1:
> time dd if=/dev/hda of=/dev/null bs=1024k count=100
> and this this took about 2 seconds.
> Then I ran a cpu-consuming process in domain 2, and repeated the dd
> command above. The result was more than 8 seconds! This strongly
> suggests that domain 0 is not being treated specially by the scheduler,
> and is being made to wait before servicing I/O interrupts, thereby
> killing I/O performance in the presence of cpu-bound domain activity. It
> seems to me that I/O domains need to be scheduled and dispatched
> immediately upon receipt of an I/O interrupt.
>
> Am I missing something here? How come nobody has noticed this behavior
> before? Is somebody working on this? I
>
> Rob Gardner
-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: Scheduling of I/O domains
@ 2004-07-22 0:33 Neugebauer, Rolf
0 siblings, 0 replies; 13+ messages in thread
From: Neugebauer, Rolf @ 2004-07-22 0:33 UTC (permalink / raw)
To: Rob Gardner, G. Milos; +Cc: xen-devel, Neugebauer, Rolf
the scheduler should also be entered if an event (in this case a
virtualized interrupt) is delivered to a domain by Xen (and not just on
a timer interrupt). In particular the event delivery mechanisms should
check if the receiving domain is blocked, if so, wake it up and enter
the scheduler.
It is debatable if an already running/runable domain should be given
special treatment if a virtualized HW interrupt or an event in general
is delivered to it. because the scheduler should also provide an upper
bound or some notion of fairness on the CPU time received by domains if
they receive lots of events.
In general we are aware of some scheduler 'issues' with more IO bound
domains in the presence of cpu bound domains (and in this sense IO
domains and IO bound domains are pretty similar).
I have a look tomorrow once I my workstation is back up in a usable
state, and as gregor pointed out, he is also looking into this.
Hopefully we can resolve this issue reasonably quick.
rolf
> -----Original Message-----
> From: xen-devel-admin@lists.sourceforge.net [mailto:xen-devel-
> admin@lists.sourceforge.net] On Behalf Of Rob Gardner
> Sent: 21 July 2004 23:25
> To: G. Milos
> Cc: xen-devel@lists.sourceforge.net
> Subject: Re: [Xen-devel] Scheduling of I/O domains
>
> G. Milos wrote:
> >
> > To target the unfairness I am developing a modification of BVT (I
called
> > it Fair Borrowed Virtual Time [FBVT]). You can enable it by
supplying
> > "sched=fbvt" command to Xen at the startup. The scheduler is under
> > development and it needs some tweaking to get the best performance
(that
> > is what I am working on at the moment). It would be very helpful if
you
> > could email me with the results of your tests for FBVT.
>
>
> I tried booting xen with "sched=fbvt" in the command line in
grub.conf.
> It didn't change the results at all. And why would it? We are not
> dealing with an "I/O bound domain" here, but rather with an "I/O
> domain", two very different things.
>
> It seems to me that this problem doesn't have anything to do with the
> choice of scheduling policy or parameters; It is about when the
> scheduler is called. It appears as though the xen cpu scheduler
> currently only runs when the hardware timer ticks. It does not run
when
> an external interrupt happens. So there is a large latency introduced
to
> I/O interrupts, and this limits I/O performance. Changing the
scheduler
> algorithm won't help this.
>
> The only way to avoid this is to immediately dispatch the I/O domain
> responsible for a given I/O interrupt as soon as that interrupt
occurs.
> This means giving I/O domains with pending interrupts scheduling
> priority over any "regular" domains. Just as in a "normal" operating
> system, interrupt service routines must complete before any user
> processes are executed. Otherwise, latencies are introduced that kill
> I/O performance.
>
> Rob Gardner
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by BEA Weblogic Workshop
> FREE Java Enterprise J2EE developer tools!
> Get your free copy of BEA WebLogic Workshop 8.1 today.
> http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_idG21&alloc_id\x10040&op=click
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2004-08-23 17:51 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-21 17:21 Scheduling of I/O domains Rob Gardner
2004-07-21 18:54 ` G. Milos
2004-07-21 22:25 ` Rob Gardner
2004-07-22 0:53 ` Keir Fraser
2004-07-30 13:32 ` Rolf Neugebauer
2004-07-30 14:48 ` G. Milos
2004-07-30 15:23 ` G. Milos
2004-08-05 22:21 ` Rob Gardner
2004-08-06 9:25 ` G. Milos
2004-08-06 21:05 ` Rob Gardner
2004-08-23 17:51 ` Ian Pratt
2004-08-02 11:48 ` G. Milos
-- strict thread matches above, loose matches on Subject: below --
2004-07-22 0:33 Neugebauer, Rolf
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.