* Prioritising dom0 vcpus
@ 2013-05-31 17:18 Marcus Granado
2013-06-03 16:47 ` Dario Faggioli
0 siblings, 1 reply; 2+ messages in thread
From: Marcus Granado @ 2013-05-31 17:18 UTC (permalink / raw)
To: xen-devel
As an experiment trying to reduce the latency when scheduling dom0
vcpus, I applied the following patch to __runq_insert() to xen 4.2:
diff -r 8643ca19d356 -r 91b13479c1a2 xen/common/sched_credit.c
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -205,6 +205,15 @@
BUG_ON( __vcpu_on_runq(svc) );
BUG_ON( cpu != svc->vcpu->processor );
+ /* if svc is a dom0 vcpu, put it always before all the other vcpus
in the runq,
+ * so that dom0 vcpus always have priority
+ */
+ if (svc->vcpu->domain->domain_id == 0) {
+ svc->pri = CSCHED_PRI_TS_BOOST; /* make sure no vcpu goes in
front of this one until this vcpu is scheduled */
+ list_add(&svc->runq_elem, (struct list_head *)runq);
+ return;
+ }
+
list_for_each( iter, runq )
{
const struct csched_vcpu * const iter_svc = __runq_elem(iter);
However, this patch seems to have had the opposite effect, and I would
like to understand why. A win7 guest now takes hours to start up, and I
believe this is due to dom0 taking an order of 10ms to serve each vm i/o
request, even though the dom0 vcpus and the guest vcpu are in different
pcpus.
xenalyze-a.out: http://pastelink.me/getfile.php?key=390a25
xentrace-D-T5.out: http://pastelink.me/getfile.php?key=b3d584
Any ideas why this is the case?
thanks,
Marcus
--
xenalyze-a.out head:
--
0.006977926 ------ x d32768v23 runstate_change d4v0
blocked->runnable
Creating domain 4
Creating vcpu 0 for dom 4
] 0.006979023 ------ x d32768v23 28004(2:8:4) 2 [ 4 0 ]
] 0.006980999 ------ x d32768v23 2800e(2:8:e) 2 [
7fff edd9df ]
] 0.006981126 ------ x d32768v23 2800f(2:8:f) 3 [ 4
e82 1c9c380 ]
] 0.006981403 ------ x d32768v23 2800a(2:8:a) 4 [
7fff 17 4 0 ]
0.006981687 ------ x d32768v23 runstate_change
d32767v23 running->runnable
Creating vcpu 23 for dom 32767
Using first_tsc for d32767v23 (9024 cycles)
0.006982783 ------ x d?v? runstate_change d4v0
runnable->running
] 0.006996466 ------ x d4v0 28006(2:8:6) 2 [ 4 0 ]
] 0.006997600 ------ x d4v0 2800e(2:8:e) 2 [ 4 4d19 ]
] 0.006997726 ------ x d4v0 2800f(2:8:f) 3 [ 7fff
4d19 ffffffff ]
] 0.006997881 ------ x d4v0 2800a(2:8:a) 4 [ 4 0 7fff
17 ]
0.006998070 ------ x d4v0 runstate_change d4v0
running->blocked
0.006998242 ------ x d?v? runstate_change d32767v23
runnable->running
0.014874949 ----x- - d32767v4 runstate_change d0v4
blocked->runnable
] 0.014879473 ----x- - d32767v4 28004(2:8:4) 2 [ 0 4 ]
0.014880331 -x---- - d32767v1 runstate_change d0v1
blocked->runnable
] 0.014884417 ----x- - d32767v4 2800e(2:8:e) 2 [ 7fff
97fc06 ]
] 0.014884544 ----x- - d32767v4 2800f(2:8:f) 3 [ 0
1978 1c9c380 ]
] 0.014884916 ----x- - d32767v4 2800a(2:8:a) 4 [ 7fff
4 0 4 ]
] 0.014885022 -x---- - d32767v1 28004(2:8:4) 2 [ 0 1 ]
0.014885134 ----x- - d32767v4 runstate_change
d32767v4 running->runnable
0.014885251 --x- - - d32767v2 runstate_change d0v2
blocked->runnable
] 0.014889526 -x-- - - d32767v1 2800e(2:8:e) 2 [ 7fff
97cdd8 ]
] 0.014889731 -x-- - - d32767v1 2800f(2:8:f) 3 [ 0
1b68 1c9c380 ]
] 0.014889949 -x-- - - d32767v1 2800a(2:8:a) 4 [ 7fff
1 0 1 ]
0.014890084 ----x- - d?v? runstate_change d0v4
runnable->running
0.014890176 -x--|- - d32767v1 runstate_change
d32767v1 running->runnable
] 0.014890291 - x-|- - d32767v2 28004(2:8:4) 2 [ 0 2 ]
0.014890374 - -x|- - d32767v3 runstate_change d0v3
blocked->runnable
0.014891134 -x--|- - d?v? runstate_change d0v1
runnable->running
] 0.014891811 -|x-|- - d32767v2 2800e(2:8:e) 2 [ 7fff
96f8a4 ]
] 0.014891905 -|-x|- - d32767v3 28004(2:8:4) 2 [ 0 3 ]
] 0.014891936 -|x-|- - d32767v2 2800f(2:8:f) 3 [ 0
1c23 1c9c380 ]
] 0.014892155 -|x-|- - d32767v2 2800a(2:8:a) 4 [ 7fff
2 0 2 ]
0.014892362 -|--|x - d32767v5 runstate_change d0v5
blocked->runnable
0.014892395 -|x-|- - d32767v2 runstate_change
d32767v2 running->runnable
] 0.014893226 -| x|- - d32767v3 2800e(2:8:e) 2 [ 7fff
982ddb ]
] 0.014893343 -| x|- - d32767v3 2800f(2:8:f) 3 [ 0
c64 1c9c380 ]
0.014893386 -|x-|- - d?v? runstate_change d0v2
runnable->running
] 0.014893556 -||x|- - d32767v3 2800a(2:8:a) 4 [ 7fff
3 0 3 ]
] 0.014893778 -||-|x - d32767v5 28004(2:8:4) 2 [ 0 5 ]
0.014893867 -||x|- - d32767v3 runstate_change
d32767v3 running->runnable
0.014894811 -||x|- - d?v? runstate_change d0v3
runnable->running
] 0.014895067 -||||x - d32767v5 2800e(2:8:e) 2 [ 7fff
982654 ]
] 0.014895192 -||||x - d32767v5 2800f(2:8:f) 3 [ 0
c3c 1c9c380 ]
] 0.014895439 -||||x - d32767v5 2800a(2:8:a) 4 [ 7fff
5 0 5 ]
0.014895815 -||||x - d32767v5 runstate_change
d32767v5 running->runnable
0.014896751 -||||x - d?v? runstate_change d0v5
runnable->running
] 0.014908155 -|||x| - d0v4 28006(2:8:6) 2 [ 0 4 ]
] 0.014908228 -||||x - d0v5 28006(2:8:6) 2 [ 0 5 ]
] 0.014908405 -x|||| - d0v1 28006(2:8:6) 2 [ 0 1 ]
] 0.014909231 -||x|| - d0v3 28006(2:8:6) 2 [ 0 3 ]
] 0.014910265 -|||x| - d0v4 2800e(2:8:e) 2 [ 0 7f14 ]
] 0.014910384 -|||x| - d0v4 2800f(2:8:f) 3 [ 7fff
7f14 ffffffff ]
] 0.014910550 -|||x| - d0v4 2800a(2:8:a) 4 [ 0 4 7fff 4 ]
] 0.014910566 -x|||| - d0v1 2800e(2:8:e) 2 [ 0 6743 ]
] 0.014910679 -x|||| - d0v1 2800f(2:8:f) 3 [ 7fff
6743 ffffffff ]
] 0.014910707 -||||x - d0v5 2800e(2:8:e) 2 [ 0 3f80 ]
0.014910783 -|||x| - d0v4 runstate_change d0v4
running->blocked
] 0.014910803 -x|| | - d0v1 2800a(2:8:a) 4 [ 0 1 7fff 1 ]
] 0.014910819 -||| x - d0v5 2800f(2:8:f) 3 [ 7fff
3f80 ffffffff ]
] 0.014910944 -||| x - d0v5 2800a(2:8:a) 4 [ 0 5 7fff 5 ]
0.014911030 -x|| | - d0v1 runstate_change d0v1
running->blocked
0.014911109 - || x - d0v5 runstate_change d0v5
running->blocked
] 0.014911307 - |x - d0v3 2800e(2:8:e) 2 [ 0 4c74 ]
0.014911367 - || x - d?v? runstate_change d32767v5
runnable->running
] 0.014911417 - |x - - d0v3 2800f(2:8:f) 3 [ 7fff
4c74 ffffffff ]
0.014911471 - ||x- - d?v? runstate_change d32767v4
runnable->running
0.014911512 -x||-- - d?v? runstate_change d32767v1
runnable->running
] 0.014911530 --|x-- - d0v3 2800a(2:8:a) 4 [ 0 3 7fff 3 ]
0.014911687 --|x-- - d0v3 runstate_change d0v3
running->blocked
0.014912276 --|x-- - d?v? runstate_change d32767v3
runnable->running
] 0.015036914 --x--- - d0v2 28006(2:8:6) 2 [ 0 2 ]
] 0.015038191 --x--- - d0v2 2800e(2:8:e) 2 [ 0 28d83 ]
] 0.015038313 --x--- - d0v2 2800f(2:8:f) 3 [ 7fff
28d83 ffffffff ]
] 0.015038445 --x--- - d0v2 2800a(2:8:a) 4 [ 0 2 7fff 2 ]
0.015038617 --x--- - d0v2 runstate_change d0v2
running->blocked
0.015039232 --x--- - d?v? runstate_change d32767v2
runnable->running
0.020630385 ------ x d32767v23 runstate_change d4v0
blocked->runnable
] 0.020631491 ------ x d32767v23 28004(2:8:4) 2 [ 4 0 ]
] 0.020633401 ------ x d32767v23 2800e(2:8:e) 2 [
7fff edb796 ]
] 0.020633555 ------ x d32767v23 2800f(2:8:f) 3 [ 4
d97 1c9c380 ]
] 0.020633813 ------ x d32767v23 2800a(2:8:a) 4 [
7fff 17 4 0 ]
0.020634086 ------ x d32767v23 runstate_change
d32767v23 running->runnable
0.020635147 ------ x d?v? runstate_change d4v0
runnable->running
] 0.020650487 ------ x d4v0 28006(2:8:6) 2 [ 4 0 ]
] 0.020651616 ------ x d4v0 2800e(2:8:e) 2 [ 4 5400 ]
] 0.020651739 ------ x d4v0 2800f(2:8:f) 3 [ 7fff
5400 ffffffff ]
] 0.020651876 ------ x d4v0 2800a(2:8:a) 4 [ 4 0 7fff
17 ]
0.020652054 ------ x d4v0 runstate_change d4v0
running->blocked
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Prioritising dom0 vcpus
2013-05-31 17:18 Prioritising dom0 vcpus Marcus Granado
@ 2013-06-03 16:47 ` Dario Faggioli
0 siblings, 0 replies; 2+ messages in thread
From: Dario Faggioli @ 2013-06-03 16:47 UTC (permalink / raw)
To: Marcus Granado; +Cc: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 2924 bytes --]
On ven, 2013-05-31 at 18:18 +0100, Marcus Granado wrote:
> As an experiment trying to reduce the latency when scheduling dom0
> vcpus, I applied the following patch to __runq_insert() to xen 4.2:
>
> diff -r 8643ca19d356 -r 91b13479c1a2 xen/common/sched_credit.c
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -205,6 +205,15 @@
> BUG_ON( __vcpu_on_runq(svc) );
> BUG_ON( cpu != svc->vcpu->processor );
>
> + /* if svc is a dom0 vcpu, put it always before all the other vcpus
> in the runq,
> + * so that dom0 vcpus always have priority
> + */
> + if (svc->vcpu->domain->domain_id == 0) {
> + svc->pri = CSCHED_PRI_TS_BOOST; /* make sure no vcpu goes in
> front of this one until this vcpu is scheduled */
> + list_add(&svc->runq_elem, (struct list_head *)runq);
> + return;
> + }
> +
> list_for_each( iter, runq )
> {
> const struct csched_vcpu * const iter_svc = __runq_elem(iter);
>
Mmm... Are we talking about wakeup latency --which, BTW, is what
TS_BOOST is all about, AFAIUI ?
In that case, isn't a waking vcpu, whether or not it belongs to Dom0,
being boosted already in csched_vcpu_wake()? __runq_insert() is called
right after that, so I think it sees the boosting already, without the
need of the above.
If it's not only wakeup latency issues that you're trying to address,
then I'm not sure, but still, __runq_insert() does not look the right
place where to place such logic, at least per my personal taste. :-)
> However, this patch seems to have had the opposite effect, and I would
> like to understand why. A win7 guest now takes hours to start up, and I
> believe this is due to dom0 taking an order of 10ms to serve each vm i/o
> request, even though the dom0 vcpus and the guest vcpu are in different
> pcpus.
>
Well, just shooting in the dark, but __runq_insert() is also called in
csched_schedule(). Perhaps your modification above interacts badly with
the current scheduling logic?
Another way of trying to achieve what you seem to be up to, could be to
put an "is_it_dom0?" check in csched_vcpu_acct() and, if true, do not
clear the boosting. Beware, I'm not saying that it makes sense, or that
I like it, it just seems more clean (at least to me) than hijacking
__runq_insert().
What do you think?
> xenalyze-a.out: http://pastelink.me/getfile.php?key=390a25
> xentrace-D-T5.out: http://pastelink.me/getfile.php?key=b3d584
>
Sorry, can't look at the traces right now... If I find 5 mins for them
and spot something weird, I'll let you know.
Regards,
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2013-06-03 16:47 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-31 17:18 Prioritising dom0 vcpus Marcus Granado
2013-06-03 16:47 ` Dario Faggioli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).