xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [RTDS Patch for Xen4.8] xen: sched_rt.c Check not_tickled Mask
@ 2017-02-22 22:16 Haoran Li
  0 siblings, 0 replies; 3+ messages in thread
From: Haoran Li @ 2017-02-22 22:16 UTC (permalink / raw)
  To: xen-devel; +Cc: mengxu, naroahlee

From: naroahlee <naroahlee@gmail.com>

   Modify: xen/common/sched_rt.c runq_tickle(): Check not_tickled Mask
for a Cache-Preferenced-PCPU  

The bug is introduced in Xen 4.7 when we converted RTDS scheduler from
quantum-driven model to event-driven model. We assumed whenever 
runq_tickle() is invoked, we will find a PCPU via a NOT-tickled mask.
However, in runq_tickle(): Case1: Pick Cache Preference
IDLE-PCPU is NOT masked by the not-tickled CPU mask.

Buggy behavior:
When two VCPUs tried to tickle a IDLE-VCPU, which is now on their
cache-preference PCPU, these two VCPU will tickle the same PCPU in a row.
However, only one VCPU is guranteed to be scheduled, because runq_pick()
would be executed only once in rt_schedule().
That means, another VCPU will lost (be descheduled) a Period.

Bug Analysis:
We need to exclude tickled VCPUs when trying to evaluate runq_tickle() case 1

Signed-off-by: Haoran Li <naroahlee@gmail.com>
---
 xen/common/sched_rt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 1b30014..777192f 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -1175,7 +1175,8 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu *new)
     cpumask_andnot(&not_tickled, &not_tickled, &prv->tickled);
 
     /* 1) if new's previous cpu is idle, kick it for cache benefit */
-    if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) )
+    if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) &&
+         cpumask_test_cpu(new->vcpu->processor, &not_tickled))
     {
         SCHED_STAT_CRANK(tickled_idle_cpu);
         cpu_to_tickle = new->vcpu->processor;
-- 
1.9.1

---
CC: <mengxu@cis.upenn.edu>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RTDS Patch for Xen4.8] xen: sched_rt.c Check not_tickled Mask
@ 2017-02-22 22:59 Meng Xu
  2017-02-24 16:58 ` Dario Faggioli
  0 siblings, 1 reply; 3+ messages in thread
From: Meng Xu @ 2017-02-22 22:59 UTC (permalink / raw)
  To: Haoran Li; +Cc: xen-devel@lists.xenproject.org, Dario Faggioli

Hi Haoran,

Thank you for sending this patch out quickly! :-)

The title can be
[PATCH] xen: rtds: only tickle the same cpu once

On Wed, Feb 22, 2017 at 5:16 PM, Haoran Li <naroahlee@gmail.com> wrote:
> From: naroahlee <naroahlee@gmail.com>
>
>    Modify: xen/common/sched_rt.c runq_tickle(): Check not_tickled Mask
> for a Cache-Preferenced-PCPU

No need for this.

>
> The bug is introduced in Xen 4.7 when we converted RTDS scheduler from
> quantum-driven model to event-driven model. We assumed whenever
> runq_tickle() is invoked, we will find a PCPU via a NOT-tickled mask.
> However, in runq_tickle(): Case1: Pick Cache Preference
> IDLE-PCPU is NOT masked by the not-tickled CPU mask.
>
> Buggy behavior:
> When two VCPUs tried to tickle a IDLE-VCPU, which is now on their
> cache-preference PCPU, these two VCPU will tickle the same PCPU in a row.
> However, only one VCPU is guranteed to be scheduled, because runq_pick()
> would be executed only once in rt_schedule().
> That means, another VCPU will lost (be descheduled) a Period.
>
> Bug Analysis:
> We need to exclude tickled VCPUs when trying to evaluate runq_tickle() case 1

Change the description to the following:

When more than one idle VCPUs that have the same PCPU as their
previous running core invoke runq_tickle(), they will tickle the same
PCPU. The tickled PCPU will only pick at most one VCPU, i.e., the
highest-priority one, to execute. The other VCPUs will not be
scheduled for a period, even when there is an idle core, making these
VCPUs unnecessarily starve for one period.

To fix this issue, we should always tickle the non-tickled PCPU in the
runq_tickle().

Meng

>
> Signed-off-by: Haoran Li <naroahlee@gmail.com>
> ---
>  xen/common/sched_rt.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> index 1b30014..777192f 100644
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -1175,7 +1175,8 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu *new)
>      cpumask_andnot(&not_tickled, &not_tickled, &prv->tickled);
>
>      /* 1) if new's previous cpu is idle, kick it for cache benefit */
> -    if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) )
> +    if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) &&
> +         cpumask_test_cpu(new->vcpu->processor, &not_tickled))

You should have a space before the last ).

Can you resend the patch with the comment resolved?

Thanks,

Meng

-- 
------------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RTDS Patch for Xen4.8] xen: sched_rt.c Check not_tickled Mask
  2017-02-22 22:59 Meng Xu
@ 2017-02-24 16:58 ` Dario Faggioli
  0 siblings, 0 replies; 3+ messages in thread
From: Dario Faggioli @ 2017-02-24 16:58 UTC (permalink / raw)
  To: Meng Xu, Haoran Li; +Cc: xen-devel@lists.xenproject.org


[-- Attachment #1.1: Type: text/plain, Size: 2876 bytes --]

On Wed, 2017-02-22 at 17:59 -0500, Meng Xu wrote:
> Hi Haoran,
> 
> Thank you for sending this patch out quickly! :-)
> 
> The title can be
> [PATCH] xen: rtds: only tickle the same cpu once
> 
Or:
xen: rtds: only tickle non-already tickled CPUs

(Nitpicking, I know, but I don't like how "the same" sounds in there.)

> On Wed, Feb 22, 2017 at 5:16 PM, Haoran Li <naroahlee@gmail.com>
> wrote:
> > 
> > Bug Analysis:
> > We need to exclude tickled VCPUs when trying to evaluate
> > runq_tickle() case 1
> 
> Change the description to the following:
> 
> When more than one idle VCPUs that have the same PCPU as their
> previous running core invoke runq_tickle(), they will tickle the same
> PCPU. The tickled PCPU will only pick at most one VCPU, i.e., the
> highest-priority one, to execute. The other VCPUs will not be
> scheduled for a period, even when there is an idle core, making these
> VCPUs unnecessarily starve for one period.
> 
Agreed.

> To fix this issue, we should always tickle the non-tickled PCPU in
> the
> runq_tickle().
> 
I'd change this sentence in something like:

"Therefore, always make sure that we only tickle PCPUs that have not
been tickled already."

> > --- a/xen/common/sched_rt.c
> > +++ b/xen/common/sched_rt.c
> > @@ -1175,7 +1175,8 @@ runq_tickle(const struct scheduler *ops,
> > struct rt_vcpu *new)
> >      cpumask_andnot(&not_tickled, &not_tickled, &prv->tickled);
> > 
> >      /* 1) if new's previous cpu is idle, kick it for cache benefit
> > */
> > -    if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) )
> > +    if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) &&
> > +         cpumask_test_cpu(new->vcpu->processor, &not_tickled))
> 
> You should have a space before the last ).
> 
Indeed.

But it looks to me that we can take the chance to tweak the code a
little bit, get rid of the special casing, and by that making it more
compact (and hence easier to read), and maybe a tiny bit more efficient
too.

I'm thinking about getting rid entirely of the 'if' above, and then
transforming the loop into something like this:

    cpu = cpumask_test_or_cycle(new->vcpu->processor,
                                &not_tickled);
    while ( cpu != nr_cpu_ids )
    {
        iter_vc = curr_on_cpu(cpu);
        /* ... existing loop body ... */
        cpumask_clear_cpu(cpu, &not_tickle);
        cpu = cpumask_cycle(cpu, &not_tickled);
    }

(I do thinks this is correct, but please, do double check.)

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-02-24 16:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-22 22:16 [RTDS Patch for Xen4.8] xen: sched_rt.c Check not_tickled Mask Haoran Li
  -- strict thread matches above, loose matches on Subject: below --
2017-02-22 22:59 Meng Xu
2017-02-24 16:58 ` Dario Faggioli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).