Re: [PATCH v6][RFC]xen: sched: convert RTDS from time to event driven model

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Tianyang Chen <tiche@seas.upenn.edu>
To: Dario Faggioli <dario.faggioli@citrix.com>,
	xen-devel@lists.xenproject.org
Cc: george.dunlap@citrix.com, Dagaen Golomb <dgolomb@seas.upenn.edu>,
	Meng Xu <mengxu@cis.upenn.edu>
Subject: Re: [PATCH v6][RFC]xen: sched: convert RTDS from time to event driven model
Date: Fri, 26 Feb 2016 12:28:48 -0500	[thread overview]
Message-ID: <56D08B50.5090301@seas.upenn.edu> (raw)
In-Reply-To: <1456477891.2959.132.camel@citrix.com>

On 2/26/2016 4:11 AM, Dario Faggioli wrote:
>> It looks like the current code doesn't add a vcpu to the
>> replenishment
>> queue when vcpu_insert() is called. When the scheduler is
>> initialized,
>> all the vcpus are added to the replenishment queue after waking up
>> from
>> sleep. This needs to be changed (add vcpu to replq in vcpu_insert())
>> to
>> make it consistent in a sense that when rt_vcpu_insert() is called,
>> it
>> needs to have a corresponding replenishment event queued.
>>
>> This way the ASSERT() is for both cases in __runq_insert() to
>> enforce
>> the fact that "when a vcpu is inserted to runq/depletedq, a
>> replenishment event is waiting for it".
>>
> I am ok with this (calling replq_insert() in rt_vcpu_insert()). Not
> just doing that unconditionally though, as a vcpu can, e.g., be paused
> when the insert_vcpu hook is called, and in that case, I don't think we
> want to enqueue the replenishment event, do we?
>

Yes.

>>> static void
>>> rt_context_saved(const struct scheduler *ops, struct vcpu *vc)
>>> {
>>>       struct rt_vcpu *svc = rt_vcpu(vc);
>>>       spinlock_t *lock = vcpu_schedule_lock_irq(vc);
>>>
>>>       clear_bit(__RTDS_scheduled, &svc->flags);
>>>       /* not insert idle vcpu to runq */
>>>       if ( is_idle_vcpu(vc) )
>>>           goto out;
>>>
>>>       if ( test_and_clear_bit(__RTDS_delayed_runq_add, &svc->flags)
>>> &&
>>>            likely(vcpu_runnable(vc)) )
>>>       {
>>>           __runq_insert(ops, svc);
>>>           runq_tickle(ops, snext);
>>>       }
>>>       else
>>>           __replq_remove(ops, svc);
>>> out:
>>>       vcpu_schedule_unlock_irq(lock, vc);
>>> }
>>>
>>> And, as said above, if you do this, try also removing the
>>> __replq_remove() call from rt_vcpu_sleep(), this one should cover
>>> for
>>> that (and, actually, more than just that!).
>>>
>>> After all this, check whether you still have the assert in
>>> __replq_insert() triggering and let me know
>> So after moving the __replq_remove() to rt_context_saved(), the
>> assert
>> in __replq_insert() still fails when dom0 boots up.
>>
> Well, maybe removing __replq_remove() from rt_vcpu_sleep() is not
> entirely ok, as if we do that we fail to deal with the case of when
> (still in rt_vcpu_sleep()), __vcpu_on_q(svc) is true.
>
> So, I'd say, do as I said above wrt rt_context_saved(). For
> rt_vcpu_sleep(), you can try something like this:
>
> static void
> rt_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc)
> {
>      struct rt_vcpu * const svc = rt_vcpu(vc);
>
>      BUG_ON( is_idle_vcpu(vc) );
>      SCHED_STAT_CRANK(vcpu_sleep);
>
>      if ( curr_on_cpu(vc->processor) == vc )
>          cpu_raise_softirq(vc->processor, SCHEDULE_SOFTIRQ);
>      else if ( __vcpu_on_q(svc) )
>      {
>          __q_remove(svc);
>          __replq_remove(svc);  <=== *** LOOK HERE ***
>      }
>      else if ( svc->flags & RTDS_delayed_runq_add )
>          clear_bit(__RTDS_delayed_runq_add, &svc->flags);
> }
>
> In fact, in both the first and the third case, we go will at some point
> pass through rt_context_switch(), and hit the __replq_remove() that I
> made you put there.
>
> In the case in the middle, as the vcpu was just queued, and for making
> it go to sleep, it is enough to remove it from the runq (or depletedq,
> in the case of this scheduler), we won't go through
> rt_schedule()-->rt_context_saved(), and hence the __replq_remove()
> won't be called.
>
> Sorry for the overlook, can you try this.
>
> That being said...
>
>> (XEN) Assertion '!vcpu_on_replq(svc)' failed at sched_rt.c:524
>> (XEN) ----[ Xen-4.7-unstable  x86_64  debug=y  Tainted:    C ]----
>>
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82d080128c07>] sched_rt.c#__replq_insert+0x2b/0x64
>> (XEN)    [<ffff82d08012a072>] sched_rt.c#rt_vcpu_wake+0xf2/0x12c
>> (XEN)    [<ffff82d08012be2c>] vcpu_wake+0x213/0x3d4
>> (XEN)    [<ffff82d08012c347>] vcpu_unblock+0x4b/0x4d
>> (XEN)    [<ffff82d080169cea>] vcpu_kick+0x20/0x6f
>> (XEN)    [<ffff82d080169d65>] vcpu_mark_events_pending+0x2c/0x2f
>> (XEN)    [<ffff82d08010762a>]
>> event_2l.c#evtchn_2l_set_pending+0xa9/0xb9
>> (XEN)    [<ffff82d08010822a>] evtchn_send+0x158/0x183
>> (XEN)    [<ffff82d0801096fc>] do_event_channel_op+0xe21/0x147d
>> (XEN)    [<ffff82d0802439e2>] lstar_enter+0xe2/0x13c
>> (XEN)
>>
> ... This says that the we call __replq_insert() from rt_vcpu_wake() and
> find out in there that vcpu_on_replq() is true.
>
> However, in v6 code for rt_vcpu_wake(), I can see this:
>
> +    if( !__vcpu_on_replq(svc) )
> +    {
> +        __replq_insert(ops, svc);
> +        ASSERT( vcpu_runnable(vc) );
> +    }
>
> which would make me think that, if vcpu_on_replq() is true, we
> shouldn't be calling __replq_insert() in the first place.
>
> So, have you made other changes wrt v6 when trying this?

The v6 doesn't have the if statement commented out when I submitted it. 
But I tried commenting it out, the assertion failed.

It fails in V6 with:
rt_vcpu_sleep(): removing replenishment event for all cases
rt_context_saved(): not removing replenishment events
rt_vcpu_wake(): not checking if the event is already queued.

or with:
rt_vcpu_sleep(): not removing replenishment event at all
rt_context_saved(): removing replenishment events if not runnable
rt_vcpu_wake(): not checking if the event is already queued.

Also with:
rt_vcpu_sleep(): removing replenishment event if the vcpu is on 
runq/depletedq
rt_context_saved(): removing replenishment events if not runnable
rt_vcpu_wake(): not checking if the event is already queued.

I added debug prints in all these functions and noticed that it could be 
caused by racing between spurious wakeups and context switching. See the 
following events for the last modification above:

(XEN) cpu1 picked idle
(XEN) d0 attempted to change d0v1's CR4 flags 00000620 -> 00040660
(XEN) cpu2 picked idle
(XEN) vcpu1 sleeps on cpu
(XEN) cpu0 picked idle
(XEN) vcpu1 context saved not runnable
(XEN) vcpu1 wakes up nowhere
(XEN) cpu0 picked vcpu1
(XEN) vcpu1 sleeps on cpu
(XEN) cpu0 picked idle
(XEN) vcpu1 context saved not runnable
(XEN) vcpu1 wakes up nowhere
(XEN) cpu0 picked vcpu1
(XEN) cpu0 picked idle
(XEN) vcpu1 context saved not runnable
(XEN) cpu0 picked vcpu0
(XEN) vcpu1 wakes up nowhere
(XEN) cpu1 picked vcpu1     *** vcpu1 is on a cpu
(XEN) cpu1 picked idle      *** vcpu1 is waiting to be context switched
(XEN) vcpu2 wakes up nowhere
(XEN) cpu0 picked vcpu0
(XEN) cpu2 picked vcpu2
(XEN) cpu0 picked vcpu0
(XEN) cpu0 picked vcpu0
(XEN) d0 attempted to change d0v2's CR4 flags 00000620 -> 00040660
(XEN) cpu0 picked vcpu0
(XEN) vcpu2 sleeps on cpu
(XEN) vcpu1 wakes up nowhere      *** vcpu1 wakes up without sleep?

(XEN) Assertion '!__vcpu_on_replq(svc)' failed at sched_rt.c:526
(XEN) ----[ Xen-4.7-unstable  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d08012a151>] sched_rt.c#rt_vcpu_wake+0x11f/0x17b
...
(XEN) Xen call trace:
(XEN)    [<ffff82d08012a151>] sched_rt.c#rt_vcpu_wake+0x11f/0x17b
(XEN)    [<ffff82d08012bf2c>] vcpu_wake+0x213/0x3d4
(XEN)    [<ffff82d08012c447>] vcpu_unblock+0x4b/0x4d
(XEN)    [<ffff82d080169cea>] vcpu_kick+0x20/0x6f
(XEN)    [<ffff82d080169d65>] vcpu_mark_events_pending+0x2c/0x2f
(XEN)    [<ffff82d08010762a>] event_2l.c#evtchn_2l_set_pending+0xa9/0xb9
(XEN)    [<ffff82d080108312>] send_guest_vcpu_virq+0x9d/0xba
(XEN)    [<ffff82d080196cbe>] send_timer_event+0xe/0x10
(XEN)    [<ffff82d08012a7b5>] schedule.c#vcpu_singleshot_timer_fn+0x9/0xb
(XEN)    [<ffff82d080131978>] timer.c#execute_timer+0x4e/0x6c
(XEN)    [<ffff82d080131ab9>] timer.c#timer_softirq_action+0xdd/0x213
(XEN)    [<ffff82d08012df32>] softirq.c#__do_softirq+0x82/0x8d
(XEN)    [<ffff82d08012df8a>] do_softirq+0x13/0x15
(XEN)    [<ffff82d080243ad1>] cpufreq.c#process_softirqs+0x21/0x30

So, it looks like spurious wakeup for vcpu1 happens before it was 
completely context switched off a cpu. But rt_vcpu_wake() didn't see it 
on cpu with curr_on_cpu() so it fell through the first two RETURNs.

I guess the replenishment queue check is necessary for this situation?

Thanks,
Tianyang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

next prev parent reply	other threads:[~2016-02-26 17:30 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-25 20:05 [PATCH v6][RFC]xen: sched: convert RTDS from time to event driven model Tianyang Chen
2016-02-25 23:31 ` Dario Faggioli
2016-02-26  5:15   ` Tianyang Chen
2016-02-26  9:11     ` Dario Faggioli
2016-02-26 17:28       ` Tianyang Chen [this message]
2016-02-26 18:09         ` Dario Faggioli
2016-02-26 18:33           ` Chen, Tianyang
2016-03-04 16:34             ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56D08B50.5090301@seas.upenn.edu \
    --to=tiche@seas.upenn.edu \
    --cc=dario.faggioli@citrix.com \
    --cc=dgolomb@seas.upenn.edu \
    --cc=george.dunlap@citrix.com \
    --cc=mengxu@cis.upenn.edu \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).