From: Tianyang Chen <tiche@seas.upenn.edu>
To: Dario Faggioli <dario.faggioli@citrix.com>,
xen-devel@lists.xenproject.org
Cc: george.dunlap@citrix.com, Dagaen Golomb <dgolomb@seas.upenn.edu>,
Meng Xu <mengxu@cis.upenn.edu>
Subject: Re: [PATCH v5][RFC]xen: sched: convert RTDS from time to event driven model
Date: Thu, 25 Feb 2016 12:29:23 -0500 [thread overview]
Message-ID: <56CF39F3.5040807@seas.upenn.edu> (raw)
In-Reply-To: <1456396456.6288.58.camel@citrix.com>
On 2/25/2016 5:34 AM, Dario Faggioli wrote:
>>>> + * it should be re-inserted back to the replenishment queue.
>>>> + */
>>>> + if ( now >= svc->cur_deadline)
>>>> + {
>>>> + rt_update_deadline(now, svc);
>>>> + __replq_remove(ops, svc);
>>>> + }
>>>> +
>>>> + if( !__vcpu_on_replq(svc) )
>>>> + __replq_insert(ops, svc);
>>>> +
>>> And here I am again: is it really necessary to check whether svc is
>>> not
>>> in the replenishment queue? It looks to me that it really should
>>> not be
>>> there... but maybe it can, because we remove the event from the
>>> queue
>>> when the vcpu sleeps, but *not* when the vcpu blocks?
>> Yeah. That is the case where I keep getting assertion failure if
>> it's
>> removed.
>>
> Which one ASSERT() fails?
>
The replq_insert() fails because it's already on the replenishment queue
when rt_vcpu_wake() is trying to insert a vcpu again.
(XEN) Assertion '!__vcpu_on_replq(svc)' failed at sched_rt.c:527
(XEN) ----[ Xen-4.7-unstable x86_64 debug=y Tainted: C ]----
(XEN) CPU: 0
(XEN) RIP: e008:[<ffff82d08012a003>] sched_rt.c#rt_vcpu_wake+0xf0/0x17f
(XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor (d0v0)
(XEN) rax: 0000000000000001 rbx: ffff83023b522940 rcx: 0000000000000001
(XEN) rdx: 00000031bb1b9980 rsi: ffff82d080342318 rdi: ffff83023b486ca0
(XEN) rbp: ffff8300bfcffd88 rsp: ffff8300bfcffd58 r8: 0000000000000004
(XEN) r9: 00000000deadbeef r10: ffff82d08025f5c0 r11: 0000000000000206
(XEN) r12: ffff83023b486ca0 r13: ffff8300bfd46000 r14: ffff82d080299b80
(XEN) r15: ffff83023b522d80 cr0: 0000000080050033 cr4: 00000000000406a0
(XEN) cr3: 0000000231c0d000 cr2: ffff880001e80ba8
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
(XEN) Xen stack trace from rsp=ffff8300bfcffd58:
(XEN) ffff8300bfcffd70 ffff8300bfd46000 0000000216110572 ffff83023b522940
(XEN) ffff82d08032bc00 0000000000000282 ffff8300bfcffdd8 ffff82d08012be0c
(XEN) ffff83023b4b5000 ffff83023b4f1000 ffff8300bfd47000 ffff8300bfd46000
(XEN) 0000000000000000 ffff83023b4b4280 0000000000014440 0000000000000001
(XEN) ffff8300bfcffde8 ffff82d08012c327 ffff8300bfcffe08 ffff82d080169cea
(XEN) ffff83023b4b5000 000000000000000a ffff8300bfcffe18 ffff82d080169d65
(XEN) ffff8300bfcffe38 ffff82d08010762a ffff83023b4b4280 ffff83023b4b5000
(XEN) ffff8300bfcffe68 ffff82d08010822a ffff8300bfcffe68 fffffffffffffff2
(XEN) ffff88022056dcb4 ffff880230c34440 ffff8300bfcffef8 ffff82d0801096fc
(XEN) ffff8300bfcffef8 ffff8300bfcfff18 ffff8300bfcffef8 ffff82d080240e85
(XEN) ffff880200000001 0000000000000000 0000000000000246 ffffffff810013aa
(XEN) 000000000000000a ffffffff810013aa 000000000000e030 ffff8300bfd47000
(XEN) ffff8802206597f0 ffff880230c34440 0000000000014440 0000000000000001
(XEN) 00007cff403000c7 ffff82d0802439e2 ffffffff8100140a 0000000000000020
(XEN) ffff88022063c7d0 ffff88022063c7d0 0000000000000001 000000000000dca0
(XEN) ffff88022056dcb8 ffff880230c34440 0000000000000206 0000000000000004
(XEN) ffff8802230001a0 ffff880220619000 0000000000000020 ffffffff8100140a
(XEN) 0000000000000000 ffff88022056dcb4 0000000000000004 0001010000000000
(XEN) ffffffff8100140a 000000000000e033 0000000000000206 ffff88022056dc90
(XEN) 000000000000e02b 0000000000000000 0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN) [<ffff82d08012a003>] sched_rt.c#rt_vcpu_wake+0xf0/0x17f
(XEN) [<ffff82d08012be0c>] vcpu_wake+0x213/0x3d4
(XEN) [<ffff82d08012c327>] vcpu_unblock+0x4b/0x4d
(XEN) [<ffff82d080169cea>] vcpu_kick+0x20/0x6f
(XEN) [<ffff82d080169d65>] vcpu_mark_events_pending+0x2c/0x2f
(XEN) [<ffff82d08010762a>] event_2l.c#evtchn_2l_set_pending+0xa9/0xb9
(XEN) [<ffff82d08010822a>] evtchn_send+0x158/0x183
(XEN) [<ffff82d0801096fc>] do_event_channel_op+0xe21/0x147d
(XEN) [<ffff82d0802439e2>] lstar_enter+0xe2/0x13c
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion '!__vcpu_on_replq(svc)' failed at sched_rt.c:527
(XEN) ****************************************
>> I'm thinking when
>> a vcpu unblocks, it could potentially fall through here.
>>
> Well, when unblocking, wake() is certainly called, yes.
>
>> And like you
>> said, mostly spurious sleep
>> happens when a vcpu is running and it could happen in other cases,
>> although rare.
>>
> I think I said already there's no such thing as "spurious sleep". Or at
> least, I can't think of anything that I would define a spurious sleep,
> if you do, please, explain what situation you're referring to.
>
I meant to say spurious wakeup... If rt_vcpu_sleep() removes vcpus from
replenishment queue, it's perfectly safe for rt_vcpu_wake() to insert
them back. So, I'm suspecting it's the spurious wakeup that's causing
troubles because vcpus are not removed prior to rt_vcpu_wake(). However,
the two RETURNs at the beginning of rt_vcpu_wake() should catch that
shouldn't it?
> In any case, one way of dealing with vcpus blocking/offlining/etc could
> be to, in context_saved(), in case we are not adding the vcpu back to
> the runq, cancel its replenishment event with __replq_remove().
>
> (This may make it possible to avoid doing it in rt_vcpu_sleep() too,
> but you'll need to check and test.)
>
> Can you give this a try.
That makes sense. Doing it in context_saved() kinda implies that if a
vcpu is sleeping and taken off, its replenishment event should be
removed. On the other hand, the logic is the same as removing it in
rt_vcpu_sleep() but just at different times. Well, I have tried it and
the check still needs to be there in rt_vcpu_wake(). I will send the
next version so it's easier to look at.
Thanks,
Tianyang
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-02-25 17:30 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-09 4:33 [PATCH v5][RFC]xen: sched: convert RTDS from time to event driven model Tianyang Chen
2016-02-16 3:55 ` Meng Xu
2016-02-18 1:55 ` Tianyang Chen
2016-02-24 15:23 ` Tianyang Chen
2016-02-25 2:02 ` Dario Faggioli
2016-02-25 6:15 ` Tianyang Chen
2016-02-25 10:34 ` Dario Faggioli
2016-02-25 17:29 ` Tianyang Chen [this message]
2016-02-25 17:51 ` Dario Faggioli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56CF39F3.5040807@seas.upenn.edu \
--to=tiche@seas.upenn.edu \
--cc=dario.faggioli@citrix.com \
--cc=dgolomb@seas.upenn.edu \
--cc=george.dunlap@citrix.com \
--cc=mengxu@cis.upenn.edu \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.