From mboxrd@z Thu Jan  1 00:00:00 1970
From: Tianyang Chen <tiche@seas.upenn.edu>
Subject: Re: [PATCH v4] xen: sched: convert RTDS from time to
 event driven model
Date: Mon, 8 Feb 2016 14:04:10 -0500
Message-ID: <56B8E6AA.1000000@seas.upenn.edu>
References: <1454301176-7131-1-git-send-email-tiche@seas.upenn.edu>
	<1454425705.9227.176.camel@citrix.com>
	<56B0F0E7.3070604@seas.upenn.edu>
	<1454503184.9227.276.camel@citrix.com>
	<56B2B628.2020604@seas.upenn.edu>
	<1454683182.9227.431.camel@citrix.com>
	<56B5764C.7090504@seas.upenn.edu>
	<1454930829.9227.555.camel@citrix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta3.messagelabs.com ([195.245.230.39])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <tiche@seas.upenn.edu>) id 1aSr7V-0006wX-Pg
	for xen-devel@lists.xenproject.org; Mon, 08 Feb 2016 19:05:10 +0000
In-Reply-To: <1454930829.9227.555.camel@citrix.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Dario Faggioli <dario.faggioli@citrix.com>, xen-devel@lists.xenproject.org
Cc: george.dunlap@citrix.com, Dagaen Golomb <dgolomb@seas.upenn.edu>, Meng Xu <mengxu@cis.upenn.edu>
List-Id: xen-devel@lists.xenproject.org


On 2/8/2016 6:27 AM, Dario Faggioli wrote:
> On Fri, 2016-02-05 at 23:27 -0500, Tianyang Chen wrote:
>>
>> On 2/5/2016 9:39 AM, Dario Faggioli wrote:
>>> On Wed, 2016-02-03 at 21:23 -0500, Tianyang Chen wrote:
>>>>
>> I see. So I'm just curious what can cause spurious wakeup? Does it
>> only
>> happen to a running vcpu (currently on pcpu), and a vcpu that is on
>> either runq or depletedq?
>>
> Hehe, "that's virt, baby!" :-P
>
> No, seriously, vcpu being already running or already in a queue, is
> indeed what makes me call a wakeup a "spurious" wakeup. You can check
> at the performance counters that we update in those cases, to see when
> they happen most. In my experience, on Credit, I've seen them happening
> mostly when a vcpu is running. For instance:
>
> root@Zhaman:~# xenperf |grep vcpu_wake
> sched: vcpu_wake_running            T=        39
> sched: vcpu_wake_onrunq             T=         0
> sched: vcpu_wake_runnable           T=     59875
> sched: vcpu_wake_not_runnable       T=         0
>
>>> So we realized that this is just a spurious wakeup, and get back to
>>> what we were doing.
>>>
>>> What's wrong with this I just said?
>>>
>>
>> The reason why I wanted to add a vcpu back to replq is because it
>> could
>> be taken off from the timer handler. Now, because of the spurious
>> wakeup, I think the timer shouldn't take any vcpus off of replq, in
>> which case wake() should be doing nothing just like other schedulers
>> when it's a spurious wakeup. Also, only sleep() should remove events
>> from replq.
>>
> Err... well, that depends on the how the code ends up looking like, and
> it's start to become difficult to reason on it like this.
>
> Perhaps, considering that it looks to me that we are in agreement on
> all the important aspects, you can draft a new version and we'll reason
> and comment on top of that?
>
>>> Mmm... no, I think we should know/figure out whether a
>>> replenishment is
>>> pending already and we are ok with it, or if we need a new one.
>>>
>>
>> Just to make sure, spurious sleep/wakeup are for a vcpu that is on
>> pcpu
>> or any queue right?
>>
> Yes, spurious wakeup is how I'm calling wakeups that needs no action
> from the schedule, because the vcpu is already awake (and either
> running or runnable).
>
> I don't think there is such a thing as spurious sleep... When sleep is
> called, we go to sleep. There are cases where the sleep-->wakeup
> turnaround is really really fast, but I would not call them "spurious
> sleeps", and they should not need much special treatment, not in
> sched_rt.c at least.
>
> Oh, maybe you mean the cases where you wakeup and you find out that
> context_saved() has not run yet (just occurred by looking at the code
> below)? Well, yes... But I wouldn't call those "spurious sleeps"
> either.
>
>> If a real sleep happens, it should be taken off from replq. However,
>> in
>> spurious wakeup (which I assume corresponds to a spurious sleep?),
>> it
>> shouldn't be taken off from replq. Adding back to replq should
>> happen
>> for those vcpus which were taken off because of "real sleep".
>>
> I don't really follow, but I have the feeling that you're making it
> sound more complicated like it is in reality... :-)
>

So my reasoning is that, when sleep() is called in sched_rt.c, a vcpu 
will need to be taken off from the replenishment event queue. However, a 
vcpu can be put to sleep/not-runnable without even calling sleep(), 
which corresponds to a later "spurious wakeup". Is there any way that 
RTDS can know when this happens? If not, in rt_vcpu_wake(), it needs to 
check, in all situations, if a vcpu is on the replenishment event queue 
or not. If not, add it back. I'm I right?

vcpu running      --> sleep(), taken off from replq	
vcpu on queue     --> sleep(), taken off from replq
vcpu not on queue --> sleep(), taken off from replq


However,

vcpu running/on queque/not on queue --> just not runnable anymore (still 
on replq) --> spurious wakeup (still on replq)

vcpus shouldn't be added back to replq again.

So in all cases, rt_vcpu_wake() always need to check if a vcpu is on the 
replq or not.

Now I think because of the addition of a replenishment queue, spurious 
wake up cannot be simply treated as NOP. Just like V4, the "Out" label 
basically maintains the replenishment queue.

Anyways, I agree it's gonna be easier to reason on v5 where all changes 
are applied.

Thanks,
Tianyang