From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH v2] xen:rtds:Fix bug in budget accounting Date: Wed, 26 Oct 2016 16:43:33 +0200 Message-ID: <1477493013.7733.34.camel@citrix.com> References: <1477102276-4116-1-git-send-email-mengxu@cis.upenn.edu> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3311515031199304304==" Return-path: Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bzPQg-0003wU-7z for xen-devel@lists.xenproject.org; Wed, 26 Oct 2016 14:43:46 +0000 In-Reply-To: <1477102276-4116-1-git-send-email-mengxu@cis.upenn.edu> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Meng Xu , xen-devel@lists.xenproject.org Cc: Wei Liu , Dagaen Golomb , George Dunlap , Haoran Li , Linh Thi Xuan Phan , Meng Xu , Tianyang Chen List-Id: xen-devel@lists.xenproject.org --===============3311515031199304304== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-Wez83uSBQWqQCbYad4FF" --=-Wez83uSBQWqQCbYad4FF Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 2016-10-21 at 22:11 -0400, Meng Xu wrote: > Bug scenario: > While a VCPU is running on a core, it may misses its deadline. > May be useful to mention why (at least the most common reasons, like overhead and/or system being overloaded... are there others?). > Then repl_timer_handler() will update the VCPU period and deadline. > The VCPU may have high priority with the new deadline and > repl_timer_handler() > decides to keep the VCPU running on the core, but with new period and > deadline. > repl_timer_handler() does not decide what to run. It's probably better to say something like "If the VCPU is still the highest priority one, even with the new deadline, it will continue to run" > However, the budget enforcement timer for the previous period is > still armed. > When rt_schedule() is called in the new period to enforce the budget > for the previous period, current burn_budget() will deduct the time > spent > in previous period from the budget in current period, which is > incorrect. >=20 Ok, so, at=20time t=3D10 the replenishment timer triggers. It finds the vcpu in the following status: =C2=A0v->cur_deadline =3D 10 =C2=A0v->cur_budget =3D 3 =C2=A0v->last_start =3D 2 I.e., the vcpu still has some budget left, but, e.g., because the system is overloaded, could not use it in its previous period. Assuming v->period=3D10 and v->budget=3D5, rt_update_deadline(), called from within repl_timer_handler(), will put the vcpu in the following state: =C2=A0v->curr_deadline =3D 20 =C2=A0v->curr_budget =3D 5 =C2=A0v->last_start =3D 2 Then, at t=3D15 rt_schedule() is invoked, it in turns calls burn_budget() which does: =C2=A0delta =3D NOW() - v->last_start =3D 15 - 2 =3D 13 =C2=A0v->cur_budget -=3D 13 Which is too much. Is this what we are talking about? > Fix: > We keeps last_start always within the current period for a VCPU, so > that > We only deduct the time spent in the current period from the VCPU > budget. >=20 Well, this is sort of ok, I guess. In general, I think that cur_deadline, cur_budget and last_start are tightly related between each others. Especially cur_budget and last_start, considering that the former is updated in a way that depends from the value of the latter. The problem here seems to me to be that, indeed, we don't update last_start all the time that we update cur_budget. If, for instance, you look at Credit2, start_time is updated inside burn_credits(), while in RTDS, last_start is not updated in burn_budget(). So (1), one thing that I would do is to set svc->last_start=3Dnow; in burn_budget(). However, just doing that won't solve the problem, because repl_timer_handler() does not call burn_budget(). I've wondered a bit whether that is correct or not... I mean, especially in the situation we are discussing --when repl_timer_handler() runs "before" rt_schedule()-- how do we account for the budget spent from the last invocation of burn_budget() and where in time we are now? Well, I'm not really sure whether or not this is a problem. Since we don't allow the budget to go negative, and since we are giving both new deadline and new budget to the vcpu=20anyway, it may not be a big deal, actually. Or maybe it could be an issue, but only if either the overhead and/or the overload are so big (e.g., if the vcpu is overrunning and doing that for more than one full period), that the system is out of the window already. So, let's not consider this, for the moment... And let's focus on the fact that, in rt_update_deadline(), we are changing cur_deadline and cur_budget, so we also need to update last_start (as we said above they're related!). I guess we either call burn_budget from rt_update_deadline(), or we open-code svc->last_start=3Dnow; burn_budget() does things that we probably don't need and don't want to do in rt_update_deadline(). In fact, if we don't allow the budget to go negative, and ignore the (potential, and maybe non-existent) accounting issue described above, there's no point in marking the vcpu as depleted (in burn_budget()) and then (since we're calling it from rt_update_deadline()) replenishing it right away! :-O So (2), I guess the best thing to do is "just" to update last_start in rt_update_deadline() to, which is sort of what the patch is doing. Yet, there's something I don't understand, i.e.: > Signed-off-by: Meng Xu > Reported-by: Dagaen Golomb >=C2=A0 > --- a/xen/common/sched_rt.c > +++ b/xen/common/sched_rt.c > @@ -407,6 +407,14 @@ rt_update_deadline(s_time_t now, struct rt_vcpu > *svc) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0svc->cur_deadline = +=3D count * svc->period; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0 > +=C2=A0=C2=A0=C2=A0=C2=A0/* > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* svc may be scheduled to run immediately= after it misses > deadline > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* Then rt_update_deadline is called befor= e rt_schedule, which=C2=A0 > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* should only deduct the time spent in cu= rrent period from the > budget > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > +=C2=A0=C2=A0=C2=A0=C2=A0if ( svc->last_start < (svc->cur_deadline - svc= ->period) ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0svc->last_start =3D svc= ->cur_deadline - svc->period; > + =C2=A0- Do we need the if()? Isn't it safe to just always update last_star= t? =C2=A0 =C2=A0If it is, as it looks to me to be, I'd prefer it done like th= at. =C2=A0- Why cur_deadline-period, and not NOW() (well, actually, now)? =C2=A0 =C2=A0Well, in the scenario above, and in all the others I can thin= k of, =C2=A0 =C2=A0it would be almost the same. And yet, I'd prefer using now, f= or =C2=A0 =C2=A0clarity and consistency. In summary, what I think we need is a patch that does both (1) and (2), i.e., it makes sure that we always update last_start to NOW() all the times that we assign to a vcpu its new budget. That would make the code more consistent and should also fix the buggy behavior you're seeing. What do you think? Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-Wez83uSBQWqQCbYad4FF Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYEMEWAAoJEBZCeImluHPuOc8P+wWKdrJimh7jbMELsEuhCAGp 1+smMWOmYRiLFsccHBZ9Ffco894TkBOBNnkFnZv01OAbtKyVuNVyyghoJs21a5XB Zxl7gGV5WhfAWHQ3P6RMNVguyvsOTttprvRXOrVFC6loQvqS0vnai388xCuIvz0V qed4kWsqBjdBN9wG4QhL0sk2BcJWa+nmgfxLjoiuPM9DVkkiSSX1kRkMw4UvIh/B SQGb/9KS85ylCsZ7UozhRQw0Eqv/Fm1m40qgj+qani6oSbAuJHv/T+fIqeJI38j4 6AgfWP7dUJlezMxv4D/FqkAe2h9HyYC7tRoIF9AluVD72hieJi7yTk1eo5sczHXd 5p4lRcybXXf+9W6vnG87lo1/3kxriNuPy8PzXkm7nUc4bE1wGm5H8I/Xo8EAbVGm hRZwSJ05I6JsuSMoRPKJLnU9GZolE0s9Lr/1D11smQK9V0l0cS829SGXsI9iZI7u 1PLPtdcrgg88JmPzXioBooCuXX4MF6h4T3IhltcntXS3QItvxdLT1tWUZAUuC5uG AD/csAEaeEERcTHxxdC7vr0LdEAP8o0W7axNfCPGKp0+I+q+gZMtw9BcvoL1JvRN mi1hhOBUSxUq3BfpzJWpMZfNW8WmGtFQjuCThA6B3pYDiiCrWj2UfR7Z2V5H45ny 2AH3E5TuLsg4Z79QTBgj =g9IB -----END PGP SIGNATURE----- --=-Wez83uSBQWqQCbYad4FF-- --===============3311515031199304304== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============3311515031199304304==--