From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH v2 1/4] xen: credit2: implement utilization cap Date: Tue, 5 Sep 2017 19:53:14 +0200 Message-ID: <1504633994.30217.4.camel@citrix.com> References: <150307081385.6642.6516202758428761422.stgit@Solace.fritz.box> <150307145322.6642.8867195330176310748.stgit@Solace.fritz.box> <09069855-91e3-1538-64c5-177210573b2b@gmail.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3305600450292299206==" Return-path: Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dpILq-0002EJ-8k for xen-devel@lists.xenproject.org; Tue, 05 Sep 2017 18:13:30 +0000 In-Reply-To: <09069855-91e3-1538-64c5-177210573b2b@gmail.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Anshul Makkar , xen-devel@lists.xenproject.org Cc: George Dunlap , Andrew Cooper , Wei Liu , Ian Jackson , Jan Beulich List-Id: xen-devel@lists.xenproject.org --===============3305600450292299206== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-U2b74opDd/OxjIy/OOan" --=-U2b74opDd/OxjIy/OOan Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 2017-08-24 at 20:42 +0100, Anshul Makkar wrote: > On 8/18/17 4:50 PM, Dario Faggioli wrote: > > =C2=A0=C2=A0 > > @@ -1515,7 +1633,16 @@ static void reset_credit(const struct > > scheduler *ops, int cpu, s_time_t now, > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* tha= t the credit it has spent so far get accounted. > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( svc->v= cpu =3D=3D curr_on_cpu(svc_cpu) ) > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0burn_credits(rqd, svc, now); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0/* > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0* And, similarly, in case it has run out of budget, > > as a > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0* consequence of this round of accounting, we also > > must inform > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0* its pCPU that it's time to park it, and pick up > > someone else. > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0*/ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0if ( unlikely(svc->budget <=3D 0) ) > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0tickle_cpu(svc_cpu, rqd); >=20 > This is for accounting of credit. Why it willl impact the budget. Do > you=C2=A0 > intend to refer that > budget of current vcpu expired while doing calculation for credit ?? > burn_credits() burns does budget acounting too now. So, it's entirely possible that the vCPU has actually run out of budget, and we figure it out now (and we should take appropriate actions!). > > @@ -1571,27 +1698,35 @@ void burn_credits(struct > > csched2_runqueue_data *rqd, > > =C2=A0=C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0delta =3D now - svc->start_time; > > =C2=A0=C2=A0 > > -=C2=A0=C2=A0=C2=A0=C2=A0if ( likely(delta > 0) ) > > -=C2=A0=C2=A0=C2=A0=C2=A0{ > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0SCHED_STAT_CRANK(burn_= credits_t2c); > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0t2c_update(rqd, delta,= svc); > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0svc->start_time =3D no= w; > > -=C2=A0=C2=A0=C2=A0=C2=A0} > > -=C2=A0=C2=A0=C2=A0=C2=A0else if ( delta < 0 ) > > +=C2=A0=C2=A0=C2=A0=C2=A0if ( unlikely(delta <=3D 0) ) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > >=20 > > +static void replenish_domain_budget(void* data) > > +{ > > +=C2=A0=C2=A0=C2=A0=C2=A0struct csched2_dom *sdom =3D data; > > +=C2=A0=C2=A0=C2=A0=C2=A0unsigned long flags; > > +=C2=A0=C2=A0=C2=A0=C2=A0s_time_t now; > > +=C2=A0=C2=A0=C2=A0=C2=A0LIST_HEAD(parked); > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0spin_lock_irqsave(&sdom->budget_lock, flags); > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0now =3D NOW(); > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0/* > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* Let's do the replenishment. Note, thou= gh, that a domain may > > overrun, > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* which means the budget would have gone= below 0 (reasons may > > be system > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* overbooking, accounting issues, etc.).= It also may happen > > that we are > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* handling the replenishment (much) late= r than we should > > (reasons may > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* again be overbooking, or issues with t= imers). > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* Even in cases of overrun or delay, how= ever, we expect that > > in 99% of > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* cases, doing just one replenishment wi= ll be good enough for > > being able > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* to unpark the vCPUs that are waiting f= or some budget. > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > > +=C2=A0=C2=A0=C2=A0=C2=A0do_replenish(sdom); > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0/* > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* And now, the special cases: > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* 1) if we are late enough to have skipp= ed (at least) one > > full period, > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* what we must do is doing more replenis= hments. Note that, > > however, > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* every time we add tot_budget to the bu= dget, we also move > > next_repl > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* away by CSCHED2_BDGT_REPL_PERIOD, to m= ake sure the cap is > > always > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* respected. > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > > +=C2=A0=C2=A0=C2=A0=C2=A0if ( unlikely(sdom->next_repl <=3D now) ) > > +=C2=A0=C2=A0=C2=A0=C2=A0{ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0do > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0do_replenish(sdom); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0while ( sdom->next_rep= l <=3D now ); > > +=C2=A0=C2=A0=C2=A0=C2=A0} >=20 > Just a bit confused. Have you seen this kind of scenario. Please can > you=C2=A0 > explain it. > Is this condition necessary. > This was discussed (with George) during v1 review. It's a corner case, which should never happen, and I in fact have never seen it happening in my tests. But we can't rule out that it won't occur, so it makes sense to deal with it (instead of just ignoring it, causing the cap mechanism to [temporarily] malfunction / become inaccurate). > > +=C2=A0=C2=A0=C2=A0=C2=A0/* > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* 2) if we overrun by more than tot_budg= et, then > > budget+tot_budget is > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* still < 0, which means that we can't u= npark the vCPUs. > > Let's bail, > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* and wait for future replenishments. > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > > +=C2=A0=C2=A0=C2=A0=C2=A0if ( unlikely(sdom->budget <=3D 0) ) > > +=C2=A0=C2=A0=C2=A0=C2=A0{ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0spin_unlock_irqrestore= (&sdom->budget_lock, flags); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0goto out; > > +=C2=A0=C2=A0=C2=A0=C2=A0} >=20 > "if we overran by more than tot_budget in previous run", make is > more=C2=A0 > clear.. > Mmm... perhaps, but not so much, IMO. It's quite clear to which time window we are referring to already, and I don't feel like re-sending for this. Let's see if there are other comments/requests. > Rest, looks good to me. >=20 Thanks and Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-U2b74opDd/OxjIy/OOan Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJZruSKAAoJEBZCeImluHPuv0MQANziuPNsqUVEvqTk5X5MOSJZ R9jlsn4jsTMQyu99Q+SarQk98WIAbk2c8rzvCyWMDrK2I3aozQRsZFEx8UQrs/Xt e27Tq2zO8sQBGoC0sBmq4FBElPGuGdG3Hh41awI56srEHAaZZGPqVsCAa5mlliJs KGYSDdPuhEwZ6f05L66q1eXbgBFIjjdhBWfLbR0AvKUf9H1ulrcMJICGtdzCJA3g B6WiR+ZXiHTuKZaoq3XRlRcxX7qJx0JtRvV84OP3NZNOuD7KcR/TYZZzlKWH336B 1m8vvO6fl4ysIFeXzVge50O7UnYt4q7bsaXJGmy7UluoNnmSSeaoUNgRA7ebL69C /p+ZIRN/U/5yiQ3bSY3lqih9ZvoqTUN2hZoLkO+2O2iVSMHR/46xnkMx0iC95fd8 374iug4myCAtw69J2WVhA1kK3yu30QAxPCJAHdBe7+JmS8miBAQB50opFoNPixf5 InoDg2x4wXBWvgtEmO/CS8nd8FWKGnb41T66qmYf01lzf6UctOpESSBIM1XX9tcK qdF8suHuCAUIYby537puKpJ0fSdSx2dgtiaCBY0yKfn4EbTOnGNHFBzcqHHip8ur ZiLmyGkt0A0256l0nWgHXG5YLF8Ea2c8EAx8WOOvKrr94lKhaM2rAn7Q7cJfPT64 39knVR9IyzsAip7wdDxG =G/a8 -----END PGP SIGNATURE----- --=-U2b74opDd/OxjIy/OOan-- --===============3305600450292299206== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============3305600450292299206==--