From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH RFC v1] xen:rtds: towards work conserving RTDS Date: Wed, 2 Aug 2017 19:46:38 +0200 Message-ID: <1501695998.19956.10.camel@citrix.com> References: <1501611210-5232-1-git-send-email-mengxu@cis.upenn.edu> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5392281665496861009==" Return-path: Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dcxjR-00088h-0l for xen-devel@lists.xenproject.org; Wed, 02 Aug 2017 17:46:53 +0000 In-Reply-To: <1501611210-5232-1-git-send-email-mengxu@cis.upenn.edu> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Meng Xu , xen-devel@lists.xenproject.org Cc: george.dunlap@eu.citrix.com, xumengpanda@gmail.com List-Id: xen-devel@lists.xenproject.org --===============5392281665496861009== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-UHuw08WlCcLZnG7wau1W" --=-UHuw08WlCcLZnG7wau1W Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hey, Meng! It's really cool to see progress on this... There was quite a bit of interest in scheduling in general at the Summit in Budapest, and one important thing for making sure RTDS will be really useful, is for it to have a work conserving mode! :-) On Tue, 2017-08-01 at 14:13 -0400, Meng Xu wrote: > Make RTDS scheduler work conserving to utilize the idle resource, > without breaking the real-time guarantees. Just kill the "to utilize the idle resource". We can expect that people that are interested in this commit, also know what 'work conserving' means. :-) > VCPU model: > Each real-time VCPU is extended to have a work conserving flag > and a priority_level field. > When a VCPU's budget is depleted in the current period, > if it has work conserving flag set, > its priority_level will increase by 1 and its budget will be > refilled; > othewrise, the VCPU will be moved to the depletedq. >=20 Mmm... Ok. But is the budget burned, while the vCPU executes at priority_level 1? If yes, doesn't this mean we risk having less budget when we get back to priority_lvevel 0? Oh, wait, maybe it's the case that, when we get back to priority_level 0, we also get another replenishment, is that the case? If yes, I actually think it's fine... > diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c > index 39f6bee..740a712 100644 > --- a/xen/common/sched_rt.c > +++ b/xen/common/sched_rt.c > @@ -191,6 +195,7 @@ struct rt_vcpu { > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* VCPU parameters, in nanoseconds */ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0s_time_t period; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0s_time_t budget; > +=C2=A0=C2=A0=C2=A0=C2=A0bool_t is_work_conserving;=C2=A0=C2=A0=C2=A0/* i= s vcpu work conserving */ > =C2=A0 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* VCPU current infomation in nanosecond */ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0s_time_t cur_budget;=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* current budget */ > @@ -201,6 +206,8 @@ struct rt_vcpu { > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct rt_dom *sdom; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct vcpu *vcpu; > =C2=A0 > +=C2=A0=C2=A0=C2=A0=C2=A0unsigned priority_level; > + > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0unsigned flags;=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* mark __RTDS_sch= eduled, etc.. */ > So, since we've got a 'flags' field already, can the flag be one of its bit, instead of adding a new bool in the struct: /* * RTDS_work_conserving: Can the vcpu run in the time that is * not part of any real-time reservation, and would therefore * be otherwise left idle? */ __RTDS_work_conserving 4 #define RTDS_work_conserving (1<<__RTDS_work_conserving) > @@ -245,6 +252,11 @@ static inline struct list_head *rt_replq(const > struct scheduler *ops) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return &rt_priv(ops)->replq; > =C2=A0} > =C2=A0 > +static inline bool_t is_work_conserving(const struct rt_vcpu *svc) > +{ > Use bool. > @@ -273,6 +285,20 @@ vcpu_on_replq(const struct rt_vcpu *svc) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return !list_empty(&svc->replq_elem); > =C2=A0} > =C2=A0 > +/* If v1 priority >=3D v2 priority, return value > 0 > + * Otherwise, return value < 0 > + */ > Comment style. Apart from that, do you want this to return >0 if v1 should have priority over v2, and <0 if vice-versa, right? If yes... > +static int > +compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu > *v2) > +{ > +=C2=A0=C2=A0=C2=A0=C2=A0if ( v1->priority_level < v2->priority_level || > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0( v1->priority_lev= el =3D=3D v2->priority_level &&=C2=A0 > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0v1->cur_deadline <=3D v2->cur_deadline ) ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= return 1; > +=C2=A0=C2=A0=C2=A0=C2=A0else > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return -1; > int prio =3D v2->priority_level - v1->priority_level; if ( prio =3D=3D 0 ) return v2->cur_deadline - v1->cur_deadline; return prio; Return type has to become s_time_t, and there's a chance that it'll return 0, if they are at the same level, and have the same absolute deadline. But I think you can deal with this in the caller. > @@ -966,8 +1001,16 @@ burn_budget(const struct scheduler *ops, struct > rt_vcpu *svc, s_time_t now) > =C2=A0 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( svc->cur_budget <=3D 0 ) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0svc->cur_budget =3D 0; > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0__set_bit(__RTDS_deplete= d, &svc->flags); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( is_work_conserving(= svc) ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= svc->priority_level++; > ASSERT(svc->priority_level <=3D 1); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= svc->cur_budget =3D svc->budget; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0else > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= svc->cur_budget =3D 0; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= __set_bit(__RTDS_depleted, &svc->flags); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0 The rest looks good to me. Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-UHuw08WlCcLZnG7wau1W Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJZgg//AAoJEBZCeImluHPufkQQAK5QjCTFISpcNGsDZ/cNwjvh DIEFXy9T1ohV7aWacdNUuZuYf86ZM1MFTg7jpKMegNTVDkoe7CrDuyuscIKOhP46 U+AUfwnBUIUgC+HIlftYgeohlJSMz79emXIJFWxkBAuDNlxTCEwF298VpbKYPIrt Irjio++Cl/n7t3SFB7coPV3pYti0keXX7BkqNphRnLk+3wZ7fThc+TDlI0ELLzjV Z85HR+xMEWDh5quKaQimz+QOh3FH7Vs2trrsNRxPYuxhPGl887/ORQT3iybOIljN 6UdihQ+5RFUYlpwyta9mjndhIr7YpIFFganuxo8bNxBpN022rGIpSGwbLgFgUcUF 94ZkGKnICmfpkJRUTLZPi60NJlLajFPXFIMX1NRFcM3JfHwpdID3UYqIQiJo+EYs zbDWJ5qkyssW2GdOPbV7pjI37mNveCAiclslIuO0M1Sme+eBe/Qyh6xnKv87grE7 tvHDhQ0Gtxr1XxaRFOd0/LJiexLoVcDRXRrEqmqa3bb+we6LtIl5alU1QlzIJ3hB IkRtgGyxxiIseHYnl1AYl0qa0Kk62wZSi8TG29vtzc3Flf8CgLJSss95vEM7D1YA aF8fhqwchYq8wyZZ9XqCQFxMLMTEz0iQj4sWFuGI8N5hFeDtpb6bVKMLZnXG8H+9 +9dlFt+swmLyRqgPTgDm =oTPC -----END PGP SIGNATURE----- --=-UHuw08WlCcLZnG7wau1W-- --===============5392281665496861009== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============5392281665496861009==--