From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: Introduce rt real-time scheduler for Xen Date: Fri, 11 Jul 2014 18:19:17 +0200 Message-ID: <1405095557.29306.530.camel@Solace> References: <1405054198-29106-1-git-send-email-mengxu@cis.upenn.edu> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0618826871301779058==" Return-path: In-Reply-To: <1405054198-29106-1-git-send-email-mengxu@cis.upenn.edu> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Meng Xu Cc: ian.campbell@citrix.com, xisisu@gmail.com, stefano.stabellini@eu.citrix.com, george.dunlap@eu.citrix.com, ian.jackson@eu.citrix.com, xen-devel@lists.xen.org, xumengpanda@gmail.com, lichong659@gmail.com, dgolomb@seas.upenn.edu List-Id: xen-devel@lists.xenproject.org --===============0618826871301779058== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-GyGBMRB1ranN8ix2i2aI" --=-GyGBMRB1ranN8ix2i2aI Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On ven, 2014-07-11 at 00:49 -0400, Meng Xu wrote: > This serie of patches adds rt real-time scheduler to Xen. >=20 He Meng, Sisu! Nice to see you here on xen-devel with this nice code-drop! :-P > In summary, It supports: > 1) Preemptive Global Earliest Deadline First scheduling policy by using a= global RunQ for the scheduler; > 2) Assign/display each VCPU's parameters of each domain; > 3) Supports CPU Pool >=20 Great, thanks for doing the effort of extracting this from your code base, and submit it here. :-) Having look at the series carefully, I think it's a nice piece of work already. There's quite a few modification and cleanups to do, and I think there's room for quite a bit of improvement, but I really like the fact that all the features are basically there already. In particular, proper SMP support, per-VCPU scheduling parameters, and a sane and theoretically sound budgetting scheme is what we're missing in SEDF[*], and we need these things badly! [*] Josh's RFC is improving this, but only wrt to the latter one (sane scheduling algorithm). > -------------------------------------------------------------------------= ---------------------------------------------------- > One scenario to show the functionality of this rt scheduler is as follows= : > //list each vcpu's parameters of each domain in cpu pools using rt schedu= ler > #xl sched-rt > Cpupool Pool-0: sched=3DEDF > Name ID VCPU Period Budget > Domain-0 0 0 10 10 > Domain-0 0 1 20 20 > Domain-0 0 2 30 30 > Domain-0 0 3 10 10 > litmus1 1 0 10 4 > litmus1 1 1 10 4 >=20 > [...] > Thanks for showing this also. > -------------------------------------------------------------------------= ---------------------------------------------------- > The differences between this new rt real-time scheduler and the sedf sche= duler are as follows: > 1) rt scheduler supports global EDF scheduling, while sedf only supports = partitioned scheduling. With the support of vcpu mask, rt scheduler can als= o be used as partitioned scheduling by setting each VCPU=E2=80=99s cpumask = to a specific cpu. > Which is be biggest and most important difference. In fact, although the implementation of this scheduler can be improved (AFAICT) wrt this aspect too, adding SMP support to SEDF would be much much harder... > 2) rt scheduler supports setting and getting each VCPU=E2=80=99s paramete= rs of a domain. A domain can have multiple vcpus with different parameters,= rt scheduler can let user get/set the parameters of each VCPU of a specifi= c domain; (sedf scheduler does not support it now) > 3) rt scheduler supports cpupool. > Right. Well, to be fair, SEDF supports cpupools as well. :-) > 4) rt scheduler uses deferrable server to burn/replenish budget of a VCPU= , while sedf uses constrant bandwidth server to burn/replenish budget of a = VCPU. This is just two options of implementing a global EDF real-time sched= uler and both options=E2=80=99 real-time performance have already been prov= ed in academic. >=20 So, can you put some links to some of your works on top of RT-Xen, which is from which this scheduler comes from? Or, if that's not possible, at least the titles? I really don't expect people to jump on research papers, but the I've seen a few, and the experimental sections were nice to read and quite useful. > -------------------------------------------------------------------------= ---------------------------------------------------- > TODO: > Allow me to add a few items here, in some sort of priority order (at least mine one): *) Deal with budget overrun in the algorithm [medium] *) Split runnable and depleted (=3Dno budget left) VCPU queues [easy] > 1) Improve the code of getting/setting each VCPU=E2=80=99s parameters. [e= asy] > Right now, it create an array with LIBXL_XEN_LEGACY_MAX_VCPUS (i.e., = 32) elements to bounce all VCPUs=E2=80=99 parameters of a domain between xe= n tool and xen to get all VCPUs=E2=80=99 parameters of a domain. It is unne= cessary to have LIBXL_XEN_LEGACY_MAX_VCPUS elements for this array. > The current work is to first get the exact number of VCPUs of a domai= n and then create an array with that exact number of elements to bounce bet= ween xen tool and xen. > 2) Provide microsecond time precision in xl interface instead of millisec= ond time precision. [easy] > Right now, rt scheduler let user to specify each VCPU=E2=80=99s param= eters (period, budget) in millisecond (i.e., ms). In some real-time applica= tion, user may want to specify VCPUs=E2=80=99 parameters in microsecond (i= .e., us). The next work is to let user specify VCPUs=E2=80=99 parameters in= microsecond and count the time in microsecond (or nanosecond) in xen rt sc= heduler as well. > *) Subject Dom0 to the EDF+DS scheduling, as all other domains [easy] We can discuss what default Dom0 parameters should be, but we certainly want it to be scheduled as all other domains, and not getting too much of a special treatment. > 3) Add Xen trace into the rt scheduler. [easy] > We will add a few xentrace tracepoints, like TRC_CSCHED2_RUNQ_POS in = credit2 scheduler, in rt scheduler, to debug via tracing. > *) Try using timers for replenishment, instead of scanning the full runqueue every now and then [medium] > 4) Method of improving the performance of rt scheduler [future work] > VCPUs of the same domain may preempt each other based on the preempti= ve global EDF scheduling policy. This self-switch issue does not bring bene= fit to the domain but introduce more overhead. When this situation happens,= we can simply promote the current running lower-priority VCPU=E2=80=99s pr= iority and let it borrow budget from higher priority VCPUs to avoid such s= elf-swtich issue. >=20 > Timeline of implementing the TODOs: > We plan to finish the TODO 1), 2) and 3) within 3-4 weeks (or earlier). > Because TODO 4) will make the scheduling policy not pure GEDF, (people wh= o wants the real GEDF may not be happy with this.) we look forward to heari= ng people=E2=80=99s opinions. > That one is definitely something we can concentrate on later. > -------------------------------------------------------------------------= ---------------------------------------------------- > Special huge thanks to Dario Faggioli for his helpful and detailed commen= ts on the preview version of this rt scheduler. :-) >=20 :-) Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-GyGBMRB1ranN8ix2i2aI Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlPADoUACgkQk4XaBE3IOsRfmACdHt6yPNUmzdeFL+m8klWlyFwe FWAAn1OnustLLbehomC/1Eno4KvMIKbd =eO5L -----END PGP SIGNATURE----- --=-GyGBMRB1ranN8ix2i2aI-- --===============0618826871301779058== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============0618826871301779058==--