From mboxrd@z Thu Jan  1 00:00:00 1970
From: Dario Faggioli <dario.faggioli@citrix.com>
Subject: Re: [PATCH v1 1/4] xen: add real time scheduler rt
Date: Wed, 3 Sep 2014 18:57:30 +0200
Message-ID: <1409763450.2673.140.camel@Solace.lan>
References: <1408921125-21470-1-git-send-email-mengxu@cis.upenn.edu>
	<1408921125-21470-2-git-send-email-mengxu@cis.upenn.edu>
	<CAFLBxZZb-xiZ81ne=02ucQ=OPXq0U3e4h-ZCzjBR8Nx6Mdeu+A@mail.gmail.com>
	<CAENZ-+kS5q7XTjhHB630929sUsEbawpuV8caDesungVVvPwkhA@mail.gmail.com>
	<CAFLBxZZhx3As_c5+AOWG+1ADCoa6Cqe2E0tXX=cZNgL9JqdaLQ@mail.gmail.com>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============8544336277372593325=="
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <CAFLBxZZhx3As_c5+AOWG+1ADCoa6Cqe2E0tXX=cZNgL9JqdaLQ@mail.gmail.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>, Sisu Xi <xisisu@gmail.com>, Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Ian Jackson <ian.jackson@eu.citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Meng Xu <xumengpanda@gmail.com>, Meng Xu <mengxu@cis.upenn.edu>, Jan Beulich <JBeulich@suse.com>, Chao Wang <chaowang@wustl.edu>, Chong Li <lichong659@gmail.com>, Dagaen Golomb <dgolomb@seas.upenn.edu>
List-Id: xen-devel@lists.xenproject.org

--===============8544336277372593325==
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="=-fwgreX5KJHixJsELTn5t"

--=-fwgreX5KJHixJsELTn5t
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On mer, 2014-09-03 at 17:06 +0100, George Dunlap wrote:
> On Wed, Sep 3, 2014 at 4:13 PM, Meng Xu <xumengpanda@gmail.com> wrote:
> > Hi George,
> >
> >
> > 2014-09-03 10:20 GMT-04:00 George Dunlap <George.Dunlap@eu.citrix.com>:
> >
> >> On Sun, Aug 24, 2014 at 11:58 PM, Meng Xu <mengxu@cis.upenn.edu> wrote=
:
> >> > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl=
.h
> >> > index 5b11bbf..27d01c1 100644
> >> > --- a/xen/include/public/domctl.h
> >> > +++ b/xen/include/public/domctl.h
> >> > @@ -339,6 +339,19 @@ struct xen_domctl_max_vcpus {
> >> >  typedef struct xen_domctl_max_vcpus xen_domctl_max_vcpus_t;
> >> >  DEFINE_XEN_GUEST_HANDLE(xen_domctl_max_vcpus_t);
> >> >
> >> > +/*
> >> > + * This structure is used to pass to rt scheduler from a
> >> > + * privileged domain to Xen
> >> > + */
> >> > +struct xen_domctl_sched_rt_params {
> >> > +    /* get vcpus' info */
> >> > +    uint64_t period; /* s_time_t type */
> >> > +    uint64_t budget;
> >> > +    uint16_t index;
> >> > +    uint16_t padding[3];
> >>
> >> Why the padding?
> >>
> >
> > I did this because of Jan's comment "Also, you need to pad the structur=
e to
> > a multiple of 8 bytes, or
> > its layout will differ between 32- and 64-bit (tool stack) callers." I =
think
> > what he said make sense so I added the padding here. :-)
> >
> > Here is the link: http://marc.info/?l=3Dxen-devel&m=3D140661680931179&w=
=3D2
>=20
> Right. :-)  I personally prefer to handle that by re-arranging the
> elements rather than adding padding, unless absolutely necessary.  In
> this case that shouldn't be too hard, particularly once we pare the
> interface down so we only have one interface (either all one vcpu at a
> time, or all batched vcpus).
>=20
> > I think it's a better idea to
> >  pass in an array with information about vcpus to get/set vcpus'
> > information.
> >
> > I only need to change the code related to setting a vcpu's information.
> > I have a question:
> > When we set a vcpu's information by using an array, we have two choices=
:
> >
> > a) just create an array with one vcpu element, and specify the index of=
 the
> > vcpu to modify; The concern to this method is that we only uses one ele=
ment
> > of this array, so is it a good idea to use an array with only one eleme=
nt?
> > b) create an array with all vcpus of this domain, modify the parameters=
 of
> > the vcpu users want to change, and then bounce the array to hypervisor =
to
> > reset these vcpus' parameters. The concern to this method is that we do=
n't
> > need any other vcpus' information to set a specific vcpu's parameters.
> > Bouncing the whole array with all vcpus information seems expensive and
> > unnecessary?
> >
> > Do you have any suggestion/advice/preference on this?
> >
> > I don't really like about the idea of reading the vcpu's information
> > one-by-one. :-) If a domain has many vcpus, say 12 vcpus, we will issue=
 12
> > hypercalls to get all vcpus' information of this domain. Because we onl=
y
> > need to issue one hypercall to get all information we want, the extra
> > hypercalls causes more overhead. This did simplify the implementation, =
but
> > may cause more overhead.
>=20
> For convenience for users, I think it's definitely the case that libxl
> should provide an interface to get and set all the vcpu parameters at
> once.  Then it can either batch them all into a single hypercall (if
> that's what we decide), or it can make the individual calls for each
> vcpu.
>=20
Indeed.

> The main reason I would think to batch the hypercalls is for
> consistency: it seems like you may want to change the period / budget
> of vcpus atomically, rather than setting one, possibly having dom0
> de-scheduled for a few hundred milliseconds, and then setting another.
> Same thing goes for reading: I would think you would want a consistent
> "snapshot" of some existing state, rather than having the possibility
> of reading half the state, then having someone change it, and then
> reading the other half.
>=20
That is actually the reason why I'd have both things. A "change this
one" variant is handy if one actually had to change only one vcpu, or a
few, but does not mind the non-atomicity.

The batched variant, for both overhead and atomicity reasons.

> Re the per-vcpu settings, though: Is it really that common for RT
> domains to want different parameters for different vcpus? =20
>
Whether it's common it is hard to say, but yes, it has to be possible.=20

For instance, I can put, in an SMP guest, two real-time applications
with different timing requirements, and pin each one to a different
(v)cpu (I mean pin *inside* the guest). At this point, I'd like for each
vcpu to have a set of RT scheduling parameters, at the Xen level, that
matches the timing requirements of what's running inside.

This may not look so typical in a server/cloud environment, but can
happen (at least in my experience) in a mobile/embedded env.

> Are these
> parameters exposed to the guest in any way, so that it can make more
> reasonable decisions as to where to run what kinds of workloads?
>=20
Not right now, AFAICS, but forms of 'scheduling paravirtualization', or
in general this kind of interaction/communication could be very useful
in real-time virtualization, so we may want to support that in future.

In any case, even without that in place right now, I think different
parameters for different vcpus is certainly something we want from an RT
scheduler.

Regards,
Dario

--=20
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


--=-fwgreX5KJHixJsELTn5t
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: This is a digitally signed message part
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEABECAAYFAlQHSHoACgkQk4XaBE3IOsRcDQCggI0f5WSFpFj+8qcb1Oe6tFL+
nhgAn1wdqydtr2zkWl0ua3GPufNYT/ra
=Gt0K
-----END PGP SIGNATURE-----

--=-fwgreX5KJHixJsELTn5t--


--===============8544336277372593325==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

--===============8544336277372593325==--