From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH] xen/arm: introduce vwfi parameter Date: Mon, 20 Feb 2017 12:15:30 +0100 Message-ID: <1487589330.6732.174.camel@citrix.com> References: <1487286292-29502-1-git-send-email-sstabellini@kernel.org> <1487382463.6732.146.camel@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8675716761081375049==" Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cflwy-0001kB-UK for xen-devel@lists.xenproject.org; Mon, 20 Feb 2017 11:16:13 +0000 In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Julien Grall , Stefano Stabellini Cc: edgar.iglesias@xilinx.com, george.dunlap@eu.citrix.com, nd@arm.com, Punit Agrawal , xen-devel@lists.xenproject.org List-Id: xen-devel@lists.xenproject.org --===============8675716761081375049== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-p/8lkV0fKQYZBmgMbJzP" --=-p/8lkV0fKQYZBmgMbJzP Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, 2017-02-19 at 21:27 +0000, Julien Grall wrote: > Hi Dario, >=20 Hi, > On 02/18/2017 01:47 AM, Dario Faggioli wrote: > > =C2=A0- vcpu A yields, and there are no runnable but not running vcpus > > =C2=A0=C2=A0=C2=A0around. In this case, A gets to run again. Full stop. >=20 > Which turn to be the busy looping I was mentioning when one vCPU is=C2=A0 > assigned to a pCPU.=20 > Absolutely. Actually, it would mean busy looping, no matter whether vCPUs are assigned or not. As I said already, it's not exactly identical, but it would have a very similar behavior of the Linux's idle=3Dpoll option: http://tomoyo.osdn.jp/cgi-bin/lxr/source/Documentation/kernel-parameters.tx= t?v=3Dlinux-4.9.9#L1576 1576=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0idle=3D=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0[X86] 1577=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0Format: idle=3Dpoll, idle=3Dhalt, idle=3Dnomwait 1578=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0Poll forces a polling idle loop that can slightly 1579=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0improve the performance of waking up a idle CPU, but 1580=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0will use a lot of power and make the system run hot. 1581=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0Not recommended. And as I've also said,=C2=A0I don't see it as a solution to wakeup latency problems, not one that I'd like to recommend using, outside of testing and debugging. It perhaps may be a useful testing and debugging aid, though. > This is not the goal of WFI and I would be really=C2=A0 > surprised that embedded folks will be happy with a solution using > more=C2=A0 > power. >=20 Me neither. It's a showstopper for anything that's battery power or may incur in thermal/cooling issues. Outside So, just to be clear, I'm happy to help and assist in understanding the scheduling background and implications, but I am equally happy to leave the decision of whether or not this is something nice or desirable to have (as an option) on ARM. :-) I've never been a fan of it, and never used it, on Linux on x86, not even when actually working on real-time and low-latency stuff. That being said, I also personally think that having the option would be no harm, but I understand concerns that, when an option is there, people will try to use it in the weirdest way, and then comply at your 'door' if their CPU went on fire! :-O > > What will never happen is that a yielding vcpu, by busy looping, > > prevents other runnable (and non yielding) vcpus to run. And if it > > does, it's a bug. :-) >=20 > I didn't say it will prevent another vCPU to run. But it will at > least=C2=A0 > use slot that could have been used for good purpose by another pCPU. >=20 Not really. Maybe I wasn't clear on explaining yielding, or maybe I'm not getting what you're trying to say. It indeed does depend a little bit on the implementation of yield, but it won't (or at least must not) happen for busy looping issuing yield() to be much different for the pCPU when that is happening to be sleeping in deep C-state (or ARM equivalente). Performance aside, of course. > So in similar workload Xen will perform worst with vwfi=3Didle, not > even=C2=A0 > mentioning the power consumption... >=20 It'd probably be a little bit more inefficient, even performance wise, if, e.g., scheduler specific yielding code acquire locks, or means that there is one more vCPU in the runqueues to be dealt with, but nothing than that. And whether or not this would be significant or noticeable, I don't know (should be measured, if interesting). > > In fact, in work conserving schedulers, if pCPU x becomes idle, it > > means there is _nothing_ that can execute on x itself around. And > > our > > schedulers are (with the exception of ARRINC, and if not using caps > > in > > Credit1) work conserving, or at least they want and try to be an as > > much work conserving as possible. >=20 > My knowledge of the scheduler is limited. Does the scheduler take > into=C2=A0 > account the cost of context switch when scheduling? When do you > decide=C2=A0 > when to run the idle vCPU? Is it only the no other vCPU are runnable > or=C2=A0 > do you have an heuristic? >=20 Again, not sure I understand. Context switches, between running vCPUs, must happen where the scheduling algorithm decides they must happen. You can try to design an algorithm that requires not too many context switches, or introduce countermeasures (we have something like that), but apart from these, I don't know what (else?) you may refer to when asking about "take into account the cost of context switch". We do try to take into account the cose of migration, i.e., moving a vCPU from a pCPU to another... but that's an entirely different thing. About the idle vCPU... I think the answer to your question is yes. Credit and Credit2 are work conserving schedulers, so they only let a pCPU go idle, if there is no one wanting to run in the system (well, in Credit2, this may not be 100% true, until the load balancer gets to execute, but in practise, it happens very few and very infrequently). Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-p/8lkV0fKQYZBmgMbJzP Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYqs/UAAoJEBZCeImluHPuYT4QAI6kJzri2xzvLa4KyNfc+kip ng56eIzghJaeDWPIImS9qmpJ04SUwqITiUwgbFD67ZqxuqAyAJsKLu+hkS2w4cW/ DBHZolKJ3mJOmnt+9nA5ojJfbiQlREVePY6M9ouYIba/IWSolsblhot+0Va4kWCB s2sqlMMODVlfR8b+cxyyHXxRANcsTH8t26B5hwB5jvApRtTiw+GhPGv8xWtcIOO8 d5KHmWVlt592QE3UAMcCRjsqXaE8tKrq8FFbI1/EOAm4sfymQUX+qKJHVsODsFkN Zd3awQb2ZB76cgmJ19ob/0Tdq6Ia9fuTx9ZFiD0QML7d/udt5+G9IzYmQYYj0jiU 0SSrdC2rSNHJaQC1ShpaEBfH5IzOhvAIDNsTaLVU5DbMcOW0Eq4JG5f0VCCCFiVz 6/wdIxxEsQhD1OuV3qOj40oHhyUE1X2sYR1FhH8RNXVOGWAXXE6HIgz3h7uP9wO3 XqNCiSYNVQ2Z44jO28gZvzeNqUyGqod6jt8H+ZgROipv4mTHTJgmEz1NcvFWhVbQ 4Aw6T09zfgNOWBZYa2ICj+iXJN4NrzD/tbOZafbDPH8C5kMSImMsEmGdlTnzRdMv lZc6wT2aLtQ9e7H7eWKjruALFnD8+1w/1PvA384dRGIHLW5+QTAgst5ACuo8jcft nrvbUMEF+D/zbrJ2HBhN =dfDG -----END PGP SIGNATURE----- --=-p/8lkV0fKQYZBmgMbJzP-- --===============8675716761081375049== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============8675716761081375049==--