From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH v3 4/9] mm: Scrub memory from idle loop Date: Thu, 11 May 2017 13:05:10 -0400 Message-ID: References: <1492184258-3277-1-git-send-email-boris.ostrovsky@oracle.com> <1492184258-3277-5-git-send-email-boris.ostrovsky@oracle.com> <1494498375.7393.5.camel@citrix.com> <34644cef-567d-b558-c580-5ed7d8116fa5@oracle.com> <1494517737.7393.7.camel@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8276688239395793121==" Return-path: In-Reply-To: <1494517737.7393.7.camel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Dario Faggioli , xen-devel@lists.xen.org Cc: sstabellini@kernel.org, wei.liu2@citrix.com, George.Dunlap@eu.citrix.com, andrew.cooper3@citrix.com, ian.jackson@eu.citrix.com, tim@xen.org, jbeulich@suse.com List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --===============8276688239395793121== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="2g9nRWRVg8bmH6mdix0ghiPf744MAFu9G" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --2g9nRWRVg8bmH6mdix0ghiPf744MAFu9G Content-Type: multipart/mixed; boundary="vwuxoNKdPnPe3IXBoqIWf1xrs8so1m92x"; protected-headers="v1" From: Boris Ostrovsky To: Dario Faggioli , xen-devel@lists.xen.org Cc: sstabellini@kernel.org, wei.liu2@citrix.com, George.Dunlap@eu.citrix.com, andrew.cooper3@citrix.com, ian.jackson@eu.citrix.com, tim@xen.org, jbeulich@suse.com Message-ID: Subject: Re: [Xen-devel] [PATCH v3 4/9] mm: Scrub memory from idle loop References: <1492184258-3277-1-git-send-email-boris.ostrovsky@oracle.com> <1492184258-3277-5-git-send-email-boris.ostrovsky@oracle.com> <1494498375.7393.5.camel@citrix.com> <34644cef-567d-b558-c580-5ed7d8116fa5@oracle.com> <1494517737.7393.7.camel@citrix.com> In-Reply-To: <1494517737.7393.7.camel@citrix.com> --vwuxoNKdPnPe3IXBoqIWf1xrs8so1m92x Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 05/11/2017 11:48 AM, Dario Faggioli wrote: > On Thu, 2017-05-11 at 10:19 -0400, Boris Ostrovsky wrote: >>>> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c >>>> index 90e2b1f..a5f62b5 100644 >>>> --- a/xen/arch/x86/domain.c >>>> +++ b/xen/arch/x86/domain.c >>>> @@ -118,7 +118,8 @@ static void idle_loop(void) >>>> { >>>> if ( cpu_is_offline(smp_processor_id()) ) >>>> play_dead(); >>>> - (*pm_idle)(); >>>> + if ( !scrub_free_pages() ) >>>> + (*pm_idle)(); >>>> do_tasklet(); >>>> >>> This means that, if we got here to run a tasklet (as in, if the >>> idle >>> vCPU has been forced into execution, because there were a vCPU >>> context >>> tasklet wanting to run), we will (potentially) do some scrubbing >>> first. >>> >> We can move do_tasklet() above scrub_free_pages(). And new tasklet >> after >> that would result in a softirq being set so we'd do an early exit >> from >> scrub_free_pages(). >> > How early? > > In fact, right now, if there is one tasklet queued, this is what > happens: > > tasklet_schedule(t) > tasklet_enqueue(t) > test_and_set_bit(_TASKLET_enqueued, tasklet_work_to_do); > raise_softirq(SCHEDULE_SOFTIRQ); > schedule() > set_bit(_TASKLET_scheduled, tasklet_work_to_do) > tasklet_work_scheduled =3D 1; > do_schedule(tasklet_work_scheduled) > csched_schedule(tasklet_work_to_do) > snext =3D CSCHED_VCPU(idle_vcpu[cpu]); > idle_loop() > (*pm_idle)() > if ( !cpu_is_haltable() ) return; > do_tasklet() /* runs tasklet t */ > clear_bit(_TASKLET_enqueued, work_to_do); /* list_empty(list) =3D=3D= true */ > raise_softirq(SCHEDULE_SOFTIRQ); > do_softirq() > schedule() > clear_bit(_TASKLET_scheduled, tasklet_work); > tasklet_work_scheduled =3D 0; > do_schedule(tasklet_work_scheduled) > csched_schedule(tasklet_work_to_do) > snext =3D CSCHED_VCPU(idle_vcpu[cpu]); > idle_loop() > (*pm_idle)() > if ( !cpu_is_haltable() ) > ... > > If we move do_tasklet up, as you suggest, this is what happens: > > tasklet_schedule(t) > tasklet_enqueue(t) > test_and_set_bit(_TASKLET_enqueued, tasklet_work_to_do); > raise_softirq(SCHEDULE_SOFTIRQ); > schedule() > set_bit(_TASKLET_scheduled, tasklet_work_to_do) > tasklet_work_scheduled =3D 1; > do_schedule(tasklet_work_scheduled) > csched_schedule(tasklet_work_to_do) > snext =3D CSCHED_VCPU(idle_vcpu[cpu]); > idle_loop() > do_tasklet() /* runs tasklet t */ > clear_bit(_TASKLET_enqueued, work_to_do); /* list_empty(list) =3D=3D= true */ > raise_softirq(SCHEDULE_SOFTIRQ); > if ( !scrub_free_pages() ) > //do some scrubbing, but softirq_pending() is true, so return 1 > do_softirq() > schedule() > clear_bit(_TASKLET_scheduled, tasklet_work); > tasklet_work_scheduled =3D 0; > do_schedule(tasklet_work_scheduled) > csched_schedule(tasklet_work_to_do) > snext =3D CSCHED_VCPU(idle_vcpu[cpu]); > idle_loop() > if ( !scrub_free_pages() ) > //do the scrubbing, returns 0, so we enter the if > (*pm_idle)() > if ( !cpu_is_haltable() ) > ... > > IOW (provided I'm understanding your code right, of course), I still > see it happening that we switched to idle *not* because the system was > idle, but for running a tasklet, and yet we end up doing at least some > scrubbing (like if the system were idle). > > Which still feels wrong to me. > > If there's more than one tasklet queued (or another one, or more, > is/are queued before the one t is processed), it's even worse, because > we go through the whole schedule()->idle_loop()->do_tasklet() again and= > again, and at each step we do a bit of scrubbing, before going back to > schedule(). > > It probably would be at least a bit better, if scrub_free_pages() would= > check for softirqs() _before_ starting any scrubbing (which I don't > think it does, right now, am I right?). Right. I didn't realize that do_tasklet() also schedules softirq. So you are suggesting something along the lines of do_tasklet(); if ( !softirq_pending(smp_processor_id() && !scrub_free_pages() )= (*pm_idle)(); do_softirq(); > >> OTOH since, as you say, we only get to idle loop() if no tasklet is >> pending (cpu_is_haltable() test) then would even that be needed? >> > Err... sorry, not getting. It's the other way round. One of the reasons= > why we end up executing idle_loop(), is that there is at least a > tasklet pending. Nevermind that. I was thinking we enter idle_loop() based on cpu_is_haltable(). -boris > > Where we only get to if there's nothing pending is to calling > (*pm_idle)(). > > Regards, > Dario --vwuxoNKdPnPe3IXBoqIWf1xrs8so1m92x-- --2g9nRWRVg8bmH6mdix0ghiPf744MAFu9G Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJZFJnGAAoJEIredpCGysGyxQQP/jMvXHt+SjsxJMA5lfU65NYU q9Wiow4mROHb0rrLT9e0DSzFbhMUEg8hnzvSFiVWtD0Dzung3TYLfk+qzlOZ1oSa iM/BgNq8xy9EiO5PZVJEiQhdBXDn/9ROokaUh3sQ/RHBwsORQ2GuHJ7kVSSefiRX RKFVm+qh6ddhOmLFfipyuFEE4tGbt/0KegQFgfE0WGrTSnAjQGBzXLofwUeaZLsG 1Px2BI4zNORAEGupjPthP5dtG9CbqbGuKeG1yj8Wn+R9huEHjEezPvPOJX8D5Mmx s5XAIG/5xoN2Y8OGIRHSAt3I3l62HbjQaWdYPLO+6mDmA/FtWE12UHX40ft/AIoH 6sExwE3FKN/fVZvxfEWn7zqF5WJfHaqdwInd+k4KrRFgFf2gPPvWLbpKfauaQXSA pxsPi1NkRqzO81Tta3VxeR+uOYJqEhfmyzfnIufV8wKJnUfXucRh4CpF7pbHMFnS 2zAVkQlJ6F/KQ00A6+mQtJc0y5OOVJu5En+UvFaGzqmIlbgHccw/5CeiDonNYOny s1EdwP4Zeo7lptnqYHsGyagDol3nrlcwCcMiNf7Q6EJn9z06dX0hCoI0o42IhGI0 A8D9IcaTAH5FqoBa4UO0/VATWWkEZ9Zbq8zfLdxFy5xJeksFaXeFCOBOFiUaGaTS 36lv4/tB/eIranJJ5+pZ =P05I -----END PGP SIGNATURE----- --2g9nRWRVg8bmH6mdix0ghiPf744MAFu9G-- --===============8276688239395793121== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============8276688239395793121==--