From mboxrd@z Thu Jan  1 00:00:00 1970
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Re: [PATCH v3 4/9] mm: Scrub memory from idle loop
Date: Thu, 11 May 2017 13:05:10 -0400
Message-ID: <b4b016d5-164a-f4d2-6830-3d130f59f59b@oracle.com>
References: <1492184258-3277-1-git-send-email-boris.ostrovsky@oracle.com>
 <1492184258-3277-5-git-send-email-boris.ostrovsky@oracle.com>
 <1494498375.7393.5.camel@citrix.com>
 <34644cef-567d-b558-c580-5ed7d8116fa5@oracle.com>
 <1494517737.7393.7.camel@citrix.com>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============8276688239395793121=="
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <1494517737.7393.7.camel@citrix.com>
List-Unsubscribe: <https://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
 <mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <https://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
 <mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Errors-To: xen-devel-bounces@lists.xen.org
Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>
To: Dario Faggioli <dario.faggioli@citrix.com>, xen-devel@lists.xen.org
Cc: sstabellini@kernel.org, wei.liu2@citrix.com, George.Dunlap@eu.citrix.com, andrew.cooper3@citrix.com, ian.jackson@eu.citrix.com, tim@xen.org, jbeulich@suse.com
List-Id: xen-devel@lists.xenproject.org

This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--===============8276688239395793121==
Content-Type: multipart/signed; micalg=pgp-sha256;
 protocol="application/pgp-signature";
 boundary="2g9nRWRVg8bmH6mdix0ghiPf744MAFu9G"

This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--2g9nRWRVg8bmH6mdix0ghiPf744MAFu9G
Content-Type: multipart/mixed; boundary="vwuxoNKdPnPe3IXBoqIWf1xrs8so1m92x";
 protected-headers="v1"
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: Dario Faggioli <dario.faggioli@citrix.com>, xen-devel@lists.xen.org
Cc: sstabellini@kernel.org, wei.liu2@citrix.com, George.Dunlap@eu.citrix.com,
 andrew.cooper3@citrix.com, ian.jackson@eu.citrix.com, tim@xen.org,
 jbeulich@suse.com
Message-ID: <b4b016d5-164a-f4d2-6830-3d130f59f59b@oracle.com>
Subject: Re: [Xen-devel] [PATCH v3 4/9] mm: Scrub memory from idle loop
References: <1492184258-3277-1-git-send-email-boris.ostrovsky@oracle.com>
 <1492184258-3277-5-git-send-email-boris.ostrovsky@oracle.com>
 <1494498375.7393.5.camel@citrix.com>
 <34644cef-567d-b558-c580-5ed7d8116fa5@oracle.com>
 <1494517737.7393.7.camel@citrix.com>
In-Reply-To: <1494517737.7393.7.camel@citrix.com>

--vwuxoNKdPnPe3IXBoqIWf1xrs8so1m92x
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

On 05/11/2017 11:48 AM, Dario Faggioli wrote:
> On Thu, 2017-05-11 at 10:19 -0400, Boris Ostrovsky wrote:
>>>> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
>>>> index 90e2b1f..a5f62b5 100644
>>>> --- a/xen/arch/x86/domain.c
>>>> +++ b/xen/arch/x86/domain.c
>>>> @@ -118,7 +118,8 @@ static void idle_loop(void)
>>>>      {
>>>>          if ( cpu_is_offline(smp_processor_id()) )
>>>>              play_dead();
>>>> -        (*pm_idle)();
>>>> +        if ( !scrub_free_pages() )
>>>> +            (*pm_idle)();
>>>>          do_tasklet();
>>>>
>>> This means that, if we got here to run a tasklet (as in, if the
>>> idle
>>> vCPU has been forced into execution, because there were a vCPU
>>> context
>>> tasklet wanting to run), we will (potentially) do some scrubbing
>>> first.
>>>
>> We can move do_tasklet() above scrub_free_pages(). And new tasklet
>> after
>> that would result in a softirq being set so we'd do an early exit
>> from
>> scrub_free_pages().
>>
> How early?
>
> In fact, right now, if there is one tasklet queued, this is what
> happens:
>
>  tasklet_schedule(t)
>    tasklet_enqueue(t)
>      test_and_set_bit(_TASKLET_enqueued, tasklet_work_to_do);
>      raise_softirq(SCHEDULE_SOFTIRQ);
>  schedule()
>    set_bit(_TASKLET_scheduled, tasklet_work_to_do)
>    tasklet_work_scheduled =3D 1;
>    do_schedule(tasklet_work_scheduled)
>      csched_schedule(tasklet_work_to_do)
>        snext =3D CSCHED_VCPU(idle_vcpu[cpu]);
>  idle_loop()
>    (*pm_idle)()
>      if ( !cpu_is_haltable() ) return;
>    do_tasklet() /* runs tasklet t */
>      clear_bit(_TASKLET_enqueued, work_to_do); /* list_empty(list) =3D=3D=
 true */
>      raise_softirq(SCHEDULE_SOFTIRQ);
>    do_softirq()
>  schedule()
>    clear_bit(_TASKLET_scheduled, tasklet_work);
>    tasklet_work_scheduled =3D 0;
>    do_schedule(tasklet_work_scheduled)
>      csched_schedule(tasklet_work_to_do)
>        snext =3D CSCHED_VCPU(idle_vcpu[cpu]);
>  idle_loop()
>    (*pm_idle)()
>      if ( !cpu_is_haltable() )
>  ...
>
> If we move do_tasklet up, as you suggest, this is what happens:
>
>  tasklet_schedule(t)
>    tasklet_enqueue(t)
>      test_and_set_bit(_TASKLET_enqueued, tasklet_work_to_do);
>      raise_softirq(SCHEDULE_SOFTIRQ);
>  schedule()
>    set_bit(_TASKLET_scheduled, tasklet_work_to_do)
>    tasklet_work_scheduled =3D 1;
>    do_schedule(tasklet_work_scheduled)
>      csched_schedule(tasklet_work_to_do)
>        snext =3D CSCHED_VCPU(idle_vcpu[cpu]);
>  idle_loop()
>    do_tasklet() /* runs tasklet t */
>      clear_bit(_TASKLET_enqueued, work_to_do); /* list_empty(list) =3D=3D=
 true */
>      raise_softirq(SCHEDULE_SOFTIRQ);
>    if ( !scrub_free_pages() )
>      //do some scrubbing, but softirq_pending() is true, so return 1
>    do_softirq()
>  schedule()
>    clear_bit(_TASKLET_scheduled, tasklet_work);
>    tasklet_work_scheduled =3D 0;
>    do_schedule(tasklet_work_scheduled)
>      csched_schedule(tasklet_work_to_do)
>        snext =3D CSCHED_VCPU(idle_vcpu[cpu]);
>  idle_loop()
>    if ( !scrub_free_pages() )
>      //do the scrubbing, returns 0, so we enter the if
>      (*pm_idle)()
>        if ( !cpu_is_haltable() )
>  ...
>
> IOW (provided I'm understanding your code right, of course), I still
> see it happening that we switched to idle *not* because the system was
> idle, but for running a tasklet, and yet we end up doing at least some
> scrubbing (like if the system were idle).
>
> Which still feels wrong to me.
>
> If there's more than one tasklet queued (or another one, or more,
> is/are queued before the one t is processed), it's even worse, because
> we go through the whole schedule()->idle_loop()->do_tasklet() again and=

> again, and at each step we do a bit of scrubbing, before going back to
> schedule().
>
> It probably would be at least a bit better, if scrub_free_pages() would=

> check for softirqs() _before_ starting any scrubbing (which I don't
> think it does, right now, am I right?).

Right.

I didn't realize that do_tasklet() also schedules softirq. So you are
suggesting something along the lines of

        do_tasklet();

        if ( !softirq_pending(smp_processor_id() && !scrub_free_pages() )=

            (*pm_idle)();

        do_softirq();


>
>> OTOH since, as you say, we only get to idle loop() if no tasklet is
>> pending (cpu_is_haltable() test) then would even that be needed?
>>
> Err... sorry, not getting. It's the other way round. One of the reasons=

> why we end up executing idle_loop(), is that there is at least a
> tasklet pending.

Nevermind that. I was thinking we enter idle_loop() based on
cpu_is_haltable().

-boris

>
> Where we only get to if there's nothing pending is to calling
> (*pm_idle)().
>
> Regards,
> Dario


--vwuxoNKdPnPe3IXBoqIWf1xrs8so1m92x--

--2g9nRWRVg8bmH6mdix0ghiPf744MAFu9G
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCAAGBQJZFJnGAAoJEIredpCGysGyxQQP/jMvXHt+SjsxJMA5lfU65NYU
q9Wiow4mROHb0rrLT9e0DSzFbhMUEg8hnzvSFiVWtD0Dzung3TYLfk+qzlOZ1oSa
iM/BgNq8xy9EiO5PZVJEiQhdBXDn/9ROokaUh3sQ/RHBwsORQ2GuHJ7kVSSefiRX
RKFVm+qh6ddhOmLFfipyuFEE4tGbt/0KegQFgfE0WGrTSnAjQGBzXLofwUeaZLsG
1Px2BI4zNORAEGupjPthP5dtG9CbqbGuKeG1yj8Wn+R9huEHjEezPvPOJX8D5Mmx
s5XAIG/5xoN2Y8OGIRHSAt3I3l62HbjQaWdYPLO+6mDmA/FtWE12UHX40ft/AIoH
6sExwE3FKN/fVZvxfEWn7zqF5WJfHaqdwInd+k4KrRFgFf2gPPvWLbpKfauaQXSA
pxsPi1NkRqzO81Tta3VxeR+uOYJqEhfmyzfnIufV8wKJnUfXucRh4CpF7pbHMFnS
2zAVkQlJ6F/KQ00A6+mQtJc0y5OOVJu5En+UvFaGzqmIlbgHccw/5CeiDonNYOny
s1EdwP4Zeo7lptnqYHsGyagDol3nrlcwCcMiNf7Q6EJn9z06dX0hCoI0o42IhGI0
A8D9IcaTAH5FqoBa4UO0/VATWWkEZ9Zbq8zfLdxFy5xJeksFaXeFCOBOFiUaGaTS
36lv4/tB/eIranJJ5+pZ
=P05I
-----END PGP SIGNATURE-----

--2g9nRWRVg8bmH6mdix0ghiPf744MAFu9G--


--===============8276688239395793121==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline

X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs
IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v
cmcveGVuLWRldmVsCg==

--===============8276688239395793121==--