All of lore.kernel.org
 help / color / mirror / Atom feed
From: Valentin Schneider <valentin.schneider@arm.com>
To: Tao Zhou <ouwen210@hotmail.com>
Cc: linux-kernel@vger.kernel.org, tglx@linutronix.de,
	mingo@kernel.org, bigeasy@linutronix.de, qais.yousef@arm.com,
	swood@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	bristot@redhat.com, vincent.donnefort@arm.com, tj@kernel.org,
	ouwen210@hotmail.com
Subject: Re: [RFC PATCH] sched/core: Fix premature p->migration_pending completion
Date: Thu, 28 Jan 2021 18:56:01 +0000	[thread overview]
Message-ID: <jhj4kj028n2.mognet@arm.com> (raw)
In-Reply-To: <BN8PR12MB2978A9A4435A01EDC97D27E89ABA9@BN8PR12MB2978.namprd12.prod.outlook.com>

On 29/01/21 01:02, Tao Zhou wrote:
> Hi,
>
> On Wed, Jan 27, 2021 at 07:30:35PM +0000, Valentin Schneider wrote:
>
>> Fiddling some more with a TLA+ model of set_cpus_allowed_ptr() & friends
>> unearthed one more outstanding issue. This doesn't even involve
>> migrate_disable(), but rather affinity changes and execution of the stopper
>> racing with each other.
>>
>> My own interpretation of the (lengthy) TLA+ splat (note the potential for
>> errors at each level) is:
>>
>>   Initial conditions:
>>     victim.cpus_mask = {CPU0, CPU1}
>>
>>   CPU0                             CPU1                             CPU<don't care>
>>
>>   switch_to(victim)
>>                                                                  set_cpus_allowed(victim, {CPU1})
>>                                                                    kick CPU0 migration_cpu_stop({.dest_cpu = CPU1})
>>   switch_to(stopper/0)
>>                                                                  // e.g. CFS load balance
>>                                                                  move_queued_task(CPU0, victim, CPU1);
>                                     ^^^^^^^^^^^^^^^^
>
> Why is move_queued_task() not attach_task()/detach_task() for CFS load..
>

Heh I expected that one; it is indeed detach_task()/attach_task() for CFS
LB. I didn't want to make this any longer than it needed to, and I figured
that move_queued_task() being a composition of detach_task(), attach_task() and
rq_locks, this would get the point across.

This does raise an "interesting" point that ATM I think this issue cannot
actually involve move_queued_task(), since all current move_queued_task()
callsites are issued either from a stopper or from set_cpus_allowed_ptr().

CFS' detach_task() + attach_task() could do it, though.

>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 06b449942adf..b57326b0a742 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -1923,20 +1923,28 @@ static int migration_cpu_stop(void *data)
>>                      complete = true;
>>              }
>>
>> -		/* migrate_enable() --  we must not race against SCA */
>> -		if (dest_cpu < 0) {
>> -			/*
>> -			 * When this was migrate_enable() but we no longer
>> -			 * have a @pending, a concurrent SCA 'fixed' things
>> -			 * and we should be valid again. Nothing to do.
>> -			 */
>> -			if (!pending) {
>> -				WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), &p->cpus_mask));
>> -				goto out;
>> -			}
>> +	       /*
>> +		* When this was migrate_enable() but we no longer
>> +		* have a @pending, a concurrent SCA 'fixed' things
>> +		* and we should be valid again.
>> +		*
>> +		* This can also be a stopper invocation that was 'fixed' by an
>> +		* earlier one.
>> +		*
>> +		* Nothing to do.
>> +		*/
>> +		if ((dest_cpu < 0 || dest_cpu == cpu_of(rq)) && !pending) {
>
> When the condition 'dest_cpu == cpu_of(rq)' is true, pending is not NULL.
> The condition may be like this:
>
>     if ((dest_cpu < 0 && !pending) || dest_cpu == cpu_of(rq))
>
> We want to choose one cpu in the new(currently modified) cpu_mask and
> complete all.
>

Consider the execution of migration_cpu_stop() in above trace with
migrate_task_to(). We do have:
- dest_cpu == cpu_of(rq)
- p->migration_pending

but we do *not* want to bail out at this condition, because we need to fix
up dest_cpu.

  parent reply	other threads:[~2021-01-28 19:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-27 19:30 [RFC PATCH] sched/core: Fix premature p->migration_pending completion Valentin Schneider
     [not found] ` <BN8PR12MB2978A9A4435A01EDC97D27E89ABA9@BN8PR12MB2978.namprd12.prod.outlook.com>
2021-01-28 18:56   ` Valentin Schneider [this message]
2021-02-03 17:23 ` Qais Yousef
2021-02-03 18:59   ` Valentin Schneider
2021-02-04 15:30     ` Qais Yousef
2021-02-05 11:02       ` Valentin Schneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jhj4kj028n2.mognet@arm.com \
    --to=valentin.schneider@arm.com \
    --cc=bigeasy@linutronix.de \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=ouwen210@hotmail.com \
    --cc=peterz@infradead.org \
    --cc=qais.yousef@arm.com \
    --cc=rostedt@goodmis.org \
    --cc=swood@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vincent.donnefort@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.