Linux-mediatek Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Jing-Ting Wu <jing-ting.wu@mediatek.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Valentin Schneider <vschneid@redhat.com>, <tglx@linutronix.de>
Cc: <wsd_upstream@mediatek.com>, <linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-mediatek@lists.infradead.org>,
	 <Jonathan.JMChen@mediatek.com>,
	"chris.redpath@arm.com" <chris.redpath@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Vincent Donnefort <vdonnefort@gmail.com>,
	Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, "Mel Gorman" <mgorman@suse.de>,
	Christian Brauner <brauner@kernel.org>
Subject: [Bug] Race condition between CPU hotplug off flow and __sched_setscheduler()
Date: Thu, 19 May 2022 20:53:15 +0800	[thread overview]
Message-ID: <4a0aa13c99ffd6aea6426f83314aa2a91bc8933f.camel@mediatek.com> (raw)

Hi all


There is a race condition between CPU hotplug off flow and
__sched_setscheduler(), which will cause hang-up in CPU hotplug off
flow.

Syndrome:
During hotplug off flow in CPU_A, it blocks on
CPUHP_AP_SCHED_WAIT_EMPTY state when enters rcuwait_wait_event().
In that moment, CPU_A stays in idle and cannot wake up stopper
thread(cpuhp/A) to continue CPU_A hotplug off flow.

Root cause:
Balance_push() callback has been stolen by CPU_B in executing
__sched_setscheduler() func., which should be executed in idle task of
CPU_A to wake up stopper thread (cpuhp/A) through calling
rcuwait_wake_up(&rq->hotplug_wait).

Racing flow as below:
CPU_A is going to hotplug off and set rq->balance_callback =
&balance_push_callback, then CPU_A should use balance_push() to push
the task out and release rq_lock.
But if CPU_B do __sched_setscheduler() before CPU_A switch to
swapper/A, CPU_B use splice_balance_callbacks() to steal rq-
>balance_callback and set the CPU_A rq->balance_callback = NULL.
Due to rq->balance_callback is NULL,
so swapper/A could not do balance_push() at CPU_A,
Due to rq(rq_A) != this_rq(rq_B),
so swapper/A could not do rcuwait_wake_up() at CPU_B.


Racing flow:
-----------------------------------------------------------------------
CPU_A (Hotplug down)                          
-----------------------------------------------------------------------
State: CPUHP_AP_ACTIVE
sched_cpu_deactivate()
-> balance_push_set(cpu, true)
   -> rq_A->balance_callback = &balance_push_callback
      => CPU_A set rq_A balance_callback here.
                                
State: CPUHP_AP_SCHED_WAIT_EMPTY
sched_cpu_wait_empty()
-> balance_hotplug_wait()
   -> rcuwait_wait_event(&rq->hotplug_wait)
      => CPU_A do while loop to push task out from CPU_A,
         until swapper/A wake up cpuhp/A.
      -> schedule()
         -> rq_lock(rq, &rf)
            -> context_switch() 
              -> finish_lock_switch()
                 -> __balance_callbacks(rq_A)
                    -> do_balance_callbacks(rq,
                                      splice_balance_callbacks(rq))
                       -> balance_push(rq_A)
                 -> raw_spin_rq_unlock_irq(rq_A)
                    => CPU_A release rq_A lock.

CPU_A release rq_A lock, CPU_B can get rq_A lock.

-----------------------------------------------------------------------
CPU_B (do __sched_setscheduler(), set rq_A->balance_callback = NULL)
-----------------------------------------------------------------------
__sched_setscheduler(p) => task_rq(p) is rq_A
-> task_rq_lock(rq_A)
  -> splice_balance_callbacks(rq_A)
     -> if (head)
           rq_A->balance_callback = NULL
           => CPU_B steal rq_A->balance_callback.
     -> task_rq_unlock(rq_A)


CPU_B release rq_A lock, CPU_A can get rq_A lock and switch to
swapper/A.

-----------------------------------------------------------------------
CPU_A (Hotplug down)                           
-----------------------------------------------------------------------
switch to swapper/A:
schedule()
-> rq_lock(rq, &rf)
   -> context_switch()                                     
      -> finish_lock_switch()
         -> __balance_callbacks(rq_A)
            -> do_balance_callbacks(rq, NULL) 
               => Because rq_A->balance_callback = NULL,
                  swapper/A could not do rcuwait_wake_up().
         -> raw_spin_rq_unlock_irq(rq_A)   

-----------------------------------------------------------------------
CPU_B (do __sched_setscheduler(), set rq_A->balance_callback = NULL) 
-----------------------------------------------------------------------
balance_callbacks(rq_A, head)
-> balance_push(rq_A)
   -> rq->balance_callback = &balance_push_callback;
     -> if (rq != this_rq())
           return;
           => Because rq = rq_A, this_rq = rq_B,
              swapper/A could not do rcuwait_wake_up().

-----------------------------------------------------------------------
CPU_A (Hotplug down)                           
-----------------------------------------------------------------------
rcuwait_wait_event(&rq->hotplug_wait)
=> swapper/A could not do rcuwait_wake_up(),
   it cannot wake up stopper thread(cpuhp/A),
   so system could not exit the while loop at rcuwait_wait_event.




Do you have any suggestion or solution for this issue?
Thank you.



Best regards,
Jing-Ting Wu


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

             reply	other threads:[~2022-05-19 13:01 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-19 12:53 Jing-Ting Wu [this message]
2022-05-19 13:14 ` [Bug] Race condition between CPU hotplug off flow and __sched_setscheduler() Peter Zijlstra
2022-05-19 13:19   ` Peter Zijlstra
2022-05-19 13:47 ` Peter Zijlstra
2022-05-23  7:12   ` [SPAM]Re: " Jing-Ting Wu
2022-05-26  5:57     ` Jing-Ting Wu
2022-06-02 16:15       ` Jing-Ting Wu
2022-06-07 20:40         ` Peter Zijlstra
2022-06-07 21:39           ` [PATCH] sched: Fix balance_push() vs __sched_setscheduler() Peter Zijlstra
2022-06-08 14:16             ` Jing-Ting Wu
2022-06-08 14:48               ` Peter Zijlstra
     [not found] <6013fb81cdb18892b75c796ab0e6756ee0e9cf71.camel@mediatek.com>
     [not found] ` <a34049745934652483e3958e0a030a45b6fcfb40.camel@mediatek.com>
2022-05-06  7:47   ` [Bug] Race condition between CPU hotplug off flow and __sched_setscheduler() Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4a0aa13c99ffd6aea6426f83314aa2a91bc8933f.camel@mediatek.com \
    --to=jing-ting.wu@mediatek.com \
    --cc=Jonathan.JMChen@mediatek.com \
    --cc=brauner@kernel.org \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=chris.redpath@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=vdonnefort@gmail.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=wsd_upstream@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox