From: Neeraj Upadhyay <neeraju@codeaurora.org>
To: paulmck@linux.vnet.ibm.com, josh@joshtriplett.org,
	rostedt@goodmis.org, mathieu.desnoyers@efficios.com,
	jiangshanlai@gmail.com
Cc: linux-kernel@vger.kernel.org, sramana@codeaurora.org,
	prsood@codeaurora.org
Subject: Query regarding synchronize_sched_expedited and resched_cpu
Date: Fri, 15 Sep 2017 16:44:38 +0530	[thread overview]
Message-ID: <8f33e48e-ac6d-2c88-e16f-20b698c06292@codeaurora.org> (raw)
Hi,
We have one query regarding the behavior of RCU expedited grace period,
for scenario where resched_cpu() in sync_sched_exp_handler() fails to
acquire the rq lock and returns w/o setting the need_resched. In this
case, how do we ensure that the CPU notify rcu about the
end of sched grace period (schedule() -> __schedule() ->
rcu_note_context_switch(cpu) -> rcu_sched_qs()) , for cases where tick
is stopped on that CPU.  Is it implied from the rq lock acquisition
failure, that the owner of the rq lock will enforce context switch?
For which scenarios in RCU paths (as the function is used only in RCU
code), we need trylock check in resched_cpu()?
void resched_cpu(int cpu)
{
         struct rq *rq = cpu_rq(cpu);
         unsigned long flags;
         if (!raw_spin_trylock_irqsave(&rq->lock, flags))
                 return;
         resched_curr(rq);
         raw_spin_unlock_irqrestore(&rq->lock, flags);
}
This issue was observed in below scenario, where one of the CPUs (CPU1)
started synchronize_sched_expedited and sent IPI to CPU5, which is in
the idle path but handled sync_sched_exp_handler() IPI before 
rcu_idle_enter().
As resched_cpu() failed to acquire the rq lock, need_resched was not set,
and CPU went to idle; resulting in expedited stall getting reported by 
CPU1.
Below is the scenario:
•    CPU1 is waiting for expedited wait to complete:
sync_rcu_exp_select_cpus
     rdp->exp_dynticks_snap & 0x1   // returns 1 for CPU5
     IPI sent to CPU5
synchronize_sched_expedited_wait
         ret = swait_event_timeout(
                                     rsp->expedited_wq,
  sync_rcu_preempt_exp_done(rnp_root),
                                     jiffies_stall);
            expmask = 0x20 , and CPU 5 is in idle path (in cpuidle_enter())
•    CPU5 handles IPI and fails to acquire rq lock.
Handles IPI
     sync_sched_exp_handler
         resched_cpu
             returns while failing to try lock acquire rq->lock
         need_resched is not set
•    CPU5 calls  rcu_idle_enter() and as need_resched is not set, goes to
     idle (schedule() is not called).
•    CPU 1 reports RCU stall.
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation
next             reply	other threads:[~2017-09-15 11:14 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-15 11:14 Neeraj Upadhyay [this message]
2017-09-17  1:00 ` Query regarding synchronize_sched_expedited and resched_cpu Paul E. McKenney
2017-09-17  6:07   ` Neeraj Upadhyay
2017-09-18 15:11     ` Steven Rostedt
2017-09-18 16:01       ` Paul E. McKenney
2017-09-18 16:12         ` Steven Rostedt
2017-09-18 16:24           ` Paul E. McKenney
2017-09-18 16:29             ` Steven Rostedt
2017-09-18 16:55               ` Paul E. McKenney
2017-09-18 23:53                 ` Paul E. McKenney
2017-09-19  1:23                   ` Steven Rostedt
2017-09-19  2:26                     ` Paul E. McKenney
2017-09-19  1:50                   ` Byungchul Park
2017-09-19  2:06                     ` Byungchul Park
2017-09-19  2:33                       ` Paul E. McKenney
2017-09-19  2:48                         ` Byungchul Park
2017-09-19  4:04                           ` Paul E. McKenney
2017-09-19  5:37                             ` Boqun Feng
2017-09-19  6:11                               ` Mike Galbraith
2017-09-19  6:53                                 ` Byungchul Park
2017-09-19 13:40                                 ` Paul E. McKenney
2017-09-21 13:57                 ` Peter Zijlstra
2017-09-21 15:33                   ` Paul E. McKenney
2017-09-19  1:55               ` Byungchul Park
2017-09-19 15:31             ` Paul E. McKenney
2017-09-19 15:58               ` Steven Rostedt
2017-09-19 16:12                 ` Paul E. McKenney
2017-09-21 13:59               ` Peter Zijlstra
2017-09-21 16:00                 ` Paul E. McKenney
2017-09-21 16:30                   ` Peter Zijlstra
2017-09-21 16:47                     ` Paul E. McKenney
2017-09-21 13:55       ` Peter Zijlstra
2017-09-21 15:31         ` Paul E. McKenney
2017-09-21 16:18           ` Peter Zijlstra
2017-09-21 15:46         ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=8f33e48e-ac6d-2c88-e16f-20b698c06292@codeaurora.org \
    --to=neeraju@codeaurora.org \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=prsood@codeaurora.org \
    --cc=rostedt@goodmis.org \
    --cc=sramana@codeaurora.org \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).