From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
john.stultz@linaro.org, linux-kernel@vger.kernel.org,
Pratyush Anand <panand@redhat.com>
Subject: Re: RCU stall when using function_graph
Date: Thu, 3 Aug 2017 05:44:21 -0700 [thread overview]
Message-ID: <20170803124421.GP3730@linux.vnet.ibm.com> (raw)
In-Reply-To: <11d179df-d8a9-5d3e-3bc4-080df464e85d@linaro.org>
On Thu, Aug 03, 2017 at 01:41:11PM +0200, Daniel Lezcano wrote:
> On 02/08/2017 15:07, Steven Rostedt wrote:
> > On Wed, 2 Aug 2017 14:42:39 +0200
> > Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
> >
> >> On Tue, Aug 01, 2017 at 08:12:14PM -0400, Steven Rostedt wrote:
> >>> On Wed, 2 Aug 2017 00:15:44 +0200
> >>> Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
> >>>
> >>>> On 02/08/2017 00:04, Paul E. McKenney wrote:
> >>>>>> Hi Paul,
> >>>>>>
> >>>>>> I have been trying to set the function_graph tracer for ftrace and each time I
> >>>>>> get a CPU stall.
> >>>>>>
> >>>>>> How to reproduce:
> >>>>>> -----------------
> >>>>>>
> >>>>>> echo function_graph > /sys/kernel/debug/tracing/current_tracer
> >>>>>>
> >>>>>> This error appears with v4.13-rc3 and v4.12-rc6.
> >>>
> >>> Can you bisect this? It may be due to this commit:
> >>>
> >>> 0598e4f08 ("ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines")
> >>
> >> Hi Steve,
> >>
> >> I git bisected but each time the issue occured. I went through the different
> >> version down to v4.4 where the board was not fully supported and it ended up to
> >> have the same issue.
> >>
> >> Finally, I had the intuition it could be related to the wall time (there is no
> >> RTC clock with battery on the board and the wall time is Jan 1st, 1970).
> >>
> >> Setting up the with ntpdate solved the problem.
>
> Actually, it did not solve the problem. The function_graph trace is set,
> I can use the system but after awhile (no tracing enabled at anytime),
> the stall appears.
>
> >> Even if it is rarely the case to have the time not set, is it normal to have a
> >> RCU cpu stall ?
> >>
> >>
> >
> > BTW, function_graph tracer is the most invasive of the tracers. It's 4x
> > slower than function tracer. I'm wondering if the tracer isn't the
> > cause, but just slows things down enough to cause a some other race
> > condition that triggers the bug.
>
> Yes, that could be true.
>
> I tried the following scenario:
>
> - cpufreq governor => userspace + max_freq (1.2GHz)
> - function_graph set ==> OK
>
> - cpufreq governor => userspace + min_freq (200MHz)
> - function_graph set ==> RCU stall
>
> Beside that, I realize the board is constantly processing SOF interrupts
> every 124us, so that adds more overhead.
>
> Removing the USB support, thus the associated processing for the SOF
> interrupts, I don't see anymore the RCU stall.
Looks like Steve called this one! ;-)
> Is it the expected behavior to have the system hang after a RCU stall
> raises ?
No, but if NMI stack traces are enabled and there are any NMI problems,
bad things can happen. In addition, the bulk of output can cause problems
if you have a slow console connection.
Thanx, Paul
next prev parent reply other threads:[~2017-08-03 12:44 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-01 22:04 RCU stall when using function_graph Paul E. McKenney
2017-08-01 22:15 ` Daniel Lezcano
2017-08-02 0:12 ` Steven Rostedt
2017-08-02 12:42 ` Daniel Lezcano
2017-08-02 13:07 ` Steven Rostedt
2017-08-03 2:40 ` Paul E. McKenney
2017-08-03 11:41 ` Daniel Lezcano
2017-08-03 12:44 ` Paul E. McKenney [this message]
2017-08-03 14:38 ` Daniel Lezcano
[not found] ` <CAOoBcBXo-=VYy2+TYEp=8+WSkOpDBr1x6uY=-r_GnTFKctXndQ@mail.gmail.com>
[not found] ` <CAOoBcBVKpQkAVXji5qQu8r8GErqxpy9Ae9N97NhGpOQPgXudZg@mail.gmail.com>
[not found] ` <CAOoBcBU00VRXmrNNEOjJHgXf9BimxKYOorJC0d3766mNdda=Bg@mail.gmail.com>
2017-08-06 17:02 ` Paul E. McKenney
2017-08-09 9:13 ` Pratyush Anand
2017-08-09 12:58 ` Paul E. McKenney
2017-08-09 13:28 ` Daniel Lezcano
2017-08-09 14:40 ` Paul E. McKenney
2017-08-09 15:51 ` Daniel Lezcano
2017-08-09 17:22 ` Paul E. McKenney
2017-08-10 9:45 ` Daniel Lezcano
2017-08-10 21:39 ` Paul E. McKenney
2017-08-11 9:38 ` Daniel Lezcano
2017-08-15 13:29 ` Steven Rostedt
2017-08-16 8:42 ` Daniel Lezcano
2017-08-16 14:04 ` Steven Rostedt
2017-08-16 16:32 ` Paul E. McKenney
2017-08-16 16:41 ` Steven Rostedt
2017-08-16 17:58 ` Paul E. McKenney
2017-08-30 22:07 ` Paul E. McKenney
2017-08-02 16:51 ` Paul E. McKenney
2017-08-02 12:49 ` Paul E. McKenney
-- strict thread matches above, loose matches on Subject: below --
2017-08-01 21:07 Daniel Lezcano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170803124421.GP3730@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=daniel.lezcano@linaro.org \
--cc=john.stultz@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=panand@redhat.com \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).