* IRQS off tracer - is it useful or not?
@ 2011-06-25 13:21 Russell King - ARM Linux
2011-06-25 13:42 ` murali at embeddedwireless.com
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Russell King - ARM Linux @ 2011-06-25 13:21 UTC (permalink / raw)
To: linux-arm-kernel
I've just been looking at the IRQS off tracer for the first time. I'm
getting output such as:
<idle>-0 0d.s3 0us!: _raw_spin_lock_irqsave <-_raw_spin_lock_irqsave
<idle>-0 0dNs4 1709us+: _raw_spin_unlock_irqrestore <-_raw_spin_unlock_irqrestore
<idle>-0 0dNs4 1770us : time_hardirqs_on <-_raw_spin_unlock_irqrestore
<idle>-0 0dNs4 1770us : <stack trace>
from it, which doesn't seem to be very useful. Figuring out that it
may be because the EABI unwinder doesn't work too well with my toolchain,
I decided to try going for the more reliable frame pointer stuff. This
gives me:
kjournal-423 0d.s4 0us : trace_hardirqs_on <-_raw_spin_unlock_irq
kjournal-423 0d.s4 0us : time_hardirqs_on <-_raw_spin_unlock_irq
kjournal-423 0d.s3 0us!: trace_hardirqs_off <-_raw_spin_lock_irqsave
kjournal-423 0d.s4 1709us+: trace_hardirqs_on <-_raw_spin_unlock_irqrestore
kjournal-423 0d.s4 1770us : time_hardirqs_on <-_raw_spin_unlock_irqrestore
kjournal-423 0d.s4 1770us : <stack trace>
=> time_hardirqs_on
=> trace_hardirqs_on_caller
=> trace_hardirqs_on
=> _raw_spin_unlock_irqrestore
=> cfq_idle_slice_timer
=> run_timer_softirq
=> __do_softirq
=> irq_exit
which is no better. It's telling me that {trace,time}_hardirqs_o{n,ff} is
involved is absurd - of course that function is involved, because that's
how these events are reported and that detail is just not interesting.
And yet again, we still don't get to find out where IRQs were disabled.
So, from what I can see, the irqsoff tracing support is next to useless,
and given that, anyone got a good reason why I should care about its
hooks? If I should care about them, it really needs to be fixed so it
actually provides useful information.
^ permalink raw reply [flat|nested] 8+ messages in thread* IRQS off tracer - is it useful or not? 2011-06-25 13:21 IRQS off tracer - is it useful or not? Russell King - ARM Linux @ 2011-06-25 13:42 ` murali at embeddedwireless.com 2011-06-27 16:26 ` Stephen Boyd ` (2 subsequent siblings) 3 siblings, 0 replies; 8+ messages in thread From: murali at embeddedwireless.com @ 2011-06-25 13:42 UTC (permalink / raw) To: linux-arm-kernel Sent by Maxis from my BlackBerry? smartphone -----Original Message----- From: Russell King - ARM Linux <linux@arm.linux.org.uk> Sender: <linux-arm-kernel-bounces@lists.infradead.org> Date: Sat, 25 Jun 2011 14:21:12 To: <linux-arm-kernel@lists.infradead.org> Subject: IRQS off tracer - is it useful or not? I've just been looking at the IRQS off tracer for the first time. I'm getting output such as: <idle>-0 0d.s3 0us!: _raw_spin_lock_irqsave <-_raw_spin_lock_irqsave <idle>-0 0dNs4 1709us+: _raw_spin_unlock_irqrestore <-_raw_spin_unlock_irqrestore <idle>-0 0dNs4 1770us : time_hardirqs_on <-_raw_spin_unlock_irqrestore <idle>-0 0dNs4 1770us : <stack trace> from it, which doesn't seem to be very useful. Figuring out that it may be because the EABI unwinder doesn't work too well with my toolchain, I decided to try going for the more reliable frame pointer stuff. This gives me: kjournal-423 0d.s4 0us : trace_hardirqs_on <-_raw_spin_unlock_irq kjournal-423 0d.s4 0us : time_hardirqs_on <-_raw_spin_unlock_irq kjournal-423 0d.s3 0us!: trace_hardirqs_off <-_raw_spin_lock_irqsave kjournal-423 0d.s4 1709us+: trace_hardirqs_on <-_raw_spin_unlock_irqrestore kjournal-423 0d.s4 1770us : time_hardirqs_on <-_raw_spin_unlock_irqrestore kjournal-423 0d.s4 1770us : <stack trace> => time_hardirqs_on => trace_hardirqs_on_caller => trace_hardirqs_on => _raw_spin_unlock_irqrestore => cfq_idle_slice_timer => run_timer_softirq => __do_softirq => irq_exit which is no better. It's telling me that {trace,time}_hardirqs_o{n,ff} is involved is absurd - of course that function is involved, because that's how these events are reported and that detail is just not interesting. And yet again, we still don't get to find out where IRQs were disabled. So, from what I can see, the irqsoff tracing support is next to useless, and given that, anyone got a good reason why I should care about its hooks? If I should care about them, it really needs to be fixed so it actually provides useful information. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel at lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 8+ messages in thread
* IRQS off tracer - is it useful or not? 2011-06-25 13:21 IRQS off tracer - is it useful or not? Russell King - ARM Linux 2011-06-25 13:42 ` murali at embeddedwireless.com @ 2011-06-27 16:26 ` Stephen Boyd 2011-06-27 16:54 ` Russell King - ARM Linux 2011-06-27 20:38 ` Todd Poynor 2011-06-28 23:08 ` Frank Rowand 3 siblings, 1 reply; 8+ messages in thread From: Stephen Boyd @ 2011-06-27 16:26 UTC (permalink / raw) To: linux-arm-kernel On 6/25/2011 6:21 AM, Russell King - ARM Linux wrote: > I've just been looking at the IRQS off tracer for the first time. I'm > getting output such as: > > <idle>-0 0d.s3 0us!: _raw_spin_lock_irqsave <-_raw_spin_lock_irqsave > <idle>-0 0dNs4 1709us+: _raw_spin_unlock_irqrestore <-_raw_spin_unlock_irqrestore > <idle>-0 0dNs4 1770us : time_hardirqs_on <-_raw_spin_unlock_irqrestore > <idle>-0 0dNs4 1770us : <stack trace> > > from it, which doesn't seem to be very useful. Figuring out that it > may be because the EABI unwinder doesn't work too well with my toolchain, > I decided to try going for the more reliable frame pointer stuff. This > gives me: > > kjournal-423 0d.s4 0us : trace_hardirqs_on <-_raw_spin_unlock_irq > kjournal-423 0d.s4 0us : time_hardirqs_on <-_raw_spin_unlock_irq > kjournal-423 0d.s3 0us!: trace_hardirqs_off <-_raw_spin_lock_irqsave > kjournal-423 0d.s4 1709us+: trace_hardirqs_on <-_raw_spin_unlock_irqrestore > kjournal-423 0d.s4 1770us : time_hardirqs_on <-_raw_spin_unlock_irqrestore > kjournal-423 0d.s4 1770us : <stack trace> > => time_hardirqs_on > => trace_hardirqs_on_caller > => trace_hardirqs_on > => _raw_spin_unlock_irqrestore > => cfq_idle_slice_timer > => run_timer_softirq > => __do_softirq > => irq_exit > > which is no better. It's telling me that {trace,time}_hardirqs_o{n,ff} is > involved is absurd - of course that function is involved, because that's > how these events are reported and that detail is just not interesting. > And yet again, we still don't get to find out where IRQs were disabled. Is ftrace enabled (/proc/sys/kernel/ftrace_enabled)? If it is you should a least see the functions that were called while irqs were off. There should also be a => started at: func_irq_off => ended at: func_irq_on near the top of the latency trace although it may not be entirely useful unless spinlocks are inlined. Perhaps we should start inlining spinlocks? -- Sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum. ^ permalink raw reply [flat|nested] 8+ messages in thread
* IRQS off tracer - is it useful or not? 2011-06-27 16:26 ` Stephen Boyd @ 2011-06-27 16:54 ` Russell King - ARM Linux 2011-06-27 17:31 ` Nicolas Pitre 2011-06-27 20:17 ` Gilles Chanteperdrix 0 siblings, 2 replies; 8+ messages in thread From: Russell King - ARM Linux @ 2011-06-27 16:54 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jun 27, 2011 at 09:26:11AM -0700, Stephen Boyd wrote: > On 6/25/2011 6:21 AM, Russell King - ARM Linux wrote: > > I've just been looking at the IRQS off tracer for the first time. I'm > > getting output such as: > > > > <idle>-0 0d.s3 0us!: _raw_spin_lock_irqsave <-_raw_spin_lock_irqsave > > <idle>-0 0dNs4 1709us+: _raw_spin_unlock_irqrestore <-_raw_spin_unlock_irqrestore > > <idle>-0 0dNs4 1770us : time_hardirqs_on <-_raw_spin_unlock_irqrestore > > <idle>-0 0dNs4 1770us : <stack trace> > > > > from it, which doesn't seem to be very useful. Figuring out that it > > may be because the EABI unwinder doesn't work too well with my toolchain, > > I decided to try going for the more reliable frame pointer stuff. This > > gives me: > > > > kjournal-423 0d.s4 0us : trace_hardirqs_on <-_raw_spin_unlock_irq > > kjournal-423 0d.s4 0us : time_hardirqs_on <-_raw_spin_unlock_irq > > kjournal-423 0d.s3 0us!: trace_hardirqs_off <-_raw_spin_lock_irqsave > > kjournal-423 0d.s4 1709us+: trace_hardirqs_on <-_raw_spin_unlock_irqrestore > > kjournal-423 0d.s4 1770us : time_hardirqs_on <-_raw_spin_unlock_irqrestore > > kjournal-423 0d.s4 1770us : <stack trace> > > => time_hardirqs_on > > => trace_hardirqs_on_caller > > => trace_hardirqs_on > > => _raw_spin_unlock_irqrestore > > => cfq_idle_slice_timer > > => run_timer_softirq > > => __do_softirq > > => irq_exit > > > > which is no better. It's telling me that {trace,time}_hardirqs_o{n,ff} is > > involved is absurd - of course that function is involved, because that's > > how these events are reported and that detail is just not interesting. > > And yet again, we still don't get to find out where IRQs were disabled. > > Is ftrace enabled (/proc/sys/kernel/ftrace_enabled)? If it is you should > a least see the functions that were called while irqs were off. > > There should also be a > > => started at: func_irq_off > => ended at: func_irq_on > > near the top of the latency trace although it may not be entirely useful > unless spinlocks are inlined. Perhaps we should start inlining spinlocks? I tried enabling ftrace too, and while that improved things, it also made the latency results unbelievable. 1. ftrace didn't log entry into every function in paths - some were skipped despite there being an mcount call. 2. (and this is the more serious) the overhead of ftrace is _far_ too large to draw any meaningful conclusions from the results. Are IRQs really disabled for 3ms on OMAP - does it really take 120us to get into an interrupt handler through all the NO_HZ stuff - does it really take 250us to do a few writes to the DMA controller? If I'm wanting to find out how long IRQs are disabled and where that happens, I'd feel a lot more confident about writing my own IRQ latency tracing for this rather than trying to use this stuff. So what I'm trying to get to the bottom of: should I care about the IRQS off hooks in the entry assembly code given that the existing IRQs off tracing seems to be useless for performing that task. My own personal experience is saying a very big no to that. ^ permalink raw reply [flat|nested] 8+ messages in thread
* IRQS off tracer - is it useful or not? 2011-06-27 16:54 ` Russell King - ARM Linux @ 2011-06-27 17:31 ` Nicolas Pitre 2011-06-27 20:17 ` Gilles Chanteperdrix 1 sibling, 0 replies; 8+ messages in thread From: Nicolas Pitre @ 2011-06-27 17:31 UTC (permalink / raw) To: linux-arm-kernel On Mon, 27 Jun 2011, Russell King - ARM Linux wrote: > So what I'm trying to get to the bottom of: should I care about the > IRQS off hooks in the entry assembly code given that the existing IRQs > off tracing seems to be useless for performing that task. My own > personal experience is saying a very big no to that. Maybe using git to find out who submitted them and ask them if they still care? This stuff must have worked for them at some point. Nicolas ^ permalink raw reply [flat|nested] 8+ messages in thread
* IRQS off tracer - is it useful or not? 2011-06-27 16:54 ` Russell King - ARM Linux 2011-06-27 17:31 ` Nicolas Pitre @ 2011-06-27 20:17 ` Gilles Chanteperdrix 1 sibling, 0 replies; 8+ messages in thread From: Gilles Chanteperdrix @ 2011-06-27 20:17 UTC (permalink / raw) To: linux-arm-kernel On 06/27/2011 06:54 PM, Russell King - ARM Linux wrote: > On Mon, Jun 27, 2011 at 09:26:11AM -0700, Stephen Boyd wrote: >> On 6/25/2011 6:21 AM, Russell King - ARM Linux wrote: >>> I've just been looking at the IRQS off tracer for the first time. I'm >>> getting output such as: >>> >>> <idle>-0 0d.s3 0us!: _raw_spin_lock_irqsave <-_raw_spin_lock_irqsave >>> <idle>-0 0dNs4 1709us+: _raw_spin_unlock_irqrestore <-_raw_spin_unlock_irqrestore >>> <idle>-0 0dNs4 1770us : time_hardirqs_on <-_raw_spin_unlock_irqrestore >>> <idle>-0 0dNs4 1770us : <stack trace> >>> >>> from it, which doesn't seem to be very useful. Figuring out that it >>> may be because the EABI unwinder doesn't work too well with my toolchain, >>> I decided to try going for the more reliable frame pointer stuff. This >>> gives me: >>> >>> kjournal-423 0d.s4 0us : trace_hardirqs_on <-_raw_spin_unlock_irq >>> kjournal-423 0d.s4 0us : time_hardirqs_on <-_raw_spin_unlock_irq >>> kjournal-423 0d.s3 0us!: trace_hardirqs_off <-_raw_spin_lock_irqsave >>> kjournal-423 0d.s4 1709us+: trace_hardirqs_on <-_raw_spin_unlock_irqrestore >>> kjournal-423 0d.s4 1770us : time_hardirqs_on <-_raw_spin_unlock_irqrestore >>> kjournal-423 0d.s4 1770us : <stack trace> >>> => time_hardirqs_on >>> => trace_hardirqs_on_caller >>> => trace_hardirqs_on >>> => _raw_spin_unlock_irqrestore >>> => cfq_idle_slice_timer >>> => run_timer_softirq >>> => __do_softirq >>> => irq_exit >>> >>> which is no better. It's telling me that {trace,time}_hardirqs_o{n,ff} is >>> involved is absurd - of course that function is involved, because that's >>> how these events are reported and that detail is just not interesting. >>> And yet again, we still don't get to find out where IRQs were disabled. >> >> Is ftrace enabled (/proc/sys/kernel/ftrace_enabled)? If it is you should >> a least see the functions that were called while irqs were off. >> >> There should also be a >> >> => started at: func_irq_off >> => ended at: func_irq_on >> >> near the top of the latency trace although it may not be entirely useful >> unless spinlocks are inlined. Perhaps we should start inlining spinlocks? > > I tried enabling ftrace too, and while that improved things, it also > made the latency results unbelievable. > > 1. ftrace didn't log entry into every function in paths - some were > skipped despite there being an mcount call. > 2. (and this is the more serious) the overhead of ftrace is _far_ too > large to draw any meaningful conclusions from the results. > > Are IRQs really disabled for 3ms on OMAP - does it really take 120us > to get into an interrupt handler through all the NO_HZ stuff - does it > really take 250us to do a few writes to the DMA controller? Hi Russell, Maybe this post is off-topic, but I worked on reducing latencies on omap3 with xenomai. Most paths are common with Linux, maybe this could be of interest for this list. The reasons for big latencies with xenomai on omap3 were: - the gpio masking/acking function calls nesting, this can be reduced by inlining some functions, especially when not compiling for multiple omaps (but my guess is that if you want the lowest latency, you are not going to compile for all the omaps anyway); - the gpio demuxing, which is one big irqs off section, this can be reduced by re-enabling irqs in the middle of it; - the gptimer 0/1 programming latency, this can be reduced by using other gptimer, and configuring the timer used as clockevent to be "non posted". On omap4, the reasons for high latencies (on 2.6.37, sorry, may be outdated) are: - gic mask/ack handlers which take a spinlock then call writel, which ends up with a call to outer_clean, which takes another spinlock. Solved by usin a writel_relaxed, and remove the spinlock for mask and ack, and also implement the mask_ack callback to avoid a function call by address, which ends up calling two other functions by address, which take two spinlocks, etc... - the l2 cache flushes in general, protected by a spinlock, currently unsolved. - the gptimer again, but here, we have another clocksource, the cortex a9 global timer (but maybe not usable with CONFIG_PM?). - the twd had a lot of early shots because the frequency calibration was too imprecise (503.6 MHz instead of 504MHz on the board I have), this is solved by the patches relying on clock frequency instead. Regards. -- Gilles. ^ permalink raw reply [flat|nested] 8+ messages in thread
* IRQS off tracer - is it useful or not? 2011-06-25 13:21 IRQS off tracer - is it useful or not? Russell King - ARM Linux 2011-06-25 13:42 ` murali at embeddedwireless.com 2011-06-27 16:26 ` Stephen Boyd @ 2011-06-27 20:38 ` Todd Poynor 2011-06-28 23:08 ` Frank Rowand 3 siblings, 0 replies; 8+ messages in thread From: Todd Poynor @ 2011-06-27 20:38 UTC (permalink / raw) To: linux-arm-kernel I used IRQs off tracing on 2.6.36 on Tegra (and am responsible for sending the fixups to make it work better on ARM). The tracing did help find probably 2-3 real problems with drivers performing too much processing with IRQs disabled (I can go dig up which problems if that would be useful). I didn't usually pay close attention to any values printed that were in the sub-millisecond range, mostly looking for larger outliers. I did have a couple of problems with it. One, as already noted, the tracing points aren't necessarily positioned such that the return address points to the caller of interest. As I recall, it looked like various config options could change whether the place with the trace point was called directly by the code of interest or whether that function was likely to be an intermediary -- options such as CONFIG_INLINE_SPIN_UNLOCK_IRQRESTORE=n probably broke the assumptions, and other things like CONFIG_DEBUG_LOCK_ALLOC and CONFIG_DEBUG_SPINLOCK could also affect what was inlined. I did hack these up to get to the real caller where I found something of possible interest. There was another problem with an approximate 2500ms time delta in impossible places. I debugged it some and found that it was likely due to something already been fixed in later kernels, so I just discarded those results. (I made a note at the time that it was probably fixed in 69d1b839 "ring-buffer: Bind time extend and data events together", but I can't recall the details at the moment.) Having stuck my neck for out for the feature having some use, I can volunteer my time to help clean things up if folks think it's worth it. Todd ^ permalink raw reply [flat|nested] 8+ messages in thread
* IRQS off tracer - is it useful or not? 2011-06-25 13:21 IRQS off tracer - is it useful or not? Russell King - ARM Linux ` (2 preceding siblings ...) 2011-06-27 20:38 ` Todd Poynor @ 2011-06-28 23:08 ` Frank Rowand 3 siblings, 0 replies; 8+ messages in thread From: Frank Rowand @ 2011-06-28 23:08 UTC (permalink / raw) To: linux-arm-kernel On 06/25/11 06:21, Russell King - ARM Linux wrote: > I've just been looking at the IRQS off tracer for the first time. I'm > getting output such as: < snip > > So, from what I can see, the irqsoff tracing support is next to useless, > and given that, anyone got a good reason why I should care about its > hooks? If I should care about them, it really needs to be fixed so it > actually provides useful information. Here are my experiences as a data point that I hope can be useful... Having irqsoff instrumentation has been crucial for my work. Unfortunately I don't have experience with the current mainline implementation. I started using irqsoff measurements in 2.6.24 plus the RT_PREEMPT patch 2.6.24-rt1. I found the overhead of the tools at that time to be larger than desired for my purposes. So I took the concepts and framework from the RT_PREEMPT patch 2.6.24-rt1 and wrote a more light weight (but much less user friendly and much less mainline acceptable) version of irqsoff and preempt off measurement tools. The team that I wrote the tools for found enough value in them to port them forward to 2.6.29 plus the RT_PREEMPT patches when 2.6.29 arrived. That team is now considering moving the tools forward to 2.6.35 plus the RT_PREEMPT patches. To sum up, the RT_PREEMPT irqsoff instrumentation that eventually evolved into what is in mainline today had too much overhead for __my purposes__. But my kernel development teams and product development teams have derived found similar (but lower overhead) instrumentation to be critical for their tuning needs. If Todd Poyner helps clean things up (as he offered in an earlier reply), maybe I'll be able to find some time to help him out (not sure how much I can promise to squeeze into my schedule yet). -Frank Rowand Sony Network Entertainment ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-06-28 23:08 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-06-25 13:21 IRQS off tracer - is it useful or not? Russell King - ARM Linux 2011-06-25 13:42 ` murali at embeddedwireless.com 2011-06-27 16:26 ` Stephen Boyd 2011-06-27 16:54 ` Russell King - ARM Linux 2011-06-27 17:31 ` Nicolas Pitre 2011-06-27 20:17 ` Gilles Chanteperdrix 2011-06-27 20:38 ` Todd Poynor 2011-06-28 23:08 ` Frank Rowand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).