Latency data - 2.6.14-rc3-rt13

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Latency data - 2.6.14-rc3-rt13
@ 2005-10-10 20:16 Mark Knecht
  2005-10-10 20:49 ` Daniel Walker
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Knecht @ 2005-10-10 20:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Lee Revell

Hi,
   I'm still suffering the occasional xrun using Jack so I'm trying to
understand what information I can feed back on this. I turned on the
histogram feature for both IRQ-off and preempt-off. From what I can
see so far it seems that something is turning IRQs off for long
periods of time? I think that this is similar to what Lee Revell
showed the other day, but I don't know how to get the same sort of
printout he showed.

This is the Jack data:

nperiods = 2 for capture
nperiods = 2 for playback
08:16:26.401 Server configuration saved to "/home/mark/.jackdrc".
08:16:26.401 Statistics reset.
08:16:26.637 Client activated.
08:16:26.639 Audio connection change.
08:16:26.642 Audio connection graph change.
08:16:28.395 MIDI connection graph change.
08:16:28.472 MIDI connection change.
08:16:29.664 Audio connection graph change.
08:19:50.071 Audio connection graph change.
08:19:50.263 MIDI connection graph change.
08:19:50.273 MIDI connection graph change.
10:04:36.179 Audio connection graph change.
10:04:36.362 Audio connection change.
10:04:38.800 Audio connection graph change.
10:04:38.810 Audio connection graph change.
12:34:09.601 XRUN callback (1).
**** alsa_pcm: xrun of at least 0.133 msecs
13:09:41.344 MIDI connection graph change.
13:09:41.439 MIDI connection change.
13:09:42.427 Audio connection graph change.
13:09:45.919 Audio connection graph change.
13:09:45.941 Audio connection graph change.
13:09:45.960 MIDI connection graph change.

This is what I see for preempt off latency. This looks really good to me:

#Minimum latency: 0 microseconds.
#Average latency: 0 microseconds.
#Maximum latency: 1207040 microseconds.
#Total samples: 196871313
#There are 2 samples greater or equal than 10240 microseconds
#usecs           samples
    0          190504250
    1            5881275
    2             379533
    3              69966
    4              13858
    5               5708
    6               2287
    7               7726
    8               1771
    9                465
   10                649
   11                935
   12                948
   13                720
   14                573
   15                329
   16                139
   17                 71
   18                 46
   19                 27
   20                 21
   21                  7
   22                  3
   23                  2
   24                  1
   25                  0
   26                  0
   27                  0

However, for IRQ off latency I have far larger numbers. I Assume that
it's probably pretty bad that interrupts could be turned off for as
long as 4mS when I'm trying to run sub3mS?

#Minimum latency: 0 microseconds.
#Average latency: 4 microseconds.
#Maximum latency: 3998 microseconds.
#Total samples: 3102665680
#There are 0 samples greater or equal than 10240 microseconds
#usecs           samples
    0         3066905198
    1            3817807
    2           15630448
    3             347330
    4              55362
    5              40098
    6              27080
    7              27740
    8              22484
    9              20298
   10              15727
   11              15346
   12              14870
   13              15258
   14             101559
   15              16152
<SNIP>
 1365              18434
 1366              15529
 1367              12471
 1368              11392
 1369              13927
 1370              21813
 1371              39568
 1372              87654
 1373             250408
 1374             535990
 1375             577197
 1376             370609
 1377             120045
 1378              20431
 1379               5312
 1380               5349
 1381               8203
 1382              14773
 1383              15352
 1384               9515
 1385               5020
 1386               3618
 1387               3738
 1388               5554
 1389               8669
 1390               8244
 1391               6448
 1392               4948
 1393               4151
 1394               5984
 1395              10599
 1396              19058
 1397              41222
 1398             109732
 1399             360362
 1400             354579
 1401             113700
 1402              11014
 1403               1240
 1404                754
 1405                772
 1406                963
 1407                850
 1408                445
 1409                276
<SNIP>
 3976                 62
 3977                 57
 3978                 71
 3979                 70
 3980                 59
 3981                 61
 3982                 69
 3983                 76
 3984                 75
 3985                 72
 3986                 76
 3987                 80
 3988                 65
 3989                 64
 3990                 78
 3991                 70
 3992                 85
 3993                128
 3994                236
 3995                358
 3996                 96
 3997                  0
 3998                  1
 3999                  0
 4000                  0
 4001                  0
 4002                  0

How can I get data that would be more useful in terms of real debug?

Thanks,
Mark

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-10 20:16 Latency data - 2.6.14-rc3-rt13 Mark Knecht
@ 2005-10-10 20:49 ` Daniel Walker
  2005-10-10 21:12   ` Mark Knecht
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Walker @ 2005-10-10 20:49 UTC (permalink / raw)
  To: Mark Knecht; +Cc: linux-kernel, Ingo Molnar, Lee Revell

On Mon, 2005-10-10 at 13:16 -0700, Mark Knecht wrote:
> Hi,

> 
> How can I get data that would be more useful in terms of real debug?

The IRQ off times look like the worst . If you do "make menuconfig"

then goto Kernel Hacking and select 
"Interrupts-off critical section latency timing"
Then select "Latency tracing"

Then when you boot the system before run the following,

"echo 0 > /proc/sys/kernel/preempt_max_latency"

That will record a trace for every maximum latency observed for IRQ
latency . You can view the trace with this command
"cat /proc/latency_trace" , and you can attach the trace to an email to
LKML so we can review it (compress it if it's big though) .

Daniel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-10 20:49 ` Daniel Walker
@ 2005-10-10 21:12   ` Mark Knecht
  2005-10-10 21:28     ` Mark Knecht
  2005-10-10 21:39     ` Daniel Walker
  0 siblings, 2 replies; 26+ messages in thread
From: Mark Knecht @ 2005-10-10 21:12 UTC (permalink / raw)
  To: Daniel Walker; +Cc: linux-kernel, Ingo Molnar, Lee Revell

On 10/10/05, Daniel Walker <dwalker@mvista.com> wrote:
> On Mon, 2005-10-10 at 13:16 -0700, Mark Knecht wrote:
> > Hi,
>
> >
> > How can I get data that would be more useful in terms of real debug?
>
> The IRQ off times look like the worst . If you do "make menuconfig"
>
> then goto Kernel Hacking and select
> "Interrupts-off critical section latency timing"
> Then select "Latency tracing"
>
> Then when you boot the system before run the following,
>
> "echo 0 > /proc/sys/kernel/preempt_max_latency"

So this disables the tracing of preempt times but keeps IRQ times on? Cool.
>
>
> That will record a trace for every maximum latency observed for IRQ
> latency . You can view the trace with this command
> "cat /proc/latency_trace" , and you can attach the trace to an email to
> LKML so we can review it (compress it if it's big though) .
>
> Daniel

Will do. Building now. I'll be back later.

Is there anything specific I should look for in the traces myself?
Anyway to help narrow it down?

Cheers,
Mark

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-10 21:12   ` Mark Knecht
@ 2005-10-10 21:28     ` Mark Knecht
  2005-10-10 21:44       ` Daniel Walker
  2005-10-10 21:39     ` Daniel Walker
  1 sibling, 1 reply; 26+ messages in thread
From: Mark Knecht @ 2005-10-10 21:28 UTC (permalink / raw)
  To: Daniel Walker; +Cc: linux-kernel, Ingo Molnar, Lee Revell

On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote:
<SNIP>
> > then goto Kernel Hacking and select
> > "Interrupts-off critical section latency timing"
> > Then select "Latency tracing"

Only had to turn on Latency Tracing. The others I had on...

<SNIP>
> >
> > Daniel
>
> Will do. Building now. I'll be back later.
>

Unfortunately I didn't think I'd be back this fast. I built the new
kernel and rebooted. The boot starts, gets down to the point where it
tells me that the preempt debug stuff is on, and then jumps to an
endlessly repeating error message:

init[1]: segfault at ffffffff8010fce0 rip ffffffff8010fce0 rsp
00007fffffcb09b8 error 15

This error repeasts endlessly until I reboot.

Good thing I had another kernel I could boot back into! ;-)

So, something isn't happy. Is this a -rt thing or a kernel issue?

Cheers,
Mark

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-10 21:28     ` Mark Knecht
@ 2005-10-10 21:44       ` Daniel Walker
  2005-10-10 22:09         ` Mark Knecht
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Walker @ 2005-10-10 21:44 UTC (permalink / raw)
  To: Mark Knecht; +Cc: linux-kernel, Ingo Molnar, Lee Revell

On Mon, 2005-10-10 at 14:28 -0700, Mark Knecht wrote:
> On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote:
> <SNIP>
> > > then goto Kernel Hacking and select
> > > "Interrupts-off critical section latency timing"
> > > Then select "Latency tracing"
> 
> Only had to turn on Latency Tracing. The others I had on...
> 
> <SNIP>
> > >
> > > Daniel
> >
> > Will do. Building now. I'll be back later.
> >
> 
> Unfortunately I didn't think I'd be back this fast. I built the new
> kernel and rebooted. The boot starts, gets down to the point where it
> tells me that the preempt debug stuff is on, and then jumps to an
> endlessly repeating error message:
> 
> init[1]: segfault at ffffffff8010fce0 rip ffffffff8010fce0 rsp
> 00007fffffcb09b8 error 15
> 
> This error repeasts endlessly until I reboot.
> 
> Good thing I had another kernel I could boot back into! ;-)
> 
> So, something isn't happy. Is this a -rt thing or a kernel issue?

Hmm, it looks like latency tracing doesn't work on x86_64 .. I guess
you'll have to wait till someone fixes it .

Another option is to turn off "Latency Tracing" then reboot, like it was
before but w/o the histogram. Then run,

"echo 0 > /proc/sys/kernel/preempt_max_latency"

Whenever a new maximum latency is observed it will log it with a stack
trace in the system logs. You can report that back here on LKML . 

You can view the system log with the command "dmesg" , I think .

Daniel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-10 21:44       ` Daniel Walker
@ 2005-10-10 22:09         ` Mark Knecht
  2005-10-10 22:28           ` Daniel Walker
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Knecht @ 2005-10-10 22:09 UTC (permalink / raw)
  To: Daniel Walker; +Cc: linux-kernel, Ingo Molnar, Lee Revell

On 10/10/05, Daniel Walker <dwalker@mvista.com> wrote:
> On Mon, 2005-10-10 at 14:28 -0700, Mark Knecht wrote:
> > On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote:
> > <SNIP>
> > > > then goto Kernel Hacking and select
> > > > "Interrupts-off critical section latency timing"
> > > > Then select "Latency tracing"
> >
> > Only had to turn on Latency Tracing. The others I had on...
> >
> > <SNIP>
> > > >
> > > > Daniel
> > >
> > > Will do. Building now. I'll be back later.
> > >
> >
> > Unfortunately I didn't think I'd be back this fast. I built the new
> > kernel and rebooted. The boot starts, gets down to the point where it
> > tells me that the preempt debug stuff is on, and then jumps to an
> > endlessly repeating error message:
> >
> > init[1]: segfault at ffffffff8010fce0 rip ffffffff8010fce0 rsp
> > 00007fffffcb09b8 error 15
> >
> > This error repeasts endlessly until I reboot.
> >
> > Good thing I had another kernel I could boot back into! ;-)
> >
> > So, something isn't happy. Is this a -rt thing or a kernel issue?
>
> Hmm, it looks like latency tracing doesn't work on x86_64 .. I guess
> you'll have to wait till someone fixes it .
>
> Another option is to turn off "Latency Tracing" then reboot, like it was
> before but w/o the histogram. Then run,
>
> "echo 0 > /proc/sys/kernel/preempt_max_latency"
>
> Whenever a new maximum latency is observed it will log it with a stack
> trace in the system logs. You can report that back here on LKML .
>
> You can view the system log with the command "dmesg" , I think .
>
> Daniel
>
>
Yes, already that looks interesting. Do I have something going on with
acpi? This is before running Jack. I'm in Gnome.There are a lot of
these messages, but they've stopped. I suppose the 3997 is the delay?
That would coorespond with the info I sent earlier.

( softirq-timer/0-3    |#0): new 3997 us maximum-latency critical section.
 => started at timestamp 185717289: <acpi_processor_idle+0x20/0x379>
 =>   ended at timestamp 185721287: <thread_return+0xb5/0x11a>

Call Trace:<ffffffff801519ac>{check_critical_timing+492}
<ffffffff80151e05>{sub_preempt_count_ti+133}
       <ffffffff803f632c>{thread_return+70}
<ffffffff803f639b>{thread_return+181}
       <ffffffff803f6505>{schedule+261} <ffffffff8013a84a>{ksoftirqd+138}
       <ffffffff8013a7c0>{ksoftirqd+0} <ffffffff8014aa8d>{kthread+205}
       <ffffffff8010f716>{child_rip+8} <ffffffff8014a9c0>{kthread+0}
       <ffffffff8010f70e>{child_rip+0}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f5df8>] .... __schedule+0xb8/0x5a6
.....[<00000000>] ..   ( <= 0x0)

 =>   dump-end timestamp 185721364

mark@lightning ~ $

   I started up Jack, mounted my 1394 drives and started streaming a
little bit of data. Basicaly my ultra-light load test. I see nothing
new in dmesg yet. I'll keep watching.

Thanks,
Mark

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-10 22:09         ` Mark Knecht
@ 2005-10-10 22:28           ` Daniel Walker
  2005-10-10 23:33             ` Mark Knecht
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Walker @ 2005-10-10 22:28 UTC (permalink / raw)
  To: Mark Knecht; +Cc: linux-kernel, Ingo Molnar, Lee Revell

On Mon, 2005-10-10 at 15:09 -0700, Mark Knecht wrote:
> On 10/10/05, Daniel Walker <dwalker@mvista.com> wrote:
> > On Mon, 2005-10-10 at 14:28 -0700, Mark Knecht wrote:
> > > On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote:
> > > <SNIP>
> > > > > then goto Kernel Hacking and select
> > > > > "Interrupts-off critical section latency timing"
> > > > > Then select "Latency tracing"
> > >
> > > Only had to turn on Latency Tracing. The others I had on...
> > >
> > > <SNIP>
> > > > >
> > > > > Daniel
> > > >
> > > > Will do. Building now. I'll be back later.
> > > >
> > >
> > > Unfortunately I didn't think I'd be back this fast. I built the new
> > > kernel and rebooted. The boot starts, gets down to the point where it
> > > tells me that the preempt debug stuff is on, and then jumps to an
> > > endlessly repeating error message:
> > >
> > > init[1]: segfault at ffffffff8010fce0 rip ffffffff8010fce0 rsp
> > > 00007fffffcb09b8 error 15
> > >
> > > This error repeasts endlessly until I reboot.
> > >
> > > Good thing I had another kernel I could boot back into! ;-)
> > >
> > > So, something isn't happy. Is this a -rt thing or a kernel issue?
> >
> > Hmm, it looks like latency tracing doesn't work on x86_64 .. I guess
> > you'll have to wait till someone fixes it .
> >
> > Another option is to turn off "Latency Tracing" then reboot, like it was
> > before but w/o the histogram. Then run,
> >
> > "echo 0 > /proc/sys/kernel/preempt_max_latency"
> >
> > Whenever a new maximum latency is observed it will log it with a stack
> > trace in the system logs. You can report that back here on LKML .
> >
> > You can view the system log with the command "dmesg" , I think .
> >
> > Daniel
> >
> >
> Yes, already that looks interesting. Do I have something going on with
> acpi? This is before running Jack. I'm in Gnome.There are a lot of
> these messages, but they've stopped. I suppose the 3997 is the delay?
> That would coorespond with the info I sent earlier.

I think this is a false reading . Sometimes when a system goes idle it
will cause interrupt disable latency that isn't real (due to process
halting) . You could try turning ACPI off if you can , and turn off
power management ..

Daniel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-10 22:28           ` Daniel Walker
@ 2005-10-10 23:33             ` Mark Knecht
  2005-10-10 23:49               ` Mark Knecht
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Knecht @ 2005-10-10 23:33 UTC (permalink / raw)
  To: Daniel Walker; +Cc: linux-kernel, Ingo Molnar, Lee Revell

On 10/10/05, Daniel Walker <dwalker@mvista.com> wrote:

> > Yes, already that looks interesting. Do I have something going on with
> > acpi? This is before running Jack. I'm in Gnome.There are a lot of
> > these messages, but they've stopped. I suppose the 3997 is the delay?
> > That would coorespond with the info I sent earlier.
>
> I think this is a false reading . Sometimes when a system goes idle it
> will cause interrupt disable latency that isn't real (due to process
> halting) . You could try turning ACPI off if you can , and turn off
> power management ..
>
> Daniel

OK. NO immediate difference with most of the power management stuff
turned off. I'll just let it run a few hours and see if I get an xrun.
If I do I'll look at the logs again and report back.

Thanks for the help. I feel like I've got a chance of spotting something.

Cheers,
Mark

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-10 23:33             ` Mark Knecht
@ 2005-10-10 23:49               ` Mark Knecht
  2005-10-11  3:45                 ` Mark Knecht
  2005-10-11  3:56                 ` Matan Peled
  0 siblings, 2 replies; 26+ messages in thread
From: Mark Knecht @ 2005-10-10 23:49 UTC (permalink / raw)
  To: Daniel Walker; +Cc: linux-kernel, Ingo Molnar, Lee Revell

On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote:
> On 10/10/05, Daniel Walker <dwalker@mvista.com> wrote:
>
> > > Yes, already that looks interesting. Do I have something going on with
> > > acpi? This is before running Jack. I'm in Gnome.There are a lot of
> > > these messages, but they've stopped. I suppose the 3997 is the delay?
> > > That would coorespond with the info I sent earlier.
> >
> > I think this is a false reading . Sometimes when a system goes idle it
> > will cause interrupt disable latency that isn't real (due to process
> > halting) . You could try turning ACPI off if you can , and turn off
> > power management ..
> >
> > Daniel
>
> OK. NO immediate difference with most of the power management stuff
> turned off. I'll just let it run a few hours and see if I get an xrun.
> If I do I'll look at the logs again and report back.
>
> Thanks for the help. I feel like I've got a chance of spotting something.
>
> Cheers,
> Mark

Well, I'm disappointed again. Some xruns came but no additional data
was put into the dmesage file:

( softirq-timer/0-3    |#0): new 3997 us maximum-latency critical section.
 => started at timestamp 253995015: <acpi_processor_idle+0x20/0x379>
 =>   ended at timestamp 253999013: <thread_return+0xb5/0x11a>

Call Trace:<ffffffff8014e32c>{check_critical_timing+492}
<ffffffff8014e785>{sub_preempt_count_ti+133}
       <ffffffff803ec76c>{thread_return+70}
<ffffffff803ec7db>{thread_return+181}
       <ffffffff803ec945>{schedule+261} <ffffffff801371ca>{ksoftirqd+138}
       <ffffffff80137140>{ksoftirqd+0} <ffffffff8014740d>{kthread+205}
       <ffffffff8010e6e6>{child_rip+8} <ffffffff80147340>{kthread+0}
       <ffffffff8010e6de>{child_rip+0}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803ec238>] .... __schedule+0xb8/0x5a6
.....[<00000000>] ..   ( <= 0x0)

 =>   dump-end timestamp 253999092

kjournald starting.  Commit interval 5 seconds
EXT3 FS on sdb1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
lightning ~ #

Basically the trace above was there pretty much right after booting. I
mounted a 1394 drive, which is shown, and then started running apps. I
got 3 xruns:

16:42:59.826 XRUN callback (1).
**** alsa_pcm: xrun of at least 0.273 msecs
16:43:03.764 XRUN callback (2).
**** alsa_pcm: xrun of at least 2.082 msecs
16:43:41.618 XRUN callback (3).
**** alsa_pcm: xrun of at least 2.040 msecs

but no additional data in the logs.

So possibly this is an IRQ off latency? Or am I jumping to
conclusions? The trace above, if I understand it, indicates a delay of
nearly 4mS, and again, since I'm trying to run at sub 3mS, that would
seem like a potential problem.

I'm going to completely shut off power management as a test just to
see where that leads.

- Mark

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-10 23:49               ` Mark Knecht
@ 2005-10-11  3:45                 ` Mark Knecht
  2005-10-11  6:46                   ` Steven Rostedt
                                     ` (2 more replies)
  2005-10-11  3:56                 ` Matan Peled
  1 sibling, 3 replies; 26+ messages in thread
From: Mark Knecht @ 2005-10-11  3:45 UTC (permalink / raw)
  To: Daniel Walker; +Cc: linux-kernel, Ingo Molnar, Lee Revell

On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote:
>
> ( softirq-timer/0-3    |#0): new 3997 us maximum-latency critical section.

So the root cause of this 4mS delay is the 250Hz timer. If I change
the system to use the 1Khz timer then the time in this section drops,
as expected, to 1mS.

( softirq-timer/0-3    |#0): new 998 us maximum-latency critical section.
 => started at timestamp 121040020: <acpi_processor_idle+0x20/0x379>
 =>   ended at timestamp 121041019: <thread_return+0xb5/0x11a>

So, thinking very interesting here I think.

Back to the drawing board as to my xruns.

- Mark

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-11  3:45                 ` Mark Knecht
@ 2005-10-11  6:46                   ` Steven Rostedt
  2005-10-11  8:56                   ` Sven-Thorsten Dietrich
  2005-10-11 11:17                   ` Ingo Molnar
  2 siblings, 0 replies; 26+ messages in thread
From: Steven Rostedt @ 2005-10-11  6:46 UTC (permalink / raw)
  To: Mark Knecht; +Cc: Daniel Walker, linux-kernel, Ingo Molnar, Lee Revell


On Mon, 10 Oct 2005, Mark Knecht wrote:

> On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote:
> >
> > ( softirq-timer/0-3    |#0): new 3997 us maximum-latency critical section.
>
> So the root cause of this 4mS delay is the 250Hz timer. If I change
> the system to use the 1Khz timer then the time in this section drops,
> as expected, to 1mS.
>
> ( softirq-timer/0-3    |#0): new 998 us maximum-latency critical section.
>  => started at timestamp 121040020: <acpi_processor_idle+0x20/0x379>
>  =>   ended at timestamp 121041019: <thread_return+0xb5/0x11a>
>
> So, thinking very interesting here I think.
>
> Back to the drawing board as to my xruns.
>

Are your xruns showing the same type of latency?  If you switch to the
1000Hz do you get only 1mS latency on your xruns too?  This sounds like
the application might have gone to sleep and didn't wake up until
something scheduled it in.  Or something else with the scheduler.  I doubt
that this has to due with preemption or interrupts being off, but
something that uses jiffies to calculate.

-- Steve


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-11  3:45                 ` Mark Knecht
  2005-10-11  6:46                   ` Steven Rostedt
@ 2005-10-11  8:56                   ` Sven-Thorsten Dietrich
  2005-10-11 11:17                   ` Ingo Molnar
  2 siblings, 0 replies; 26+ messages in thread
From: Sven-Thorsten Dietrich @ 2005-10-11  8:56 UTC (permalink / raw)
  To: Mark Knecht; +Cc: Daniel Walker, linux-kernel, Ingo Molnar, Lee Revell

On Mon, 2005-10-10 at 20:45 -0700, Mark Knecht wrote:
> On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote:
> >
> > ( softirq-timer/0-3    |#0): new 3997 us maximum-latency critical section.
> 
> So the root cause of this 4mS delay is the 250Hz timer. If I change
> the system to use the 1Khz timer then the time in this section drops,
> as expected, to 1mS.
> 
> ( softirq-timer/0-3    |#0): new 998 us maximum-latency critical section.
>  => started at timestamp 121040020: <acpi_processor_idle+0x20/0x379>
>  =>   ended at timestamp 121041019: <thread_return+0xb5/0x11a>
> 
> So, thinking very interesting here I think.
> 

Maximum preemption latencies near the timer period 
are sometimes indicative of mismatched 
preempt_disable/preempt_enable pairs in the code.

Usually the scheduler warns about that on the console, however.

Sven

> Back to the drawing board as to my xruns.
> 
> - Mark
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
***********************************
Sven-Thorsten Dietrich
Real-Time Software Architect
MontaVista Software, Inc.
1237 East Arques Ave.
Sunnyvale, CA 94085

Phone: 408.992.4515
Fax: 408.328.9204

http://www.mvista.com
Platform To Innovate
*********************************** 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-11  3:45                 ` Mark Knecht
  2005-10-11  6:46                   ` Steven Rostedt
  2005-10-11  8:56                   ` Sven-Thorsten Dietrich
@ 2005-10-11 11:17                   ` Ingo Molnar
  2005-10-11 22:45                     ` Mark Knecht
  2 siblings, 1 reply; 26+ messages in thread
From: Ingo Molnar @ 2005-10-11 11:17 UTC (permalink / raw)
  To: Mark Knecht; +Cc: Daniel Walker, linux-kernel, Lee Revell


* Mark Knecht <markknecht@gmail.com> wrote:

> > ( softirq-timer/0-3    |#0): new 3997 us maximum-latency critical section.
> 
> So the root cause of this 4mS delay is the 250Hz timer. If I change 
> the system to use the 1Khz timer then the time in this section drops, 
> as expected, to 1mS.

this was a bug in the critical-section-latency measurement code of x64.  
The timer irq is the one that leaves userspace running for the longest 
time, between two kernel calls.

I have fixed these bugs in -rc4-rt1, could you try it? It should report 
much lower latencies, regardless of PM settings.

	Ingo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-11 11:17                   ` Ingo Molnar
@ 2005-10-11 22:45                     ` Mark Knecht
  2005-10-12  0:02                       ` Lee Revell
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Knecht @ 2005-10-11 22:45 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Daniel Walker, linux-kernel, Lee Revell

On 10/11/05, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Mark Knecht <markknecht@gmail.com> wrote:
>
> > > ( softirq-timer/0-3    |#0): new 3997 us maximum-latency critical section.
> >
> > So the root cause of this 4mS delay is the 250Hz timer. If I change
> > the system to use the 1Khz timer then the time in this section drops,
> > as expected, to 1mS.
>
> this was a bug in the critical-section-latency measurement code of x64.
> The timer irq is the one that leaves userspace running for the longest
> time, between two kernel calls.
>
> I have fixed these bugs in -rc4-rt1, could you try it? It should report
> much lower latencies, regardless of PM settings.
>
>         Ingo
>

Ingo,
   This test now reports much more intersting data:

(           dmesg-8010 |#0): new 13 us maximum-latency critical section.
 => started at timestamp 117628604: <do_IRQ+0x29/0x50>
 =>   ended at timestamp 117628618: <do_IRQ+0x39/0x50>

Call Trace: <IRQ> <ffffffff8014c42c>{check_critical_timing+492}
       <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff80110159>{do_IRQ+57}
       <ffffffff8010e16c>{ret_from_intr+0}  <EOI>
<ffffffff80249a75>{copy_page+5}
       <ffffffff801671f1>{do_no_page+737}
<ffffffff801674ee>{__handle_mm_fault+414}
       <ffffffff8014c64b>{sub_preempt_count_ti+75}
<ffffffff803eac99>{do_page_fault+1049}
       <ffffffff8014c64b>{sub_preempt_count_ti+75}
<ffffffff80168eee>{vma_link+286}
       <ffffffff8010e4c1>{error_exit+0} <ffffffff8024a866>{__clear_user+22}
       <ffffffff801a46fb>{padzero+27} <ffffffff801a4b42>{load_elf_interp+850}
       <ffffffff801a5b1d>{load_elf_binary+3341}
<ffffffff8014c64b>{sub_preempt_count_ti+75}
       <ffffffff8014c64b>{sub_preempt_count_ti+75}
<ffffffff801a4e10>{load_elf_binary+0}
       <ffffffff80182ee0>{search_binary_handler+272}
<ffffffff801831f5>{do_execve+405}
       <ffffffff8010dbc6>{system_call+126} <ffffffff8010c634>{sys_execve+68}
       <ffffffff8010dfea>{stub_execve+106}
 =>   dump-end timestamp 117628777

(        nautilus-7955 |#0): new 14 us maximum-latency critical section.
 => started at timestamp 127874927: <do_IRQ+0x29/0x50>
 =>   ended at timestamp 127874941: <do_IRQ+0x39/0x50>

Call Trace: <IRQ> <ffffffff8014c42c>{check_critical_timing+492}
       <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff80110159>{do_IRQ+57}
       <ffffffff8010e16c>{ret_from_intr+0}  <EOI>
<ffffffff8010dbc6>{system_call+126}

 =>   dump-end timestamp 127874996

(        nautilus-7955 |#0): new 14 us maximum-latency critical section.
 => started at timestamp 128647499: <do_IRQ+0x29/0x50>
 =>   ended at timestamp 128647514: <do_IRQ+0x39/0x50>

Call Trace: <IRQ> <ffffffff8014c42c>{check_critical_timing+492}
       <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff80110159>{do_IRQ+57}
       <ffffffff8010e16c>{ret_from_intr+0}  <EOI>
 =>   dump-end timestamp 128647567

lightning ~ #

NOTE: In my first kernel build I turned on latency measurement for
both max preempt and ITQ-off. The kernel segfaulted immediately after
the boot messages reminding me that these were on. I rebuilt the
kernel with only the max preempt and that worked. I suspect some
problem with the IRQ-off section since that's the only change I made
and it did not segfault in -rc3-rt13.

Thanks,
Mark

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-11 22:45                     ` Mark Knecht
@ 2005-10-12  0:02                       ` Lee Revell
  2005-10-12  1:09                         ` Mark Knecht
  0 siblings, 1 reply; 26+ messages in thread
From: Lee Revell @ 2005-10-12  0:02 UTC (permalink / raw)
  To: Mark Knecht; +Cc: Ingo Molnar, Daniel Walker, linux-kernel

On Tue, 2005-10-11 at 15:45 -0700, Mark Knecht wrote:
> On 10/11/05, Ingo Molnar <mingo@elte.hu> wrote:
> >
> > * Mark Knecht <markknecht@gmail.com> wrote:
> >
> > > > ( softirq-timer/0-3    |#0): new 3997 us maximum-latency critical section.
> > >
> > > So the root cause of this 4mS delay is the 250Hz timer. If I change
> > > the system to use the 1Khz timer then the time in this section drops,
> > > as expected, to 1mS.
> >
> > this was a bug in the critical-section-latency measurement code of x64.
> > The timer irq is the one that leaves userspace running for the longest
> > time, between two kernel calls.
> >
> > I have fixed these bugs in -rc4-rt1, could you try it? It should report
> > much lower latencies, regardless of PM settings.
> >
> >         Ingo
> >
> 
> Ingo,
>    This test now reports much more intersting data:
> 
> (           dmesg-8010 |#0): new 13 us maximum-latency critical section.
>  => started at timestamp 117628604: <do_IRQ+0x29/0x50>
>  =>   ended at timestamp 117628618: <do_IRQ+0x39/0x50>

This is the expected, correct behavior - very small maximum latency
critical sections.  Do you get anything longer (say 300 usecs or more)
if you leave it running?

So far the latency tracer on my much slower system has only gone up to
123 usecs.  So the bug seems to be fixed at least on i386.

Lee


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-12  0:02                       ` Lee Revell
@ 2005-10-12  1:09                         ` Mark Knecht
  2005-10-12  1:21                           ` Lee Revell
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Knecht @ 2005-10-12  1:09 UTC (permalink / raw)
  To: Lee Revell; +Cc: Ingo Molnar, Daniel Walker, linux-kernel

On 10/11/05, Lee Revell <rlrevell@joe-job.com> wrote:
> On Tue, 2005-10-11 at 15:45 -0700, Mark Knecht wrote:
> > On 10/11/05, Ingo Molnar <mingo@elte.hu> wrote:
> > >
> > > * Mark Knecht <markknecht@gmail.com> wrote:
> > >
> > > > > ( softirq-timer/0-3    |#0): new 3997 us maximum-latency critical section.
> > > >
> > > > So the root cause of this 4mS delay is the 250Hz timer. If I change
> > > > the system to use the 1Khz timer then the time in this section drops,
> > > > as expected, to 1mS.
> > >
> > > this was a bug in the critical-section-latency measurement code of x64.
> > > The timer irq is the one that leaves userspace running for the longest
> > > time, between two kernel calls.
> > >
> > > I have fixed these bugs in -rc4-rt1, could you try it? It should report
> > > much lower latencies, regardless of PM settings.
> > >
> > >         Ingo
> > >
> >
> > Ingo,
> >    This test now reports much more intersting data:
> >
> > (           dmesg-8010 |#0): new 13 us maximum-latency critical section.
> >  => started at timestamp 117628604: <do_IRQ+0x29/0x50>
> >  =>   ended at timestamp 117628618: <do_IRQ+0x39/0x50>
>
> This is the expected, correct behavior - very small maximum latency
> critical sections.  Do you get anything longer (say 300 usecs or more)
> if you leave it running?
>
> So far the latency tracer on my much slower system has only gone up to
> 123 usecs.  So the bug seems to be fixed at least on i386.
>
> Lee
>
>

I've been watching this now for a couple of hours. Still no xruns. Max
latency has only gone up to 19uS, so this is all good. However the
thing I am seeing is that all my free memory has gone away and I've
now swapped out just a little nit. There was really no change in the
state of the machine, other than xscreensaver starting and running for
a while, which is standard for me. None the less here's the sort of
state things are in - 17uS right after booting, and only 19uS now.
That's good. But in the middle look at the change in memory statistics
during a one hour time period.

(               X-7860 |#0): new 16 us maximum-latency critical section.
 => started at timestamp 259455017: <__schedule+0xb8/0x596>
 =>   ended at timestamp 259455034: <thread_return+0xb4/0x10a>

Call Trace:<ffffffff8014c42c>{check_critical_timing+492}
<ffffffff8014c64b>{sub_preempt_count_ti+75}
       <ffffffff803e7d9a>{thread_return+180} <ffffffff803e7ef5>{schedule+261}
       <ffffffff803e89e4>{schedule_timeout+148}
<ffffffff8013a660>{process_timeout+0}
       <ffffffff8018b767>{do_select+967} <ffffffff8018b2b0>{__pollwait+0}
       <ffffffff8018baad>{sys_select+749}
<ffffffff8010d9e9>{sys_rt_sigreturn+553}
       <ffffffff8010dbc6>{system_call+126}
 =>   dump-end timestamp 259455125

(               X-7860 |#0): new 17 us maximum-latency critical section.
 => started at timestamp 375451745: <__schedule+0xb8/0x596>
 =>   ended at timestamp 375451762: <thread_return+0xb4/0x10a>

Call Trace:<ffffffff8014c42c>{check_critical_timing+492}
<ffffffff8014c64b>{sub_preempt_count_ti+75}
       <ffffffff803e7d9a>{thread_return+180} <ffffffff803e7ef5>{schedule+261}
       <ffffffff803e89e4>{schedule_timeout+148}
<ffffffff8013a660>{process_timeout+0}
       <ffffffff8018b767>{do_select+967} <ffffffff8018b2b0>{__pollwait+0}
       <ffffffff8018baad>{sys_select+749}
<ffffffff8010d9e9>{sys_rt_sigreturn+553}
       <ffffffff8010dbc6>{system_call+126}
 =>   dump-end timestamp 375451852

kjournald starting.  Commit interval 5 seconds
EXT3 FS on sdc1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
lightning ~ #



lightning ~ # date && free
Tue Oct 11 16:21:41 PDT 2005
             total       used       free     shared    buffers     cached
Mem:        510460     410816      99644          0      15896     169664
-/+ buffers/cache:     225256     285204
Swap:       996020          0     996020
lightning ~ # date && free
Tue Oct 11 17:39:03 PDT 2005
             total       used       free     shared    buffers     cached
Mem:        510460     504464       5996          0      39828     119012
-/+ buffers/cache:     345624     164836
Swap:       996020        304     995716
lightning ~ #



(       hdspmixer-8071 |#0): new 18 us maximum-latency critical section.
 => started at timestamp 2132546112: <snd_ctl_open+0x71/0x200 [snd]>
 =>   ended at timestamp 2132546131: <snd_ctl_open+0x8a/0x200 [snd]>

Call Trace:<ffffffff8014c42c>{check_critical_timing+492}
<ffffffff8014c64b>{sub_preempt_count_ti+75}
       <ffffffff88006a0a>{:snd:snd_ctl_open+138}
<ffffffff8800414b>{:snd:snd_open+267}
       <ffffffff801806d9>{chrdev_open+441} <ffffffff801766a8>{__dentry_open+280}
       <ffffffff80176877>{filp_open+135}
<ffffffff8014c64b>{sub_preempt_count_ti+75}
       <ffffffff80176a06>{get_unused_fd+230} <ffffffff80176b71>{do_sys_open+81}
       <ffffffff8010dbc6>{system_call+126}
 =>   dump-end timestamp 2132546239

(          IRQ 58-4526 |#0): new 19 us maximum-latency critical section.
 => started at timestamp 8439787551: <__schedule+0xb8/0x596>
 =>   ended at timestamp 8439787571: <thread_return+0xb4/0x10a>

Call Trace:<ffffffff8014c42c>{check_critical_timing+492}
<ffffffff8014c64b>{sub_preempt_count_ti+75}
       <ffffffff803e7d9a>{thread_return+180}
<ffffffff80146550>{keventd_create_kthread+0}
       <ffffffff803e7ef5>{schedule+261} <ffffffff80156062>{do_irqd+514}
       <ffffffff80155e60>{do_irqd+0} <ffffffff8014650d>{kthread+205}
       <ffffffff8010e676>{child_rip+8}
<ffffffff80146550>{keventd_create_kthread+0}
       <ffffffff80354d00>{pci_conf1_read+0} <ffffffff80146440>{kthread+0}
       <ffffffff8010e66e>{child_rip+0}
 =>   dump-end timestamp 8439787658

lightning ~ #


Now, I know these command line methods are probably frowned upon by
folks in the know so I'm happy to do this in a more rigorous way. (Is
it called valgrind?) Should free memory drop like that over time?

The only apps running were Aqualung, hdspmixer and Firefox. Is one of
them leaking, is the kernel leaking, or is this normal?

I will continue to let this run for another 4-6 hours today and hope I
catch another xrun. If not I'll start it up again tomorrow and
certainly will get something.

Thanks much,
Mark

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-12  1:09                         ` Mark Knecht
@ 2005-10-12  1:21                           ` Lee Revell
  2005-10-12  1:25                             ` Mark Knecht
  2005-10-12  6:38                             ` Steven Rostedt
  0 siblings, 2 replies; 26+ messages in thread
From: Lee Revell @ 2005-10-12  1:21 UTC (permalink / raw)
  To: Mark Knecht; +Cc: Ingo Molnar, Daniel Walker, linux-kernel

On Tue, 2005-10-11 at 18:09 -0700, Mark Knecht wrote:
> Should free memory drop like that over time?

Yes this is perfectly normal.  When a system first boots all the memory
your apps aren't using is initially free.  As applications access more
data over time then it will be cached in memory until free memory drops
to near zero.

"Free memory" is actually wasted memory - it's better to use all
available RAM for caching.

Lee

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-12  1:21                           ` Lee Revell
@ 2005-10-12  1:25                             ` Mark Knecht
  2005-10-12  6:38                             ` Steven Rostedt
  1 sibling, 0 replies; 26+ messages in thread
From: Mark Knecht @ 2005-10-12  1:25 UTC (permalink / raw)
  To: Lee Revell; +Cc: Ingo Molnar, Daniel Walker, linux-kernel

On 10/11/05, Lee Revell <rlrevell@joe-job.com> wrote:
> On Tue, 2005-10-11 at 18:09 -0700, Mark Knecht wrote:
> > Should free memory drop like that over time?
>
> Yes this is perfectly normal.  When a system first boots all the memory
> your apps aren't using is initially free.  As applications access more
> data over time then it will be cached in memory until free memory drops
> to near zero.
>
> "Free memory" is actually wasted memory - it's better to use all
> available RAM for caching.
>
> Lee

OK, non-intuitive for a guitar player, but good to know it's normal. Thanks!

In that case I'll just wait for an xrun and hope I catch something in
dmesg. We'll just have to wait and see.

Cheers,
Mark

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-12  1:21                           ` Lee Revell
  2005-10-12  1:25                             ` Mark Knecht
@ 2005-10-12  6:38                             ` Steven Rostedt
  2005-10-12 16:18                               ` Lee Revell
  1 sibling, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2005-10-12  6:38 UTC (permalink / raw)
  To: Lee Revell; +Cc: Mark Knecht, Ingo Molnar, Daniel Walker, linux-kernel

On Tue, 11 Oct 2005, Lee Revell wrote:

> On Tue, 2005-10-11 at 18:09 -0700, Mark Knecht wrote:
> > Should free memory drop like that over time?
>
> Yes this is perfectly normal.  When a system first boots all the memory
> your apps aren't using is initially free.  As applications access more
> data over time then it will be cached in memory until free memory drops
> to near zero.
>
> "Free memory" is actually wasted memory - it's better to use all
> available RAM for caching.
>

But the swap being touched bothers me.  Although I've had problems with
leaving Mozilla up for long times and it leaking. Without Mozilla running
and running lots of other apps, I have almost 100% memory used, but 0%
swap.

If the swap starts to increase slowly over time, you _do_ have a
leak somewhere.  Probably not in the kernel (kernel memory never goes into
swap). But if you want to see if the kernel is leaking, examine
/proc/slabinfo once in a while and if you see something there constantly
growing, then that might indicate a leak.  Just pay attention to the first
column.

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-12  6:38                             ` Steven Rostedt
@ 2005-10-12 16:18                               ` Lee Revell
  2005-10-12 17:03                                 ` Steven Rostedt
  0 siblings, 1 reply; 26+ messages in thread
From: Lee Revell @ 2005-10-12 16:18 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Mark Knecht, Ingo Molnar, Daniel Walker, linux-kernel

On Wed, 2005-10-12 at 02:38 -0400, Steven Rostedt wrote:
> On Tue, 11 Oct 2005, Lee Revell wrote:
> 
> > On Tue, 2005-10-11 at 18:09 -0700, Mark Knecht wrote:
> > > Should free memory drop like that over time?
> >
> > Yes this is perfectly normal.  When a system first boots all the memory
> > your apps aren't using is initially free.  As applications access more
> > data over time then it will be cached in memory until free memory drops
> > to near zero.
> >
> > "Free memory" is actually wasted memory - it's better to use all
> > available RAM for caching.
> >
> 
> But the swap being touched bothers me.  Although I've had problems with
> leaving Mozilla up for long times and it leaking. Without Mozilla running
> and running lots of other apps, I have almost 100% memory used, but 0%
> swap.

I believe this is the expected behavior under 2.6 unless you
set /proc/sys/vm/swappiness to 0.  If an app allocates memory and then
never touches it then those pages will eventually be swapped out to make
room for hot ones.

Lee


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-12 16:18                               ` Lee Revell
@ 2005-10-12 17:03                                 ` Steven Rostedt
  0 siblings, 0 replies; 26+ messages in thread
From: Steven Rostedt @ 2005-10-12 17:03 UTC (permalink / raw)
  To: Lee Revell; +Cc: Mark Knecht, Ingo Molnar, Daniel Walker, linux-kernel


On Wed, 12 Oct 2005, Lee Revell wrote:

>
> I believe this is the expected behavior under 2.6 unless you
> set /proc/sys/vm/swappiness to 0.  If an app allocates memory and then
> never touches it then those pages will eventually be swapped out to make
> room for hot ones.
>

OK, thanks for the info.  I guess my apps don't allocate enough memory to
be eventually swapped out (I obviously run 2.6).  Or those apps use the
memory that it allocates often.

Whatever..

-- Steve


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-10 23:49               ` Mark Knecht
  2005-10-11  3:45                 ` Mark Knecht
@ 2005-10-11  3:56                 ` Matan Peled
  2005-10-11  4:45                   ` Mark Knecht
  1 sibling, 1 reply; 26+ messages in thread
From: Matan Peled @ 2005-10-11  3:56 UTC (permalink / raw)
  To: Mark Knecht; +Cc: Daniel Walker, linux-kernel, Ingo Molnar, Lee Revell

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mark Knecht wrote:
> Well, I'm disappointed again. Some xruns came but no additional data
> was put into the dmesage file:

"dmesage file"? You mean, "/var/log/dmesg"? No, thats not the file you want...
At least here, that file contains the logs as they were when the boot finishes.

The information you want can be retrieved by running the command (actual program
to run, type it in a terminal) "dmesg".

Or you might be able to look whats in the end of the /var/log/messages file, if
you have one.

- --
[Name      ]   ::  [Matan I. Peled    ]
[Location  ]   ::  [Israel            ]
[Public Key]   ::  [0xD6F42CA5        ]
[Keyserver ]   ::  [keyserver.kjsl.com]
encrypted/signed  plain text  preferred

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDSzgDA7Qvptb0LKURApE4AKCOxBS3Xos5yEAnd5QbItjRmYlIsQCePSmm
ULBWIDcIOPxXAnMSodZNgCg=
=qmrM
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-11  3:56                 ` Matan Peled
@ 2005-10-11  4:45                   ` Mark Knecht
  2005-10-11  6:14                     ` Ingo Molnar
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Knecht @ 2005-10-11  4:45 UTC (permalink / raw)
  To: chaosite; +Cc: Daniel Walker, linux-kernel, Ingo Molnar, Lee Revell

On 10/10/05, Matan Peled <chaosite@gmail.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Mark Knecht wrote:
> > Well, I'm disappointed again. Some xruns came but no additional data
> > was put into the dmesage file:
>
> "dmesage file"? You mean, "/var/log/dmesg"? No, thats not the file you want...
> At least here, that file contains the logs as they were when the boot finishes.
>
> The information you want can be retrieved by running the command (actual program
> to run, type it in a terminal) "dmesg".

Right, sorry. Obviously bad typing on my part. I am running 'dmesg' at
the command line to get the traces. My bad...

Anyway, these latencies seem understood at this point. They are just
the timer that you set in the kernel hacking section. No big deal.
However, none of the messages yet give any clues (that I understand)
as to the cause of the timing misses I'm seeing with 2.6.14-rc3-rt13.
I shall look into a 2.6.14-rc4-rtX tomorrow.

Cheers,
Mark

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-11  4:45                   ` Mark Knecht
@ 2005-10-11  6:14                     ` Ingo Molnar
  2005-10-11  7:53                       ` Ingo Molnar
  0 siblings, 1 reply; 26+ messages in thread
From: Ingo Molnar @ 2005-10-11  6:14 UTC (permalink / raw)
  To: Mark Knecht; +Cc: chaosite, Daniel Walker, linux-kernel, Lee Revell


* Mark Knecht <markknecht@gmail.com> wrote:

> Anyway, these latencies seem understood at this point. They are just 
> the timer that you set in the kernel hacking section. No big deal. 
> However, none of the messages yet give any clues (that I understand) 
> as to the cause of the timing misses I'm seeing with 2.6.14-rc3-rt13. 
> I shall look into a 2.6.14-rc4-rtX tomorrow.

do you have 64-bit userspace too? If you have 32-bit userspace then 
could you try running the x86 kernel? Generally the 64-bit kernel has 
less mature debugging options in the -rt tree: e.g. latency tracing 
itself doesnt work (due to gcc silliness).

	Ingo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-11  6:14                     ` Ingo Molnar
@ 2005-10-11  7:53                       ` Ingo Molnar
  0 siblings, 0 replies; 26+ messages in thread
From: Ingo Molnar @ 2005-10-11  7:53 UTC (permalink / raw)
  To: Mark Knecht; +Cc: chaosite, Daniel Walker, linux-kernel, Lee Revell


* Ingo Molnar <mingo@elte.hu> wrote:

> > Anyway, these latencies seem understood at this point. They are just 
> > the timer that you set in the kernel hacking section. No big deal. 
> > However, none of the messages yet give any clues (that I understand) 
> > as to the cause of the timing misses I'm seeing with 2.6.14-rc3-rt13. 
> > I shall look into a 2.6.14-rc4-rtX tomorrow.
> 
> do you have 64-bit userspace too? If you have 32-bit userspace then 
> could you try running the x86 kernel? Generally the 64-bit kernel has 
> less mature debugging options in the -rt tree: e.g. latency tracing 
> itself doesnt work [...]

let me take that back - latency tracing does work on x64 too.

	Ingo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Latency data - 2.6.14-rc3-rt13
  2005-10-10 21:12   ` Mark Knecht
  2005-10-10 21:28     ` Mark Knecht
@ 2005-10-10 21:39     ` Daniel Walker
  1 sibling, 0 replies; 26+ messages in thread
From: Daniel Walker @ 2005-10-10 21:39 UTC (permalink / raw)
  To: Mark Knecht; +Cc: linux-kernel, Ingo Molnar, Lee Revell

On Mon, 2005-10-10 at 14:12 -0700, Mark Knecht wrote:
> On 10/10/05, Daniel Walker <dwalker@mvista.com> wrote:
> > On Mon, 2005-10-10 at 13:16 -0700, Mark Knecht wrote:
> > > Hi,
> >
> > >
> > > How can I get data that would be more useful in terms of real debug?
> >
> > The IRQ off times look like the worst . If you do "make menuconfig"
> >
> > then goto Kernel Hacking and select
> > "Interrupts-off critical section latency timing"
> > Then select "Latency tracing"
> >
> > Then when you boot the system before run the following,
> >
> > "echo 0 > /proc/sys/kernel/preempt_max_latency"
> 
> So this disables the tracing of preempt times but keeps IRQ times on? Cool.

Yes, that what the compile options are doing.

> >
> > That will record a trace for every maximum latency observed for IRQ
> > latency . You can view the trace with this command
> > "cat /proc/latency_trace" , and you can attach the trace to an email to
> > LKML so we can review it (compress it if it's big though) .
> >
> > Daniel
> 
> Will do. Building now. I'll be back later.
> 
> Is there anything specific I should look for in the traces myself?
> Anyway to help narrow it down?

You want the trace to represent the largest latency. For instance in the
histogram it showed IRQ latency was ~3000us or greater, so you want the
trace to show at least that size latency .

Daniel 


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2005-10-12 17:04 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-10 20:16 Latency data - 2.6.14-rc3-rt13 Mark Knecht
2005-10-10 20:49 ` Daniel Walker
2005-10-10 21:12   ` Mark Knecht
2005-10-10 21:28     ` Mark Knecht
2005-10-10 21:44       ` Daniel Walker
2005-10-10 22:09         ` Mark Knecht
2005-10-10 22:28           ` Daniel Walker
2005-10-10 23:33             ` Mark Knecht
2005-10-10 23:49               ` Mark Knecht
2005-10-11  3:45                 ` Mark Knecht
2005-10-11  6:46                   ` Steven Rostedt
2005-10-11  8:56                   ` Sven-Thorsten Dietrich
2005-10-11 11:17                   ` Ingo Molnar
2005-10-11 22:45                     ` Mark Knecht
2005-10-12  0:02                       ` Lee Revell
2005-10-12  1:09                         ` Mark Knecht
2005-10-12  1:21                           ` Lee Revell
2005-10-12  1:25                             ` Mark Knecht
2005-10-12  6:38                             ` Steven Rostedt
2005-10-12 16:18                               ` Lee Revell
2005-10-12 17:03                                 ` Steven Rostedt
2005-10-11  3:56                 ` Matan Peled
2005-10-11  4:45                   ` Mark Knecht
2005-10-11  6:14                     ` Ingo Molnar
2005-10-11  7:53                       ` Ingo Molnar
2005-10-10 21:39     ` Daniel Walker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox