* Latency data - 2.6.14-rc3-rt13
@ 2005-10-10 20:16 Mark Knecht
2005-10-10 20:49 ` Daniel Walker
0 siblings, 1 reply; 26+ messages in thread
From: Mark Knecht @ 2005-10-10 20:16 UTC (permalink / raw)
To: linux-kernel; +Cc: Ingo Molnar, Lee Revell
Hi,
I'm still suffering the occasional xrun using Jack so I'm trying to
understand what information I can feed back on this. I turned on the
histogram feature for both IRQ-off and preempt-off. From what I can
see so far it seems that something is turning IRQs off for long
periods of time? I think that this is similar to what Lee Revell
showed the other day, but I don't know how to get the same sort of
printout he showed.
This is the Jack data:
nperiods = 2 for capture
nperiods = 2 for playback
08:16:26.401 Server configuration saved to "/home/mark/.jackdrc".
08:16:26.401 Statistics reset.
08:16:26.637 Client activated.
08:16:26.639 Audio connection change.
08:16:26.642 Audio connection graph change.
08:16:28.395 MIDI connection graph change.
08:16:28.472 MIDI connection change.
08:16:29.664 Audio connection graph change.
08:19:50.071 Audio connection graph change.
08:19:50.263 MIDI connection graph change.
08:19:50.273 MIDI connection graph change.
10:04:36.179 Audio connection graph change.
10:04:36.362 Audio connection change.
10:04:38.800 Audio connection graph change.
10:04:38.810 Audio connection graph change.
12:34:09.601 XRUN callback (1).
**** alsa_pcm: xrun of at least 0.133 msecs
13:09:41.344 MIDI connection graph change.
13:09:41.439 MIDI connection change.
13:09:42.427 Audio connection graph change.
13:09:45.919 Audio connection graph change.
13:09:45.941 Audio connection graph change.
13:09:45.960 MIDI connection graph change.
This is what I see for preempt off latency. This looks really good to me:
#Minimum latency: 0 microseconds.
#Average latency: 0 microseconds.
#Maximum latency: 1207040 microseconds.
#Total samples: 196871313
#There are 2 samples greater or equal than 10240 microseconds
#usecs samples
0 190504250
1 5881275
2 379533
3 69966
4 13858
5 5708
6 2287
7 7726
8 1771
9 465
10 649
11 935
12 948
13 720
14 573
15 329
16 139
17 71
18 46
19 27
20 21
21 7
22 3
23 2
24 1
25 0
26 0
27 0
However, for IRQ off latency I have far larger numbers. I Assume that
it's probably pretty bad that interrupts could be turned off for as
long as 4mS when I'm trying to run sub3mS?
#Minimum latency: 0 microseconds.
#Average latency: 4 microseconds.
#Maximum latency: 3998 microseconds.
#Total samples: 3102665680
#There are 0 samples greater or equal than 10240 microseconds
#usecs samples
0 3066905198
1 3817807
2 15630448
3 347330
4 55362
5 40098
6 27080
7 27740
8 22484
9 20298
10 15727
11 15346
12 14870
13 15258
14 101559
15 16152
<SNIP>
1365 18434
1366 15529
1367 12471
1368 11392
1369 13927
1370 21813
1371 39568
1372 87654
1373 250408
1374 535990
1375 577197
1376 370609
1377 120045
1378 20431
1379 5312
1380 5349
1381 8203
1382 14773
1383 15352
1384 9515
1385 5020
1386 3618
1387 3738
1388 5554
1389 8669
1390 8244
1391 6448
1392 4948
1393 4151
1394 5984
1395 10599
1396 19058
1397 41222
1398 109732
1399 360362
1400 354579
1401 113700
1402 11014
1403 1240
1404 754
1405 772
1406 963
1407 850
1408 445
1409 276
<SNIP>
3976 62
3977 57
3978 71
3979 70
3980 59
3981 61
3982 69
3983 76
3984 75
3985 72
3986 76
3987 80
3988 65
3989 64
3990 78
3991 70
3992 85
3993 128
3994 236
3995 358
3996 96
3997 0
3998 1
3999 0
4000 0
4001 0
4002 0
How can I get data that would be more useful in terms of real debug?
Thanks,
Mark
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Latency data - 2.6.14-rc3-rt13 2005-10-10 20:16 Latency data - 2.6.14-rc3-rt13 Mark Knecht @ 2005-10-10 20:49 ` Daniel Walker 2005-10-10 21:12 ` Mark Knecht 0 siblings, 1 reply; 26+ messages in thread From: Daniel Walker @ 2005-10-10 20:49 UTC (permalink / raw) To: Mark Knecht; +Cc: linux-kernel, Ingo Molnar, Lee Revell On Mon, 2005-10-10 at 13:16 -0700, Mark Knecht wrote: > Hi, > > How can I get data that would be more useful in terms of real debug? The IRQ off times look like the worst . If you do "make menuconfig" then goto Kernel Hacking and select "Interrupts-off critical section latency timing" Then select "Latency tracing" Then when you boot the system before run the following, "echo 0 > /proc/sys/kernel/preempt_max_latency" That will record a trace for every maximum latency observed for IRQ latency . You can view the trace with this command "cat /proc/latency_trace" , and you can attach the trace to an email to LKML so we can review it (compress it if it's big though) . Daniel ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-10 20:49 ` Daniel Walker @ 2005-10-10 21:12 ` Mark Knecht 2005-10-10 21:28 ` Mark Knecht 2005-10-10 21:39 ` Daniel Walker 0 siblings, 2 replies; 26+ messages in thread From: Mark Knecht @ 2005-10-10 21:12 UTC (permalink / raw) To: Daniel Walker; +Cc: linux-kernel, Ingo Molnar, Lee Revell On 10/10/05, Daniel Walker <dwalker@mvista.com> wrote: > On Mon, 2005-10-10 at 13:16 -0700, Mark Knecht wrote: > > Hi, > > > > > How can I get data that would be more useful in terms of real debug? > > The IRQ off times look like the worst . If you do "make menuconfig" > > then goto Kernel Hacking and select > "Interrupts-off critical section latency timing" > Then select "Latency tracing" > > Then when you boot the system before run the following, > > "echo 0 > /proc/sys/kernel/preempt_max_latency" So this disables the tracing of preempt times but keeps IRQ times on? Cool. > > > That will record a trace for every maximum latency observed for IRQ > latency . You can view the trace with this command > "cat /proc/latency_trace" , and you can attach the trace to an email to > LKML so we can review it (compress it if it's big though) . > > Daniel Will do. Building now. I'll be back later. Is there anything specific I should look for in the traces myself? Anyway to help narrow it down? Cheers, Mark ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-10 21:12 ` Mark Knecht @ 2005-10-10 21:28 ` Mark Knecht 2005-10-10 21:44 ` Daniel Walker 2005-10-10 21:39 ` Daniel Walker 1 sibling, 1 reply; 26+ messages in thread From: Mark Knecht @ 2005-10-10 21:28 UTC (permalink / raw) To: Daniel Walker; +Cc: linux-kernel, Ingo Molnar, Lee Revell On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote: <SNIP> > > then goto Kernel Hacking and select > > "Interrupts-off critical section latency timing" > > Then select "Latency tracing" Only had to turn on Latency Tracing. The others I had on... <SNIP> > > > > Daniel > > Will do. Building now. I'll be back later. > Unfortunately I didn't think I'd be back this fast. I built the new kernel and rebooted. The boot starts, gets down to the point where it tells me that the preempt debug stuff is on, and then jumps to an endlessly repeating error message: init[1]: segfault at ffffffff8010fce0 rip ffffffff8010fce0 rsp 00007fffffcb09b8 error 15 This error repeasts endlessly until I reboot. Good thing I had another kernel I could boot back into! ;-) So, something isn't happy. Is this a -rt thing or a kernel issue? Cheers, Mark ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-10 21:28 ` Mark Knecht @ 2005-10-10 21:44 ` Daniel Walker 2005-10-10 22:09 ` Mark Knecht 0 siblings, 1 reply; 26+ messages in thread From: Daniel Walker @ 2005-10-10 21:44 UTC (permalink / raw) To: Mark Knecht; +Cc: linux-kernel, Ingo Molnar, Lee Revell On Mon, 2005-10-10 at 14:28 -0700, Mark Knecht wrote: > On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote: > <SNIP> > > > then goto Kernel Hacking and select > > > "Interrupts-off critical section latency timing" > > > Then select "Latency tracing" > > Only had to turn on Latency Tracing. The others I had on... > > <SNIP> > > > > > > Daniel > > > > Will do. Building now. I'll be back later. > > > > Unfortunately I didn't think I'd be back this fast. I built the new > kernel and rebooted. The boot starts, gets down to the point where it > tells me that the preempt debug stuff is on, and then jumps to an > endlessly repeating error message: > > init[1]: segfault at ffffffff8010fce0 rip ffffffff8010fce0 rsp > 00007fffffcb09b8 error 15 > > This error repeasts endlessly until I reboot. > > Good thing I had another kernel I could boot back into! ;-) > > So, something isn't happy. Is this a -rt thing or a kernel issue? Hmm, it looks like latency tracing doesn't work on x86_64 .. I guess you'll have to wait till someone fixes it . Another option is to turn off "Latency Tracing" then reboot, like it was before but w/o the histogram. Then run, "echo 0 > /proc/sys/kernel/preempt_max_latency" Whenever a new maximum latency is observed it will log it with a stack trace in the system logs. You can report that back here on LKML . You can view the system log with the command "dmesg" , I think . Daniel ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-10 21:44 ` Daniel Walker @ 2005-10-10 22:09 ` Mark Knecht 2005-10-10 22:28 ` Daniel Walker 0 siblings, 1 reply; 26+ messages in thread From: Mark Knecht @ 2005-10-10 22:09 UTC (permalink / raw) To: Daniel Walker; +Cc: linux-kernel, Ingo Molnar, Lee Revell On 10/10/05, Daniel Walker <dwalker@mvista.com> wrote: > On Mon, 2005-10-10 at 14:28 -0700, Mark Knecht wrote: > > On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote: > > <SNIP> > > > > then goto Kernel Hacking and select > > > > "Interrupts-off critical section latency timing" > > > > Then select "Latency tracing" > > > > Only had to turn on Latency Tracing. The others I had on... > > > > <SNIP> > > > > > > > > Daniel > > > > > > Will do. Building now. I'll be back later. > > > > > > > Unfortunately I didn't think I'd be back this fast. I built the new > > kernel and rebooted. The boot starts, gets down to the point where it > > tells me that the preempt debug stuff is on, and then jumps to an > > endlessly repeating error message: > > > > init[1]: segfault at ffffffff8010fce0 rip ffffffff8010fce0 rsp > > 00007fffffcb09b8 error 15 > > > > This error repeasts endlessly until I reboot. > > > > Good thing I had another kernel I could boot back into! ;-) > > > > So, something isn't happy. Is this a -rt thing or a kernel issue? > > Hmm, it looks like latency tracing doesn't work on x86_64 .. I guess > you'll have to wait till someone fixes it . > > Another option is to turn off "Latency Tracing" then reboot, like it was > before but w/o the histogram. Then run, > > "echo 0 > /proc/sys/kernel/preempt_max_latency" > > Whenever a new maximum latency is observed it will log it with a stack > trace in the system logs. You can report that back here on LKML . > > You can view the system log with the command "dmesg" , I think . > > Daniel > > Yes, already that looks interesting. Do I have something going on with acpi? This is before running Jack. I'm in Gnome.There are a lot of these messages, but they've stopped. I suppose the 3997 is the delay? That would coorespond with the info I sent earlier. ( softirq-timer/0-3 |#0): new 3997 us maximum-latency critical section. => started at timestamp 185717289: <acpi_processor_idle+0x20/0x379> => ended at timestamp 185721287: <thread_return+0xb5/0x11a> Call Trace:<ffffffff801519ac>{check_critical_timing+492} <ffffffff80151e05>{sub_preempt_count_ti+133} <ffffffff803f632c>{thread_return+70} <ffffffff803f639b>{thread_return+181} <ffffffff803f6505>{schedule+261} <ffffffff8013a84a>{ksoftirqd+138} <ffffffff8013a7c0>{ksoftirqd+0} <ffffffff8014aa8d>{kthread+205} <ffffffff8010f716>{child_rip+8} <ffffffff8014a9c0>{kthread+0} <ffffffff8010f70e>{child_rip+0} --------------------------- | preempt count: 00000001 ] | 1-level deep critical section nesting: ---------------------------------------- .. [<ffffffff803f5df8>] .... __schedule+0xb8/0x5a6 .....[<00000000>] .. ( <= 0x0) => dump-end timestamp 185721364 mark@lightning ~ $ I started up Jack, mounted my 1394 drives and started streaming a little bit of data. Basicaly my ultra-light load test. I see nothing new in dmesg yet. I'll keep watching. Thanks, Mark ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-10 22:09 ` Mark Knecht @ 2005-10-10 22:28 ` Daniel Walker 2005-10-10 23:33 ` Mark Knecht 0 siblings, 1 reply; 26+ messages in thread From: Daniel Walker @ 2005-10-10 22:28 UTC (permalink / raw) To: Mark Knecht; +Cc: linux-kernel, Ingo Molnar, Lee Revell On Mon, 2005-10-10 at 15:09 -0700, Mark Knecht wrote: > On 10/10/05, Daniel Walker <dwalker@mvista.com> wrote: > > On Mon, 2005-10-10 at 14:28 -0700, Mark Knecht wrote: > > > On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote: > > > <SNIP> > > > > > then goto Kernel Hacking and select > > > > > "Interrupts-off critical section latency timing" > > > > > Then select "Latency tracing" > > > > > > Only had to turn on Latency Tracing. The others I had on... > > > > > > <SNIP> > > > > > > > > > > Daniel > > > > > > > > Will do. Building now. I'll be back later. > > > > > > > > > > Unfortunately I didn't think I'd be back this fast. I built the new > > > kernel and rebooted. The boot starts, gets down to the point where it > > > tells me that the preempt debug stuff is on, and then jumps to an > > > endlessly repeating error message: > > > > > > init[1]: segfault at ffffffff8010fce0 rip ffffffff8010fce0 rsp > > > 00007fffffcb09b8 error 15 > > > > > > This error repeasts endlessly until I reboot. > > > > > > Good thing I had another kernel I could boot back into! ;-) > > > > > > So, something isn't happy. Is this a -rt thing or a kernel issue? > > > > Hmm, it looks like latency tracing doesn't work on x86_64 .. I guess > > you'll have to wait till someone fixes it . > > > > Another option is to turn off "Latency Tracing" then reboot, like it was > > before but w/o the histogram. Then run, > > > > "echo 0 > /proc/sys/kernel/preempt_max_latency" > > > > Whenever a new maximum latency is observed it will log it with a stack > > trace in the system logs. You can report that back here on LKML . > > > > You can view the system log with the command "dmesg" , I think . > > > > Daniel > > > > > Yes, already that looks interesting. Do I have something going on with > acpi? This is before running Jack. I'm in Gnome.There are a lot of > these messages, but they've stopped. I suppose the 3997 is the delay? > That would coorespond with the info I sent earlier. I think this is a false reading . Sometimes when a system goes idle it will cause interrupt disable latency that isn't real (due to process halting) . You could try turning ACPI off if you can , and turn off power management .. Daniel ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-10 22:28 ` Daniel Walker @ 2005-10-10 23:33 ` Mark Knecht 2005-10-10 23:49 ` Mark Knecht 0 siblings, 1 reply; 26+ messages in thread From: Mark Knecht @ 2005-10-10 23:33 UTC (permalink / raw) To: Daniel Walker; +Cc: linux-kernel, Ingo Molnar, Lee Revell On 10/10/05, Daniel Walker <dwalker@mvista.com> wrote: > > Yes, already that looks interesting. Do I have something going on with > > acpi? This is before running Jack. I'm in Gnome.There are a lot of > > these messages, but they've stopped. I suppose the 3997 is the delay? > > That would coorespond with the info I sent earlier. > > I think this is a false reading . Sometimes when a system goes idle it > will cause interrupt disable latency that isn't real (due to process > halting) . You could try turning ACPI off if you can , and turn off > power management .. > > Daniel OK. NO immediate difference with most of the power management stuff turned off. I'll just let it run a few hours and see if I get an xrun. If I do I'll look at the logs again and report back. Thanks for the help. I feel like I've got a chance of spotting something. Cheers, Mark ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-10 23:33 ` Mark Knecht @ 2005-10-10 23:49 ` Mark Knecht 2005-10-11 3:45 ` Mark Knecht 2005-10-11 3:56 ` Matan Peled 0 siblings, 2 replies; 26+ messages in thread From: Mark Knecht @ 2005-10-10 23:49 UTC (permalink / raw) To: Daniel Walker; +Cc: linux-kernel, Ingo Molnar, Lee Revell On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote: > On 10/10/05, Daniel Walker <dwalker@mvista.com> wrote: > > > > Yes, already that looks interesting. Do I have something going on with > > > acpi? This is before running Jack. I'm in Gnome.There are a lot of > > > these messages, but they've stopped. I suppose the 3997 is the delay? > > > That would coorespond with the info I sent earlier. > > > > I think this is a false reading . Sometimes when a system goes idle it > > will cause interrupt disable latency that isn't real (due to process > > halting) . You could try turning ACPI off if you can , and turn off > > power management .. > > > > Daniel > > OK. NO immediate difference with most of the power management stuff > turned off. I'll just let it run a few hours and see if I get an xrun. > If I do I'll look at the logs again and report back. > > Thanks for the help. I feel like I've got a chance of spotting something. > > Cheers, > Mark Well, I'm disappointed again. Some xruns came but no additional data was put into the dmesage file: ( softirq-timer/0-3 |#0): new 3997 us maximum-latency critical section. => started at timestamp 253995015: <acpi_processor_idle+0x20/0x379> => ended at timestamp 253999013: <thread_return+0xb5/0x11a> Call Trace:<ffffffff8014e32c>{check_critical_timing+492} <ffffffff8014e785>{sub_preempt_count_ti+133} <ffffffff803ec76c>{thread_return+70} <ffffffff803ec7db>{thread_return+181} <ffffffff803ec945>{schedule+261} <ffffffff801371ca>{ksoftirqd+138} <ffffffff80137140>{ksoftirqd+0} <ffffffff8014740d>{kthread+205} <ffffffff8010e6e6>{child_rip+8} <ffffffff80147340>{kthread+0} <ffffffff8010e6de>{child_rip+0} --------------------------- | preempt count: 00000001 ] | 1-level deep critical section nesting: ---------------------------------------- .. [<ffffffff803ec238>] .... __schedule+0xb8/0x5a6 .....[<00000000>] .. ( <= 0x0) => dump-end timestamp 253999092 kjournald starting. Commit interval 5 seconds EXT3 FS on sdb1, internal journal EXT3-fs: mounted filesystem with ordered data mode. lightning ~ # Basically the trace above was there pretty much right after booting. I mounted a 1394 drive, which is shown, and then started running apps. I got 3 xruns: 16:42:59.826 XRUN callback (1). **** alsa_pcm: xrun of at least 0.273 msecs 16:43:03.764 XRUN callback (2). **** alsa_pcm: xrun of at least 2.082 msecs 16:43:41.618 XRUN callback (3). **** alsa_pcm: xrun of at least 2.040 msecs but no additional data in the logs. So possibly this is an IRQ off latency? Or am I jumping to conclusions? The trace above, if I understand it, indicates a delay of nearly 4mS, and again, since I'm trying to run at sub 3mS, that would seem like a potential problem. I'm going to completely shut off power management as a test just to see where that leads. - Mark ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-10 23:49 ` Mark Knecht @ 2005-10-11 3:45 ` Mark Knecht 2005-10-11 6:46 ` Steven Rostedt ` (2 more replies) 2005-10-11 3:56 ` Matan Peled 1 sibling, 3 replies; 26+ messages in thread From: Mark Knecht @ 2005-10-11 3:45 UTC (permalink / raw) To: Daniel Walker; +Cc: linux-kernel, Ingo Molnar, Lee Revell On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote: > > ( softirq-timer/0-3 |#0): new 3997 us maximum-latency critical section. So the root cause of this 4mS delay is the 250Hz timer. If I change the system to use the 1Khz timer then the time in this section drops, as expected, to 1mS. ( softirq-timer/0-3 |#0): new 998 us maximum-latency critical section. => started at timestamp 121040020: <acpi_processor_idle+0x20/0x379> => ended at timestamp 121041019: <thread_return+0xb5/0x11a> So, thinking very interesting here I think. Back to the drawing board as to my xruns. - Mark ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-11 3:45 ` Mark Knecht @ 2005-10-11 6:46 ` Steven Rostedt 2005-10-11 8:56 ` Sven-Thorsten Dietrich 2005-10-11 11:17 ` Ingo Molnar 2 siblings, 0 replies; 26+ messages in thread From: Steven Rostedt @ 2005-10-11 6:46 UTC (permalink / raw) To: Mark Knecht; +Cc: Daniel Walker, linux-kernel, Ingo Molnar, Lee Revell On Mon, 10 Oct 2005, Mark Knecht wrote: > On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote: > > > > ( softirq-timer/0-3 |#0): new 3997 us maximum-latency critical section. > > So the root cause of this 4mS delay is the 250Hz timer. If I change > the system to use the 1Khz timer then the time in this section drops, > as expected, to 1mS. > > ( softirq-timer/0-3 |#0): new 998 us maximum-latency critical section. > => started at timestamp 121040020: <acpi_processor_idle+0x20/0x379> > => ended at timestamp 121041019: <thread_return+0xb5/0x11a> > > So, thinking very interesting here I think. > > Back to the drawing board as to my xruns. > Are your xruns showing the same type of latency? If you switch to the 1000Hz do you get only 1mS latency on your xruns too? This sounds like the application might have gone to sleep and didn't wake up until something scheduled it in. Or something else with the scheduler. I doubt that this has to due with preemption or interrupts being off, but something that uses jiffies to calculate. -- Steve ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-11 3:45 ` Mark Knecht 2005-10-11 6:46 ` Steven Rostedt @ 2005-10-11 8:56 ` Sven-Thorsten Dietrich 2005-10-11 11:17 ` Ingo Molnar 2 siblings, 0 replies; 26+ messages in thread From: Sven-Thorsten Dietrich @ 2005-10-11 8:56 UTC (permalink / raw) To: Mark Knecht; +Cc: Daniel Walker, linux-kernel, Ingo Molnar, Lee Revell On Mon, 2005-10-10 at 20:45 -0700, Mark Knecht wrote: > On 10/10/05, Mark Knecht <markknecht@gmail.com> wrote: > > > > ( softirq-timer/0-3 |#0): new 3997 us maximum-latency critical section. > > So the root cause of this 4mS delay is the 250Hz timer. If I change > the system to use the 1Khz timer then the time in this section drops, > as expected, to 1mS. > > ( softirq-timer/0-3 |#0): new 998 us maximum-latency critical section. > => started at timestamp 121040020: <acpi_processor_idle+0x20/0x379> > => ended at timestamp 121041019: <thread_return+0xb5/0x11a> > > So, thinking very interesting here I think. > Maximum preemption latencies near the timer period are sometimes indicative of mismatched preempt_disable/preempt_enable pairs in the code. Usually the scheduler warns about that on the console, however. Sven > Back to the drawing board as to my xruns. > > - Mark > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- *********************************** Sven-Thorsten Dietrich Real-Time Software Architect MontaVista Software, Inc. 1237 East Arques Ave. Sunnyvale, CA 94085 Phone: 408.992.4515 Fax: 408.328.9204 http://www.mvista.com Platform To Innovate *********************************** ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-11 3:45 ` Mark Knecht 2005-10-11 6:46 ` Steven Rostedt 2005-10-11 8:56 ` Sven-Thorsten Dietrich @ 2005-10-11 11:17 ` Ingo Molnar 2005-10-11 22:45 ` Mark Knecht 2 siblings, 1 reply; 26+ messages in thread From: Ingo Molnar @ 2005-10-11 11:17 UTC (permalink / raw) To: Mark Knecht; +Cc: Daniel Walker, linux-kernel, Lee Revell * Mark Knecht <markknecht@gmail.com> wrote: > > ( softirq-timer/0-3 |#0): new 3997 us maximum-latency critical section. > > So the root cause of this 4mS delay is the 250Hz timer. If I change > the system to use the 1Khz timer then the time in this section drops, > as expected, to 1mS. this was a bug in the critical-section-latency measurement code of x64. The timer irq is the one that leaves userspace running for the longest time, between two kernel calls. I have fixed these bugs in -rc4-rt1, could you try it? It should report much lower latencies, regardless of PM settings. Ingo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-11 11:17 ` Ingo Molnar @ 2005-10-11 22:45 ` Mark Knecht 2005-10-12 0:02 ` Lee Revell 0 siblings, 1 reply; 26+ messages in thread From: Mark Knecht @ 2005-10-11 22:45 UTC (permalink / raw) To: Ingo Molnar; +Cc: Daniel Walker, linux-kernel, Lee Revell On 10/11/05, Ingo Molnar <mingo@elte.hu> wrote: > > * Mark Knecht <markknecht@gmail.com> wrote: > > > > ( softirq-timer/0-3 |#0): new 3997 us maximum-latency critical section. > > > > So the root cause of this 4mS delay is the 250Hz timer. If I change > > the system to use the 1Khz timer then the time in this section drops, > > as expected, to 1mS. > > this was a bug in the critical-section-latency measurement code of x64. > The timer irq is the one that leaves userspace running for the longest > time, between two kernel calls. > > I have fixed these bugs in -rc4-rt1, could you try it? It should report > much lower latencies, regardless of PM settings. > > Ingo > Ingo, This test now reports much more intersting data: ( dmesg-8010 |#0): new 13 us maximum-latency critical section. => started at timestamp 117628604: <do_IRQ+0x29/0x50> => ended at timestamp 117628618: <do_IRQ+0x39/0x50> Call Trace: <IRQ> <ffffffff8014c42c>{check_critical_timing+492} <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff80110159>{do_IRQ+57} <ffffffff8010e16c>{ret_from_intr+0} <EOI> <ffffffff80249a75>{copy_page+5} <ffffffff801671f1>{do_no_page+737} <ffffffff801674ee>{__handle_mm_fault+414} <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff803eac99>{do_page_fault+1049} <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff80168eee>{vma_link+286} <ffffffff8010e4c1>{error_exit+0} <ffffffff8024a866>{__clear_user+22} <ffffffff801a46fb>{padzero+27} <ffffffff801a4b42>{load_elf_interp+850} <ffffffff801a5b1d>{load_elf_binary+3341} <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff801a4e10>{load_elf_binary+0} <ffffffff80182ee0>{search_binary_handler+272} <ffffffff801831f5>{do_execve+405} <ffffffff8010dbc6>{system_call+126} <ffffffff8010c634>{sys_execve+68} <ffffffff8010dfea>{stub_execve+106} => dump-end timestamp 117628777 ( nautilus-7955 |#0): new 14 us maximum-latency critical section. => started at timestamp 127874927: <do_IRQ+0x29/0x50> => ended at timestamp 127874941: <do_IRQ+0x39/0x50> Call Trace: <IRQ> <ffffffff8014c42c>{check_critical_timing+492} <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff80110159>{do_IRQ+57} <ffffffff8010e16c>{ret_from_intr+0} <EOI> <ffffffff8010dbc6>{system_call+126} => dump-end timestamp 127874996 ( nautilus-7955 |#0): new 14 us maximum-latency critical section. => started at timestamp 128647499: <do_IRQ+0x29/0x50> => ended at timestamp 128647514: <do_IRQ+0x39/0x50> Call Trace: <IRQ> <ffffffff8014c42c>{check_critical_timing+492} <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff80110159>{do_IRQ+57} <ffffffff8010e16c>{ret_from_intr+0} <EOI> => dump-end timestamp 128647567 lightning ~ # NOTE: In my first kernel build I turned on latency measurement for both max preempt and ITQ-off. The kernel segfaulted immediately after the boot messages reminding me that these were on. I rebuilt the kernel with only the max preempt and that worked. I suspect some problem with the IRQ-off section since that's the only change I made and it did not segfault in -rc3-rt13. Thanks, Mark ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-11 22:45 ` Mark Knecht @ 2005-10-12 0:02 ` Lee Revell 2005-10-12 1:09 ` Mark Knecht 0 siblings, 1 reply; 26+ messages in thread From: Lee Revell @ 2005-10-12 0:02 UTC (permalink / raw) To: Mark Knecht; +Cc: Ingo Molnar, Daniel Walker, linux-kernel On Tue, 2005-10-11 at 15:45 -0700, Mark Knecht wrote: > On 10/11/05, Ingo Molnar <mingo@elte.hu> wrote: > > > > * Mark Knecht <markknecht@gmail.com> wrote: > > > > > > ( softirq-timer/0-3 |#0): new 3997 us maximum-latency critical section. > > > > > > So the root cause of this 4mS delay is the 250Hz timer. If I change > > > the system to use the 1Khz timer then the time in this section drops, > > > as expected, to 1mS. > > > > this was a bug in the critical-section-latency measurement code of x64. > > The timer irq is the one that leaves userspace running for the longest > > time, between two kernel calls. > > > > I have fixed these bugs in -rc4-rt1, could you try it? It should report > > much lower latencies, regardless of PM settings. > > > > Ingo > > > > Ingo, > This test now reports much more intersting data: > > ( dmesg-8010 |#0): new 13 us maximum-latency critical section. > => started at timestamp 117628604: <do_IRQ+0x29/0x50> > => ended at timestamp 117628618: <do_IRQ+0x39/0x50> This is the expected, correct behavior - very small maximum latency critical sections. Do you get anything longer (say 300 usecs or more) if you leave it running? So far the latency tracer on my much slower system has only gone up to 123 usecs. So the bug seems to be fixed at least on i386. Lee ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-12 0:02 ` Lee Revell @ 2005-10-12 1:09 ` Mark Knecht 2005-10-12 1:21 ` Lee Revell 0 siblings, 1 reply; 26+ messages in thread From: Mark Knecht @ 2005-10-12 1:09 UTC (permalink / raw) To: Lee Revell; +Cc: Ingo Molnar, Daniel Walker, linux-kernel On 10/11/05, Lee Revell <rlrevell@joe-job.com> wrote: > On Tue, 2005-10-11 at 15:45 -0700, Mark Knecht wrote: > > On 10/11/05, Ingo Molnar <mingo@elte.hu> wrote: > > > > > > * Mark Knecht <markknecht@gmail.com> wrote: > > > > > > > > ( softirq-timer/0-3 |#0): new 3997 us maximum-latency critical section. > > > > > > > > So the root cause of this 4mS delay is the 250Hz timer. If I change > > > > the system to use the 1Khz timer then the time in this section drops, > > > > as expected, to 1mS. > > > > > > this was a bug in the critical-section-latency measurement code of x64. > > > The timer irq is the one that leaves userspace running for the longest > > > time, between two kernel calls. > > > > > > I have fixed these bugs in -rc4-rt1, could you try it? It should report > > > much lower latencies, regardless of PM settings. > > > > > > Ingo > > > > > > > Ingo, > > This test now reports much more intersting data: > > > > ( dmesg-8010 |#0): new 13 us maximum-latency critical section. > > => started at timestamp 117628604: <do_IRQ+0x29/0x50> > > => ended at timestamp 117628618: <do_IRQ+0x39/0x50> > > This is the expected, correct behavior - very small maximum latency > critical sections. Do you get anything longer (say 300 usecs or more) > if you leave it running? > > So far the latency tracer on my much slower system has only gone up to > 123 usecs. So the bug seems to be fixed at least on i386. > > Lee > > I've been watching this now for a couple of hours. Still no xruns. Max latency has only gone up to 19uS, so this is all good. However the thing I am seeing is that all my free memory has gone away and I've now swapped out just a little nit. There was really no change in the state of the machine, other than xscreensaver starting and running for a while, which is standard for me. None the less here's the sort of state things are in - 17uS right after booting, and only 19uS now. That's good. But in the middle look at the change in memory statistics during a one hour time period. ( X-7860 |#0): new 16 us maximum-latency critical section. => started at timestamp 259455017: <__schedule+0xb8/0x596> => ended at timestamp 259455034: <thread_return+0xb4/0x10a> Call Trace:<ffffffff8014c42c>{check_critical_timing+492} <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff803e7d9a>{thread_return+180} <ffffffff803e7ef5>{schedule+261} <ffffffff803e89e4>{schedule_timeout+148} <ffffffff8013a660>{process_timeout+0} <ffffffff8018b767>{do_select+967} <ffffffff8018b2b0>{__pollwait+0} <ffffffff8018baad>{sys_select+749} <ffffffff8010d9e9>{sys_rt_sigreturn+553} <ffffffff8010dbc6>{system_call+126} => dump-end timestamp 259455125 ( X-7860 |#0): new 17 us maximum-latency critical section. => started at timestamp 375451745: <__schedule+0xb8/0x596> => ended at timestamp 375451762: <thread_return+0xb4/0x10a> Call Trace:<ffffffff8014c42c>{check_critical_timing+492} <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff803e7d9a>{thread_return+180} <ffffffff803e7ef5>{schedule+261} <ffffffff803e89e4>{schedule_timeout+148} <ffffffff8013a660>{process_timeout+0} <ffffffff8018b767>{do_select+967} <ffffffff8018b2b0>{__pollwait+0} <ffffffff8018baad>{sys_select+749} <ffffffff8010d9e9>{sys_rt_sigreturn+553} <ffffffff8010dbc6>{system_call+126} => dump-end timestamp 375451852 kjournald starting. Commit interval 5 seconds EXT3 FS on sdc1, internal journal EXT3-fs: mounted filesystem with ordered data mode. lightning ~ # lightning ~ # date && free Tue Oct 11 16:21:41 PDT 2005 total used free shared buffers cached Mem: 510460 410816 99644 0 15896 169664 -/+ buffers/cache: 225256 285204 Swap: 996020 0 996020 lightning ~ # date && free Tue Oct 11 17:39:03 PDT 2005 total used free shared buffers cached Mem: 510460 504464 5996 0 39828 119012 -/+ buffers/cache: 345624 164836 Swap: 996020 304 995716 lightning ~ # ( hdspmixer-8071 |#0): new 18 us maximum-latency critical section. => started at timestamp 2132546112: <snd_ctl_open+0x71/0x200 [snd]> => ended at timestamp 2132546131: <snd_ctl_open+0x8a/0x200 [snd]> Call Trace:<ffffffff8014c42c>{check_critical_timing+492} <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff88006a0a>{:snd:snd_ctl_open+138} <ffffffff8800414b>{:snd:snd_open+267} <ffffffff801806d9>{chrdev_open+441} <ffffffff801766a8>{__dentry_open+280} <ffffffff80176877>{filp_open+135} <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff80176a06>{get_unused_fd+230} <ffffffff80176b71>{do_sys_open+81} <ffffffff8010dbc6>{system_call+126} => dump-end timestamp 2132546239 ( IRQ 58-4526 |#0): new 19 us maximum-latency critical section. => started at timestamp 8439787551: <__schedule+0xb8/0x596> => ended at timestamp 8439787571: <thread_return+0xb4/0x10a> Call Trace:<ffffffff8014c42c>{check_critical_timing+492} <ffffffff8014c64b>{sub_preempt_count_ti+75} <ffffffff803e7d9a>{thread_return+180} <ffffffff80146550>{keventd_create_kthread+0} <ffffffff803e7ef5>{schedule+261} <ffffffff80156062>{do_irqd+514} <ffffffff80155e60>{do_irqd+0} <ffffffff8014650d>{kthread+205} <ffffffff8010e676>{child_rip+8} <ffffffff80146550>{keventd_create_kthread+0} <ffffffff80354d00>{pci_conf1_read+0} <ffffffff80146440>{kthread+0} <ffffffff8010e66e>{child_rip+0} => dump-end timestamp 8439787658 lightning ~ # Now, I know these command line methods are probably frowned upon by folks in the know so I'm happy to do this in a more rigorous way. (Is it called valgrind?) Should free memory drop like that over time? The only apps running were Aqualung, hdspmixer and Firefox. Is one of them leaking, is the kernel leaking, or is this normal? I will continue to let this run for another 4-6 hours today and hope I catch another xrun. If not I'll start it up again tomorrow and certainly will get something. Thanks much, Mark ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-12 1:09 ` Mark Knecht @ 2005-10-12 1:21 ` Lee Revell 2005-10-12 1:25 ` Mark Knecht 2005-10-12 6:38 ` Steven Rostedt 0 siblings, 2 replies; 26+ messages in thread From: Lee Revell @ 2005-10-12 1:21 UTC (permalink / raw) To: Mark Knecht; +Cc: Ingo Molnar, Daniel Walker, linux-kernel On Tue, 2005-10-11 at 18:09 -0700, Mark Knecht wrote: > Should free memory drop like that over time? Yes this is perfectly normal. When a system first boots all the memory your apps aren't using is initially free. As applications access more data over time then it will be cached in memory until free memory drops to near zero. "Free memory" is actually wasted memory - it's better to use all available RAM for caching. Lee ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-12 1:21 ` Lee Revell @ 2005-10-12 1:25 ` Mark Knecht 2005-10-12 6:38 ` Steven Rostedt 1 sibling, 0 replies; 26+ messages in thread From: Mark Knecht @ 2005-10-12 1:25 UTC (permalink / raw) To: Lee Revell; +Cc: Ingo Molnar, Daniel Walker, linux-kernel On 10/11/05, Lee Revell <rlrevell@joe-job.com> wrote: > On Tue, 2005-10-11 at 18:09 -0700, Mark Knecht wrote: > > Should free memory drop like that over time? > > Yes this is perfectly normal. When a system first boots all the memory > your apps aren't using is initially free. As applications access more > data over time then it will be cached in memory until free memory drops > to near zero. > > "Free memory" is actually wasted memory - it's better to use all > available RAM for caching. > > Lee OK, non-intuitive for a guitar player, but good to know it's normal. Thanks! In that case I'll just wait for an xrun and hope I catch something in dmesg. We'll just have to wait and see. Cheers, Mark ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-12 1:21 ` Lee Revell 2005-10-12 1:25 ` Mark Knecht @ 2005-10-12 6:38 ` Steven Rostedt 2005-10-12 16:18 ` Lee Revell 1 sibling, 1 reply; 26+ messages in thread From: Steven Rostedt @ 2005-10-12 6:38 UTC (permalink / raw) To: Lee Revell; +Cc: Mark Knecht, Ingo Molnar, Daniel Walker, linux-kernel On Tue, 11 Oct 2005, Lee Revell wrote: > On Tue, 2005-10-11 at 18:09 -0700, Mark Knecht wrote: > > Should free memory drop like that over time? > > Yes this is perfectly normal. When a system first boots all the memory > your apps aren't using is initially free. As applications access more > data over time then it will be cached in memory until free memory drops > to near zero. > > "Free memory" is actually wasted memory - it's better to use all > available RAM for caching. > But the swap being touched bothers me. Although I've had problems with leaving Mozilla up for long times and it leaking. Without Mozilla running and running lots of other apps, I have almost 100% memory used, but 0% swap. If the swap starts to increase slowly over time, you _do_ have a leak somewhere. Probably not in the kernel (kernel memory never goes into swap). But if you want to see if the kernel is leaking, examine /proc/slabinfo once in a while and if you see something there constantly growing, then that might indicate a leak. Just pay attention to the first column. -- Steve ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-12 6:38 ` Steven Rostedt @ 2005-10-12 16:18 ` Lee Revell 2005-10-12 17:03 ` Steven Rostedt 0 siblings, 1 reply; 26+ messages in thread From: Lee Revell @ 2005-10-12 16:18 UTC (permalink / raw) To: Steven Rostedt; +Cc: Mark Knecht, Ingo Molnar, Daniel Walker, linux-kernel On Wed, 2005-10-12 at 02:38 -0400, Steven Rostedt wrote: > On Tue, 11 Oct 2005, Lee Revell wrote: > > > On Tue, 2005-10-11 at 18:09 -0700, Mark Knecht wrote: > > > Should free memory drop like that over time? > > > > Yes this is perfectly normal. When a system first boots all the memory > > your apps aren't using is initially free. As applications access more > > data over time then it will be cached in memory until free memory drops > > to near zero. > > > > "Free memory" is actually wasted memory - it's better to use all > > available RAM for caching. > > > > But the swap being touched bothers me. Although I've had problems with > leaving Mozilla up for long times and it leaking. Without Mozilla running > and running lots of other apps, I have almost 100% memory used, but 0% > swap. I believe this is the expected behavior under 2.6 unless you set /proc/sys/vm/swappiness to 0. If an app allocates memory and then never touches it then those pages will eventually be swapped out to make room for hot ones. Lee ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-12 16:18 ` Lee Revell @ 2005-10-12 17:03 ` Steven Rostedt 0 siblings, 0 replies; 26+ messages in thread From: Steven Rostedt @ 2005-10-12 17:03 UTC (permalink / raw) To: Lee Revell; +Cc: Mark Knecht, Ingo Molnar, Daniel Walker, linux-kernel On Wed, 12 Oct 2005, Lee Revell wrote: > > I believe this is the expected behavior under 2.6 unless you > set /proc/sys/vm/swappiness to 0. If an app allocates memory and then > never touches it then those pages will eventually be swapped out to make > room for hot ones. > OK, thanks for the info. I guess my apps don't allocate enough memory to be eventually swapped out (I obviously run 2.6). Or those apps use the memory that it allocates often. Whatever.. -- Steve ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-10 23:49 ` Mark Knecht 2005-10-11 3:45 ` Mark Knecht @ 2005-10-11 3:56 ` Matan Peled 2005-10-11 4:45 ` Mark Knecht 1 sibling, 1 reply; 26+ messages in thread From: Matan Peled @ 2005-10-11 3:56 UTC (permalink / raw) To: Mark Knecht; +Cc: Daniel Walker, linux-kernel, Ingo Molnar, Lee Revell -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Mark Knecht wrote: > Well, I'm disappointed again. Some xruns came but no additional data > was put into the dmesage file: "dmesage file"? You mean, "/var/log/dmesg"? No, thats not the file you want... At least here, that file contains the logs as they were when the boot finishes. The information you want can be retrieved by running the command (actual program to run, type it in a terminal) "dmesg". Or you might be able to look whats in the end of the /var/log/messages file, if you have one. - -- [Name ] :: [Matan I. Peled ] [Location ] :: [Israel ] [Public Key] :: [0xD6F42CA5 ] [Keyserver ] :: [keyserver.kjsl.com] encrypted/signed plain text preferred -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFDSzgDA7Qvptb0LKURApE4AKCOxBS3Xos5yEAnd5QbItjRmYlIsQCePSmm ULBWIDcIOPxXAnMSodZNgCg= =qmrM -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-11 3:56 ` Matan Peled @ 2005-10-11 4:45 ` Mark Knecht 2005-10-11 6:14 ` Ingo Molnar 0 siblings, 1 reply; 26+ messages in thread From: Mark Knecht @ 2005-10-11 4:45 UTC (permalink / raw) To: chaosite; +Cc: Daniel Walker, linux-kernel, Ingo Molnar, Lee Revell On 10/10/05, Matan Peled <chaosite@gmail.com> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Mark Knecht wrote: > > Well, I'm disappointed again. Some xruns came but no additional data > > was put into the dmesage file: > > "dmesage file"? You mean, "/var/log/dmesg"? No, thats not the file you want... > At least here, that file contains the logs as they were when the boot finishes. > > The information you want can be retrieved by running the command (actual program > to run, type it in a terminal) "dmesg". Right, sorry. Obviously bad typing on my part. I am running 'dmesg' at the command line to get the traces. My bad... Anyway, these latencies seem understood at this point. They are just the timer that you set in the kernel hacking section. No big deal. However, none of the messages yet give any clues (that I understand) as to the cause of the timing misses I'm seeing with 2.6.14-rc3-rt13. I shall look into a 2.6.14-rc4-rtX tomorrow. Cheers, Mark ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-11 4:45 ` Mark Knecht @ 2005-10-11 6:14 ` Ingo Molnar 2005-10-11 7:53 ` Ingo Molnar 0 siblings, 1 reply; 26+ messages in thread From: Ingo Molnar @ 2005-10-11 6:14 UTC (permalink / raw) To: Mark Knecht; +Cc: chaosite, Daniel Walker, linux-kernel, Lee Revell * Mark Knecht <markknecht@gmail.com> wrote: > Anyway, these latencies seem understood at this point. They are just > the timer that you set in the kernel hacking section. No big deal. > However, none of the messages yet give any clues (that I understand) > as to the cause of the timing misses I'm seeing with 2.6.14-rc3-rt13. > I shall look into a 2.6.14-rc4-rtX tomorrow. do you have 64-bit userspace too? If you have 32-bit userspace then could you try running the x86 kernel? Generally the 64-bit kernel has less mature debugging options in the -rt tree: e.g. latency tracing itself doesnt work (due to gcc silliness). Ingo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-11 6:14 ` Ingo Molnar @ 2005-10-11 7:53 ` Ingo Molnar 0 siblings, 0 replies; 26+ messages in thread From: Ingo Molnar @ 2005-10-11 7:53 UTC (permalink / raw) To: Mark Knecht; +Cc: chaosite, Daniel Walker, linux-kernel, Lee Revell * Ingo Molnar <mingo@elte.hu> wrote: > > Anyway, these latencies seem understood at this point. They are just > > the timer that you set in the kernel hacking section. No big deal. > > However, none of the messages yet give any clues (that I understand) > > as to the cause of the timing misses I'm seeing with 2.6.14-rc3-rt13. > > I shall look into a 2.6.14-rc4-rtX tomorrow. > > do you have 64-bit userspace too? If you have 32-bit userspace then > could you try running the x86 kernel? Generally the 64-bit kernel has > less mature debugging options in the -rt tree: e.g. latency tracing > itself doesnt work [...] let me take that back - latency tracing does work on x64 too. Ingo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Latency data - 2.6.14-rc3-rt13 2005-10-10 21:12 ` Mark Knecht 2005-10-10 21:28 ` Mark Knecht @ 2005-10-10 21:39 ` Daniel Walker 1 sibling, 0 replies; 26+ messages in thread From: Daniel Walker @ 2005-10-10 21:39 UTC (permalink / raw) To: Mark Knecht; +Cc: linux-kernel, Ingo Molnar, Lee Revell On Mon, 2005-10-10 at 14:12 -0700, Mark Knecht wrote: > On 10/10/05, Daniel Walker <dwalker@mvista.com> wrote: > > On Mon, 2005-10-10 at 13:16 -0700, Mark Knecht wrote: > > > Hi, > > > > > > > > How can I get data that would be more useful in terms of real debug? > > > > The IRQ off times look like the worst . If you do "make menuconfig" > > > > then goto Kernel Hacking and select > > "Interrupts-off critical section latency timing" > > Then select "Latency tracing" > > > > Then when you boot the system before run the following, > > > > "echo 0 > /proc/sys/kernel/preempt_max_latency" > > So this disables the tracing of preempt times but keeps IRQ times on? Cool. Yes, that what the compile options are doing. > > > > That will record a trace for every maximum latency observed for IRQ > > latency . You can view the trace with this command > > "cat /proc/latency_trace" , and you can attach the trace to an email to > > LKML so we can review it (compress it if it's big though) . > > > > Daniel > > Will do. Building now. I'll be back later. > > Is there anything specific I should look for in the traces myself? > Anyway to help narrow it down? You want the trace to represent the largest latency. For instance in the histogram it showed IRQ latency was ~3000us or greater, so you want the trace to show at least that size latency . Daniel ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2005-10-12 17:04 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-10-10 20:16 Latency data - 2.6.14-rc3-rt13 Mark Knecht 2005-10-10 20:49 ` Daniel Walker 2005-10-10 21:12 ` Mark Knecht 2005-10-10 21:28 ` Mark Knecht 2005-10-10 21:44 ` Daniel Walker 2005-10-10 22:09 ` Mark Knecht 2005-10-10 22:28 ` Daniel Walker 2005-10-10 23:33 ` Mark Knecht 2005-10-10 23:49 ` Mark Knecht 2005-10-11 3:45 ` Mark Knecht 2005-10-11 6:46 ` Steven Rostedt 2005-10-11 8:56 ` Sven-Thorsten Dietrich 2005-10-11 11:17 ` Ingo Molnar 2005-10-11 22:45 ` Mark Knecht 2005-10-12 0:02 ` Lee Revell 2005-10-12 1:09 ` Mark Knecht 2005-10-12 1:21 ` Lee Revell 2005-10-12 1:25 ` Mark Knecht 2005-10-12 6:38 ` Steven Rostedt 2005-10-12 16:18 ` Lee Revell 2005-10-12 17:03 ` Steven Rostedt 2005-10-11 3:56 ` Matan Peled 2005-10-11 4:45 ` Mark Knecht 2005-10-11 6:14 ` Ingo Molnar 2005-10-11 7:53 ` Ingo Molnar 2005-10-10 21:39 ` Daniel Walker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox