All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.23
@ 2004-11-10 14:08 Mark_H_Johnson
  2004-11-10 15:25 ` Ingo Molnar
  0 siblings, 1 reply; 26+ messages in thread
From: Mark_H_Johnson @ 2004-11-10 14:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Amit Shah, Karsten Wiese, Bill Huey, Adam Heath, emann,
	Gunther Persoons, K.R. Foley, linux-kernel, Florian Schmidt,
	Fernando Pablo Lopez-Lezcano, Lee Revell, Rui Nuno Capela,
	Shane Shrybman, Thomas Gleixner, Michal Schmidt

>> [4] Application level latencies are OK but not great.
>>  X test - only 90% of CPU loops are within 100 usec of nominal value.
>> In previous RT kernels I got > 99% with 100 usec.
>
>this might be a side-effect of the chrt-ing of events/[0|1] and/or
>ksoftirqd (which we did to debug the 'freeze' problems) - are those
>still chrt-ed?
For reference:
# ps -eo pid,pri,rtprio,cmd | grep '\['
    1  23      - init [5]
    2 139     99 [migration/0]
    3  34      - [ksoftirqd/0]
    4  34      - [desched/0]
    5 139     99 [migration/1]
    6  34      - [ksoftirqd/1]
    7  34      - [desched/1]
    8  41      1 [events/0]
    9  41      1 [events/1]
   10  34      - [khelper]
   15  32      - [kthread]
   27  34      - [kblockd/0]
   28  34      - [kblockd/1]
   36  24      - [khubd]
  103  23      - [kswapd0]
  104  32      - [aio/0]
  105  33      - [aio/1]
  180 139     99 [IRQ 8]
  195  14      - [kseriod]
  201 139     99 [IRQ 12]
  237 139     99 [IRQ 14]
  239 139     99 [IRQ 15]
  278 139     99 [IRQ 1]
  310  24      - [kirqd]
  313 139     99 [IRQ 4]
  320  24      - [kjournald]
  605 139     99 [IRQ 10]
 1206  24      - [kjournald]
 1207  24      - [kjournald]
 1309 139     99 [IRQ 3]
 1323  31      - [IRQ 7]
 1494 139     99 [IRQ 6]
 1748 139     99 [IRQ 11]
14131  23      - [pdflush]
14242  24      - [pdflush]
17337  21      - grep \[

>Please review and double-check all SCHED_FIFO tasks in
>the system and keep only those that are absolutely necessary for
>latencytest's operation [i.e. the soundcard IRQ and latencytest itself]
>- everything else should be SCHED_OTHER. Do latencies get any better if
>you do this?
I can, but that is not necessarily an "apples to apples" comparison.
When I compare with 2.4 preempt + low latency kernels, the X stress
test had > 99% of the samples within 100 usec of the nominal value.
Don't forget - on a 2.4 kernel, the IRQ's are all unthreaded. On
the 2.4 kernel, heavy disk I/O is where I get the worst behavior
and even then, I get > 90% of samples within 100 usec.

I still maintain that a 2.6 RT kernel has to do as well or better
than a 2.4 RT kernel (or else, why would I step up??).

  --Mark


^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.23
@ 2004-11-10 17:35 Mark_H_Johnson
  2004-11-11 11:38 ` Ingo Molnar
  0 siblings, 1 reply; 26+ messages in thread
From: Mark_H_Johnson @ 2004-11-10 17:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Amit Shah, Karsten Wiese, Bill Huey, Adam Heath, emann,
	Gunther Persoons, K.R. Foley, linux-kernel, Florian Schmidt,
	Fernando Pablo Lopez-Lezcano, Lee Revell, Rui Nuno Capela,
	Shane Shrybman, Thomas Gleixner, Michal Schmidt

>you have to build another kernel for them. irqs-off and preempt-off
>timing can be mixed freely (and both can be enabled in the same kernel),
>but wakeup timing deserves its own .config space and since it's not
>mixable with the other two methods i didnt see much point in enabling
>all 3 at once with strange dependencies between them. Is this a big
>issue? Normally i think the wakeup timing is more than enough to get a
>feel of latencies, and if something specific is suspected the other ones
>can be turned on.

Just that it takes an hour or so to rebuild the kernel plus the
disk storage to keep two kernels instead of one.

The wakeup latencies I am seeing are all quite small, but the overhead
I am seeing at the application level have been quite high. Only 40 wakeup
latencies > 50 usec in a half hour of testing. I guess I'll build a
.24 without wakeup timing to see what kind of problems I'm having.
I can send you the wakeup timing traces if you are interested but
they are all really short (< 100 usec) or appear to indicate a hardware
contention problem (one step at 100 usec or so).

 --Mark


^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.23
@ 2004-11-10 17:31 Mark_H_Johnson
  0 siblings, 0 replies; 26+ messages in thread
From: Mark_H_Johnson @ 2004-11-10 17:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Amit Shah, Karsten Wiese, Bill Huey, Adam Heath, emann,
	Gunther Persoons, K.R. Foley, linux-kernel, Florian Schmidt,
	Fernando Pablo Lopez-Lezcano, Lee Revell, Rui Nuno Capela,
	Shane Shrybman, Thomas Gleixner, Michal Schmidt

>I may do a variant on this anyway. I think its important to see if
>the symptom (> 100 usec CPU delay) is really:
>- lots of short delays
>OR
>- relatively few long delays
>and I have an idea of how to code that up and add to latencytrace.

A follow up on this message. My first test completed with the following
results. The new code indicates:

X     - Min delay was     0. Max delay was     3. Ave delay was  0.015295.
top   - Min delay was     0. Max delay was    23. Ave delay was  0.025659.
netO  - Min delay was     0. Max delay was    31. Ave delay was  1.169024.
netI  - Min delay was     1. Max delay was    35. Ave delay was  1.182864.
diskW - Min delay was     0. Max delay was    18. Ave delay was  1.166944.
diskC - Min delay was     0. Max delay was    18. Ave delay was  1.080036.
diskR - Min delay was     0. Max delay was     7. Ave delay was  0.803804.

A "delay" was counted if 1000 iterations of the CPU inner loop took
longer than 10 usec. For comparison, I moved this code to my 2.4 system
and got the following results:

X     - Min delay was     0. Max delay was    17. Ave delay was  1.277730.
top   - Min delay was     0. Max delay was    12. Ave delay was  1.452692.
netO  - Min delay was     0. Max delay was    12. Ave delay was  1.633742.
netI  - Min delay was     0. Max delay was    12. Ave delay was  1.626565.
diskW - Min delay was     0. Max delay was    14. Ave delay was  1.566188.
diskC - Min delay was     0. Max delay was    12. Ave delay was  1.701542.
diskR - Min delay was     0. Max delay was    12. Ave delay was  1.650909.

Looks pretty comparable at this level. The 2.4 results appear to be more
consistent.

Grr. The new code does have an impact on the application measurements
under 2.4. It appears the TSC accesses are being delayed while the disk
is active. The within 100 usec rate was only 77% (was 90%) but the peak
is still pretty close (2.60 vs. 2.38 msec).

I am also not sure this is the "right" measurement either. I probably
need to count the delays or divide the overall loop time by the number
of delays to see if that is a more meaningful value.

  --Mark


^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.23
@ 2004-11-10 15:57 Mark_H_Johnson
  2004-11-10 17:06 ` Ingo Molnar
  0 siblings, 1 reply; 26+ messages in thread
From: Mark_H_Johnson @ 2004-11-10 15:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Amit Shah, Karsten Wiese, Bill Huey, Adam Heath, emann,
	Gunther Persoons, K.R. Foley, linux-kernel, Florian Schmidt,
	Fernando Pablo Lopez-Lezcano, Lee Revell, Rui Nuno Capela,
	Shane Shrybman, Thomas Gleixner, Michal Schmidt

>> [...] I also noticed that /proc/sys/kernel/preempt_wakeup_timing was
>> removed at .20 but not sure if that was deliberate. [...]
>
>yeah, this was deliberate - it's a side-effect of separating it from the
>other timing options.

OK. So maybe I didn't understand what you said previously. Now, if I build
to get maximum-latency wakeup values, I can't get the IRQ off or
preempt off timing and traces? If that's not true, how do I switch
between the different sampling methods?

--Mark H Johnson
  <mailto:Mark_H_Johnson@raytheon.com>


^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.23
@ 2004-11-10 14:51 Mark_H_Johnson
  2004-11-11 12:51 ` Ingo Molnar
  0 siblings, 1 reply; 26+ messages in thread
From: Mark_H_Johnson @ 2004-11-10 14:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Amit Shah, Karsten Wiese, Bill Huey, Adam Heath, emann,
	Gunther Persoons, K.R. Foley, linux-kernel, Florian Schmidt,
	Fernando Pablo Lopez-Lezcano, Lee Revell, Rui Nuno Capela,
	Shane Shrybman, Thomas Gleixner, Michal Schmidt

>* Mark_H_Johnson@raytheon.com <Mark_H_Johnson@raytheon.com> wrote:
>
>> >- everything else should be SCHED_OTHER. Do latencies get any better if
>> >you do this?
>
>> I can, but that is not necessarily an "apples to apples" comparison.
>
>the goal now would be to simplify the test and work down the issues in
>isolation, instead of looking at a complex setup of mixed workloads and
>just seeing 'it sucks' without knowing which component causes what.

However based on the results of the last several weeks, it is apparent
to me that the simple tests are finding only a subset of the problems.
The stressful series of tests is finding a number of symptoms much
sooner and more repeatable than those simple tests.

I was thinking about this problem this morning and was wondering if
we could do something like an "end trigger" to help determine the cause
of some of these pauses. Something like:
 - start to fill / refresh the trace buffer (already doing this?)
 - run RT CPU loop & sample TSC every 100 iterations or so
 - if delta T exceeds 100 usec (or so), then set "end trigger" and
dump the data from /proc/latency_trace.
Repeat with some rate limit so we don't get too much data.
I can still run the stressful test cases to cause the situations and
get the "just in time" data for the analysis. Perhaps a variant of
the interface you provided before on tracing a specific path.

I may do a variant on this anyway. I think its important to see if
the symptom (> 100 usec CPU delay) is really:
 - lots of short delays
OR
 - relatively few long delays
and I have an idea of how to code that up and add to latencytrace.

--Mark H Johnson
  <mailto:Mark_H_Johnson@raytheon.com>


^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.23
@ 2004-11-10 14:12 Mark_H_Johnson
  0 siblings, 0 replies; 26+ messages in thread
From: Mark_H_Johnson @ 2004-11-10 14:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Amit Shah, Karsten Wiese, Bill Huey, Adam Heath, emann,
	Gunther Persoons, K.R. Foley, linux-kernel, Florian Schmidt,
	Fernando Pablo Lopez-Lezcano, Lee Revell, Rui Nuno Capela,
	Shane Shrybman, Thomas Gleixner, Michal Schmidt

>* Mark_H_Johnson@raytheon.com <Mark_H_Johnson@raytheon.com> wrote:
>
>> [3] I am not so sure that the latency tracing works. I do not get any
>> trace output, even if I set preempt_max_latency to zero.
>
>What is the value of preempt_thresh?
It was zero at boot time. Now that I check, set it somewhere to 200.
Setting it back to zero, I now see that I have some extremely
small reports, max so far is 63 usec. Will run my big test again
to see what turns up.

  --Mark


^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.23
@ 2004-11-09 21:07 Mark_H_Johnson
  2004-11-09 23:20 ` Ingo Molnar
  2004-11-09 23:26 ` Ingo Molnar
  0 siblings, 2 replies; 26+ messages in thread
From: Mark_H_Johnson @ 2004-11-09 21:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Amit Shah, Karsten Wiese, Bill Huey, Adam Heath, emann,
	Gunther Persoons, K.R. Foley, linux-kernel, Florian Schmidt,
	Fernando Pablo Lopez-Lezcano, Lee Revell, Rui Nuno Capela,
	Shane Shrybman, Thomas Gleixner, Michal Schmidt

>i have released the -V0.7.23 Real-Time Preemption patch, which can be
>downloaded from the usual place:
>
>    http://redhat.com/~mingo/realtime-preempt/

A few notes on results from running with this patch (well, actually -EA
that Ingo provided separately).

[1] Build problems, separately reported to Ingo with CONFIG_PREEMPT_RT
enabled on x86 and you have modules using kunmap_atomic. Fix by adding
kunmap_virt and kmap_to_page to the list of exports.

[2] The live lock that I was having seems to have been killed based
on an hour of testing (I could usually cause it in 5 minutes or less).

[3] I am not so sure that the latency tracing works. I do not get any
trace output, even if I set preempt_max_latency to zero. I also noticed
that /proc/sys/kernel/preempt_wakeup_timing was removed at .20 but
not sure if that was deliberate. As a result, I have no reports from
the kernel tracing.

[4] Application level latencies are OK but not great.
 X test - only 90% of CPU loops are within 100 usec of nominal value.
In previous RT kernels I got > 99% with 100 usec.
 top test - looks much nicer than X test, but still have up to 30%
overhead on CPU loop.
 network I/O tests - smoothest of all test results, very nice
 disk write - very noisy, bursts of long delays with only 82% within
100 usec and worst case has over 100% overhead (2.5 msec vs 1.16 nominal)
 disk copy - fewer bursts, but worst case is similar to disk write.
About 95% within 100 usec.
 disk read - relatively clean with a pair of modest bursts early in
the test and then settled out a little worse than the network tests.
99.9% of samples within 100 usec and max of 1.65 msec.

[5] concurrent ping of system had over 13% lost packets (1089 out
of 10723 - plus I let this run after the tests had completed). The
2.4 RT kernel I use has no lost packets.

[6] I did a separate run of a script Ingo suggested that samples
the kernel profile data. It shows basically no CPU time for the
events/0 and events/1 tasks. I took 11 samples in about 1/2 hour
of testing but nothing seems to jump out of the data.

--Mark H Johnson
  <mailto:Mark_H_Johnson@raytheon.com>


^ permalink raw reply	[flat|nested] 26+ messages in thread
* [patch] Real-Time Preemption, -RT-2.6.9-rc4-mm1-U9
@ 2004-10-21 13:27 Ingo Molnar
  2004-10-22 13:35 ` [patch] Real-Time Preemption, -RT-2.6.9-rc4-mm1-U9.3 Ingo Molnar
  0 siblings, 1 reply; 26+ messages in thread
From: Ingo Molnar @ 2004-10-21 13:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lee Revell, Rui Nuno Capela, Mark_H_Johnson, K.R. Foley,
	Bill Huey, Adam Heath, Florian Schmidt, Thomas Gleixner,
	Michal Schmidt, Fernando Pablo Lopez-Lezcano


i have released the -U9 Real-Time Preemption patch, which can be
downloaded from:

  http://redhat.com/~mingo/realtime-preempt/

this too is a fixes-only release. It includes more driver fixes and
improvements from Thomas Gleixner.

Changes since -U8.1:

 - USB semaphore->completion conversion from Thomas Gleixner

 - netconsole fixes from Michal Schmidt

 - fbcon fixes

 - added counted semaphores, this is now used by firewire, XFS and ACPI. 
   This could fix the firewire breakage - but testing would be welcome.

 - PREEMPT_ACTIVE irqs-enabled critical path removal.

 - fixed irqs-off raw spinlock primitives on UP: they enabled irqs 
   before enabling preemption, creating a window for an interrupt to
   slip in and increase the critical path.

 - made the deadlock detector not crash the current process - it will
   just hang. This produces far nicer log output while still not
   endangering stability. Also, fixed a bug in the detector that happens 
   if the trace buffer overflows.

 - made the atomic-counter-underflow detector non-fatal as well, for the
   same reasons.

to create a -U9 tree from scratch, the patching order is:

   http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
 + http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
 + http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
 + http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U9

	Ingo

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2004-11-11 12:27 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-10 14:08 [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.23 Mark_H_Johnson
2004-11-10 15:25 ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2004-11-10 17:35 Mark_H_Johnson
2004-11-11 11:38 ` Ingo Molnar
2004-11-10 17:31 Mark_H_Johnson
2004-11-10 15:57 Mark_H_Johnson
2004-11-10 17:06 ` Ingo Molnar
2004-11-10 14:51 Mark_H_Johnson
2004-11-11 12:51 ` Ingo Molnar
2004-11-10 14:12 Mark_H_Johnson
2004-11-09 21:07 Mark_H_Johnson
2004-11-09 23:20 ` Ingo Molnar
2004-11-09 23:26 ` Ingo Molnar
2004-10-21 13:27 [patch] Real-Time Preemption, -RT-2.6.9-rc4-mm1-U9 Ingo Molnar
2004-10-22 13:35 ` [patch] Real-Time Preemption, -RT-2.6.9-rc4-mm1-U9.3 Ingo Molnar
2004-10-22 15:50   ` [patch] Real-Time Preemption, -RT-2.6.9-mm1-U10 Ingo Molnar
2004-10-22 17:56     ` [patch] Real-Time Preemption, -RT-2.6.9-mm1-U10.2 Ingo Molnar
2004-10-25 10:40       ` [patch] Real-Time Preemption, -RT-2.6.9-mm1-V0 Ingo Molnar
2004-10-27  0:15         ` [patch] Real-Time Preemption, -RT-2.6.9-mm1-V0.3 Ingo Molnar
2004-11-03 10:58           ` [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm2-V0.7.1 Ingo Molnar
2004-11-06 15:57             ` [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.18 Ingo Molnar
2004-11-08  9:16               ` [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.19 Ingo Molnar
2004-11-08 16:57                 ` [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.21 Ingo Molnar
2004-11-09 16:05                   ` [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.23 Ingo Molnar
2004-11-10 13:52                     ` Karsten Wiese
2004-11-10 13:58                       ` Karsten Wiese
2004-11-10 15:01                       ` Ingo Molnar
2004-11-10 14:20                         ` Karsten Wiese
2004-11-10 14:50                           ` Karsten Wiese
2004-11-10 15:33                           ` Ingo Molnar
2004-11-11  4:34                     ` K.R. Foley
2004-11-11  5:01                     ` K.R. Foley
2004-11-11  9:52                       ` Ingo Molnar
2004-11-11 10:20                       ` Ingo Molnar
2004-11-11 13:05                         ` Ingo Molnar
2004-11-11 12:27                           ` K.R. Foley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.