Re: Pin-pointing the root of unusual application latencies

All of lore.kernel.org
 help / color / mirror / Atom feed

From: John Sigler <linux.kernel@free.fr>
To: linux-rt-users@vger.kernel.org, oprofile-list@lists.sourceforge.net
Subject: Re: Pin-pointing the root of unusual application latencies
Date: Fri, 13 Jul 2007 10:32:37 +0200	[thread overview]
Message-ID: <469738A5.2040306@free.fr> (raw)
In-Reply-To: <469600F7.3060603@free.fr>

John Sigler wrote:

> [ Note: cross-posted to linux-rt-users and oprofile-list ]
> 
> I'm running kernel 2.6.20.7-rt8 on x86 (1267-MHz P3) with the
> following debug options:
> 
> CONFIG_TRACE_IRQFLAGS_SUPPORT=y
> # CONFIG_PRINTK_TIME is not set
> # CONFIG_ENABLE_MUST_CHECK is not set
> # CONFIG_MAGIC_SYSRQ is not set
> # CONFIG_UNUSED_SYMBOLS is not set
> # CONFIG_DEBUG_FS is not set
> # CONFIG_HEADERS_CHECK is not set
> # CONFIG_DEBUG_KERNEL is not set
> CONFIG_LOG_BUF_SHIFT=14
> CONFIG_STACKTRACE=y
> CONFIG_EVENT_TRACE=y
> CONFIG_FUNCTION_TRACE=y
> CONFIG_WAKEUP_TIMING=y
> CONFIG_LATENCY_TRACE=y
> # CONFIG_CRITICAL_PREEMPT_TIMING is not set
> # CONFIG_CRITICAL_IRQSOFF_TIMING is not set
> CONFIG_WAKEUP_LATENCY_HIST=y
> CONFIG_LATENCY_TIMING=y
> CONFIG_LATENCY_HIST=y
> CONFIG_MCOUNT=y
> CONFIG_DEBUG_BUGVERBOSE=y
> CONFIG_FRAME_POINTER=y
> CONFIG_UNWIND_INFO=y
> CONFIG_STACK_UNWIND=y
> CONFIG_EARLY_PRINTK=y
> CONFIG_X86_FIND_SMP_CONFIG=y
> CONFIG_X86_MPPARSE=y
> CONFIG_DOUBLEFAULT=y
> 
> http://linux.kernel.free.fr/latency/config-2.6.20.7-rt8-adlink-latency
> 
> Here's my situation:
> 
> A SCHED_RR process with priority 80 is performing blocking reads of 
> 1316-byte chunks from a PCI device. The data comes in at ~38 Mbit/s.
> 
> Therefore, in a perfect world, my process would wake up periodically 
> every 1316*8 / 38e6 = 277 µs to handle the latest chunk.
> 
> However, periodically, my process blocks up to 400 µs instead of 277 µs.
> Naturally, the next iteration, my process blocks less than 277 µs (the 
> PCI board has 1316 bytes ready every 277 µs regardless).
> 
> I've captured a histogram of read latencies.
> It is expressed in units of 512 ns.
> e.g. there were 23,126,460 occurences of LAT=278 µs (543 * 512 ns)
> http://linux.kernel.free.fr/latency/histogram.txt
> http://linux.kernel.free.fr/latency/read_latency.png
> 
> The system is otherwise completely idle. In other words, 100% of the CPU 
> is dedicated to the single task of waiting for an interrupt request, 
> servicing it, and pushing the data to user land.
> 
> I used OProfile to frequently sample the instruction pointer while my 
> application ran all night long.
> 
> http://linux.kernel.free.fr/latency/oprofile_report_all_night.txt
> 
> CPU: PIII, speed 1266.7 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a 
> unit mask of 0x00 (No unit mask) count 60000
> samples      %      symbol name
> 958075714  90.6470   default_idle
> 25637397    2.4256   mask_and_ack_8259A
> 10697948    1.0122   __copy_to_user_ll
> 9345347     0.8842   oprofiled
> 9246368     0.8748   Dta1xxIsr
> 6732131     0.6370   enable_8259A_irq
> 5153650     0.4876   __schedule
> 1873157     0.1772   irq_entries_start
> 1729897     0.1637   increment_tail
> 1565354     0.1481   clock_gettime
> 1429908     0.1353   Dta1xxDmaTransfer
> 1428107     0.1351   ioread16
> 1252095     0.1185   system_call
> 1081768     0.1023   try_to_wake_up
> 983687      0.0931   cpu_idle
> 
> I thought default_idle called HLT via safe_halt(). Since I'm only 
> counting UNHALTED clock cycles, I'm surprised to see so many samples 
> taken in default_idle. Does someone understand that?
> 
> Is it possible that one of these functions periodically delays the 
> delivery of data to my SCHED_RR application?
> 
> Here's a /proc/latency_trace dump. What is there to understand?
> 
> # cat /proc/latency_trace
> preemption latency trace v1.1.5 on 2.6.20.7-rt8
> --------------------------------------------------------------------
>  latency: 26 us, #2/2, CPU#0 | (M:rt VP:0, KP:0, SP:1 HP:1)
>     -----------------
>     | task: softirq-timer/0-4 (uid:0 nice:0 policy:1 rt_prio:50)
>     -----------------
> 
>                  _------=> CPU#
>                 / _-----=> irqs-off
>                | / _----=> need-resched
>                || / _---=> hardirq/softirq
>                ||| / _--=> preempt-depth
>                |||| /
>                |||||     delay
>    cmd     pid ||||| time  |   caller
>       \   /    |||||   \   |   /
>    <...>-4     0D..1   26us : trace_stop_sched_switched 
> (__sched_text_start)
> 
> 
> vim:ft=help

Shouldn't there be many lines in this output?
Do I need an additional kernel configuration option?

> Is there a different diagnostic I can run to tell why data delivery
> to my app is periodically delayed 100 µs?

I'm open to all suggestions. I'm running out of ideas.

> # cat /proc/latency_hist/wakeup_latency/CPU0
> #Minimum latency: 0 microseconds.
> #Average latency: 0 microseconds.
> #Maximum latency: 26 microseconds.
> #Total samples: 14925698
> #There are 0 samples greater or equal than 10240 microseconds
> #usecs           samples
>     0            3682745
>     1            9487737
>     2            1748372
>     3               1192
>     4               3689
>     5                 50
>     6                 14
>     7                 14
>     8                  9
>     9                 46
>    10                995
>    11                799
>    12                  6
>    13                  8
>    14                  4
>    15                  6
>    16                  3
>    17                  0
>    18                  1
>    19                  0
>    20                  0
>    21                  0
>    22                  3
>    23                  1
>    24                  0
>    25                  1
>    26                  3
> 
> I would think that I might have a clearer picture if I could request the 
> last N instruction pointer samples from OProfile, to figure out what the 
> CPU has been doing for the past several hundred µs.

I'd like to hear what OProfile devs think of this approach.
I seems far from trivial to implement?

Regards.

next prev parent reply	other threads:[~2007-07-13  8:32 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-12 10:22 Pin-pointing the root of unusual application latencies John Sigler
2007-07-13  8:32 ` John Sigler [this message]
2007-07-23  8:25   ` John Sigler
2007-07-23  9:53 ` Ingo Molnar
2007-07-23 14:14   ` John Sigler
2007-07-23 16:04     ` Ingo Molnar
2007-07-23 16:04       ` Ingo Molnar
2007-07-23 16:44       ` John Sigler
2007-07-24  8:31       ` John Sigler
2007-07-24  9:20         ` John Sigler
2007-07-25 13:04         ` John Sigler
2007-07-25 13:05           ` Ingo Molnar
2007-07-25 13:05             ` Ingo Molnar
2007-07-25 13:20             ` John Sigler
2007-07-25 13:38               ` Ingo Molnar
2007-07-25 14:05                 ` John Sigler
2007-07-25 14:13                   ` Alessio Igor Bogani
2007-07-25 14:35                     ` John Sigler
2007-07-25 15:00                       ` Alessio Igor Bogani
2007-07-25 15:21                         ` John Sigler
2007-07-25 15:35                           ` Alessio Igor Bogani
2007-07-25 15:53                             ` John Sigler
2007-07-25 15:28                   ` Karsten Wiese
2007-07-25 15:46                     ` John Sigler
2007-07-25 16:31                       ` Karsten Wiese
2007-07-25 17:09                   ` Len Brown
2007-07-26  8:35                     ` John Sigler
2007-07-26 10:45                       ` John Sigler
2007-07-26 12:02                         ` John Sigler
2007-07-26 15:16         ` John Sigler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=469738A5.2040306@free.fr \
    --to=linux.kernel@free.fr \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=oprofile-list@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.