All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Mark_H_Johnson@raytheon.com
Cc: Lee Revell <rlrevell@joe-job.com>,
	Free Ekanayaka <free@agnula.org>,
	Eric St-Laurent <ericstl34@sympatico.ca>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"K.R. Foley" <kr@cybsft.com>,
	Felipe Alfaro Solana <lkml@felipe-alfaro.com>,
	Daniel Schmitt <pnambic@unu.nu>,
	"P.O. Gaillard" <pierre-olivier.gaillard@fr.thalesgroup.com>,
	nando@ccrma.stanford.edu, luke@audioslack.com, free78@tin.it
Subject: Re: [patch] voluntary-preempt-2.6.9-rc1-bk4-R1
Date: Thu, 9 Sep 2004 21:47:37 +0200	[thread overview]
Message-ID: <20040909194737.GA2778@elte.hu> (raw)
In-Reply-To: <OF1EEB0481.83AB73CE-ON86256F0A.006A8955-86256F0A.006A8968@raytheon.com>


* Mark_H_Johnson@raytheon.com <Mark_H_Johnson@raytheon.com> wrote:

> PIO trace
> =========

> 00000001 0.000ms (+0.370ms): touch_preempt_timing (ide_outsl)
> 00000001 0.370ms (+0.000ms): touch_preempt_timing (ide_outsl)

Please decrease the 128U chunking to 8U or so.

> I have several traces where send_IPI_mask_bitmask (flush_tlb_others)
> shows up. For example...

flush_tlb_others() is a good indicator of irqs-off sections on the other
CPU. The code does the following when it flushes TLBs: it sends an IPI
(inter-process-interrupt) to all other CPUs (one CPU in your case) and
waits for arrival of that IRQ and completion while spinning on a flag. 
The IPI normally takes 10 usecs or so to process so this is not an
issue. BUT if the CPU has IRQs disabled then the IPI is delayed and the
IRQs-off latency shows up as flush_tlb_others() latency.

> 00000003 0.014ms (+0.132ms): send_IPI_mask_bitmask (flush_tlb_others)
> 00010003 0.147ms (+0.000ms): do_nmi (flush_tlb_others)
> 00010003 0.147ms (+0.001ms): do_nmi (ide_outsl)

Since the other CPU's do_nmi() implicates ide_outsl it could be that we
are doing ide_outsl with IRQs disabled? Could you add something like
this to the ide_outsl code:

	if (irqs_disabled() && printk_ratelimit())
		dump_stack();

(the most common irqs-off section is the latency printout itself - this
triggers if the latency message goes to the console - i.e. 'dmesg -n 1'
wasnt done.)

> Buried inside a pretty long trace in kswapd0, I saw the following...

> 00000007 0.111ms (+0.000ms): _spin_unlock_irqrestore (try_to_wake_up)
> 00000006 0.111ms (+0.298ms): preempt_schedule (try_to_wake_up)
> 00000005 0.409ms (+0.000ms): _spin_unlock (flush_tlb_others)

this too is flush_tlb_others() related.

> So the long wait on paths through sched and timer_tsc appear to be
> eliminated with PIO to the disk.

yeah, nice. I'd still like to make sure that we've not hidden latencies
by working down the ide_outsl() latency and its apparent IRQs-off
property.

> Is there some "for sure" way to limit the size and/or duration of DMA
> transfers so I get reasonable performance from the disk (and other DMA
> devices) and reasonable latency?

the 'unit' of the 'weird' delays seem to be around 70 usecs, the maximum
seems to be around 560 usecs. Note the 1:8 relationship between the two. 
You have 32 KB as max_sectors, so the 70 usecs unit is for a single 4K
transfer which is a single scatter-gather entry: it all makes perfect
sense. 4K per 70 usecs means a DMA rate of ~57 MB/sec which sounds
reasonable.

so if these assumptions are true i'd suggest to decrease max_sectors
from 32K to 16K - that alone should halve these random latencies from
560 usecs to 280 usecs. Unless you see stability you might want to try
an 8K setting as well - this will likely decrease your IO rates
noticeably though. This would reduce the random delays to 140 usecs.

but the real fix would be to tweak the IDE controller to not do so
agressive DMA! Are there any BIOS settings that somehow deal with it? 
Try increasing the PCI latency value? Is the disk using UDMA - if yes,
could you downgrade it to normal IDE DMA? Perhaps that tweaks the
controller to be 'nice' to the CPU. Is your IDE chipset integrated on
the motherboard? Could you send me your full bootlog (off-list)?

there are also tools that tweak chipsets directly - powertweak and the
PCI latency settings. Maybe something tweaks the IDE controller just the
right way. Also, try disabling specific controller support in the
.config (or turn it on) - by chance the generic IDE code could program
the IDE controller in a way that generates nicer DMA.

	Ingo

  reply	other threads:[~2004-09-09 19:52 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-09 19:23 [patch] voluntary-preempt-2.6.9-rc1-bk4-R1 Mark_H_Johnson
2004-09-09 19:47 ` Ingo Molnar [this message]
2004-09-09 19:02   ` Alan Cox
  -- strict thread matches above, loose matches on Subject: below --
2004-09-13 14:44 Mark_H_Johnson
2004-09-14 18:32 ` Ingo Molnar
2004-09-15 15:26   ` Stephen Smalley
2004-09-10 14:28 Mark_H_Johnson
2004-09-10 19:29 ` Ingo Molnar
2004-09-10 12:57 Mark_H_Johnson
2004-09-10 13:10 ` Ingo Molnar
2004-09-09 22:56 Mark_H_Johnson
2004-09-10 12:42 ` Ingo Molnar
2004-09-09 22:41 Mark_H_Johnson
2004-09-09 18:31 Mark_H_Johnson
2004-09-08 20:33 Mark_H_Johnson
2004-09-08 22:31 ` Alan Cox
2004-09-09 16:44   ` Thomas Charbonnel
2004-09-09  6:27 ` Ingo Molnar
2004-09-09 15:12 ` Ingo Molnar
2004-09-09 15:33 ` Ingo Molnar
2004-09-07 16:54 Mark_H_Johnson
2004-09-08 18:42 ` Ingo Molnar
2004-09-10  2:31   ` K.R. Foley
2004-09-10 17:56     ` K.R. Foley
2004-09-10 19:26       ` Ingo Molnar
2004-09-10 19:28         ` K.R. Foley
2004-09-09 16:02 ` Ingo Molnar
2004-09-02 22:14 [patch] voluntary-preempt-2.6.9-rc1-bk4-R0 Ingo Molnar
2004-09-03  0:24 ` Lee Revell
2004-09-03  3:17   ` Eric St-Laurent
2004-09-03  6:26     ` Lee Revell
2004-09-03  6:36       ` Ingo Molnar
2004-09-03  6:49         ` Lee Revell
2004-09-03  7:05           ` Ingo Molnar
2004-09-03  7:40             ` Lee Revell
2004-09-03  7:50               ` Free Ekanayaka
2004-09-03  8:05                 ` Lee Revell
2004-09-03  9:25                   ` [patch] voluntary-preempt-2.6.9-rc1-bk4-R1 Ingo Molnar
2004-09-03  9:50                     ` Luke Yelavich
2004-09-03 10:29                       ` Ingo Molnar
2004-09-03 10:43                         ` Luke Yelavich
2004-09-03 11:33                     ` Thomas Charbonnel
2004-09-03 11:49                       ` Ingo Molnar
2004-09-03 12:05                         ` Thomas Charbonnel
2004-09-03 16:14                         ` Thomas Charbonnel
2004-09-03 17:36                           ` Thomas Charbonnel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040909194737.GA2778@elte.hu \
    --to=mingo@elte.hu \
    --cc=Mark_H_Johnson@raytheon.com \
    --cc=ericstl34@sympatico.ca \
    --cc=free78@tin.it \
    --cc=free@agnula.org \
    --cc=kr@cybsft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkml@felipe-alfaro.com \
    --cc=luke@audioslack.com \
    --cc=nando@ccrma.stanford.edu \
    --cc=pierre-olivier.gaillard@fr.thalesgroup.com \
    --cc=pnambic@unu.nu \
    --cc=rlrevell@joe-job.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.