All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ralf Baechle <ralf@linux-mips.org>
To: "David VomLehn (dvomlehn)" <dvomlehn@cisco.com>
Cc: Atsushi Nemoto <anemo@mba.ocn.ne.jp>,
	ddaney@caviumnetworks.com,
	"Michael Sundius -X (msundius - Yoh Services LLC at Cisco)" 
	<msundius@cisco.com>,
	linux-mips@linux-mips.org, msundius@sundius.com
Subject: Re: memcpy and prefetch
Date: Wed, 4 Feb 2009 21:27:46 +0000	[thread overview]
Message-ID: <20090204212746.GB13138@linux-mips.org> (raw)
In-Reply-To: <FF038EB85946AA46B18DFEE6E6F8A2898237E1@xmb-rtp-218.amer.cisco.com>

On Thu, Jan 29, 2009 at 10:39:37PM -0500, David VomLehn (dvomlehn) wrote:

> > The idea here is that we have two issues with prefetching:
> > 
> >  o Prefetching beyond the end of the source or destination range on a
> >    in-coherent range might bring back stale values from a DMA I/O
> >    buffer resulting in data corruption.  Hardware DMA coherency will
> >    avoid this issue.
> > 
> >  o IP27 has full blown hardware coherency.  Historically 
> > CONFIG_DMA_COHERENT
> >    was not able to cope with something of the complexity of IP27, so
> >    there was a separate CONFIG_DMA_IP27 and the broken logic 
> > expression
> >    was meant to treat CONFIG_DMA_COHERENT and CONFIG_DMA_IP27 the same
> >    as for prefetching.
> > 
> >  o Prefetching beyond the end of physical memory can cause 
> > exceptions on
> >    some systems.  The Malta has this problem.
> > 
> > Thus no prefetching on Malta or non-coherent systems.

> It seems to me as though we could avoid the first and third problems
> with a memcpy that doesn't prefetch past the end of the buffer, the
> thought being that if we are reading or writing a memory region, we
> really shouldn't be doing DMA to or from that location. This would
> probably be slightly suboptimal, performance-wise, for those systems
> that do have DMA coherence. It seems as though we could have two
> mutually exclusive versions, selectable via the CONFIG_DMA_COHERENT
> flag. For those of us without DMA coherence, it would probably give our
> memcpy performance a bit of a kick in the pants over using no prefetch
> at all.

Unnecessary prefetching can come at a high cost due to memory latencies
and cache pollution.  So you want to avoid unnecessary prefetches rather
than hoping for hardware cache coherency to sorts out the mess software
left behind.

The general expectation is that prefetching will help - but depending on
the pipeline structure prefetching can be hard to exploit optimally.  For
example there are MIPS cores were the optimal sequence is something like

  load store load store load store load store

But on others it's

  load load load load store store store store

Placing prefetching instructions into loops built from such blocks can
result in very surprising result.

> If this makes sense, we might be able to sign up to do the work. Anyone
> have a good, caching-aware memcpy test?

Testing memcpy is an interesting little project.  Correctness is one
thing but a good implementation needs to do a few performance tradeoffs
which are best meassure with real world, not synthetic workloads.

  Ralf

  reply	other threads:[~2009-02-04 21:27 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-27 23:00 memcpy and prefetch Michael Sundius
2009-01-27 23:07 ` David Daney
2009-01-28 10:37   ` Ralf Baechle
2009-01-28 15:28     ` Atsushi Nemoto
2009-01-28 18:30       ` Ralf Baechle
2009-01-29 12:36         ` Atsushi Nemoto
2009-01-29 15:58           ` Ralf Baechle
2009-01-30  3:39             ` David VomLehn (dvomlehn)
2009-01-30  3:39               ` David VomLehn (dvomlehn)
2009-02-04 21:27               ` Ralf Baechle [this message]
2009-02-05 15:31                 ` Atsushi Nemoto
2009-01-28 19:28   ` Michael Sundius
2009-01-28 19:54     ` David Daney
2009-01-28 21:52       ` Chad Reese

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090204212746.GB13138@linux-mips.org \
    --to=ralf@linux-mips.org \
    --cc=anemo@mba.ocn.ne.jp \
    --cc=ddaney@caviumnetworks.com \
    --cc=dvomlehn@cisco.com \
    --cc=linux-mips@linux-mips.org \
    --cc=msundius@cisco.com \
    --cc=msundius@sundius.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.