All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ralf Baechle <ralf@linux-mips.org>
To: Dominic Sweetman <dom@mips.com>
Cc: "Maciej W. Rozycki" <macro@linux-mips.org>,
	Thiemo Seufer <ths@networkno.de>,
	linux-mips@linux-mips.org
Subject: Re: Performance bug in c-r4k.c cache handling code
Date: Tue, 20 Sep 2005 13:22:43 +0100	[thread overview]
Message-ID: <20050920122243.GE3159@linux-mips.org> (raw)
In-Reply-To: <17199.53696.27856.801284@mips.com>

On Tue, Sep 20, 2005 at 10:09:20AM +0100, Dominic Sweetman wrote:

> > > I found an performance bug in c-r4k.c:r4k_dma_cache_inv, where a
> > > Hit_Writeback_Inv instead of Hit_Invalidate is done.
> 
> The MIPS64 spec (which is really all there is to set standards in this
> area) regards Hit_Invalidate as optional.  So it would be nice not to
> use it.  CPUs have no standard "configuration" register you can read
> to establish which cacheops work, so to identify capable CPUs you must
> use a table of CPU attributes indexed by the CPU ID, which encourages
> the crime of building software which can't possibly run on a new CPU...
> 
> So long as the buffer is in fact clean, then in most implementations a
> Hit_Writeback_Invalidate will be just as efficient.

This are R4700 numbers, the only I was able to find in a quick search.

  Hit_Invalidate_D		 7 cycles for a cache miss
				 9 cycles for a cache hit
  Hit_Writeback_Invalidate_D	 7 cycles for a cache miss
				12 cycles for a cache hit if the cache line
				   is clean.
				14 cycles for a cache hit if the cache line
				   is dirty (Writeback).
  Hit_Writeback_D		 7 cycles for a cache miss
				10 cycles for a cache hit if the cache line
				   is clean
				14 cycles for a cache hit if the cache line
				   is dirty (Writeback).

> Moreover, CPUs always "post" writes to some extent, so a small
> percentage of dirty lines can be handled without any great overhead.
> So a significant advantage can only occur when the buffer you want to
> invalidate (prior to DMA-in) was fairly recently densely written by
> the CPU; and this is only safe when all that data can be guaranteed to
> now be of no importance to anyone.

Linux has a well-defined ABI that DMA drivers are supposed to use; the
functions of this ABI that perform cache flushes also take a DMA
direction argument based on which the implementation can deciede on what
the best flush function for a particular case will be.

> Randomly and retrospectively discarding writes could generate some
> very interesting bugs, or (indeed) usually hide some very interesting
> bugs.  It's the kind of thing one would lik to avoid!
> 
> I suppose where DMA data subsequently gets decorated by the CPU then
> handed on to some other layer, then the buffer is freed...?
> 
> > FYI, for R4k DECstations the need to flush the cache for newly allocated 
> > skbs reduces throughput of FDDI reception by about a half (!), down from 
> > about 90Mbps (that's for the /260)...

Software coherency will result in many server / client type operations
approximate worst case as none of the data will reside in caches.  Routers
are going to be somewhat better off - as long as they don't peek to deep
into the packets, that is.

> How did you measure the high throughput?  Have you got a
> machine with DMA-coherency you can turn on and off?

Afaik AMD Alchemy processors have configurable coherency.

  Ralf

  reply	other threads:[~2005-09-20 12:23 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-19 15:40 Performance bug in c-r4k.c cache handling code Thiemo Seufer
2005-09-19 16:54 ` Atsushi Nemoto
2005-09-19 18:31   ` Thiemo Seufer
2005-09-19 17:01 ` Maciej W. Rozycki
2005-09-20  9:09   ` Dominic Sweetman
2005-09-20 12:22     ` Ralf Baechle [this message]
2005-09-20 12:37     ` Maciej W. Rozycki
2005-09-20 13:18       ` Dominic Sweetman
2005-09-20 14:51         ` Maciej W. Rozycki
2005-09-19 17:31 ` peter fuerst

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050920122243.GE3159@linux-mips.org \
    --to=ralf@linux-mips.org \
    --cc=dom@mips.com \
    --cc=linux-mips@linux-mips.org \
    --cc=macro@linux-mips.org \
    --cc=ths@networkno.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.