Linux MIPS Architecture development
 help / color / mirror / Atom feed
* Question concerning cache coherency
@ 2000-01-19 17:00 Jeff Harrell
  2000-01-19 23:22 ` Ralf Baechle
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Harrell @ 2000-01-19 17:00 UTC (permalink / raw)
  To: sgi-mips; +Cc: Ralf Baechle, bbrown, vwells, kmcdonald, mhassler

I have an interesting issue that I would like to run past the MIPS/Linux
newsgroup.  I am
currently porting the MIPS/Linux code to a development board that has a
IDT64475 MIPS
core (64-bit R4xxx core).  I notice that this part does not have any
method of maintaining
cache coherency (i.e., no hardware support for cache coherency).  It is
highly likely that we
will be plugging in a network card on a PCI bus that would be DMA'ing to
a shared memory
space in SDRAM.  I assume that the problem of cache coherency is fixed
by mapping the shared
memory as uncached.  I have not dug into the network drivers (or the
kernel) enough to know whether
this is how the problem is addressed on typical MIPS architectures.  I
guess I have two questions
related to this issue;  Do devices that DMA, typically access uncached
memory  and if so, is a second buffer
required to copy from kernel to user space?  The second question is
concerning the performance hit in
running out of uncached memory,  Have people seen significant
performance degradation when
using uncached memory.  Any insight that anybody can provide would be
greatly appreciated.

Thanks,
Jeff


--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Jeff Harrell                    Work:  (801) 619-6104
Broadband Access group/TI
jharrell@ti.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Question concerning cache coherency
  2000-01-19 17:00 Question concerning cache coherency Jeff Harrell
@ 2000-01-19 23:22 ` Ralf Baechle
  0 siblings, 0 replies; 3+ messages in thread
From: Ralf Baechle @ 2000-01-19 23:22 UTC (permalink / raw)
  To: Jeff Harrell; +Cc: sgi-mips, bbrown, vwells, kmcdonald, mhassler

On Wed, Jan 19, 2000 at 10:00:11AM -0700, Jeff Harrell wrote:

> I have an interesting issue that I would like to run past the MIPS/Linux
> newsgroup.  I am currently porting the MIPS/Linux code to a development
> board that has a IDT64475 MIPS core (64-bit R4xxx core).  I notice that
> this part does not have any method of maintaining cache coherency (i.e.,
> no hardware support for cache coherency).  It is highly likely that we
> will be plugging in a network card on a PCI bus that would be DMA'ing to a
> shared memory space in SDRAM.  I assume that the problem of cache
> coherency is fixed by mapping the shared memory as uncached.  I have not
> dug into the network drivers (or the kernel) enough to know whether this
> is how the problem is addressed on typical MIPS architectures.  I guess I
> have two questions related to this issue; Do devices that DMA, typically
> access uncached memory and if so, is a second buffer required to copy from
> kernel to user space?  The second question is concerning the performance
> hit in running out of uncached memory, Have people seen significant
> performance degradation when using uncached memory.  Any insight that
> anybody can provide would be greatly appreciated.

The performance hit by using uncached memory is tremenduous.  Avoid it, if
you can.  Even if you cannot exploit the locality effects of caches you will
still gain from cached access because of prefetch / burst access and write
gathering.

The is one special case where you can not use caching, that is a cacheline
worth of data might concurrently be manipulated both by both processor and a
DMA device.  The typical example are processors with 32-byte cache lines
like the R4000 and a Ethernet chip like the Sonic which has ring entries of
only 16 byte size.  For such a configuration there is a case where

  1)  processor fetches cacheline
  2)                                   NIC write to that cacheline
  3)  processor writes cacheline back

-> the processor just corrupted the NIC written data.

The only way you can deal with that is by either stopping the NIC which you
don't want to or by using uncached access.

Take a look at the bottom of <asm/io.h> which defines three functions which
do the cache flushing for you.  On machines that are cache coherent by
hardware like SGI's Origins these functions will simply be no-ops.

  Ralf

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Question concerning cache coherency
@ 2000-01-20  0:49 Kevin D. Kissell
  0 siblings, 0 replies; 3+ messages in thread
From: Kevin D. Kissell @ 2000-01-20  0:49 UTC (permalink / raw)
  To: Jeff Harrell, sgi-mips; +Cc: Ralf Baechle, bbrown, vwells, kmcdonald, mhassler

>I have an interesting issue that I would like to run past the MIPS/Linux
>newsgroup.  I am
>currently porting the MIPS/Linux code to a development board that has a
>IDT64475 MIPS
>core (64-bit R4xxx core).  I notice that this part does not have any
>method of maintaining
>cache coherency (i.e., no hardware support for cache coherency).  It is
>highly likely that we
>will be plugging in a network card on a PCI bus that would be DMA'ing to
>a shared memory
>space in SDRAM.  I assume that the problem of cache coherency is fixed
>by mapping the shared
>memory as uncached.  I have not dug into the network drivers (or the
>kernel) enough to know whether
>this is how the problem is addressed on typical MIPS architectures.  I
>guess I have two questions
>related to this issue;  Do devices that DMA, typically access uncached
>memory  and if so, is a second buffer
>required to copy from kernel to user space?  The second question is
>concerning the performance hit in
>running out of uncached memory,  Have people seen significant
>performance degradation when
>using uncached memory.  Any insight that anybody can provide would be
>greatly appreciated.


While some MIPS CPUs have mechanisms for hardware
cache coherence, many of them do not, and even systems
with coherent-I/O-capable CPUs often do not implement
the necessary protocol.

There are two basic options for dealing with caches
and DMA I/O:   flush the caches, or operate on
non-cached memory.  Sometimes one does both.  
A random  buffer being handed to a driver must be 
assumed  to have some portion of its contents cached, 
and  must be explicity flushed to memory (via 
hit_writeback_invalidate Cache instructions, or
dma_cache_wback_invalidate() calls in Linux) 
before being  presented to a DMA device.  

There's  a bit more discretion for data structures that 
are private to the driver/device.  If a data structure 
is going to be manipulated a great deal by the CPU 
before being DMAed, it will be worthwhile to treat it 
as cached and flush it out to memory when it is 
released to the I/O device.   If a data structure is
constantly shared between CPU and I/O, it is may be 
better to treat it as uncached rather than constantly
invoke the cache flush procedure.  There's a lot of
grey area in between where the optimal choice is
implementation and application dependent.

In an ethernet driver for a chip like a Lance or a Tulip ,
for example, which autonomously processes lists of 
buffers, the shared buffer descriptor lists might be treated 
as uncached  by the CPU, but transmit buffers coming 
in from further up the protocol stack and empty receive
buffers allocated from the general memory pool might 
be explicity flushed before being turned over to the I/O 
device.

Simple OS's like Linux (at least through 2.2.x) map the
kernel code and data through the kseg0/kseg1 mappings 
to physical memory, which makes it really simple to create 
an uncached data structure.  Including asm/io.h provides
a KSEG1ADDR() macro which just does an AND and an 
OR to generate an uncached alias.  This only works
for systems with 512M or less of memory, BTW.

Great care must be taken with uncached aliases, since
the behaviour of MIPS CPUs is not well defined if uncached
and cached accesses to the same location (or cache line)
are mixed.  I recommend allocating twice the maximum
cache line size (less 1 byte if you like) of kernel memory
in addition to the size of any data structure, and forcing
the alignment of the structure to the first cache line
boundary within the allocated block.  This should ensure
that no cached allocation of memory (or cached malloc
control structure) overlaps with the data structure, and
that it is thus safe to transform the pointer to the new
data structure to the kseg1 uncached form.  Of course,
if the structure is ever to be deallocated, the original
allocation address must be recoverable somehow.

            Regards,

            Kevin K.
.    

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2000-01-20  0:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-01-19 17:00 Question concerning cache coherency Jeff Harrell
2000-01-19 23:22 ` Ralf Baechle
  -- strict thread matches above, loose matches on Subject: below --
2000-01-20  0:49 Kevin D. Kissell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox