* PROPOSAL : multi-way cache support in Linux/MIPS
@ 2000-08-01 23:52 Jun Sun
2000-08-02 18:12 ` Dominic Sweetman
0 siblings, 1 reply; 15+ messages in thread
From: Jun Sun @ 2000-08-01 23:52 UTC (permalink / raw)
To: linux, linux-mips, ralf
Ralf,
I have got NEC DDB5476 running stable enough that I am comfortable to
check in
my code. Will you take it?
Assuming the answer is yes, there are several issues regarding checking
in.
I will bring them up one by one.
The first issue is multi-way cache support. DDB5476 uses R5432 CPU
which
has two-way set-associative cache. The problematic part is the
index-based cache operations in r4xxcache.h does not cover all ways in a
set.
I think this is a problem in general. So far I have seen MIPS
processors with
2-way, 4-way and 8-way sets. And I have seen them using ether least-
significant-addressing-bits or most-significant-addressing-bits
within a cache line to select ways.
Here is my proposal :
. introduce two variables,
cache_num_ways - number of ways in a set
cache_way_selection_offset - the offset added to the address to
select
next cache line in the same set. For LSBs addressing,
it is
equal to 1. For MSBs addressing, it is equal to
cache_line_size / cache_num_ways. (It can potentially
take
care of some future weired way-selection scheme as long
as
the offset is uniform)
. These variables are initialized in cpu_probe().
(Alternatively, I think we should have cpu_info table, that contains
all
these cpu information. Then a general routine can fill in the based
on
the cpu id. This can get rid of a bunch of ugly switch/case
statements.)
. in the include/asm/r4kcache.h file, all Index-based cache operation
needs
to changed like the following (for illustration only; need
optimization) :
-----
while(start < end) {
cache16_unroll32(start,Index_Writeback_Inv_D);
start += 0x200;
}
+++++
while(start < end) {
for (i=0; i< cache_num_ways; i++) {
cache16_unroll32(start +
i*cache_way_selection_offset,
Index_Writeback_Inv_D);
}
start += 0x200;
}
=====
What do you think? If it is OK, I can do the patch. The cpu_info table
is a nice wish, but I don't think I know enough to do that job yet.
Jun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
@ 2000-08-02 8:26 Kevin D. Kissell
2000-08-02 8:26 ` Kevin D. Kissell
2000-08-02 17:05 ` Jun Sun
0 siblings, 2 replies; 15+ messages in thread
From: Kevin D. Kissell @ 2000-08-02 8:26 UTC (permalink / raw)
To: Jun Sun, linux, linux-mips, ralf
[-- Attachment #1: Type: text/plain, Size: 3806 bytes --]
Rather than re-invent the wheel, please consider the
cache descriptor data structures we developed at
MIPS to deal with this problem. I believe that the
updated cache.h file, and maybe even the cpu_probe.c
file, was checked into the 2.2 repository at SGI long ago.
There are also a set of initialisation and invalidation routines
that key off the cache descriptor structure, but those aren't
in the SGI repository (though anyone can get them from
ftp.mips.com or www.paralogos.com). The CPU probe
logic (also on those sites, and already integrated
into several variants because it also supports setting
up state needed by the software FPU emulation)
is table-based, and for each PrID value, there is
a template for the cache characteristics, which
can either be taken "as is" or probed, depending
on a flag in the descriptor. Since the number of
"ways" cannot always be determined by probing,
if the number of ways is specified, it is preserved
even if a cache probe is performed. I won't attach the
full set of cache probe routines (which would only confuse
things), but here is the cache data structure definition
and the CPU descriptor template table that we use.
Regads,
Kevin K.
-----Original Message-----
From: Jun Sun <jsun@mvista.com>
To: linux@cthulhu.engr.sgi.com <linux@cthulhu.engr.sgi.com>;
linux-mips@fnet.fr <linux-mips@fnet.fr>; ralf@oss.sgi.com <ralf@oss.sgi.com>
Date: Wednesday, August 02, 2000 2:01 AM
Subject: PROPOSAL : multi-way cache support in Linux/MIPS
>Ralf,
>
>I have got NEC DDB5476 running stable enough that I am comfortable to
>check in
>my code. Will you take it?
>
>Assuming the answer is yes, there are several issues regarding checking
>in.
>I will bring them up one by one.
>
>The first issue is multi-way cache support. DDB5476 uses R5432 CPU
>which
>has two-way set-associative cache. The problematic part is the
>index-based cache operations in r4xxcache.h does not cover all ways in a
>set.
>
>I think this is a problem in general. So far I have seen MIPS
>processors with
>2-way, 4-way and 8-way sets. And I have seen them using ether least-
>significant-addressing-bits or most-significant-addressing-bits
>within a cache line to select ways.
>
>Here is my proposal :
>
>. introduce two variables,
> cache_num_ways - number of ways in a set
> cache_way_selection_offset - the offset added to the address to
>select
> next cache line in the same set. For LSBs addressing,
>it is
> equal to 1. For MSBs addressing, it is equal to
> cache_line_size / cache_num_ways. (It can potentially
>take
> care of some future weired way-selection scheme as long
>as
> the offset is uniform)
>
>. These variables are initialized in cpu_probe().
>
> (Alternatively, I think we should have cpu_info table, that contains
>all
> these cpu information. Then a general routine can fill in the based
>on
> the cpu id. This can get rid of a bunch of ugly switch/case
>statements.)
>
>. in the include/asm/r4kcache.h file, all Index-based cache operation
>needs
> to changed like the following (for illustration only; need
>optimization) :
>
>-----
> while(start < end) {
> cache16_unroll32(start,Index_Writeback_Inv_D);
> start += 0x200;
> }
>+++++
> while(start < end) {
> for (i=0; i< cache_num_ways; i++) {
> cache16_unroll32(start +
>i*cache_way_selection_offset,
> Index_Writeback_Inv_D);
> }
> start += 0x200;
> }
>=====
>
>What do you think? If it is OK, I can do the patch. The cpu_info table
>is a nice wish, but I don't think I know enough to do that job yet.
>
>Jun
[-- Attachment #2: cache.h --]
[-- Type: application/octet-stream, Size: 1212 bytes --]
/*
* include/asm-mips/cache.h
*/
/**************************************************************************
* 7 Dec, 1999.
* Added definition of cache descriptor structure.
*
* Kevin D. Kissell, kevink@mips.com
* Copyright (C) 1999 MIPS Technologies, Inc. All rights reserved.
*************************************************************************/
#ifndef __ASM_MIPS_CACHE_H
#define __ASM_MIPS_CACHE_H
#ifndef _LANGUAGE_ASSEMBLY
/*
* Descriptor for a cache
*/
struct cache_desc {
int linesz;
int sets;
int ways;
int flags; /* Details like write thru/back, coherent, etc. */
};
#endif
/*
* Flag definitions
*/
#define MIPS_CACHE_NEEDS_CONFIG 0x00000001
#define MIPS_CACHE_VIRTUAL 0x00000002 /* Virtually tagged */
/* bytes per L1 cache line */
/*
* It would be nice to make this dynamic,
* based on mips_cpu.dcache.linesz, but
* it is used for fixed-size structure allocation.
* Set to known maximum for MIPS architecture, 32 bytes.
*/
#define L1_CACHE_BYTES 32
#define L1_CACHE_ALIGN(x) (((x)+(L1_CACHE_BYTES-1))&~(L1_CACHE_BYTES-1))
#define SMP_CACHE_BYTES L1_CACHE_BYTES
#endif /* __ASM_MIPS_CACHE_H */
[-- Attachment #3: cpu.h --]
[-- Type: application/octet-stream, Size: 3206 bytes --]
/* $Id: cpu.h,v 1.5 2000/02/16 21:46:29 kevink Exp $
* cpu.h: Values of the PRId register used to match up
* various MIPS cpu types.
*
* Copyright (C) 1996 David S. Miller (dm@engr.sgi.com)
*
*/
/**************************************************************************
* 7 Dec, 1999.
* Added 4KC and 5KC PR_ID codes, and defined mips_cpu data structure
* and field encodings.
*
* Kevin D. Kissell, kevink@mips.com
* Copyright (C) 1999 MIPS Technologies, Inc. All rights reserved.
*************************************************************************/
#ifndef _MIPS_CPU_H
#define _MIPS_CPU_H
/*
* Assigned values for the product ID register. In order to detect a
* certain CPU type exactly eventually additional registers may need to
* be examined.
*/
#define PRID_IMP_R2000 0x0100
#define PRID_IMP_R3000 0x0200
#define PRID_IMP_R6000 0x0300
#define PRID_IMP_R4000 0x0400
#define PRID_IMP_R6000A 0x0600
#define PRID_IMP_R10000 0x0900
#define PRID_IMP_R4300 0x0b00
#define PRID_IMP_R8000 0x1000
#define PRID_IMP_R4600 0x2000
#define PRID_IMP_R4700 0x2100
#define PRID_IMP_R4640 0x2200
#define PRID_IMP_R4650 0x2200 /* Same as R4640 */
#define PRID_IMP_R5000 0x2300
#define PRID_IMP_SONIC 0x2400
#define PRID_IMP_MAGIC 0x2500
#define PRID_IMP_RM7000 0x2700
#define PRID_IMP_NEVADA 0x2800 /* RM5260 ??? */
#define PRID_IMP_4KC 0x8000
#define PRID_IMP_5KC 0x8100
#define PRID_IMP_UNKNOWN 0xff00
#define PRID_REV_R4400 0x0040
#define PRID_REV_R3000A 0x0030
#define PRID_REV_R3000 0x0020
#define PRID_REV_R2000A 0x0010
#include <asm/cache.h>
#ifndef _LANGUAGE_ASSEMBLY
/*
* Capability and feature descriptor structure for MIPS CPU
*/
struct mips_cpu {
unsigned int processor_id;
unsigned int cputype; /* Old "mips_cputype" code */
int isa_level;
int options;
int tlbsize;
struct cache_desc icache; /* Primary I-cache */
struct cache_desc dcache; /* Primary D or combined I/D cache */
struct cache_desc scache; /* Secondary cache */
struct cache_desc tcache; /* Tertiary/split secondary cache */
};
#endif
/*
* ISA Level encodings
*/
#define MIPS_CPU_ISA_I 0x00000001
#define MIPS_CPU_ISA_II 0x00000002
#define MIPS_CPU_ISA_III 0x00000003
#define MIPS_CPU_ISA_IV 0x00000004
#define MIPS_CPU_ISA_V 0x00000005
#define MIPS_CPU_ISA_M32 0x00000020
#define MIPS_CPU_ISA_M64 0x00000040
/*
* CPU Option encodings
*/
#define MIPS_CPU_TLB 0x00000001 /* CPU has TLB */
/* Leave a spare bit for variant MMU types... */
#define MIPS_CPU_4KEX 0x00000004 /* "R4K" exception model */
#define MIPS_CPU_4KTLB 0x00000008 /* "R4K" TLB handler */
#define MIPS_CPU_FPU 0x00000010 /* CPU has FPU */
#define MIPS_CPU_32FPR 0x00000020 /* 32 dbl. prec. FP registers */
#define MIPS_CPU_COUNTER 0x00000040 /* Cycle count/compare */
#define MIPS_CPU_WATCH 0x00000080 /* watchpoint registers */
#define MIPS_CPU_MIPS16 0x00000100 /* code compression */
#define MIPS_CPU_DIVEC 0x00000200 /* dedicated interrupt vector */
#define MIPS_CPU_VCE 0x00000400 /* virt. coherence conflict possible */
#define MIPS_CPU_CACHE_CDEX 0x00000800 /* Create_Dirty_Exclusive CACHE op */
#endif /* !(_MIPS_CPU_H) */
[-- Attachment #4: cpu_probe.c --]
[-- Type: application/octet-stream, Size: 7889 bytes --]
/* $Id: cpu_probe.c,v 1.11 2000/07/07 09:02:36 carstenl Exp $
*
* cpu_probe.c
*
* Kevin D. Kissell, kevink@mips.com
* Copyright (C) 1999,2000 MIPS Technologies, Inc. All rights reserved.
*
* ########################################################################
*
* This program is free software; you can distribute it and/or modify it
* under the terms of the GNU General Public License (Version 2) as
* published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
*
* ########################################################################
*
* C code, called from startup vector to decode the CPU configuration and
* set up the mips_cpu data structure, used by the kernel to abstract out
* most implementation options of MIPS CPUs.
*
*/
#include <asm/cpu.h>
#include <asm/bootinfo.h>
#include <asm/init.h>
#include <linux/config.h>
#ifdef CONFIG_CPU_MIPS32
extern void mips32_cpu_probe(unsigned int pr_id);
#endif
/* declaration of the global struct */
struct mips_cpu mips_cpu = {PRID_IMP_UNKNOWN, CPU_UNKNOWN, 0, 0, 0,
{0,0,0,0}, {0,0,0,0}, {0,0,0,0}, {0,0,0,0}};
/* Shortcut for assembler access to mips_cpu.options */
int *cpuoptions = &mips_cpu.options;
/*
* Canned descriptors of MIPS CPUs. Note that for the code below
* to function correctly, a generic description with a processor_id
* value with no implementation bits set must follow any descriptions
* of distinct variant revistions, i.e. R4000 must precede R4400,
* R3000 must precede R3000A. Many CPUs are not reflected in
* the list. New entries require the addtion of PR_ID register
* data in asm/cpu.h and assignment of a CPU_ code in asm/bootinfo.h.
* The mips_cpu structure is defined in asm/cpu.h and asm/cache.h.
*/
/*
* Some options are common across all R4000 derivatives
*/
#define R4K_OPTS (MIPS_CPU_TLB | MIPS_CPU_4KEX | MIPS_CPU_4KTLB \
| MIPS_CPU_COUNTER | MIPS_CPU_CACHE_CDEX )
static struct mips_cpu __initdata mips_cpu_template[] = {
/* R2000 */
{ PRID_IMP_R2000, /* PR_ID register value */
CPU_R2000, /* Kernel internal CPU identifier */
MIPS_CPU_ISA_I, /* MIPS ISA level */
MIPS_CPU_TLB, /* Flags for implementation options */
32, /* Number of TLB entries */
{0,0,0,0}, /* I-cache line size, #sets, #ways, flags */
{0,0,0,MIPS_CACHE_NEEDS_CONFIG}, /* Unified/D-cache descriptor */
{0,0,0,0}, /* S-cache descriptor */
{0,0,0,0}}, /* Tertiary cache descriptor */
/* MIPS 4Kc */
{ PRID_IMP_4KC, CPU_4KC, MIPS_CPU_ISA_M32, MIPS_CPU_TLB |
MIPS_CPU_4KEX | MIPS_CPU_4KTLB | MIPS_CPU_COUNTER |
MIPS_CPU_DIVEC | MIPS_CPU_WATCH, 16,
{16, 256, 4, 0}, {16, 256, 4, 0}, {0,0,0,0}, {0,0,0,0}},
/* MIPS 5Kc */
{ PRID_IMP_5KC, CPU_5KC, MIPS_CPU_ISA_M64, MIPS_CPU_TLB |
MIPS_CPU_4KEX | MIPS_CPU_4KTLB | MIPS_CPU_COUNTER |
MIPS_CPU_DIVEC | MIPS_CPU_WATCH, 32,
{32, 128, 4, 0}, {32, 128, 4, 0}, {0,0,0,0}, {0,0,0,0}},
/* R3000 */
{ PRID_IMP_R3000, CPU_R3000, MIPS_CPU_ISA_I, MIPS_CPU_TLB, 32,
{0, 0, 0, 0}, {0, 0, 0, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R3000A */
{ PRID_IMP_R3000 | PRID_REV_R3000A, CPU_R3000A,
MIPS_CPU_ISA_I, MIPS_CPU_TLB, 32,
{0, 0, 0, 0}, {0, 0, 0, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R6000 */
{ PRID_IMP_R6000, CPU_R6000, MIPS_CPU_ISA_II,
MIPS_CPU_TLB | MIPS_CPU_FPU, 32,
{0, 0, 0, 0}, {0, 0, 0, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R6000A */
{ PRID_IMP_R6000A, CPU_R6000A, MIPS_CPU_ISA_II,
MIPS_CPU_TLB | MIPS_CPU_FPU, 32,
{0, 0, 0, 0}, {0, 0, 0, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R4000SC */
{ PRID_IMP_R4000, CPU_R4000SC, MIPS_CPU_ISA_III,
R4K_OPTS | MIPS_CPU_FPU | MIPS_CPU_32FPR
| MIPS_CPU_WATCH | MIPS_CPU_VCE, 48,
{0, 0, 1, MIPS_CACHE_NEEDS_CONFIG},
{0, 0, 1, MIPS_CACHE_NEEDS_CONFIG},
{0,0,1,MIPS_CACHE_NEEDS_CONFIG}, {0,0,0,0}},
/* R4400SC */
{ PRID_IMP_R4000 | PRID_REV_R4400, CPU_R4400SC, MIPS_CPU_ISA_III,
R4K_OPTS | MIPS_CPU_FPU | MIPS_CPU_32FPR
| MIPS_CPU_WATCH | MIPS_CPU_VCE, 48,
{0, 0, 1, MIPS_CACHE_NEEDS_CONFIG},
{0, 0, 1, MIPS_CACHE_NEEDS_CONFIG},
{0,0,1, MIPS_CACHE_NEEDS_CONFIG}, {0,0,0,0}},
/* R4300 */
{ PRID_IMP_R4300, CPU_R4300, MIPS_CPU_ISA_III,
R4K_OPTS | MIPS_CPU_FPU | MIPS_CPU_32FPR | MIPS_CPU_WATCH, 32,
{32, 512, 1, MIPS_CACHE_NEEDS_CONFIG},
{16, 512, 1, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R4600 */
{ PRID_IMP_R4600, CPU_R4600, MIPS_CPU_ISA_III,
R4K_OPTS | MIPS_CPU_FPU, 48,
{32, 128, 2, MIPS_CACHE_NEEDS_CONFIG},
{32, 128, 2, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R4650 */
{ PRID_IMP_R4650, CPU_R4650, MIPS_CPU_ISA_III,
R4K_OPTS | MIPS_CPU_FPU, 48,
{32, 128, 2, MIPS_CACHE_NEEDS_CONFIG},
{32, 128, 2, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R4700 */
{ PRID_IMP_R4700, CPU_R4700, MIPS_CPU_ISA_III,
R4K_OPTS | MIPS_CPU_FPU | MIPS_CPU_32FPR, 48,
{32, 256, 2, MIPS_CACHE_NEEDS_CONFIG},
{32, 256, 2, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R5000 */
{ PRID_IMP_R5000, CPU_R5000, MIPS_CPU_ISA_IV,
R4K_OPTS | MIPS_CPU_FPU | MIPS_CPU_32FPR, 48,
{32, 512, 2, MIPS_CACHE_NEEDS_CONFIG},
{32, 512, 2, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,MIPS_CACHE_NEEDS_CONFIG}, {0,0,0,0}},
/* R52xx. Cache size varies with revision */
{ PRID_IMP_NEVADA, CPU_NEVADA, MIPS_CPU_ISA_IV,
R4K_OPTS | MIPS_CPU_FPU | MIPS_CPU_32FPR | MIPS_CPU_DIVEC, 48,
{0, 0, 2, MIPS_CACHE_NEEDS_CONFIG}, {0, 0, 2, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R8000 - has wierd TLB: 3-way x 128 */
{ PRID_IMP_R8000, CPU_R8000, MIPS_CPU_ISA_IV,
MIPS_CPU_TLB | MIPS_CPU_4KEX | MIPS_CPU_FPU | MIPS_CPU_32FPR, 384,
{32, 512, 1, MIPS_CACHE_VIRTUAL}, {32, 512, 1, 0},
{0,0,0,MIPS_CACHE_NEEDS_CONFIG}, {0,0,0,0}},
/* R10000 */
{ PRID_IMP_R10000, CPU_R10000, MIPS_CPU_ISA_IV,
MIPS_CPU_TLB | MIPS_CPU_4KEX | MIPS_CPU_FPU
| MIPS_CPU_32FPR | MIPS_CPU_COUNTER | MIPS_CPU_WATCH, 64,
{32, 512, 2, 0}, {32, 512, 2, 0},
{0,0,0,MIPS_CACHE_NEEDS_CONFIG}, {0,0,0,0}},
};
/*
* This function is running on the boot prom stack. It should
* minimize use of dynamic variables and call an absolute
* minimum of other functions.
*/
__initfunc(void mips_cpu_probe(unsigned int pr_id))
{
int i;
#ifdef CONFIG_CPU_MIPS32
/*
* If high-order halfword non-zero, use MIPS32 mechanism
*/
if((pr_id >> 16) != 0) {
mips32_cpu_probe(pr_id);
return;
}
#endif
/*
* If old encoding scheme and CPU in table, find and copy.
*
* First try for match including revision number
*/
for(i=0; mips_cpu_template[i].processor_id != 0; i++) {
if(mips_cpu_template[i].processor_id == pr_id) {
memcpy(&mips_cpu, &mips_cpu_template[i],
sizeof(struct mips_cpu));
return;
}
}
/*
* That failing, look for match on implementation only
*/
for(i=0; mips_cpu_template[i].processor_id != 0; i++) {
if((mips_cpu_template[i].processor_id & PRID_IMP_UNKNOWN)
== (pr_id & PRID_IMP_UNKNOWN)) {
memcpy(&mips_cpu, &mips_cpu_template[i],
sizeof(struct mips_cpu));
return;
}
}
/*
* Otherwise CPU is unknown - all bets are off
*/
mips_cpu.processor_id = pr_id;
mips_cpu.cputype = CPU_UNKNOWN;
mips_cpu.options = 0;
mips_cpu.icache.flags = MIPS_CACHE_NEEDS_CONFIG;
mips_cpu.dcache.flags = MIPS_CACHE_NEEDS_CONFIG;
mips_cpu.scache.flags = MIPS_CACHE_NEEDS_CONFIG;
return;
}
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
2000-08-02 8:26 PROPOSAL : multi-way cache support in Linux/MIPS Kevin D. Kissell
@ 2000-08-02 8:26 ` Kevin D. Kissell
2000-08-02 17:05 ` Jun Sun
1 sibling, 0 replies; 15+ messages in thread
From: Kevin D. Kissell @ 2000-08-02 8:26 UTC (permalink / raw)
To: Jun Sun, linux, linux-mips, ralf
[-- Attachment #1: Type: text/plain, Size: 3806 bytes --]
Rather than re-invent the wheel, please consider the
cache descriptor data structures we developed at
MIPS to deal with this problem. I believe that the
updated cache.h file, and maybe even the cpu_probe.c
file, was checked into the 2.2 repository at SGI long ago.
There are also a set of initialisation and invalidation routines
that key off the cache descriptor structure, but those aren't
in the SGI repository (though anyone can get them from
ftp.mips.com or www.paralogos.com). The CPU probe
logic (also on those sites, and already integrated
into several variants because it also supports setting
up state needed by the software FPU emulation)
is table-based, and for each PrID value, there is
a template for the cache characteristics, which
can either be taken "as is" or probed, depending
on a flag in the descriptor. Since the number of
"ways" cannot always be determined by probing,
if the number of ways is specified, it is preserved
even if a cache probe is performed. I won't attach the
full set of cache probe routines (which would only confuse
things), but here is the cache data structure definition
and the CPU descriptor template table that we use.
Regads,
Kevin K.
-----Original Message-----
From: Jun Sun <jsun@mvista.com>
To: linux@cthulhu.engr.sgi.com <linux@cthulhu.engr.sgi.com>;
linux-mips@fnet.fr <linux-mips@fnet.fr>; ralf@oss.sgi.com <ralf@oss.sgi.com>
Date: Wednesday, August 02, 2000 2:01 AM
Subject: PROPOSAL : multi-way cache support in Linux/MIPS
>Ralf,
>
>I have got NEC DDB5476 running stable enough that I am comfortable to
>check in
>my code. Will you take it?
>
>Assuming the answer is yes, there are several issues regarding checking
>in.
>I will bring them up one by one.
>
>The first issue is multi-way cache support. DDB5476 uses R5432 CPU
>which
>has two-way set-associative cache. The problematic part is the
>index-based cache operations in r4xxcache.h does not cover all ways in a
>set.
>
>I think this is a problem in general. So far I have seen MIPS
>processors with
>2-way, 4-way and 8-way sets. And I have seen them using ether least-
>significant-addressing-bits or most-significant-addressing-bits
>within a cache line to select ways.
>
>Here is my proposal :
>
>. introduce two variables,
> cache_num_ways - number of ways in a set
> cache_way_selection_offset - the offset added to the address to
>select
> next cache line in the same set. For LSBs addressing,
>it is
> equal to 1. For MSBs addressing, it is equal to
> cache_line_size / cache_num_ways. (It can potentially
>take
> care of some future weired way-selection scheme as long
>as
> the offset is uniform)
>
>. These variables are initialized in cpu_probe().
>
> (Alternatively, I think we should have cpu_info table, that contains
>all
> these cpu information. Then a general routine can fill in the based
>on
> the cpu id. This can get rid of a bunch of ugly switch/case
>statements.)
>
>. in the include/asm/r4kcache.h file, all Index-based cache operation
>needs
> to changed like the following (for illustration only; need
>optimization) :
>
>-----
> while(start < end) {
> cache16_unroll32(start,Index_Writeback_Inv_D);
> start += 0x200;
> }
>+++++
> while(start < end) {
> for (i=0; i< cache_num_ways; i++) {
> cache16_unroll32(start +
>i*cache_way_selection_offset,
> Index_Writeback_Inv_D);
> }
> start += 0x200;
> }
>=====
>
>What do you think? If it is OK, I can do the patch. The cpu_info table
>is a nice wish, but I don't think I know enough to do that job yet.
>
>Jun
[-- Attachment #2: cache.h --]
[-- Type: application/octet-stream, Size: 1212 bytes --]
/*
* include/asm-mips/cache.h
*/
/**************************************************************************
* 7 Dec, 1999.
* Added definition of cache descriptor structure.
*
* Kevin D. Kissell, kevink@mips.com
* Copyright (C) 1999 MIPS Technologies, Inc. All rights reserved.
*************************************************************************/
#ifndef __ASM_MIPS_CACHE_H
#define __ASM_MIPS_CACHE_H
#ifndef _LANGUAGE_ASSEMBLY
/*
* Descriptor for a cache
*/
struct cache_desc {
int linesz;
int sets;
int ways;
int flags; /* Details like write thru/back, coherent, etc. */
};
#endif
/*
* Flag definitions
*/
#define MIPS_CACHE_NEEDS_CONFIG 0x00000001
#define MIPS_CACHE_VIRTUAL 0x00000002 /* Virtually tagged */
/* bytes per L1 cache line */
/*
* It would be nice to make this dynamic,
* based on mips_cpu.dcache.linesz, but
* it is used for fixed-size structure allocation.
* Set to known maximum for MIPS architecture, 32 bytes.
*/
#define L1_CACHE_BYTES 32
#define L1_CACHE_ALIGN(x) (((x)+(L1_CACHE_BYTES-1))&~(L1_CACHE_BYTES-1))
#define SMP_CACHE_BYTES L1_CACHE_BYTES
#endif /* __ASM_MIPS_CACHE_H */
[-- Attachment #3: cpu.h --]
[-- Type: application/octet-stream, Size: 3206 bytes --]
/* $Id: cpu.h,v 1.5 2000/02/16 21:46:29 kevink Exp $
* cpu.h: Values of the PRId register used to match up
* various MIPS cpu types.
*
* Copyright (C) 1996 David S. Miller (dm@engr.sgi.com)
*
*/
/**************************************************************************
* 7 Dec, 1999.
* Added 4KC and 5KC PR_ID codes, and defined mips_cpu data structure
* and field encodings.
*
* Kevin D. Kissell, kevink@mips.com
* Copyright (C) 1999 MIPS Technologies, Inc. All rights reserved.
*************************************************************************/
#ifndef _MIPS_CPU_H
#define _MIPS_CPU_H
/*
* Assigned values for the product ID register. In order to detect a
* certain CPU type exactly eventually additional registers may need to
* be examined.
*/
#define PRID_IMP_R2000 0x0100
#define PRID_IMP_R3000 0x0200
#define PRID_IMP_R6000 0x0300
#define PRID_IMP_R4000 0x0400
#define PRID_IMP_R6000A 0x0600
#define PRID_IMP_R10000 0x0900
#define PRID_IMP_R4300 0x0b00
#define PRID_IMP_R8000 0x1000
#define PRID_IMP_R4600 0x2000
#define PRID_IMP_R4700 0x2100
#define PRID_IMP_R4640 0x2200
#define PRID_IMP_R4650 0x2200 /* Same as R4640 */
#define PRID_IMP_R5000 0x2300
#define PRID_IMP_SONIC 0x2400
#define PRID_IMP_MAGIC 0x2500
#define PRID_IMP_RM7000 0x2700
#define PRID_IMP_NEVADA 0x2800 /* RM5260 ??? */
#define PRID_IMP_4KC 0x8000
#define PRID_IMP_5KC 0x8100
#define PRID_IMP_UNKNOWN 0xff00
#define PRID_REV_R4400 0x0040
#define PRID_REV_R3000A 0x0030
#define PRID_REV_R3000 0x0020
#define PRID_REV_R2000A 0x0010
#include <asm/cache.h>
#ifndef _LANGUAGE_ASSEMBLY
/*
* Capability and feature descriptor structure for MIPS CPU
*/
struct mips_cpu {
unsigned int processor_id;
unsigned int cputype; /* Old "mips_cputype" code */
int isa_level;
int options;
int tlbsize;
struct cache_desc icache; /* Primary I-cache */
struct cache_desc dcache; /* Primary D or combined I/D cache */
struct cache_desc scache; /* Secondary cache */
struct cache_desc tcache; /* Tertiary/split secondary cache */
};
#endif
/*
* ISA Level encodings
*/
#define MIPS_CPU_ISA_I 0x00000001
#define MIPS_CPU_ISA_II 0x00000002
#define MIPS_CPU_ISA_III 0x00000003
#define MIPS_CPU_ISA_IV 0x00000004
#define MIPS_CPU_ISA_V 0x00000005
#define MIPS_CPU_ISA_M32 0x00000020
#define MIPS_CPU_ISA_M64 0x00000040
/*
* CPU Option encodings
*/
#define MIPS_CPU_TLB 0x00000001 /* CPU has TLB */
/* Leave a spare bit for variant MMU types... */
#define MIPS_CPU_4KEX 0x00000004 /* "R4K" exception model */
#define MIPS_CPU_4KTLB 0x00000008 /* "R4K" TLB handler */
#define MIPS_CPU_FPU 0x00000010 /* CPU has FPU */
#define MIPS_CPU_32FPR 0x00000020 /* 32 dbl. prec. FP registers */
#define MIPS_CPU_COUNTER 0x00000040 /* Cycle count/compare */
#define MIPS_CPU_WATCH 0x00000080 /* watchpoint registers */
#define MIPS_CPU_MIPS16 0x00000100 /* code compression */
#define MIPS_CPU_DIVEC 0x00000200 /* dedicated interrupt vector */
#define MIPS_CPU_VCE 0x00000400 /* virt. coherence conflict possible */
#define MIPS_CPU_CACHE_CDEX 0x00000800 /* Create_Dirty_Exclusive CACHE op */
#endif /* !(_MIPS_CPU_H) */
[-- Attachment #4: cpu_probe.c --]
[-- Type: application/octet-stream, Size: 7889 bytes --]
/* $Id: cpu_probe.c,v 1.11 2000/07/07 09:02:36 carstenl Exp $
*
* cpu_probe.c
*
* Kevin D. Kissell, kevink@mips.com
* Copyright (C) 1999,2000 MIPS Technologies, Inc. All rights reserved.
*
* ########################################################################
*
* This program is free software; you can distribute it and/or modify it
* under the terms of the GNU General Public License (Version 2) as
* published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
*
* ########################################################################
*
* C code, called from startup vector to decode the CPU configuration and
* set up the mips_cpu data structure, used by the kernel to abstract out
* most implementation options of MIPS CPUs.
*
*/
#include <asm/cpu.h>
#include <asm/bootinfo.h>
#include <asm/init.h>
#include <linux/config.h>
#ifdef CONFIG_CPU_MIPS32
extern void mips32_cpu_probe(unsigned int pr_id);
#endif
/* declaration of the global struct */
struct mips_cpu mips_cpu = {PRID_IMP_UNKNOWN, CPU_UNKNOWN, 0, 0, 0,
{0,0,0,0}, {0,0,0,0}, {0,0,0,0}, {0,0,0,0}};
/* Shortcut for assembler access to mips_cpu.options */
int *cpuoptions = &mips_cpu.options;
/*
* Canned descriptors of MIPS CPUs. Note that for the code below
* to function correctly, a generic description with a processor_id
* value with no implementation bits set must follow any descriptions
* of distinct variant revistions, i.e. R4000 must precede R4400,
* R3000 must precede R3000A. Many CPUs are not reflected in
* the list. New entries require the addtion of PR_ID register
* data in asm/cpu.h and assignment of a CPU_ code in asm/bootinfo.h.
* The mips_cpu structure is defined in asm/cpu.h and asm/cache.h.
*/
/*
* Some options are common across all R4000 derivatives
*/
#define R4K_OPTS (MIPS_CPU_TLB | MIPS_CPU_4KEX | MIPS_CPU_4KTLB \
| MIPS_CPU_COUNTER | MIPS_CPU_CACHE_CDEX )
static struct mips_cpu __initdata mips_cpu_template[] = {
/* R2000 */
{ PRID_IMP_R2000, /* PR_ID register value */
CPU_R2000, /* Kernel internal CPU identifier */
MIPS_CPU_ISA_I, /* MIPS ISA level */
MIPS_CPU_TLB, /* Flags for implementation options */
32, /* Number of TLB entries */
{0,0,0,0}, /* I-cache line size, #sets, #ways, flags */
{0,0,0,MIPS_CACHE_NEEDS_CONFIG}, /* Unified/D-cache descriptor */
{0,0,0,0}, /* S-cache descriptor */
{0,0,0,0}}, /* Tertiary cache descriptor */
/* MIPS 4Kc */
{ PRID_IMP_4KC, CPU_4KC, MIPS_CPU_ISA_M32, MIPS_CPU_TLB |
MIPS_CPU_4KEX | MIPS_CPU_4KTLB | MIPS_CPU_COUNTER |
MIPS_CPU_DIVEC | MIPS_CPU_WATCH, 16,
{16, 256, 4, 0}, {16, 256, 4, 0}, {0,0,0,0}, {0,0,0,0}},
/* MIPS 5Kc */
{ PRID_IMP_5KC, CPU_5KC, MIPS_CPU_ISA_M64, MIPS_CPU_TLB |
MIPS_CPU_4KEX | MIPS_CPU_4KTLB | MIPS_CPU_COUNTER |
MIPS_CPU_DIVEC | MIPS_CPU_WATCH, 32,
{32, 128, 4, 0}, {32, 128, 4, 0}, {0,0,0,0}, {0,0,0,0}},
/* R3000 */
{ PRID_IMP_R3000, CPU_R3000, MIPS_CPU_ISA_I, MIPS_CPU_TLB, 32,
{0, 0, 0, 0}, {0, 0, 0, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R3000A */
{ PRID_IMP_R3000 | PRID_REV_R3000A, CPU_R3000A,
MIPS_CPU_ISA_I, MIPS_CPU_TLB, 32,
{0, 0, 0, 0}, {0, 0, 0, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R6000 */
{ PRID_IMP_R6000, CPU_R6000, MIPS_CPU_ISA_II,
MIPS_CPU_TLB | MIPS_CPU_FPU, 32,
{0, 0, 0, 0}, {0, 0, 0, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R6000A */
{ PRID_IMP_R6000A, CPU_R6000A, MIPS_CPU_ISA_II,
MIPS_CPU_TLB | MIPS_CPU_FPU, 32,
{0, 0, 0, 0}, {0, 0, 0, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R4000SC */
{ PRID_IMP_R4000, CPU_R4000SC, MIPS_CPU_ISA_III,
R4K_OPTS | MIPS_CPU_FPU | MIPS_CPU_32FPR
| MIPS_CPU_WATCH | MIPS_CPU_VCE, 48,
{0, 0, 1, MIPS_CACHE_NEEDS_CONFIG},
{0, 0, 1, MIPS_CACHE_NEEDS_CONFIG},
{0,0,1,MIPS_CACHE_NEEDS_CONFIG}, {0,0,0,0}},
/* R4400SC */
{ PRID_IMP_R4000 | PRID_REV_R4400, CPU_R4400SC, MIPS_CPU_ISA_III,
R4K_OPTS | MIPS_CPU_FPU | MIPS_CPU_32FPR
| MIPS_CPU_WATCH | MIPS_CPU_VCE, 48,
{0, 0, 1, MIPS_CACHE_NEEDS_CONFIG},
{0, 0, 1, MIPS_CACHE_NEEDS_CONFIG},
{0,0,1, MIPS_CACHE_NEEDS_CONFIG}, {0,0,0,0}},
/* R4300 */
{ PRID_IMP_R4300, CPU_R4300, MIPS_CPU_ISA_III,
R4K_OPTS | MIPS_CPU_FPU | MIPS_CPU_32FPR | MIPS_CPU_WATCH, 32,
{32, 512, 1, MIPS_CACHE_NEEDS_CONFIG},
{16, 512, 1, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R4600 */
{ PRID_IMP_R4600, CPU_R4600, MIPS_CPU_ISA_III,
R4K_OPTS | MIPS_CPU_FPU, 48,
{32, 128, 2, MIPS_CACHE_NEEDS_CONFIG},
{32, 128, 2, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R4650 */
{ PRID_IMP_R4650, CPU_R4650, MIPS_CPU_ISA_III,
R4K_OPTS | MIPS_CPU_FPU, 48,
{32, 128, 2, MIPS_CACHE_NEEDS_CONFIG},
{32, 128, 2, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R4700 */
{ PRID_IMP_R4700, CPU_R4700, MIPS_CPU_ISA_III,
R4K_OPTS | MIPS_CPU_FPU | MIPS_CPU_32FPR, 48,
{32, 256, 2, MIPS_CACHE_NEEDS_CONFIG},
{32, 256, 2, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R5000 */
{ PRID_IMP_R5000, CPU_R5000, MIPS_CPU_ISA_IV,
R4K_OPTS | MIPS_CPU_FPU | MIPS_CPU_32FPR, 48,
{32, 512, 2, MIPS_CACHE_NEEDS_CONFIG},
{32, 512, 2, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,MIPS_CACHE_NEEDS_CONFIG}, {0,0,0,0}},
/* R52xx. Cache size varies with revision */
{ PRID_IMP_NEVADA, CPU_NEVADA, MIPS_CPU_ISA_IV,
R4K_OPTS | MIPS_CPU_FPU | MIPS_CPU_32FPR | MIPS_CPU_DIVEC, 48,
{0, 0, 2, MIPS_CACHE_NEEDS_CONFIG}, {0, 0, 2, MIPS_CACHE_NEEDS_CONFIG},
{0,0,0,0}, {0,0,0,0}},
/* R8000 - has wierd TLB: 3-way x 128 */
{ PRID_IMP_R8000, CPU_R8000, MIPS_CPU_ISA_IV,
MIPS_CPU_TLB | MIPS_CPU_4KEX | MIPS_CPU_FPU | MIPS_CPU_32FPR, 384,
{32, 512, 1, MIPS_CACHE_VIRTUAL}, {32, 512, 1, 0},
{0,0,0,MIPS_CACHE_NEEDS_CONFIG}, {0,0,0,0}},
/* R10000 */
{ PRID_IMP_R10000, CPU_R10000, MIPS_CPU_ISA_IV,
MIPS_CPU_TLB | MIPS_CPU_4KEX | MIPS_CPU_FPU
| MIPS_CPU_32FPR | MIPS_CPU_COUNTER | MIPS_CPU_WATCH, 64,
{32, 512, 2, 0}, {32, 512, 2, 0},
{0,0,0,MIPS_CACHE_NEEDS_CONFIG}, {0,0,0,0}},
};
/*
* This function is running on the boot prom stack. It should
* minimize use of dynamic variables and call an absolute
* minimum of other functions.
*/
__initfunc(void mips_cpu_probe(unsigned int pr_id))
{
int i;
#ifdef CONFIG_CPU_MIPS32
/*
* If high-order halfword non-zero, use MIPS32 mechanism
*/
if((pr_id >> 16) != 0) {
mips32_cpu_probe(pr_id);
return;
}
#endif
/*
* If old encoding scheme and CPU in table, find and copy.
*
* First try for match including revision number
*/
for(i=0; mips_cpu_template[i].processor_id != 0; i++) {
if(mips_cpu_template[i].processor_id == pr_id) {
memcpy(&mips_cpu, &mips_cpu_template[i],
sizeof(struct mips_cpu));
return;
}
}
/*
* That failing, look for match on implementation only
*/
for(i=0; mips_cpu_template[i].processor_id != 0; i++) {
if((mips_cpu_template[i].processor_id & PRID_IMP_UNKNOWN)
== (pr_id & PRID_IMP_UNKNOWN)) {
memcpy(&mips_cpu, &mips_cpu_template[i],
sizeof(struct mips_cpu));
return;
}
}
/*
* Otherwise CPU is unknown - all bets are off
*/
mips_cpu.processor_id = pr_id;
mips_cpu.cputype = CPU_UNKNOWN;
mips_cpu.options = 0;
mips_cpu.icache.flags = MIPS_CACHE_NEEDS_CONFIG;
mips_cpu.dcache.flags = MIPS_CACHE_NEEDS_CONFIG;
mips_cpu.scache.flags = MIPS_CACHE_NEEDS_CONFIG;
return;
}
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
2000-08-02 8:26 PROPOSAL : multi-way cache support in Linux/MIPS Kevin D. Kissell
2000-08-02 8:26 ` Kevin D. Kissell
@ 2000-08-02 17:05 ` Jun Sun
1 sibling, 0 replies; 15+ messages in thread
From: Jun Sun @ 2000-08-02 17:05 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: linux, linux-mips, ralf
Kevin,
This looks great, something exactly I was hoping for!
A couple of questions :
. What about the actual cache operation routines (flush_cache_page,
...)? Are they divided into R4xxx, R3xx, etc? I guess I am curious how
the code is organized.
. Your structure gives the number of ways, but no info about how to
select a way. How would do an index-based cache operation? It seems to
me you probably need something like cache_way_selection_offset in the
cpu table.
Jun
"Kevin D. Kissell" wrote:
>
> Rather than re-invent the wheel, please consider the
> cache descriptor data structures we developed at
> MIPS to deal with this problem. I believe that the
> updated cache.h file, and maybe even the cpu_probe.c
> file, was checked into the 2.2 repository at SGI long ago.
> There are also a set of initialisation and invalidation routines
> that key off the cache descriptor structure, but those aren't
> in the SGI repository (though anyone can get them from
> ftp.mips.com or www.paralogos.com). The CPU probe
> logic (also on those sites, and already integrated
> into several variants because it also supports setting
> up state needed by the software FPU emulation)
> is table-based, and for each PrID value, there is
> a template for the cache characteristics, which
> can either be taken "as is" or probed, depending
> on a flag in the descriptor. Since the number of
> "ways" cannot always be determined by probing,
> if the number of ways is specified, it is preserved
> even if a cache probe is performed. I won't attach the
> full set of cache probe routines (which would only confuse
> things), but here is the cache data structure definition
> and the CPU descriptor template table that we use.
>
...
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
2000-08-01 23:52 Jun Sun
@ 2000-08-02 18:12 ` Dominic Sweetman
2000-08-02 21:38 ` Jun Sun
0 siblings, 1 reply; 15+ messages in thread
From: Dominic Sweetman @ 2000-08-02 18:12 UTC (permalink / raw)
To: Jun Sun; +Cc: linux, linux-mips, ralf
Jun Sun (jsun@mvista.com) writes:
> The first issue is multi-way cache support. DDB5476 uses R5432 CPU
> which has two-way set-associative cache. The problematic part is
> the index-based cache operations in r4xxcache.h does not cover all
> ways in a set.
>
> I think this is a problem in general. So far I have seen MIPS
> processors with 2-way, 4-way and 8-way sets. And I have seen them
> using ether least- significant-addressing-bits or
> most-significant-addressing-bits within a cache line to select ways.
So far as I know the Vr5432 is the only CPU to do anything so silly as
using the lowest index bits to select the "way". Certainly most CPUs
put the "way" bits above the cache-store-index; and MIPS now require
it to be done like that for MIPS32 and MIPS64 compatible parts.
The MS-selects-way organisation permits the cache to be initialised
without the software ever needing to know how many ways it has (just
crank the index up, but being careful about the tendency to recycle
the same way when pre-filling cache data).
Cache maintenance should always use "hit" type instructions, and you
don't need to know the cache organisation for those, even with the
Vr5432.
I suggest you should implement the don't-care method, and then have a
cpu_info-driven special case for the unique and deprecated Vr5432.
Dominic Sweetman
Algorithmics Ltd
dom@algor.co.uk
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
@ 2000-08-02 18:15 Kevin D. Kissell
2000-08-02 18:15 ` Kevin D. Kissell
2000-08-02 21:50 ` Jun Sun
0 siblings, 2 replies; 15+ messages in thread
From: Kevin D. Kissell @ 2000-08-02 18:15 UTC (permalink / raw)
To: Jun Sun; +Cc: linux, linux-mips, ralf
-->A couple of questions :
>
>. What about the actual cache operation routines (flush_cache_page,
>...)? Are they divided into R4xxx, R3xx, etc? I guess I am curious how
>the code is organized.
We kept pretty much the existing 2.2 structure for these things.
We created the module "r4k.c" in arch/mips/mm which was
essentially parallel to the old r4xx0.c module, but which implemented
the various TLB and cache functions (a) using the information in
the mips_cpu structure wherever it made sense and (b) in ways
that are fully compatible with the "MIPS32" ISA+CP0 model
as well as with the original R4000 family and its descendants.
It's possible to write code that is compatible with an R4000 but
not MIPS32, and vice versa, but they are 99% identical.
>. Your structure gives the number of ways, but no info about how to
>select a way. How would do an index-based cache operation? It seems to
>me you probably need something like cache_way_selection_offset in the
>cpu table.
The MIPS32 spec for the CACHE instruction gives a trivial
mapping from sets/ways/linesize into CACHE instruction
operands. In fact, the same technique works for most pre
MIPS32 multi-way caches as well. The only exception that
comes to mind is the R10000. If one wanted to support the
R10K or other oddball CACHE-implementations in this
system, I would suggest adding a MIPS_CACHE_R10KWAYSEL
or some flag to the flags field of the cache descriptor,
and tweaking any routines that need to select indices
(such a routine to hunt down and kill all possible virtual
aliases of an address) to handle the special case.
The primitives in Linux 2.2 did not require much knowedge
of multi-way caches as such - they could all be implemented
either using hit-based CACHE operations, or by cycling
through all possible indices using knowledge of the total
size and the line size. But the newer synthesizable MIPS
cores allow cache configurations to be "dialed in" in ways
that the old code could not handle. The CPUs themselves
can be interrogated to determine the line size/nways/nsets
geometry, so we mirror that in the Linux code and use those
parameters to compute total size, way size, etc. The
PrID-based lookup table and the dynamic probe routines
are there to allow older parts to use the same mechanisms.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
2000-08-02 18:15 Kevin D. Kissell
@ 2000-08-02 18:15 ` Kevin D. Kissell
2000-08-02 21:50 ` Jun Sun
1 sibling, 0 replies; 15+ messages in thread
From: Kevin D. Kissell @ 2000-08-02 18:15 UTC (permalink / raw)
To: Jun Sun; +Cc: linux, linux-mips, ralf
-->A couple of questions :
>
>. What about the actual cache operation routines (flush_cache_page,
>...)? Are they divided into R4xxx, R3xx, etc? I guess I am curious how
>the code is organized.
We kept pretty much the existing 2.2 structure for these things.
We created the module "r4k.c" in arch/mips/mm which was
essentially parallel to the old r4xx0.c module, but which implemented
the various TLB and cache functions (a) using the information in
the mips_cpu structure wherever it made sense and (b) in ways
that are fully compatible with the "MIPS32" ISA+CP0 model
as well as with the original R4000 family and its descendants.
It's possible to write code that is compatible with an R4000 but
not MIPS32, and vice versa, but they are 99% identical.
>. Your structure gives the number of ways, but no info about how to
>select a way. How would do an index-based cache operation? It seems to
>me you probably need something like cache_way_selection_offset in the
>cpu table.
The MIPS32 spec for the CACHE instruction gives a trivial
mapping from sets/ways/linesize into CACHE instruction
operands. In fact, the same technique works for most pre
MIPS32 multi-way caches as well. The only exception that
comes to mind is the R10000. If one wanted to support the
R10K or other oddball CACHE-implementations in this
system, I would suggest adding a MIPS_CACHE_R10KWAYSEL
or some flag to the flags field of the cache descriptor,
and tweaking any routines that need to select indices
(such a routine to hunt down and kill all possible virtual
aliases of an address) to handle the special case.
The primitives in Linux 2.2 did not require much knowedge
of multi-way caches as such - they could all be implemented
either using hit-based CACHE operations, or by cycling
through all possible indices using knowledge of the total
size and the line size. But the newer synthesizable MIPS
cores allow cache configurations to be "dialed in" in ways
that the old code could not handle. The CPUs themselves
can be interrogated to determine the line size/nways/nsets
geometry, so we mirror that in the Linux code and use those
parameters to compute total size, way size, etc. The
PrID-based lookup table and the dynamic probe routines
are there to allow older parts to use the same mechanisms.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
@ 2000-08-02 18:36 Kevin D. Kissell
2000-08-02 18:36 ` Kevin D. Kissell
0 siblings, 1 reply; 15+ messages in thread
From: Kevin D. Kissell @ 2000-08-02 18:36 UTC (permalink / raw)
To: Dominic Sweetman, Jun Sun; +Cc: linux, linux-mips, ralf
Dom Sweetman writes:
>So far as I know the Vr5432 is the only CPU to do anything so silly as
>using the lowest index bits to select the "way".
Alas, the R10000 does the same silly thing, and while you
and I might not consider such a venerable processor interesting
for new embedded MIPS/Linux designs, our friends who
are trying to replace IRIX with Linux on their SGI boxes
are going to have to deal with them for a little while longer.
>The MS-selects-way organisation permits the cache to be initialised
>without the software ever needing to know how many ways it has (just
>crank the index up, but being careful about the tendency to recycle
>the same way when pre-filling cache data).
Which is why MIPS belatedly documented it as the "correct"
way to design a multiway cache...
>Cache maintenance should always use "hit" type instructions, and you
>don't need to know the cache organisation for those, even with the
>Vr5432.
The counterargument to *always* using "hit" ops is that they
generate TLB traffic and TLB refills, which some people
find annoying to allow for and in any case time consuming.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
2000-08-02 18:36 Kevin D. Kissell
@ 2000-08-02 18:36 ` Kevin D. Kissell
0 siblings, 0 replies; 15+ messages in thread
From: Kevin D. Kissell @ 2000-08-02 18:36 UTC (permalink / raw)
To: Dominic Sweetman, Jun Sun; +Cc: linux, linux-mips, ralf
Dom Sweetman writes:
>So far as I know the Vr5432 is the only CPU to do anything so silly as
>using the lowest index bits to select the "way".
Alas, the R10000 does the same silly thing, and while you
and I might not consider such a venerable processor interesting
for new embedded MIPS/Linux designs, our friends who
are trying to replace IRIX with Linux on their SGI boxes
are going to have to deal with them for a little while longer.
>The MS-selects-way organisation permits the cache to be initialised
>without the software ever needing to know how many ways it has (just
>crank the index up, but being careful about the tendency to recycle
>the same way when pre-filling cache data).
Which is why MIPS belatedly documented it as the "correct"
way to design a multiway cache...
>Cache maintenance should always use "hit" type instructions, and you
>don't need to know the cache organisation for those, even with the
>Vr5432.
The counterargument to *always* using "hit" ops is that they
generate TLB traffic and TLB refills, which some people
find annoying to allow for and in any case time consuming.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
2000-08-02 18:12 ` Dominic Sweetman
@ 2000-08-02 21:38 ` Jun Sun
0 siblings, 0 replies; 15+ messages in thread
From: Jun Sun @ 2000-08-02 21:38 UTC (permalink / raw)
To: Dominic Sweetman; +Cc: linux, linux-mips, ralf
Dominic Sweetman wrote:
>
> Jun Sun (jsun@mvista.com) writes:
>
> > The first issue is multi-way cache support. DDB5476 uses R5432 CPU
> > which has two-way set-associative cache. The problematic part is
> > the index-based cache operations in r4xxcache.h does not cover all
> > ways in a set.
> >
> > I think this is a problem in general. So far I have seen MIPS
> > processors with 2-way, 4-way and 8-way sets. And I have seen them
> > using ether least- significant-addressing-bits or
> > most-significant-addressing-bits within a cache line to select ways.
>
> So far as I know the Vr5432 is the only CPU to do anything so silly as
> using the lowest index bits to select the "way".
Actually Sony's R4500 uses the same low bits mechanism.
> Cache maintenance should always use "hit" type instructions, and you
> don't need to know the cache organisation for those, even with the
> Vr5432.
>
Ideally - but no in reality. Linux stills uses index-operations a lot.
Theorically, indexed flush is faster if the flushing are is bigger than
the cache size.
> I suggest you should implement the don't-care method, and then have a
> cpu_info-driven special case for the unique and deprecated Vr5432.
>
If Vr5432 is really that unique, I think that is probably best way, at
least for now.
Jun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
2000-08-02 18:15 Kevin D. Kissell
2000-08-02 18:15 ` Kevin D. Kissell
@ 2000-08-02 21:50 ` Jun Sun
1 sibling, 0 replies; 15+ messages in thread
From: Jun Sun @ 2000-08-02 21:50 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: linux, linux-mips, ralf
"Kevin D. Kissell" wrote:
> It's possible to write code that is compatible with an R4000 but
> not MIPS32, and vice versa, but they are 99% identical.
>
Kevin,
Is that possible you can list the 1% difference here?
I have always been confused by MIPS32/MIPS64 vs R3000/R4000/etc. (And
on top of it, there is also MIPS I, II, III, IV, etc...). I am sure I
am not the only one.
If you can give an pointer that will clarify names, that would be good
too.
Jun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
@ 2000-08-02 22:44 Kevin D. Kissell
2000-08-02 22:44 ` Kevin D. Kissell
2000-08-02 23:10 ` Jun Sun
0 siblings, 2 replies; 15+ messages in thread
From: Kevin D. Kissell @ 2000-08-02 22:44 UTC (permalink / raw)
To: Jun Sun; +Cc: linux, linux-mips, ralf
>> It's possible to write code that is compatible with an R4000 but
>> not MIPS32, and vice versa, but they are 99% identical.
>
>Is that possible you can list the 1% difference here?
>
>I have always been confused by MIPS32/MIPS64 vs R3000/R4000/etc. (And
>on top of it, there is also MIPS I, II, III, IV, etc...). I am sure I
>am not the only one.
MIPS I, II, III, IV, and V are ISAs, instruction-set architectures.
R3000, R4000, R5000, R6000, R7000, R8000 et. al.
are microprocessor designs that conform (more-or-less)
to one of those ISAs. The ISAs were defined in such a
way as to be strict supersets of one another. Any MIPS
processor can run MIPS I code. Any MIPS IV processor
can run MIPS I, II, III, and IV code, etc. To oversimplify
slightly:
MIPS I - basic 32-bit RISC ISA
MIPS II - add branch-likely and Test instructions
MIPS III - add 64-bit address and 64-bit data support
MIPS IV - add FP MAC, Prefetch
MIPS V - add "paired single" SIMD FP instructions
As defined by MIPS in the beginning, the ISA - i.e. MIPS I,
MIPS II, etc. - described the machine as seen from the
standpoint of a user-mode application program. The
CP0 instructions and registers weren't considered a part
of it. This gave chip designers a lot of freedom and OS
writers a lot of headaches. The R8000, for example, was
the first MIPS IV CPU, and is 100% (well, maybe 99.99%)
compatible with the MIPS IV R10000 at the user binary level.
But while the R10000 has a CP0 organization that is
a straightforward extrapolation of the R4000 - they
wanted it to run NT, after all - the R8000 is just bizzare.
Another problem with the way things had been done
in the MIPS I/II/III days was that, due to the strict supersetting
rules, any new feature had to ride on the back of all the
other cool new features that came before it. As a specific
example, PREFetch is a MIPS IV instruction. But MIPS IV
implies MIPS III, and MIPS III implies a 64-bit CPU. So a
32-bit CPU supporting prefetch, which is a fairly obviously
useful thing, does not fit neatly into the model. So...
When MIPS Technologies spun back out of SGI, one of
the first things that was done was to set about defining
standard architectures for 32 and 64-bit CPUs that
solved these problems. These new standard architectures
are "MIPS32" and "MIPS64". These architectures include
both the ISA and the privileged resource architecture, or
PRA, so that CP0 is finally standardised - with some amount
of permitted subsetting and implementation-specific details
allowed, just the same. The MIPS32 ISA includes features
from MIPS I, II, III, IV, and V, as well as some stuff like
integer MADD, MSUB, CLZ, and CLO that had never
made it into the standard user mode ISA. But MIPS32
has no 64-bit operations. MIPS64 is the full-blown 64-bit
MIPS I-V+ ISA plus a PRA that is a strict superset of the
MIPS32 PRA.
So, to get back to Linux, a MIPS32 part can *almost*
run the standard MIPS R4K kernel. Almost. What had
to be fixed was essentially:
- ensuring that TLB initialization and invalidation never
write identical (even though invalid) entries to the TLB.
MIPS32 parts are allowed to complain about that, and
some of them do.
- ensuring that no 64-bit instructions are ever used. This
necessitated my rewriting the semaphore support code.
- eliminating certain assumptions about the relationship
between cache size, line size, and associativity.
Note that none of this stuff is incompatible with an R4xxx
or an R5xxx, its just a matter of being a little more generic.
And of course the flip side is that we don't use prefetch,
MADD, or CLZ in the kernel either, because the MIPS III-IV
parts can't handle them (well, OK, some of them can).
Hope this helps,
Keivn K.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
2000-08-02 22:44 Kevin D. Kissell
@ 2000-08-02 22:44 ` Kevin D. Kissell
2000-08-02 23:10 ` Jun Sun
1 sibling, 0 replies; 15+ messages in thread
From: Kevin D. Kissell @ 2000-08-02 22:44 UTC (permalink / raw)
To: Jun Sun; +Cc: linux, linux-mips, ralf
>> It's possible to write code that is compatible with an R4000 but
>> not MIPS32, and vice versa, but they are 99% identical.
>
>Is that possible you can list the 1% difference here?
>
>I have always been confused by MIPS32/MIPS64 vs R3000/R4000/etc. (And
>on top of it, there is also MIPS I, II, III, IV, etc...). I am sure I
>am not the only one.
MIPS I, II, III, IV, and V are ISAs, instruction-set architectures.
R3000, R4000, R5000, R6000, R7000, R8000 et. al.
are microprocessor designs that conform (more-or-less)
to one of those ISAs. The ISAs were defined in such a
way as to be strict supersets of one another. Any MIPS
processor can run MIPS I code. Any MIPS IV processor
can run MIPS I, II, III, and IV code, etc. To oversimplify
slightly:
MIPS I - basic 32-bit RISC ISA
MIPS II - add branch-likely and Test instructions
MIPS III - add 64-bit address and 64-bit data support
MIPS IV - add FP MAC, Prefetch
MIPS V - add "paired single" SIMD FP instructions
As defined by MIPS in the beginning, the ISA - i.e. MIPS I,
MIPS II, etc. - described the machine as seen from the
standpoint of a user-mode application program. The
CP0 instructions and registers weren't considered a part
of it. This gave chip designers a lot of freedom and OS
writers a lot of headaches. The R8000, for example, was
the first MIPS IV CPU, and is 100% (well, maybe 99.99%)
compatible with the MIPS IV R10000 at the user binary level.
But while the R10000 has a CP0 organization that is
a straightforward extrapolation of the R4000 - they
wanted it to run NT, after all - the R8000 is just bizzare.
Another problem with the way things had been done
in the MIPS I/II/III days was that, due to the strict supersetting
rules, any new feature had to ride on the back of all the
other cool new features that came before it. As a specific
example, PREFetch is a MIPS IV instruction. But MIPS IV
implies MIPS III, and MIPS III implies a 64-bit CPU. So a
32-bit CPU supporting prefetch, which is a fairly obviously
useful thing, does not fit neatly into the model. So...
When MIPS Technologies spun back out of SGI, one of
the first things that was done was to set about defining
standard architectures for 32 and 64-bit CPUs that
solved these problems. These new standard architectures
are "MIPS32" and "MIPS64". These architectures include
both the ISA and the privileged resource architecture, or
PRA, so that CP0 is finally standardised - with some amount
of permitted subsetting and implementation-specific details
allowed, just the same. The MIPS32 ISA includes features
from MIPS I, II, III, IV, and V, as well as some stuff like
integer MADD, MSUB, CLZ, and CLO that had never
made it into the standard user mode ISA. But MIPS32
has no 64-bit operations. MIPS64 is the full-blown 64-bit
MIPS I-V+ ISA plus a PRA that is a strict superset of the
MIPS32 PRA.
So, to get back to Linux, a MIPS32 part can *almost*
run the standard MIPS R4K kernel. Almost. What had
to be fixed was essentially:
- ensuring that TLB initialization and invalidation never
write identical (even though invalid) entries to the TLB.
MIPS32 parts are allowed to complain about that, and
some of them do.
- ensuring that no 64-bit instructions are ever used. This
necessitated my rewriting the semaphore support code.
- eliminating certain assumptions about the relationship
between cache size, line size, and associativity.
Note that none of this stuff is incompatible with an R4xxx
or an R5xxx, its just a matter of being a little more generic.
And of course the flip side is that we don't use prefetch,
MADD, or CLZ in the kernel either, because the MIPS III-IV
parts can't handle them (well, OK, some of them can).
Hope this helps,
Keivn K.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
2000-08-02 22:44 Kevin D. Kissell
2000-08-02 22:44 ` Kevin D. Kissell
@ 2000-08-02 23:10 ` Jun Sun
2000-08-02 23:31 ` Ralf Baechle
1 sibling, 1 reply; 15+ messages in thread
From: Jun Sun @ 2000-08-02 23:10 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: linux, linux-mips, ralf
That certainly helps a lot. Thanks, Kevin.
"Kevin D. Kissell" wrote:
...
>
> So, to get back to Linux, a MIPS32 part can *almost*
> run the standard MIPS R4K kernel. Almost. What had
Still one more question. If I understand correctly, the 4Km and 4Kp are
MIPS32 CPUs. However, they don't have TLBs. Right? Without TLBs, I
don't suppose Linux will run ...
Jun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: PROPOSAL : multi-way cache support in Linux/MIPS
2000-08-02 23:10 ` Jun Sun
@ 2000-08-02 23:31 ` Ralf Baechle
0 siblings, 0 replies; 15+ messages in thread
From: Ralf Baechle @ 2000-08-02 23:31 UTC (permalink / raw)
To: Jun Sun; +Cc: Kevin D. Kissell, linux, linux-mips
On Wed, Aug 02, 2000 at 04:10:43PM -0700, Jun Sun wrote:
> > So, to get back to Linux, a MIPS32 part can *almost*
> > run the standard MIPS R4K kernel. Almost. What had
>
> Still one more question. If I understand correctly, the 4Km and 4Kp are
> MIPS32 CPUs. However, they don't have TLBs. Right? Without TLBs, I
> don't suppose Linux will run ...
There is ``Microcontroller Linux'' aka uclinux available at www.uclinux.org.
It could be ported to TLB-less processors. You'd loose some of the
important functionality of the standard Linux, including some source
compatibility.
Ralf
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2000-08-02 23:35 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-08-02 8:26 PROPOSAL : multi-way cache support in Linux/MIPS Kevin D. Kissell
2000-08-02 8:26 ` Kevin D. Kissell
2000-08-02 17:05 ` Jun Sun
-- strict thread matches above, loose matches on Subject: below --
2000-08-02 22:44 Kevin D. Kissell
2000-08-02 22:44 ` Kevin D. Kissell
2000-08-02 23:10 ` Jun Sun
2000-08-02 23:31 ` Ralf Baechle
2000-08-02 18:36 Kevin D. Kissell
2000-08-02 18:36 ` Kevin D. Kissell
2000-08-02 18:15 Kevin D. Kissell
2000-08-02 18:15 ` Kevin D. Kissell
2000-08-02 21:50 ` Jun Sun
2000-08-01 23:52 Jun Sun
2000-08-02 18:12 ` Dominic Sweetman
2000-08-02 21:38 ` Jun Sun
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox