Re: [UPDATED PATCH] IP28 support

Linux MIPS Architecture development
 help / color / mirror / Atom feed

* Re: [UPDATED PATCH] IP28 support
@ 2007-12-23  1:44 peter fuerst
  2007-12-23  9:39 ` Richard Sandiford
  0 siblings, 1 reply; 26+ messages in thread
From: peter fuerst @ 2007-12-23  1:44 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Ralf Baechle, Thomas Bogendoerfer, Kumba, linux-mips



On Wed, 12 Dec 2007, Richard Sandiford wrote:

> Date: Wed, 12 Dec 2007 18:09:31 +0000
> From: Richard Sandiford <rsandifo@nildram.co.uk>
> To: peter fuerst <pf@pfrst.de>
> Cc: Ralf Baechle <ralf@linux-mips.org>,
>      Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
>      Kumba <kumba@gentoo.org>, linux-mips@linux-mips.org
> Subject: Re: [UPDATED PATCH] IP28 support
>
> peter fuerst <pf@pfrst.de> writes:
> >> Ralf Baechle <ralf@linux-mips.org> writes:
> >> >> Then there's the language-lawyerly code I gave to Peter on gcc-patches@:
> >> >>
> >> >>      void foo (int x)
> >> >>      {
> >> >>        int array[1];
> >> >>        if (x)
> >> >>          bar (array[0x1fff]);
> >> >>      }
> >> >>
> >
> > A strange method to pass data... Of course, cooking up such an "ABI",
> > where local variables are accessed with a const offset that is not known at
> > compile-time to be valid, would subvert the test for $sp-based accesses...
>
> Well, as I said when I gave that example originally, it's unlikely that
> the example would be written in that form.  But hide the constants and
> checks in configurable macros, and the general idea becomes a little
> more feasible.
>
> >> FWIW, my first cut at the option restrictions were based on what
> >> the patch exempts (and doesn't exempt).  We could instead get gcc
> >> to only exempt accesses that it can prove are either to the current
> >> function's stack frame or to its stack arguments.  I.e. rather than
> >> consider every $sp-based access to be safe, we'd instead do some
> >
> > "every $sp-based access" (set(mem(plus(sp)(const_int)))) is restricted
> > to local variables too, with the constant offset being either
> > - compiler-generated or
> > - deliberately put in the source (however including the above example)
>
> That's not literally true.  SP+INT addresses can be used to access
> stack arguments too, and 4.x can optimise some varargs accesses to

"local variables" was meant to include (var-)argument-slots too, which are
allright, so far.

> compile-time base+offset addresses.  And as I said, the compiler is
> free to make up accesses that aren't in fact valid for cases where
> the access isn't made.  E.g. if you had a loop with a stride of 128,
> the compiler could unroll the loop as many times as it likes.  Some
> of the unrolled iterations might access areas outside the stack frame.

You mean, the compiler would generate $sp+const_int accesses here, whose
validity is not known at compile-time - similar to foo() above ?

> (You would hope that the compiler would be intelligent enough to crop
> the iteration count in such cases, because the extra iterations should
> never be used in valid code.  But that isn't the point.  The compiler
> doesn't _need_ to crop the iteration count for correctness, and we're
> talking about something we _do_ need for correctness.)
>
> >> bounds checking on the value.
> > Fine, if that is possible.
>
> FWIW, the frame info is available in cfun->machine->frame at the time
> your code runs.
>
> >>                                (We could also use MEM_ATTRS to
> >> pick up cases where a stack variable is acceesed via something
> >> other than the stack or frame pointers, as happens for large frames.)
> >
> > Aren't these always accesses with non-constant offset, where a CB can't be
> > avoided, even if they are recognized as being actually relative to $sp ?
>
> The MEM_ATTRS I meant were MEM_EXPR + MEM_OFFSET, which only apply where
> there is a known constant offset.
>
> >> > In case of a hypothetic multi-platform kernel of which at least one needs
> >> > the R10000 workarounds, all code would be uniformly compiled with the
> >> > magic -mr10k-cache-barrier option and all source level workaround would
> >> > be enabled.
> >>
> >> Hmm.  This probably shows I am misunderstanding the problem, but I was
> >> thinking about the IO-mapped case.  I thought one of the problems was
> >> that if you had a cached speculative load or store to an access-sensitive
> >> IO-mapped address, the IO-mapped device might "see" that access even if it
> >> doesn't take place.  Could you not have a situation where a KSEG0 or
> >> XKSEG0 access is access-sensitive on one machine and not another?
> >> The patch won't insert countermeasures before symbolic and constant
> >> addresses, because it believes all such addresses to be safe.
> >>
> >
> > The threat to IO-addresses comes from the addressing register in the speculated
> > mem-instruction (set(mem(plus(reg)...), containing one of the addresses as
> > "garbage".
> >
> > Symbolic addresses are well defined from link-time on, no matter what history
> > before the access.  They either point (set(mem(plus(symbol_ref)...) to
> > - some variable in the cached area, what is harmless (unless DMA-related),
> >   or to
> > - IO-devices, accessed uncached, i.e. non-speculative,
> > unless there is a programming-error ;)
> > The same holds for const_int used as address.
>
> I think you're missing my point.  If you access an I/O-mapped device
> through KSEG2 or an uncached XKPHYS address, is it not also physically
> possible (though clearly unwise) to access it through KSEG0 or a cached
> XKPHYS address too?  So can you guarantee that every const_int cached
> address in a multi-platform kernel is not I/O-mapped on any of the r10k
> platforms?  Or can you guarantee that the compiler will not manufacture
> such an address from an otherwise harmless address?
Hmm, it's not quite clear, how it could be manufactured.
>                                                     Again, the key thing
> is to think about what the compiler can validly do on non-r10k platforms,
> however silly it might seem, and then make sure the workarounds cope
> with it.

Do you think of accesses that essentially look like this ?

  if (machine_x)
     *uncached(const_addr) = val;
  else
     *cached(const_addr) = val;

Fortunately (at least? even?) on IP28 cached access (hence a block read
request) to an I/O-device address is a non-issue. In this respect the
hardware design seems to follow the recommendations from the R10000 manual
(NACK from external agent?):
- if such an access graduates (i.e. a "real" access), a bus-error will occur.
- if not (i.e. mis-speculated), nothing happens.

However, i don't yet know, how O2 behaves, or, if there exists any other
R10k-machine, which would need the software-workaround.

>
> >> I'm also a little worried that the compiler is free to make up accesses
> >> that didn't exist in the original program, provided that those accesses
> > The cache-barrier itself ?
>
> No, in general.  Optimisers (particularly loop optimisers) can invent
> accesses that didn't exist in the original source code.  Normally they
> would only be executed in correct circumstances, but with this
> speculative execution, that might not be true.
>
> >> are never actually performed in cases where they'd be wrong.  So how about:
> >>
> >> -mr10k-cache-barrier=load-store
> >>   Insert a cache barrier at the beginning of any sequentially-executed
> >>   series of instructions that contains a load or store.  For the purposes
> >>   of this option, GCC can ignore loads and stores that it can prove
> >>   are an in-range access to:
> >>
> >>   (a) the current function's stack frame;
> >>   (b) an incoming stack argument;
> >>   (b) an object with a link-time-constant address; or
> >>   (c) a block of uncached memory
> > Can we recognize uncached memory in the instruction ?
>
> Well, I was just thinking about teaching the compiler about KSEG2,
> the always-uncached XKPHYS addresses, etc.  (Sorry for messing up
> the bullet letters there!)  The idea is that we have a correlation
> between symbolic constants and C objects, so we can check whether
> an offset in a symbolic constant is within the object.  We already

No doubt, this would be very helpfull.

> have code to do this in other situations.  But there is no correlation
> between const_int addresses and C objects, and we cannot be sure that
> a given const_int address existed in the original source code, so
> I think the only safe thing is to check its uncached properties instead.
>
> I know all this must be frustrating.  I'm sure your patches work great
> as they are with current and past kernels, and current and past compilers.
> The problem is that, if it becomes a mainline gcc feature, it needs to be
> defined from first principles.

Agreed, the design of any feature advantageously should be based on a clear
(more or less formal) specification of what the compiler can do.

>                                 And we need to do that without assuming
> that the accesses we're looking at existed in the original source code.
>
> FWIW, I'm happy to help update the patch once we've agreed on an

This would be appreciated (of course :) Many thanks in advance!

> option spec.

Well, the option spec could be as listed above. With "store" as default
for an empty option-string ("none" as default if the option isn't given
at all).

>
> Richard
>
>

kind regards

peter


PS: apologies for delaying the answer, i just couldn't concentrate on
this topic recently.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-23  1:44 [UPDATED PATCH] IP28 support peter fuerst
@ 2007-12-23  9:39 ` Richard Sandiford
  2007-12-24  0:39   ` post
  2008-01-16 19:32   ` peter fuerst
  0 siblings, 2 replies; 26+ messages in thread
From: Richard Sandiford @ 2007-12-23  9:39 UTC (permalink / raw)
  To: peter fuerst; +Cc: Ralf Baechle, Thomas Bogendoerfer, Kumba, linux-mips

peter fuerst <post@pfrst.de> writes:
>> compile-time base+offset addresses.  And as I said, the compiler is
>> free to make up accesses that aren't in fact valid for cases where
>> the access isn't made.  E.g. if you had a loop with a stride of 128,
>> the compiler could unroll the loop as many times as it likes.  Some
>> of the unrolled iterations might access areas outside the stack frame.
>
> You mean, the compiler would generate $sp+const_int accesses here, whose
> validity is not known at compile-time - similar to foo() above ?

Right.

>> I think you're missing my point.  If you access an I/O-mapped device
>> through KSEG2 or an uncached XKPHYS address, is it not also physically
>> possible (though clearly unwise) to access it through KSEG0 or a cached
>> XKPHYS address too?  So can you guarantee that every const_int cached
>> address in a multi-platform kernel is not I/O-mapped on any of the r10k
>> platforms?  Or can you guarantee that the compiler will not manufacture
>> such an address from an otherwise harmless address?
> Hmm, it's not quite clear, how it could be manufactured.

Similar to the above really, for combinations of suitably bizarre input
code and compiler behaviour.  Again, the problem isn't that such a thing
is _likely_ to happen, just that it wouldn't be wrong for it to happen in
non-r10k situations (and thus not likely to be treated as a "wrong-code"
bug by gcc developers).

>>                                                     Again, the key thing
>> is to think about what the compiler can validly do on non-r10k platforms,
>> however silly it might seem, and then make sure the workarounds cope
>> with it.
>
> Do you think of accesses that essentially look like this ?
>
>   if (machine_x)
>      *uncached(const_addr) = val;
>   else
>      *cached(const_addr) = val;

Well, more generally, I was thinking of something like:


    if (machine_x)
      *cached(const_addr1) = ...;
    else
      ...blah...

where const_addr1 might be harmful if !machine_x.

> Fortunately (at least? even?) on IP28 cached access (hence a block read
> request) to an I/O-device address is a non-issue. In this respect the
> hardware design seems to follow the recommendations from the R10000 manual
> (NACK from external agent?):
> - if such an access graduates (i.e. a "real" access), a bus-error will occur.
> - if not (i.e. mis-speculated), nothing happens.

Ah, OK.  That's what I was missing, thanks.  (I suspect you and Ralf
have explained that to me before, but it hadn't sunk in.  Sorry!)

> However, i don't yet know, how O2 behaves, or, if there exists any other
> R10k-machine, which would need the software-workaround.

OK.

In that case, for the IP28 at least, I think the only issue with excluding
cachable const_int addresses is that the compiler might somehow conspire to
create an address that turns out to be, for some runs at least, an address
in a DMA buffer.

> Well, the option spec could be as listed above. With "store" as default
> for an empty option-string ("none" as default if the option isn't given
> at all).

Sounds good.

Thanks, it seems we have a plan ;)

Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-23  9:39 ` Richard Sandiford
@ 2007-12-24  0:39   ` post
  2008-01-16 19:32   ` peter fuerst
  1 sibling, 0 replies; 26+ messages in thread
From: post @ 2007-12-24  0:39 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Ralf Baechle, Thomas Bogendoerfer, Kumba, linux-mips



On Sun, 23 Dec 2007, Richard Sandiford wrote:

> Date: Sun, 23 Dec 2007 09:39:28 +0000
> From: Richard Sandiford <rsandifo@nildram.co.uk>
> To: peter fuerst <post@pfrst.de>
> Cc: Ralf Baechle <ralf@linux-mips.org>,
>      Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
>      Kumba <kumba@gentoo.org>, linux-mips@linux-mips.org
> Subject: Re: [UPDATED PATCH] IP28 support
> 
> ...
> Ah, OK.  That's what I was missing, thanks.  (I suspect you and Ralf
> have explained that to me before, but it hadn't sunk in.  Sorry!)

Missed to explain that in time... Sorry!

> ...

kind regards

peter

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-23  9:39 ` Richard Sandiford
  2007-12-24  0:39   ` post
@ 2008-01-16 19:32   ` peter fuerst
  2008-01-19 14:14     ` Richard Sandiford
  1 sibling, 1 reply; 26+ messages in thread
From: peter fuerst @ 2008-01-16 19:32 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Ralf Baechle, Thomas Bogendoerfer, Kumba, linux-mips



Hi,

what next step do you suggest ?

kind regards

peter


On Sun, 23 Dec 2007, Richard Sandiford wrote:

> Date: Sun, 23 Dec 2007 09:39:28 +0000
> From: Richard Sandiford <rsandifo@nildram.co.uk>
> To: peter fuerst <post@pfrst.de>
> Cc: Ralf Baechle <ralf@linux-mips.org>,
>      Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
>      Kumba <kumba@gentoo.org>, linux-mips@linux-mips.org
> Subject: Re: [UPDATED PATCH] IP28 support
>
>
> ...
>
> Thanks, it seems we have a plan ;)
>
> Richard
>
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2008-01-16 19:32   ` peter fuerst
@ 2008-01-19 14:14     ` Richard Sandiford
  2008-01-19 23:56       ` post
  0 siblings, 1 reply; 26+ messages in thread
From: Richard Sandiford @ 2008-01-19 14:14 UTC (permalink / raw)
  To: post; +Cc: Ralf Baechle, Thomas Bogendoerfer, Kumba, linux-mips

peter fuerst <post@pfrst.de> writes:
> what next step do you suggest ?

Sorry, I've been busy with gcc stage 3 stuff recently, so haven't had
chance to get to this.  Have you done any more on the patch since the
version you last posted to gcc-patches?  If not, I can take that and
convert it to what we discussed.

Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2008-01-19 14:14     ` Richard Sandiford
@ 2008-01-19 23:56       ` post
  0 siblings, 0 replies; 26+ messages in thread
From: post @ 2008-01-19 23:56 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Ralf Baechle, Thomas Bogendoerfer, Kumba, linux-mips



Ok,

i'll apply proposed changes as of 2006-05.msg01446 to the last sent
patch (2006-04.msg00084), so we have a slightly better starting point.

kind regards

peter


On Sat, 19 Jan 2008, Richard Sandiford wrote:

> Date: Sat, 19 Jan 2008 14:14:34 +0000
> From: Richard Sandiford <rsandifo@nildram.co.uk>
> To: post@pfrst.de
> Cc: Ralf Baechle <ralf@linux-mips.org>,
>      Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
>      Kumba <kumba@gentoo.org>, linux-mips@linux-mips.org
> Subject: Re: [UPDATED PATCH] IP28 support
> 
> peter fuerst <post@pfrst.de> writes:
> > what next step do you suggest ?
> 
> Sorry, I've been busy with gcc stage 3 stuff recently, so haven't had
> chance to get to this.  Have you done any more on the patch since the
> version you last posted to gcc-patches?  If not, I can take that and
> convert it to what we discussed.
> 
> Richard
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [UPDATED PATCH] IP28 support
@ 2007-12-02 12:00 Thomas Bogendoerfer
  0 siblings, 0 replies; 26+ messages in thread
From: Thomas Bogendoerfer @ 2007-12-02 12:00 UTC (permalink / raw)
  To: linux-mips; +Cc: ralf

Add support for SGI IP28 machines (Indigo 2 with R10k CPUs)
This work is mainly based on Peter Fuersts work.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
---

Changes to last version:

- restructered Kconfig to make device/feature selection easier

 arch/mips/Kconfig                                  |   66 ++-
 arch/mips/Makefile                                 |   14 +
 arch/mips/sgi-ip22/Makefile                        |    8 +-
 arch/mips/sgi-ip22/ip22-mc.c                       |    4 +
 arch/mips/sgi-ip22/ip28-berr.c                     |  700 ++++++++++++++++++++
 include/asm-mips/dma.h                             |    7 +-
 include/asm-mips/mach-ip28/cpu-feature-overrides.h |   50 ++
 include/asm-mips/mach-ip28/ds1286.h                |    4 +
 include/asm-mips/mach-ip28/spaces.h                |   22 +
 include/asm-mips/mach-ip28/war.h                   |   25 +
 10 files changed, 890 insertions(+), 10 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 455bd1f..aae317d 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -122,6 +122,7 @@ config MACH_JAZZ
 	select ARCH_MAY_HAVE_PC_FDC
 	select CEVT_R4K
 	select CSRC_R4K
+	select DEFAULT_SGI_PARTITION if CPU_BIG_ENDIAN
 	select GENERIC_ISA_DMA
 	select IRQ_CPU
 	select I8253
@@ -398,6 +399,7 @@ config SGI_IP22
 	select BOOT_ELF32
 	select CEVT_R4K
 	select CSRC_R4K
+	select DEFAULT_SGI_PARTITION
 	select DMA_NONCOHERENT
 	select HW_HAS_EISA
 	select I8253
@@ -405,6 +407,12 @@ config SGI_IP22
 	select IP22_CPU_SCACHE
 	select IRQ_CPU
 	select GENERIC_ISA_DMA_SUPPORT_BROKEN
+	select SGI_HAS_DS1286
+	select SGI_HAS_I8042
+	select SGI_HAS_INDYDOG
+	select SGI_HAS_SEEQ
+	select SGI_HAS_WD93
+	select SGI_HAS_ZILOG
 	select SWAP_IO_SPACE
 	select SYS_HAS_CPU_R4X00
 	select SYS_HAS_CPU_R5000
@@ -422,6 +430,7 @@ config SGI_IP27
 	select ARC
 	select ARC64
 	select BOOT_ELF64
+	select DEFAULT_SGI_PARTITION
 	select DMA_IP27
 	select SYS_HAS_EARLY_PRINTK
 	select HW_HAS_PCI
@@ -438,6 +447,35 @@ config SGI_IP27
 	  workstations.  To compile a Linux kernel that runs on these, say Y
 	  here.
 
+config SGI_IP28
+	bool "SGI IP28 (Indigo2 R10k) (EXPERIMENTAL)"
+	depends on EXPERIMENTAL
+	select ARC
+	select ARC64
+	select BOOT_ELF64
+	select CEVT_R4K
+	select CSRC_R4K
+	select DEFAULT_SGI_PARTITION
+	select DMA_NONCOHERENT
+	select IRQ_CPU
+	select HW_HAS_EISA
+	select I8253
+	select I8259
+	select SGI_HAS_DS1286
+	select SGI_HAS_I8042
+	select SGI_HAS_INDYDOG
+	select SGI_HAS_SEEQ
+	select SGI_HAS_WD93
+	select SGI_HAS_ZILOG
+	select SWAP_IO_SPACE
+	select SYS_HAS_CPU_R10000
+	select SYS_HAS_EARLY_PRINTK
+	select SYS_SUPPORTS_64BIT_KERNEL
+	select SYS_SUPPORTS_BIG_ENDIAN
+      help
+        This is the SGI Indigo2 with R10000 processor.  To compile a Linux
+        kernel that runs on these, say Y here.
+
 config SGI_IP32
 	bool "SGI IP32 (O2)"
 	select ARC
@@ -577,6 +615,7 @@ config SNI_RM
 	select BOOT_ELF32
 	select CEVT_R4K
 	select CSRC_R4K
+	select DEFAULT_SGI_PARTITION if CPU_BIG_ENDIAN
 	select DMA_NONCOHERENT
 	select GENERIC_ISA_DMA
 	select HW_HAS_EISA
@@ -950,6 +989,27 @@ config EMMA2RH
 config SERIAL_RM9000
 	bool
 
+config SGI_HAS_DS1286
+	bool
+
+config SGI_HAS_INDYDOG
+	bool
+
+config SGI_HAS_SEEQ
+	bool
+
+config SGI_HAS_WD93
+	bool
+
+config SGI_HAS_ZILOG
+	bool
+
+config SGI_HAS_I8042
+	bool
+
+config DEFAULT_SGI_PARTITION
+	bool
+
 config ARC32
 	bool
 
@@ -959,7 +1019,7 @@ config BOOT_ELF32
 config MIPS_L1_CACHE_SHIFT
 	int
 	default "4" if MACH_DECSTATION
-	default "7" if SGI_IP27 || SNI_RM
+	default "7" if SGI_IP27 || SGI_IP28 || SNI_RM
 	default "4" if PMC_MSP4200_EVAL
 	default "5"
 
@@ -968,7 +1028,7 @@ config HAVE_STD_PC_SERIAL_PORT
 
 config ARC_CONSOLE
 	bool "ARC console support"
-	depends on SGI_IP22 || (SNI_RM && CPU_LITTLE_ENDIAN)
+	depends on SGI_IP22 || SGI_IP28 || (SNI_RM && CPU_LITTLE_ENDIAN)
 
 config ARC_MEMORY
 	bool
@@ -977,7 +1037,7 @@ config ARC_MEMORY
 
 config ARC_PROMLIB
 	bool
-	depends on MACH_JAZZ || SNI_RM || SGI_IP22 || SGI_IP32
+	depends on MACH_JAZZ || SNI_RM || SGI_IP22 || SGI_IP28 || SGI_IP32
 	default y
 
 config ARC64
diff --git a/arch/mips/Makefile b/arch/mips/Makefile
index a1f8d8b..d91fbca 100644
--- a/arch/mips/Makefile
+++ b/arch/mips/Makefile
@@ -475,6 +475,20 @@ endif
 endif
 
 #
+# SGI IP28 (Indigo2 R10k)
+#
+# Set the load address to >= 0xa800000020080000 if you want to leave space for
+# symmon, 0xa800000020004000 for production kernels ?  Note that the value must
+# be 16kb aligned or the handling of the current variable will break.
+# Simplified: what IP22 does at 128MB+ in ksegN, IP28 does at 512MB+ in xkphys
+#
+#core-$(CONFIG_SGI_IP28)		+= arch/mips/sgi-ip22/ arch/mips/arc/arc_con.o
+core-$(CONFIG_SGI_IP28)		+= arch/mips/sgi-ip22/
+cflags-$(CONFIG_SGI_IP28)	+= -mr10k-cache-barrier=1 -Iinclude/asm-mips/mach-ip28
+#cflags-$(CONFIG_SGI_IP28)	+= -Iinclude/asm-mips/mach-ip28
+load-$(CONFIG_SGI_IP28)		+= 0xa800000020004000
+
+#
 # SGI-IP32 (O2)
 #
 # Set the load address to >= 80069000 if you want to leave space for symmon,
diff --git a/arch/mips/sgi-ip22/Makefile b/arch/mips/sgi-ip22/Makefile
index e3acb51..ef1564e 100644
--- a/arch/mips/sgi-ip22/Makefile
+++ b/arch/mips/sgi-ip22/Makefile
@@ -3,9 +3,11 @@
 # under Linux.
 #
 
-obj-y	+= ip22-mc.o ip22-hpc.o ip22-int.o ip22-berr.o \
-	   ip22-time.o ip22-nvram.o ip22-platform.o ip22-reset.o ip22-setup.o
+obj-y	+= ip22-mc.o ip22-hpc.o ip22-int.o ip22-time.o ip22-nvram.o \
+	   ip22-platform.o ip22-reset.o ip22-setup.o
 
+obj-$(CONFIG_SGI_IP22) += ip22-berr.o
+obj-$(CONFIG_SGI_IP28) += ip28-berr.o
 obj-$(CONFIG_EISA)	+= ip22-eisa.o
 
-EXTRA_CFLAGS += -Werror
+# EXTRA_CFLAGS += -Werror
diff --git a/arch/mips/sgi-ip22/ip22-mc.c b/arch/mips/sgi-ip22/ip22-mc.c
index 01a805d..3f35d63 100644
--- a/arch/mips/sgi-ip22/ip22-mc.c
+++ b/arch/mips/sgi-ip22/ip22-mc.c
@@ -4,6 +4,7 @@
  * Copyright (C) 1996 David S. Miller (dm@engr.sgi.com)
  * Copyright (C) 1999 Andrew R. Baker (andrewb@uab.edu) - Indigo2 changes
  * Copyright (C) 2003 Ladislav Michl  (ladis@linux-mips.org)
+ * Copyright (C) 2004 Peter Fuerst    (pf@net.alphadv.de) - IP28
  */
 
 #include <linux/init.h>
@@ -137,9 +138,12 @@ void __init sgimc_init(void)
 	/* Step 2: Enable all parity checking in cpu control register
 	 *         zero.
 	 */
+	/* don't touch parity settings for IP28 */
+#ifndef CONFIG_SGI_IP28
 	tmp = sgimc->cpuctrl0;
 	tmp |= (SGIMC_CCTRL0_EPERRGIO | SGIMC_CCTRL0_EPERRMEM |
 		SGIMC_CCTRL0_R4KNOCHKPARR);
+#endif
 	sgimc->cpuctrl0 = tmp;
 
 	/* Step 3: Setup the MC write buffer depth, this is controlled
diff --git a/arch/mips/sgi-ip22/ip28-berr.c b/arch/mips/sgi-ip22/ip28-berr.c
new file mode 100644
index 0000000..0ee5be8
--- /dev/null
+++ b/arch/mips/sgi-ip22/ip28-berr.c
@@ -0,0 +1,700 @@
+/*
+ * ip28-berr.c: Bus error handling.
+ *
+ * Copyright (C) 2002, 2003 Ladislav Michl (ladis@linux-mips.org)
+ * Copyright (C) 2005 Peter Fuerst (pf@net.alphadv.de) - IP28
+ */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+
+#include <asm/addrspace.h>
+#include <asm/system.h>
+#include <asm/traps.h>
+#include <asm/branch.h>
+#include <asm/irq_regs.h>
+#include <asm/sgi/mc.h>
+#include <asm/sgi/hpc3.h>
+#include <asm/sgi/ioc.h>
+#include <asm/sgi/ip22.h>
+#include <asm/r4kcache.h>
+#include <asm/uaccess.h>
+#include <asm/bootinfo.h>
+
+static unsigned int count_be_is_fixup;
+static unsigned int count_be_handler;
+static unsigned int count_be_interrupt;
+static int debug_be_interrupt;
+
+static unsigned int cpu_err_stat;	/* Status reg for CPU */
+static unsigned int gio_err_stat;	/* Status reg for GIO */
+static unsigned int cpu_err_addr;	/* Error address reg for CPU */
+static unsigned int gio_err_addr;	/* Error address reg for GIO */
+static unsigned int extio_stat;
+static unsigned int hpc3_berr_stat;	/* Bus error interrupt status */
+
+struct hpc3_stat {
+	unsigned long addr;
+	unsigned int ctrl;
+	unsigned int cbp;
+	unsigned int ndptr;
+};
+
+static struct {
+	struct hpc3_stat pbdma[8];
+	struct hpc3_stat scsi[2];
+	struct hpc3_stat ethrx, ethtx;
+} hpc3;
+
+static struct {
+	unsigned long err_addr;
+	struct {
+		u32 lo;
+		u32 hi;
+	} tags[1][2], tagd[4][2], tagi[4][2]; /* Way 0/1 */
+} cache_tags;
+
+static inline void save_cache_tags(unsigned busaddr)
+{
+	unsigned long addr = CAC_BASE | busaddr;
+	int i;
+	cache_tags.err_addr = addr;
+
+	/*
+	 * Starting with a bus-address, save secondary cache (indexed by
+	 * PA[23..18:7..6]) tags first.
+	 */
+	addr &= ~1L;
+#define tag cache_tags.tags[0]
+	cache_op(Index_Load_Tag_S, addr);
+	tag[0].lo = read_c0_taglo();	/* PA[35:18], VA[13:12] */
+	tag[0].hi = read_c0_taghi();	/* PA[39:36] */
+	cache_op(Index_Load_Tag_S, addr | 1L);
+	tag[1].lo = read_c0_taglo();	/* PA[35:18], VA[13:12] */
+	tag[1].hi = read_c0_taghi();	/* PA[39:36] */
+#undef tag
+
+	/*
+	 * Save all primary data cache (indexed by VA[13:5]) tags which
+	 * might fit to this bus-address, knowing that VA[11:0] == PA[11:0].
+	 * Saving all tags and evaluating them later is easier and safer
+	 * than relying on VA[13:12] from the secondary cache tags to pick
+	 * matching primary tags here already.
+	 */
+	addr &= (0xffL << 56) | ((1 << 12) - 1);
+#define tag cache_tags.tagd[i]
+	for (i = 0; i < 4; ++i, addr += (1 << 12)) {
+		cache_op(Index_Load_Tag_D, addr);
+		tag[0].lo = read_c0_taglo();	/* PA[35:12] */
+		tag[0].hi = read_c0_taghi();	/* PA[39:36] */
+		cache_op(Index_Load_Tag_D, addr | 1L);
+		tag[1].lo = read_c0_taglo();	/* PA[35:12] */
+		tag[1].hi = read_c0_taghi();	/* PA[39:36] */
+	}
+#undef tag
+
+	/*
+	 * Save primary instruction cache (indexed by VA[13:6]) tags
+	 * the same way.
+	 */
+	addr &= (0xffL << 56) | ((1 << 12) - 1);
+#define tag cache_tags.tagi[i]
+	for (i = 0; i < 4; ++i, addr += (1 << 12)) {
+		cache_op(Index_Load_Tag_I, addr);
+		tag[0].lo = read_c0_taglo();	/* PA[35:12] */
+		tag[0].hi = read_c0_taghi();	/* PA[39:36] */
+		cache_op(Index_Load_Tag_I, addr | 1L);
+		tag[1].lo = read_c0_taglo();	/* PA[35:12] */
+		tag[1].hi = read_c0_taghi();	/* PA[39:36] */
+	}
+#undef tag
+}
+
+#define GIO_ERRMASK	0xff00
+#define CPU_ERRMASK	0x3f00
+
+static void save_and_clear_buserr(void)
+{
+	int i;
+
+	/* save status registers */
+	cpu_err_addr = sgimc->cerr;
+	cpu_err_stat = sgimc->cstat;
+	gio_err_addr = sgimc->gerr;
+	gio_err_stat = sgimc->gstat;
+	extio_stat = sgioc->extio;
+	hpc3_berr_stat = hpc3c0->bestat;
+
+	hpc3.scsi[0].addr  = (unsigned long)&hpc3c0->scsi_chan0;
+	hpc3.scsi[0].ctrl  = hpc3c0->scsi_chan0.ctrl; /* HPC3_SCTRL_ACTIVE ? */
+	hpc3.scsi[0].cbp   = hpc3c0->scsi_chan0.cbptr;
+	hpc3.scsi[0].ndptr = hpc3c0->scsi_chan0.ndptr;
+
+	hpc3.scsi[1].addr  = (unsigned long)&hpc3c0->scsi_chan1;
+	hpc3.scsi[1].ctrl  = hpc3c0->scsi_chan1.ctrl; /* HPC3_SCTRL_ACTIVE ? */
+	hpc3.scsi[1].cbp   = hpc3c0->scsi_chan1.cbptr;
+	hpc3.scsi[1].ndptr = hpc3c0->scsi_chan1.ndptr;
+
+	hpc3.ethrx.addr  = (unsigned long)&hpc3c0->ethregs.rx_cbptr;
+	hpc3.ethrx.ctrl  = hpc3c0->ethregs.rx_ctrl; /* HPC3_ERXCTRL_ACTIVE ? */
+	hpc3.ethrx.cbp   = hpc3c0->ethregs.rx_cbptr;
+	hpc3.ethrx.ndptr = hpc3c0->ethregs.rx_ndptr;
+
+	hpc3.ethtx.addr  = (unsigned long)&hpc3c0->ethregs.tx_cbptr;
+	hpc3.ethtx.ctrl  = hpc3c0->ethregs.tx_ctrl; /* HPC3_ETXCTRL_ACTIVE ? */
+	hpc3.ethtx.cbp   = hpc3c0->ethregs.tx_cbptr;
+	hpc3.ethtx.ndptr = hpc3c0->ethregs.tx_ndptr;
+
+	for (i = 0; i < 8; ++i) {
+		/* HPC3_PDMACTRL_ISACT ? */
+		hpc3.pbdma[i].addr  = (unsigned long)&hpc3c0->pbdma[i];
+		hpc3.pbdma[i].ctrl  = hpc3c0->pbdma[i].pbdma_ctrl;
+		hpc3.pbdma[i].cbp   = hpc3c0->pbdma[i].pbdma_bptr;
+		hpc3.pbdma[i].ndptr = hpc3c0->pbdma[i].pbdma_dptr;
+	}
+	i = 0;
+	if (gio_err_stat & CPU_ERRMASK)
+		i = gio_err_addr;
+	if (cpu_err_stat & CPU_ERRMASK)
+		i = cpu_err_addr;
+	save_cache_tags(i);
+
+	sgimc->cstat = sgimc->gstat = 0;
+}
+
+static void print_cache_tags(void)
+{
+	u32 scb, scw;
+	int i;
+
+	printk(KERN_ERR "Cache tags @ %08x:\n", (unsigned)cache_tags.err_addr);
+
+	/* PA[31:12] shifted to PTag0 (PA[35:12]) format */
+	scw = (cache_tags.err_addr >> 4) & 0x0fffff00;
+
+	scb = cache_tags.err_addr & ((1 << 12) - 1) & ~((1 << 5) - 1);
+	for (i = 0; i < 4; ++i) { /* for each possible VA[13:12] value */
+		if ((cache_tags.tagd[i][0].lo & 0x0fffff00) != scw &&
+		    (cache_tags.tagd[i][1].lo & 0x0fffff00) != scw)
+		    continue;
+		printk(KERN_ERR
+		       "D: 0: %08x %08x, 1: %08x %08x  (VA[13:5]  %04x)\n",
+			cache_tags.tagd[i][0].hi, cache_tags.tagd[i][0].lo,
+			cache_tags.tagd[i][1].hi, cache_tags.tagd[i][1].lo,
+			scb | (1 << 12)*i);
+	}
+	scb = cache_tags.err_addr & ((1 << 12) - 1) & ~((1 << 6) - 1);
+	for (i = 0; i < 4; ++i) { /* for each possible VA[13:12] value */
+		if ((cache_tags.tagi[i][0].lo & 0x0fffff00) != scw &&
+		    (cache_tags.tagi[i][1].lo & 0x0fffff00) != scw)
+		    continue;
+		printk(KERN_ERR
+		       "I: 0: %08x %08x, 1: %08x %08x  (VA[13:6]  %04x)\n",
+			cache_tags.tagi[i][0].hi, cache_tags.tagi[i][0].lo,
+			cache_tags.tagi[i][1].hi, cache_tags.tagi[i][1].lo,
+			scb | (1 << 12)*i);
+	}
+	i = read_c0_config();
+	scb = i & (1 << 13) ? 7:6;      /* scblksize = 2^[7..6] */
+	scw = ((i >> 16) & 7) + 19 - 1; /* scwaysize = 2^[24..19] / 2 */
+
+	i = ((1 << scw) - 1) & ~((1 << scb) - 1);
+	printk(KERN_ERR "S: 0: %08x %08x, 1: %08x %08x  (PA[%u:%u] %05x)\n",
+		cache_tags.tags[0][0].hi, cache_tags.tags[0][0].lo,
+		cache_tags.tags[0][1].hi, cache_tags.tags[0][1].lo,
+		scw-1, scb, i & (unsigned)cache_tags.err_addr);
+}
+
+static inline const char *cause_excode_text(int cause)
+{
+	static const char *txt[32] =
+	{	"Interrupt",
+		"TLB modification",
+		"TLB (load or instruction fetch)",
+		"TLB (store)",
+		"Address error (load or instruction fetch)",
+		"Address error (store)",
+		"Bus error (instruction fetch)",
+		"Bus error (data: load or store)",
+		"Syscall",
+		"Breakpoint",
+		"Reserved instruction",
+		"Coprocessor unusable",
+		"Arithmetic Overflow",
+		"Trap",
+		"14",
+		"Floating-Point",
+		"16", "17", "18", "19", "20", "21", "22",
+		"Watch Hi/Lo",
+		"24", "25", "26", "27", "28", "29", "30", "31",
+	};
+	return txt[(cause & 0x7c) >> 2];
+}
+
+static void print_buserr(const struct pt_regs *regs)
+{
+	const int field = 2 * sizeof(unsigned long);
+	int error = 0;
+
+	if (extio_stat & EXTIO_MC_BUSERR) {
+		printk(KERN_ERR "MC Bus Error\n");
+		error |= 1;
+	}
+	if (extio_stat & EXTIO_HPC3_BUSERR) {
+		printk(KERN_ERR "HPC3 Bus Error 0x%x:<id=0x%x,%s,lane=0x%x>\n",
+			hpc3_berr_stat,
+			(hpc3_berr_stat & HPC3_BESTAT_PIDMASK) >>
+					  HPC3_BESTAT_PIDSHIFT,
+			(hpc3_berr_stat & HPC3_BESTAT_CTYPE) ? "PIO" : "DMA",
+			hpc3_berr_stat & HPC3_BESTAT_BLMASK);
+		error |= 2;
+	}
+	if (extio_stat & EXTIO_EISA_BUSERR) {
+		printk(KERN_ERR "EISA Bus Error\n");
+		error |= 4;
+	}
+	if (cpu_err_stat & CPU_ERRMASK) {
+		printk(KERN_ERR "CPU error 0x%x<%s%s%s%s%s%s> @ 0x%08x\n",
+			cpu_err_stat,
+			cpu_err_stat & SGIMC_CSTAT_RD ? "RD " : "",
+			cpu_err_stat & SGIMC_CSTAT_PAR ? "PAR " : "",
+			cpu_err_stat & SGIMC_CSTAT_ADDR ? "ADDR " : "",
+			cpu_err_stat & SGIMC_CSTAT_SYSAD_PAR ? "SYSAD " : "",
+			cpu_err_stat & SGIMC_CSTAT_SYSCMD_PAR ? "SYSCMD " : "",
+			cpu_err_stat & SGIMC_CSTAT_BAD_DATA ? "BAD_DATA " : "",
+			cpu_err_addr);
+		error |= 8;
+	}
+	if (gio_err_stat & GIO_ERRMASK) {
+		printk(KERN_ERR "GIO error 0x%x:<%s%s%s%s%s%s%s%s> @ 0x%08x\n",
+			gio_err_stat,
+			gio_err_stat & SGIMC_GSTAT_RD ? "RD " : "",
+			gio_err_stat & SGIMC_GSTAT_WR ? "WR " : "",
+			gio_err_stat & SGIMC_GSTAT_TIME ? "TIME " : "",
+			gio_err_stat & SGIMC_GSTAT_PROM ? "PROM " : "",
+			gio_err_stat & SGIMC_GSTAT_ADDR ? "ADDR " : "",
+			gio_err_stat & SGIMC_GSTAT_BC ? "BC " : "",
+			gio_err_stat & SGIMC_GSTAT_PIO_RD ? "PIO_RD " : "",
+			gio_err_stat & SGIMC_GSTAT_PIO_WR ? "PIO_WR " : "",
+			gio_err_addr);
+		error |= 16;
+	}
+	if (!error)
+		printk(KERN_ERR "MC: Hmm, didn't find any error condition.\n");
+	else {
+		printk(KERN_ERR "CP0: config %08x,  "
+			"MC: cpuctrl0/1: %08x/%05x, giopar: %04x\n"
+			"MC: cpu/gio_memacc: %08x/%05x, memcfg0/1: %08x/%08x\n",
+			read_c0_config(),
+			sgimc->cpuctrl0, sgimc->cpuctrl0, sgimc->giopar,
+			sgimc->cmacc, sgimc->gmacc,
+			sgimc->mconfig0, sgimc->mconfig1);
+		print_cache_tags();
+	}
+	printk(KERN_ALERT "%s, epc == %0*lx, ra == %0*lx\n",
+	       cause_excode_text(regs->cp0_cause),
+	       field, regs->cp0_epc, field, regs->regs[31]);
+}
+
+/*
+ * Try to find out, whether the bus error is caused by the instruction
+ * at EPC, otherwise we have an asynchronous error.
+ *
+ * Doc1: "MIPS IV Instruction Set", Rev 3.2 (SGI 007-2597-001)
+ * Doc2: "MIPS R10000 Microporcessor User's Manual", Ver 2.0 (SGI 007-2490-001)
+ * Doc3: "MIPS R4000 Microporcessor User's Manual", 2nd Ed. (SGI 007-2489-001)
+ */
+
+#define JMP_INDEX26_OP 1
+#define JMP_REGISTER_OP 2
+#define JMP_PCREL16_OP 3
+#define BASE_OFFSET_OP 4
+#define BASE_IDXREG_OP 5
+
+/* Match virtual address in an insn with physical error address */
+
+static int match_addr(unsigned paddr, unsigned long vaddr)
+{
+	unsigned long uaddr;
+
+	if ((vaddr & 0xffffffff80000000L) == 0xffffffff80000000L)
+		uaddr = (unsigned) CPHYSADDR(vaddr);
+	else if ((vaddr >> 62) == 2)
+		uaddr = (unsigned) XPHYSADDR(vaddr);
+	else {
+		unsigned long eh = vaddr & ~0x1fffL;
+
+		eh |= read_c0_entryhi() & 0xff;
+		write_c0_entryhi(eh);
+		tlb_probe();
+		if (read_c0_index() & 0x80000000)
+			return 0;
+		tlb_read();
+		if (vaddr & (1L << PAGE_SHIFT))
+			uaddr = (unsigned) read_c0_entrylo1();
+		else
+			uaddr = (unsigned) read_c0_entrylo0();
+		uaddr <<= 6;
+		uaddr &= ~PAGE_MASK;
+		uaddr |= vaddr & PAGE_MASK;
+	}
+	return ((uaddr & ~0x7f) == (paddr & ~0x7f));
+}
+
+/* Check, which kind of memory reference is triggered by `insn' */
+
+static int check_special(unsigned insn)
+{
+	/* See Doc1, page A-180 */
+	unsigned func = insn & 0x3f;
+
+	if (8 == func || 8+1 == func) /* JR, JALR */
+		return JMP_REGISTER_OP;
+
+	return 0;
+}
+
+static int check_regimm(unsigned insn)
+{
+	/* See Doc1, page A-180 */
+	unsigned rt = (insn >> 19) & 3; /* bits 20..19[..16] */
+
+	/* BLTZ, BGEZ, BLTZL, BBGEZL || BLTZAL, BGEZAL, BLTZALL, BBGEZALL */
+	if (!rt || 2 == rt)
+		return JMP_PCREL16_OP;
+
+	return 0;
+}
+
+static int check_cop0(unsigned insn)
+{
+	/* See Doc2, pages 287 ff., 187 ff. */
+	if ((insn >> 26) == 5*8+7) /* CACHE */
+		switch ((insn >> 16) & 0x1f) {
+		case Index_Writeback_Inv_D:
+		case Hit_Writeback_Inv_D:
+		case Index_Writeback_Inv_S:
+		case Hit_Writeback_Inv_S:
+			return BASE_OFFSET_OP;
+		}
+	return 0;
+}
+
+static int check_cop1(unsigned insn)
+{
+	/* See Doc1, pages B-108 ff. */
+	unsigned fmt = (insn >> 21) & 0x1f; /* bits 25..21 */
+
+	if (8 == fmt) /* BC1* */
+		return JMP_PCREL16_OP;
+
+	return 0;
+}
+
+static int check_cop1x(unsigned insn)
+{
+	/* See Doc1, pages B-108 ff. */
+	switch (insn & 0x3f) {
+	case 0:   /* LWXC1 */
+	case 1:   /* LDXC1 */
+	case 8:   /* SWXC1 */
+	case 8+1: /* SDXC1 */
+		return BASE_IDXREG_OP;
+	}
+	return 0;
+}
+
+static int check_plain(unsigned insn)
+{
+	/* See Doc1, page A-180 */
+	unsigned opcode = insn >> 26;
+
+	if (2 == opcode || 3 == opcode) /* J, JAL */
+		return JMP_INDEX26_OP;
+
+	if ((4     <= opcode && opcode <= 7) ||   /* BEQ, BNE, BLEZ, BGTZ */
+	    (4+2*8 <= opcode && opcode <= 7+2*8)) /* BEQL, BNEL, BLEZL, BGTZL */
+		return JMP_PCREL16_OP;
+
+	if (6*8+3 == opcode) /* PREF */
+		return 0;
+
+	if (3*8+2 == opcode || 3*8+3 == opcode || /* LDL, LDR */
+	    4*8 <= opcode) /* misc. LOAD, STORE */
+		return BASE_OFFSET_OP;
+
+	return 0;
+}
+
+/* Check, whether the insn at EPC causes a memory access at `paddr' */
+
+static int check_addr_in_insn(unsigned paddr, const struct pt_regs *regs)
+{
+	unsigned long epc;
+	unsigned insn;
+	unsigned long a;
+	int typ;
+
+	epc = regs->cp0_cause & CAUSEF_BD ? regs->cp0_epc:regs->cp0_epc+4;
+
+	/* show_code() from kernel/traps.c */
+	if (__get_user(insn, (u32 *)epc))
+		return 1;
+
+	/* See Doc1, pages A-180, B-108 ff. */
+	switch (insn >> 26) {
+	case 0:
+		typ = check_special(insn);
+		break;
+	case 1:
+		typ = check_regimm(insn);
+		break;
+	case 2*8:   /* COP0 */
+	case 5*8+7: /* CACHE */
+		typ = check_cop0(insn);
+		break;
+	case 2*8+1:
+		typ = check_cop1(insn);
+		break;
+	case 2*8+3:
+		typ = check_cop1x(insn);
+		break;
+	default:
+		typ = check_plain(insn);
+		break;
+	}
+	switch (typ) {
+	case JMP_INDEX26_OP:
+		a = (regs->cp0_epc + 4) & ~0xfffffff;
+		a |= (insn & 0x3ffffff) << 2;
+		return match_addr(paddr, a);
+	case JMP_REGISTER_OP:
+		a = regs->regs[(insn >> 21) & 0x1f];
+		return match_addr(paddr, a);
+	case JMP_PCREL16_OP:
+		a = regs->cp0_epc + 4 + ((insn & 0xffff) << 2);
+		return match_addr(paddr, a);
+	case BASE_OFFSET_OP:
+		a = regs->regs[(insn >> 21) & 0x1f] + (insn & 0xffff);
+		return match_addr(paddr, a);
+	case BASE_IDXREG_OP:
+		a = regs->regs[(insn >> 21) & 0x1f];
+		a += regs->regs[(insn >> 16) & 0x1f];
+		return match_addr(paddr, a);
+	case 0:
+		return 0;
+	}
+	/* Assume it would be too dangerous to continue ... */
+	return 1;
+}
+
+/*
+ * Check, whether MC's (virtual) DMA address caused the bus error.
+ * See "Virtual DMA Specification", Draft 1.5, Feb 13 1992, SGI
+ */
+
+static int addr_is_ram(unsigned long addr, unsigned sz)
+{
+	int i;
+
+	for (i = 0; i < boot_mem_map.nr_map; i++) {
+		unsigned long a = boot_mem_map.map[i].addr;
+		if (a <= addr && addr+sz <= a+boot_mem_map.map[i].size)
+			return 1;
+	}
+	return 0;
+}
+
+static int check_microtlb(u32 hi, u32 lo, unsigned long vaddr)
+{
+	/* This is likely rather similar to correct code ;-) */
+
+	vaddr &= 0x7fffffff; /* Doc. states that top bit is ignored */
+
+	/* If tlb-entry is valid and VPN-high (bits [30:21] ?) matches... */
+	if ((lo & 2) && (vaddr >> 21) == ((hi<<1) >> 22)) {
+		u32 ctl = sgimc->dma_ctrl;
+		if (ctl & 1) {
+			unsigned int pgsz = (ctl & 2) ? 14:12; /* 16k:4k */
+			/* PTEIndex is VPN-low (bits [22:14]/[20:12] ?) */
+			unsigned long pte = (lo >> 6) << 12; /* PTEBase */
+			pte += 8*((vaddr >> pgsz) & 0x1ff);
+			if (addr_is_ram(pte, 8)) {
+				/*
+				 * Note: Since DMA hardware does look up
+				 * translation on its own, this PTE *must*
+				 * match the TLB/EntryLo-register format !
+				 */
+				unsigned long a = *(unsigned long *)
+						PHYS_TO_XKSEG_UNCACHED(pte);
+				a = (a & 0x3f) << 6; /* PFN */
+				a += vaddr & ((1 << pgsz) - 1);
+				return (cpu_err_addr == a);
+			}
+		}
+	}
+	return 0;
+}
+
+static int check_vdma_memaddr(void)
+{
+	if (cpu_err_stat & CPU_ERRMASK) {
+		u32 a = sgimc->maddronly;
+
+		if (!(sgimc->dma_ctrl & 0x100)) /* Xlate-bit clear ? */
+			return (cpu_err_addr == a);
+
+		if (check_microtlb(sgimc->dtlb_hi0, sgimc->dtlb_lo0, a) ||
+		    check_microtlb(sgimc->dtlb_hi1, sgimc->dtlb_lo1, a) ||
+		    check_microtlb(sgimc->dtlb_hi2, sgimc->dtlb_lo2, a) ||
+		    check_microtlb(sgimc->dtlb_hi3, sgimc->dtlb_lo3, a))
+			return 1;
+	}
+	return 0;
+}
+
+static int check_vdma_gioaddr(void)
+{
+	if (gio_err_stat & GIO_ERRMASK) {
+		u32 a = sgimc->gio_dma_trans;
+		a = (sgimc->gmaddronly & ~a) | (sgimc->gio_dma_sbits & a);
+		return (gio_err_addr == a);
+	}
+	return 0;
+}
+
+/*
+ * MC sends an interrupt whenever bus or parity errors occur. In addition,
+ * if the error happened during a CPU read, it also asserts the bus error
+ * pin on the R4K. Code in bus error handler save the MC bus error registers
+ * and then clear the interrupt when this happens.
+ */
+
+static int ip28_be_interrupt(const struct pt_regs *regs)
+{
+	int i;
+
+	save_and_clear_buserr();
+	/*
+	 * Try to find out, whether we got here by a mispredicted speculative
+	 * load/store operation.  If so, it's not fatal, we can go on.
+	 */
+	/* Any cause other than "Interrupt" (ExcCode 0) is fatal. */
+	if (regs->cp0_cause & CAUSEF_EXCCODE)
+		goto mips_be_fatal;
+
+	/* Any cause other than "Bus error interrupt" (IP6) is weird. */
+	if ((regs->cp0_cause & CAUSEF_IP6) != CAUSEF_IP6)
+		goto mips_be_fatal;
+
+	if (extio_stat & (EXTIO_HPC3_BUSERR | EXTIO_EISA_BUSERR))
+		goto mips_be_fatal;
+
+	/* Any state other than "Memory bus error" is fatal. */
+	if (cpu_err_stat & CPU_ERRMASK & ~SGIMC_CSTAT_ADDR)
+			goto mips_be_fatal;
+
+	/* GIO errors are fatal */
+	if (gio_err_stat & GIO_ERRMASK)
+		goto mips_be_fatal;
+
+	/* Finding `cpu_err_addr' in the insn at EPC is fatal. */
+	if ((cpu_err_stat & CPU_ERRMASK) &&
+	     check_addr_in_insn(cpu_err_addr, regs))
+			goto mips_be_fatal;
+
+	/*
+	 * Now we have an asynchronous bus error, speculatively or DMA caused.
+	 * Need to search all DMA descriptors for the error address.
+	 */
+	for (i = 0; i < sizeof(hpc3)/sizeof(struct hpc3_stat); ++i) {
+		struct hpc3_stat *hp = (struct hpc3_stat *)&hpc3 + i;
+		if ((cpu_err_stat & CPU_ERRMASK) &&
+		    (cpu_err_addr == hp->ndptr || cpu_err_addr == hp->cbp))
+			break;
+		if ((gio_err_stat & GIO_ERRMASK) &&
+		    (gio_err_addr == hp->ndptr || gio_err_addr == hp->cbp))
+			break;
+	}
+	if (i < sizeof(hpc3)/sizeof(struct hpc3_stat)) {
+		struct hpc3_stat *hp = (struct hpc3_stat *)&hpc3 + i;
+		printk(KERN_ERR "at DMA addresses: HPC3 @ %08lx:"
+		       " ctl %08x, ndp %08x, cbp %08x\n",
+		       CPHYSADDR(hp->addr), hp->ctrl, hp->ndptr, hp->cbp);
+		goto mips_be_fatal;
+	}
+	/* Check MC's virtual DMA stuff. */
+	if (check_vdma_memaddr()) {
+		printk(KERN_ERR "at GIO DMA: mem address 0x%08x.\n",
+			sgimc->maddronly);
+		goto mips_be_fatal;
+	}
+	if (check_vdma_gioaddr()) {
+		printk(KERN_ERR "at GIO DMA: gio address 0x%08x.\n",
+			sgimc->gmaddronly);
+		goto mips_be_fatal;
+	}
+	/* A speculative bus error... */
+	if (debug_be_interrupt) {
+		print_buserr(regs);
+		printk(KERN_ERR "discarded!\n");
+	}
+	return MIPS_BE_DISCARD;
+
+mips_be_fatal:
+	print_buserr(regs);
+	return MIPS_BE_FATAL;
+}
+
+void ip22_be_interrupt(int irq)
+{
+	const struct pt_regs *regs = get_irq_regs();
+
+	count_be_interrupt++;
+
+	if (ip28_be_interrupt(regs) != MIPS_BE_DISCARD) {
+		/* Assume it would be too dangerous to continue ... */
+		die_if_kernel("Oops", regs);
+		force_sig(SIGBUS, current);
+	} else if (debug_be_interrupt)
+		show_regs((struct pt_regs *)regs);
+}
+
+static int ip28_be_handler(struct pt_regs *regs, int is_fixup)
+{
+	/*
+	 * We arrive here only in the unusual case of do_be() invocation,
+	 * i.e. by a bus error exception without a bus error interrupt.
+	 */
+	if (is_fixup) {
+		count_be_is_fixup++;
+		save_and_clear_buserr();
+		return MIPS_BE_FIXUP;
+	}
+	count_be_handler++;
+	return ip28_be_interrupt(regs);
+}
+
+void __init ip22_be_init(void)
+{
+	board_be_handler = ip28_be_handler;
+}
+
+int ip28_show_be_info(struct seq_file *m)
+{
+	seq_printf(m, "IP28 be fixups\t\t: %u\n", count_be_is_fixup);
+	seq_printf(m, "IP28 be interrupts\t: %u\n", count_be_interrupt);
+	seq_printf(m, "IP28 be handler\t\t: %u\n", count_be_handler);
+
+	return 0;
+}
+
+static int __init debug_be_setup(char *str)
+{
+	debug_be_interrupt++;
+	return 1;
+}
+__setup("ip28_debug_be", debug_be_setup);
+
diff --git a/include/asm-mips/dma.h b/include/asm-mips/dma.h
index d6a6c21..1353c81 100644
--- a/include/asm-mips/dma.h
+++ b/include/asm-mips/dma.h
@@ -84,10 +84,9 @@
  * Deskstations or Acer PICA but not the much more versatile DMA logic used
  * for the local devices on Acer PICA or Magnums.
  */
-#ifdef CONFIG_SGI_IP22
-/* Horrible hack to have a correct DMA window on IP22 */
-#include <asm/sgi/mc.h>
-#define MAX_DMA_ADDRESS		(PAGE_OFFSET + SGIMC_SEG0_BADDR + 0x01000000)
+#if defined(CONFIG_SGI_IP22) || defined(CONFIG_SGI_IP28)
+/* don't care; ISA bus master won't work, ISA slave DMA supports 32bit addr */
+#define MAX_DMA_ADDRESS		PAGE_OFFSET
 #else
 #define MAX_DMA_ADDRESS		(PAGE_OFFSET + 0x01000000)
 #endif
diff --git a/include/asm-mips/mach-ip28/cpu-feature-overrides.h b/include/asm-mips/mach-ip28/cpu-feature-overrides.h
new file mode 100644
index 0000000..9a53b32
--- /dev/null
+++ b/include/asm-mips/mach-ip28/cpu-feature-overrides.h
@@ -0,0 +1,50 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 2003 Ralf Baechle
+ * 6/2004	pf
+ */
+#ifndef __ASM_MACH_IP28_CPU_FEATURE_OVERRIDES_H
+#define __ASM_MACH_IP28_CPU_FEATURE_OVERRIDES_H
+
+/*
+ * IP28 only comes with R10000 family processors all using the same config
+ */
+#define cpu_has_watch		1
+#define cpu_has_mips16		0
+#define cpu_has_divec		0
+#define cpu_has_vce		0
+#define cpu_has_cache_cdex_p	0
+#define cpu_has_cache_cdex_s	0
+#define cpu_has_prefetch	1
+#define cpu_has_mcheck		0
+#define cpu_has_ejtag		0
+
+#define cpu_has_llsc		1
+#define cpu_has_vtag_icache	0
+#define cpu_has_dc_aliases	0 /* see probe_pcache() */
+#define cpu_has_ic_fills_f_dc	0
+#define cpu_has_dsp		0
+#define cpu_icache_snoops_remote_store  1
+#define cpu_has_mipsmt		0
+#define cpu_has_userlocal	0
+
+#define cpu_has_nofpuex		0
+#define cpu_has_64bits		1
+
+#define cpu_has_4kex		1
+#define cpu_has_4k_cache	1
+
+#define cpu_has_inclusive_pcaches	1
+
+#define cpu_dcache_line_size()	32
+#define cpu_icache_line_size()	64
+
+#define cpu_has_mips32r1	0
+#define cpu_has_mips32r2	0
+#define cpu_has_mips64r1	0
+#define cpu_has_mips64r2	0
+
+#endif /* __ASM_MACH_IP28_CPU_FEATURE_OVERRIDES_H */
diff --git a/include/asm-mips/mach-ip28/ds1286.h b/include/asm-mips/mach-ip28/ds1286.h
new file mode 100644
index 0000000..471bb9a
--- /dev/null
+++ b/include/asm-mips/mach-ip28/ds1286.h
@@ -0,0 +1,4 @@
+#ifndef __ASM_MACH_IP28_DS1286_H
+#define __ASM_MACH_IP28_DS1286_H
+#include <asm/mach-ip22/ds1286.h>
+#endif /* __ASM_MACH_IP28_DS1286_H */
diff --git a/include/asm-mips/mach-ip28/spaces.h b/include/asm-mips/mach-ip28/spaces.h
new file mode 100644
index 0000000..05aabb2
--- /dev/null
+++ b/include/asm-mips/mach-ip28/spaces.h
@@ -0,0 +1,22 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 1994 - 1999, 2000, 03, 04 Ralf Baechle
+ * Copyright (C) 2000, 2002  Maciej W. Rozycki
+ * Copyright (C) 1990, 1999, 2000 Silicon Graphics, Inc.
+ * 2004	pf
+ */
+#ifndef _ASM_MACH_IP28_SPACES_H
+#define _ASM_MACH_IP28_SPACES_H
+
+#define CAC_BASE		0xa800000000000000
+
+#define HIGHMEM_START		(~0UL)
+
+#define PHYS_OFFSET		_AC(0x20000000, UL)
+
+#include <asm/mach-generic/spaces.h>
+
+#endif /* _ASM_MACH_IP28_SPACES_H */
diff --git a/include/asm-mips/mach-ip28/war.h b/include/asm-mips/mach-ip28/war.h
new file mode 100644
index 0000000..a1baafa
--- /dev/null
+++ b/include/asm-mips/mach-ip28/war.h
@@ -0,0 +1,25 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 2002, 2004, 2007 by Ralf Baechle <ralf@linux-mips.org>
+ */
+#ifndef __ASM_MIPS_MACH_IP28_WAR_H
+#define __ASM_MIPS_MACH_IP28_WAR_H
+
+#define R4600_V1_INDEX_ICACHEOP_WAR	0
+#define R4600_V1_HIT_CACHEOP_WAR	0
+#define R4600_V2_HIT_CACHEOP_WAR	0
+#define R5432_CP0_INTERRUPT_WAR		0
+#define BCM1250_M3_WAR			0
+#define SIBYTE_1956_WAR			0
+#define MIPS4K_ICACHE_REFILL_WAR	0
+#define MIPS_CACHE_SYNC_WAR		0
+#define TX49XX_ICACHE_INDEX_INV_WAR	0
+#define RM9000_CDEX_SMP_WAR		0
+#define ICACHE_REFILLS_WORKAROUND_WAR	0
+#define R10000_LLSC_WAR			1
+#define MIPS34K_MISSED_ITLB_WAR		0
+
+#endif /* __ASM_MIPS_MACH_IP28_WAR_H */

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [UPDATED PATCH] IP28 support
@ 2007-11-29  9:54 Thomas Bogendoerfer
  2007-11-29 13:01 ` Ralf Baechle
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Bogendoerfer @ 2007-11-29  9:54 UTC (permalink / raw)
  To: linux-mips; +Cc: ralf

Add support for SGI IP28 machines (Indigo 2 with R10k CPUs)
This work is mainly based on Peter Fuersts work.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
---

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 2f2ce0c..21649e4 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -421,6 +421,27 @@ config SGI_IP27
 	  workstations.  To compile a Linux kernel that runs on these, say Y
 	  here.
 
+config SGI_IP28
+	bool "SGI IP28 (Indigo2 R10k) (EXPERIMENTAL)"
+	depends on EXPERIMENTAL
+	select ARC
+	select ARC64
+	select CEVT_R4K
+	select DMA_NONCOHERENT
+	select IRQ_CPU
+	select SWAP_IO_SPACE
+	select HW_HAS_EISA
+	select I8253
+	select I8259
+	select SYS_HAS_CPU_R10000
+	select SYS_HAS_EARLY_PRINTK
+	select BOOT_ELF64
+	select SYS_SUPPORTS_64BIT_KERNEL
+	select SYS_SUPPORTS_BIG_ENDIAN
+      help
+        This is the SGI Indigo2 with R10000 processor.  To compile a Linux
+        kernel that runs on these, say Y here.
+
 config SGI_IP32
 	bool "SGI IP32 (O2)"
 	select ARC
@@ -932,7 +953,7 @@ config BOOT_ELF32
 config MIPS_L1_CACHE_SHIFT
 	int
 	default "4" if MACH_DECSTATION
-	default "7" if SGI_IP27 || SNI_RM
+	default "7" if SGI_IP27 || SGI_IP28 || SNI_RM
 	default "4" if PMC_MSP4200_EVAL
 	default "5"
 
@@ -941,7 +962,7 @@ config HAVE_STD_PC_SERIAL_PORT
 
 config ARC_CONSOLE
 	bool "ARC console support"
-	depends on SGI_IP22 || (SNI_RM && CPU_LITTLE_ENDIAN)
+	depends on SGI_IP22 || SGI_IP28 || (SNI_RM && CPU_LITTLE_ENDIAN)
 
 config ARC_MEMORY
 	bool
@@ -950,7 +971,7 @@ config ARC_MEMORY
 
 config ARC_PROMLIB
 	bool
-	depends on MACH_JAZZ || SNI_RM || SGI_IP22 || SGI_IP32
+	depends on MACH_JAZZ || SNI_RM || SGI_IP22 || SGI_IP28 || SGI_IP32
 	default y
 
 config ARC64
diff --git a/arch/mips/Makefile b/arch/mips/Makefile
index a1f8d8b..d91fbca 100644
--- a/arch/mips/Makefile
+++ b/arch/mips/Makefile
@@ -475,6 +475,20 @@ endif
 endif
 
 #
+# SGI IP28 (Indigo2 R10k)
+#
+# Set the load address to >= 0xa800000020080000 if you want to leave space for
+# symmon, 0xa800000020004000 for production kernels ?  Note that the value must
+# be 16kb aligned or the handling of the current variable will break.
+# Simplified: what IP22 does at 128MB+ in ksegN, IP28 does at 512MB+ in xkphys
+#
+#core-$(CONFIG_SGI_IP28)		+= arch/mips/sgi-ip22/ arch/mips/arc/arc_con.o
+core-$(CONFIG_SGI_IP28)		+= arch/mips/sgi-ip22/
+cflags-$(CONFIG_SGI_IP28)	+= -mr10k-cache-barrier=1 -Iinclude/asm-mips/mach-ip28
+#cflags-$(CONFIG_SGI_IP28)	+= -Iinclude/asm-mips/mach-ip28
+load-$(CONFIG_SGI_IP28)		+= 0xa800000020004000
+
+#
 # SGI-IP32 (O2)
 #
 # Set the load address to >= 80069000 if you want to leave space for symmon,
diff --git a/arch/mips/sgi-ip22/Makefile b/arch/mips/sgi-ip22/Makefile
index e3acb51..ef1564e 100644
--- a/arch/mips/sgi-ip22/Makefile
+++ b/arch/mips/sgi-ip22/Makefile
@@ -3,9 +3,11 @@
 # under Linux.
 #
 
-obj-y	+= ip22-mc.o ip22-hpc.o ip22-int.o ip22-berr.o \
-	   ip22-time.o ip22-nvram.o ip22-platform.o ip22-reset.o ip22-setup.o
+obj-y	+= ip22-mc.o ip22-hpc.o ip22-int.o ip22-time.o ip22-nvram.o \
+	   ip22-platform.o ip22-reset.o ip22-setup.o
 
+obj-$(CONFIG_SGI_IP22) += ip22-berr.o
+obj-$(CONFIG_SGI_IP28) += ip28-berr.o
 obj-$(CONFIG_EISA)	+= ip22-eisa.o
 
-EXTRA_CFLAGS += -Werror
+# EXTRA_CFLAGS += -Werror
diff --git a/arch/mips/sgi-ip22/ip22-mc.c b/arch/mips/sgi-ip22/ip22-mc.c
index 01a805d..3f35d63 100644
--- a/arch/mips/sgi-ip22/ip22-mc.c
+++ b/arch/mips/sgi-ip22/ip22-mc.c
@@ -4,6 +4,7 @@
  * Copyright (C) 1996 David S. Miller (dm@engr.sgi.com)
  * Copyright (C) 1999 Andrew R. Baker (andrewb@uab.edu) - Indigo2 changes
  * Copyright (C) 2003 Ladislav Michl  (ladis@linux-mips.org)
+ * Copyright (C) 2004 Peter Fuerst    (pf@net.alphadv.de) - IP28
  */
 
 #include <linux/init.h>
@@ -137,9 +138,12 @@ void __init sgimc_init(void)
 	/* Step 2: Enable all parity checking in cpu control register
 	 *         zero.
 	 */
+	/* don't touch parity settings for IP28 */
+#ifndef CONFIG_SGI_IP28
 	tmp = sgimc->cpuctrl0;
 	tmp |= (SGIMC_CCTRL0_EPERRGIO | SGIMC_CCTRL0_EPERRMEM |
 		SGIMC_CCTRL0_R4KNOCHKPARR);
+#endif
 	sgimc->cpuctrl0 = tmp;
 
 	/* Step 3: Setup the MC write buffer depth, this is controlled
diff --git a/arch/mips/sgi-ip22/ip28-berr.c b/arch/mips/sgi-ip22/ip28-berr.c
new file mode 100644
index 0000000..0ee5be8
--- /dev/null
+++ b/arch/mips/sgi-ip22/ip28-berr.c
@@ -0,0 +1,700 @@
+/*
+ * ip28-berr.c: Bus error handling.
+ *
+ * Copyright (C) 2002, 2003 Ladislav Michl (ladis@linux-mips.org)
+ * Copyright (C) 2005 Peter Fuerst (pf@net.alphadv.de) - IP28
+ */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+
+#include <asm/addrspace.h>
+#include <asm/system.h>
+#include <asm/traps.h>
+#include <asm/branch.h>
+#include <asm/irq_regs.h>
+#include <asm/sgi/mc.h>
+#include <asm/sgi/hpc3.h>
+#include <asm/sgi/ioc.h>
+#include <asm/sgi/ip22.h>
+#include <asm/r4kcache.h>
+#include <asm/uaccess.h>
+#include <asm/bootinfo.h>
+
+static unsigned int count_be_is_fixup;
+static unsigned int count_be_handler;
+static unsigned int count_be_interrupt;
+static int debug_be_interrupt;
+
+static unsigned int cpu_err_stat;	/* Status reg for CPU */
+static unsigned int gio_err_stat;	/* Status reg for GIO */
+static unsigned int cpu_err_addr;	/* Error address reg for CPU */
+static unsigned int gio_err_addr;	/* Error address reg for GIO */
+static unsigned int extio_stat;
+static unsigned int hpc3_berr_stat;	/* Bus error interrupt status */
+
+struct hpc3_stat {
+	unsigned long addr;
+	unsigned int ctrl;
+	unsigned int cbp;
+	unsigned int ndptr;
+};
+
+static struct {
+	struct hpc3_stat pbdma[8];
+	struct hpc3_stat scsi[2];
+	struct hpc3_stat ethrx, ethtx;
+} hpc3;
+
+static struct {
+	unsigned long err_addr;
+	struct {
+		u32 lo;
+		u32 hi;
+	} tags[1][2], tagd[4][2], tagi[4][2]; /* Way 0/1 */
+} cache_tags;
+
+static inline void save_cache_tags(unsigned busaddr)
+{
+	unsigned long addr = CAC_BASE | busaddr;
+	int i;
+	cache_tags.err_addr = addr;
+
+	/*
+	 * Starting with a bus-address, save secondary cache (indexed by
+	 * PA[23..18:7..6]) tags first.
+	 */
+	addr &= ~1L;
+#define tag cache_tags.tags[0]
+	cache_op(Index_Load_Tag_S, addr);
+	tag[0].lo = read_c0_taglo();	/* PA[35:18], VA[13:12] */
+	tag[0].hi = read_c0_taghi();	/* PA[39:36] */
+	cache_op(Index_Load_Tag_S, addr | 1L);
+	tag[1].lo = read_c0_taglo();	/* PA[35:18], VA[13:12] */
+	tag[1].hi = read_c0_taghi();	/* PA[39:36] */
+#undef tag
+
+	/*
+	 * Save all primary data cache (indexed by VA[13:5]) tags which
+	 * might fit to this bus-address, knowing that VA[11:0] == PA[11:0].
+	 * Saving all tags and evaluating them later is easier and safer
+	 * than relying on VA[13:12] from the secondary cache tags to pick
+	 * matching primary tags here already.
+	 */
+	addr &= (0xffL << 56) | ((1 << 12) - 1);
+#define tag cache_tags.tagd[i]
+	for (i = 0; i < 4; ++i, addr += (1 << 12)) {
+		cache_op(Index_Load_Tag_D, addr);
+		tag[0].lo = read_c0_taglo();	/* PA[35:12] */
+		tag[0].hi = read_c0_taghi();	/* PA[39:36] */
+		cache_op(Index_Load_Tag_D, addr | 1L);
+		tag[1].lo = read_c0_taglo();	/* PA[35:12] */
+		tag[1].hi = read_c0_taghi();	/* PA[39:36] */
+	}
+#undef tag
+
+	/*
+	 * Save primary instruction cache (indexed by VA[13:6]) tags
+	 * the same way.
+	 */
+	addr &= (0xffL << 56) | ((1 << 12) - 1);
+#define tag cache_tags.tagi[i]
+	for (i = 0; i < 4; ++i, addr += (1 << 12)) {
+		cache_op(Index_Load_Tag_I, addr);
+		tag[0].lo = read_c0_taglo();	/* PA[35:12] */
+		tag[0].hi = read_c0_taghi();	/* PA[39:36] */
+		cache_op(Index_Load_Tag_I, addr | 1L);
+		tag[1].lo = read_c0_taglo();	/* PA[35:12] */
+		tag[1].hi = read_c0_taghi();	/* PA[39:36] */
+	}
+#undef tag
+}
+
+#define GIO_ERRMASK	0xff00
+#define CPU_ERRMASK	0x3f00
+
+static void save_and_clear_buserr(void)
+{
+	int i;
+
+	/* save status registers */
+	cpu_err_addr = sgimc->cerr;
+	cpu_err_stat = sgimc->cstat;
+	gio_err_addr = sgimc->gerr;
+	gio_err_stat = sgimc->gstat;
+	extio_stat = sgioc->extio;
+	hpc3_berr_stat = hpc3c0->bestat;
+
+	hpc3.scsi[0].addr  = (unsigned long)&hpc3c0->scsi_chan0;
+	hpc3.scsi[0].ctrl  = hpc3c0->scsi_chan0.ctrl; /* HPC3_SCTRL_ACTIVE ? */
+	hpc3.scsi[0].cbp   = hpc3c0->scsi_chan0.cbptr;
+	hpc3.scsi[0].ndptr = hpc3c0->scsi_chan0.ndptr;
+
+	hpc3.scsi[1].addr  = (unsigned long)&hpc3c0->scsi_chan1;
+	hpc3.scsi[1].ctrl  = hpc3c0->scsi_chan1.ctrl; /* HPC3_SCTRL_ACTIVE ? */
+	hpc3.scsi[1].cbp   = hpc3c0->scsi_chan1.cbptr;
+	hpc3.scsi[1].ndptr = hpc3c0->scsi_chan1.ndptr;
+
+	hpc3.ethrx.addr  = (unsigned long)&hpc3c0->ethregs.rx_cbptr;
+	hpc3.ethrx.ctrl  = hpc3c0->ethregs.rx_ctrl; /* HPC3_ERXCTRL_ACTIVE ? */
+	hpc3.ethrx.cbp   = hpc3c0->ethregs.rx_cbptr;
+	hpc3.ethrx.ndptr = hpc3c0->ethregs.rx_ndptr;
+
+	hpc3.ethtx.addr  = (unsigned long)&hpc3c0->ethregs.tx_cbptr;
+	hpc3.ethtx.ctrl  = hpc3c0->ethregs.tx_ctrl; /* HPC3_ETXCTRL_ACTIVE ? */
+	hpc3.ethtx.cbp   = hpc3c0->ethregs.tx_cbptr;
+	hpc3.ethtx.ndptr = hpc3c0->ethregs.tx_ndptr;
+
+	for (i = 0; i < 8; ++i) {
+		/* HPC3_PDMACTRL_ISACT ? */
+		hpc3.pbdma[i].addr  = (unsigned long)&hpc3c0->pbdma[i];
+		hpc3.pbdma[i].ctrl  = hpc3c0->pbdma[i].pbdma_ctrl;
+		hpc3.pbdma[i].cbp   = hpc3c0->pbdma[i].pbdma_bptr;
+		hpc3.pbdma[i].ndptr = hpc3c0->pbdma[i].pbdma_dptr;
+	}
+	i = 0;
+	if (gio_err_stat & CPU_ERRMASK)
+		i = gio_err_addr;
+	if (cpu_err_stat & CPU_ERRMASK)
+		i = cpu_err_addr;
+	save_cache_tags(i);
+
+	sgimc->cstat = sgimc->gstat = 0;
+}
+
+static void print_cache_tags(void)
+{
+	u32 scb, scw;
+	int i;
+
+	printk(KERN_ERR "Cache tags @ %08x:\n", (unsigned)cache_tags.err_addr);
+
+	/* PA[31:12] shifted to PTag0 (PA[35:12]) format */
+	scw = (cache_tags.err_addr >> 4) & 0x0fffff00;
+
+	scb = cache_tags.err_addr & ((1 << 12) - 1) & ~((1 << 5) - 1);
+	for (i = 0; i < 4; ++i) { /* for each possible VA[13:12] value */
+		if ((cache_tags.tagd[i][0].lo & 0x0fffff00) != scw &&
+		    (cache_tags.tagd[i][1].lo & 0x0fffff00) != scw)
+		    continue;
+		printk(KERN_ERR
+		       "D: 0: %08x %08x, 1: %08x %08x  (VA[13:5]  %04x)\n",
+			cache_tags.tagd[i][0].hi, cache_tags.tagd[i][0].lo,
+			cache_tags.tagd[i][1].hi, cache_tags.tagd[i][1].lo,
+			scb | (1 << 12)*i);
+	}
+	scb = cache_tags.err_addr & ((1 << 12) - 1) & ~((1 << 6) - 1);
+	for (i = 0; i < 4; ++i) { /* for each possible VA[13:12] value */
+		if ((cache_tags.tagi[i][0].lo & 0x0fffff00) != scw &&
+		    (cache_tags.tagi[i][1].lo & 0x0fffff00) != scw)
+		    continue;
+		printk(KERN_ERR
+		       "I: 0: %08x %08x, 1: %08x %08x  (VA[13:6]  %04x)\n",
+			cache_tags.tagi[i][0].hi, cache_tags.tagi[i][0].lo,
+			cache_tags.tagi[i][1].hi, cache_tags.tagi[i][1].lo,
+			scb | (1 << 12)*i);
+	}
+	i = read_c0_config();
+	scb = i & (1 << 13) ? 7:6;      /* scblksize = 2^[7..6] */
+	scw = ((i >> 16) & 7) + 19 - 1; /* scwaysize = 2^[24..19] / 2 */
+
+	i = ((1 << scw) - 1) & ~((1 << scb) - 1);
+	printk(KERN_ERR "S: 0: %08x %08x, 1: %08x %08x  (PA[%u:%u] %05x)\n",
+		cache_tags.tags[0][0].hi, cache_tags.tags[0][0].lo,
+		cache_tags.tags[0][1].hi, cache_tags.tags[0][1].lo,
+		scw-1, scb, i & (unsigned)cache_tags.err_addr);
+}
+
+static inline const char *cause_excode_text(int cause)
+{
+	static const char *txt[32] =
+	{	"Interrupt",
+		"TLB modification",
+		"TLB (load or instruction fetch)",
+		"TLB (store)",
+		"Address error (load or instruction fetch)",
+		"Address error (store)",
+		"Bus error (instruction fetch)",
+		"Bus error (data: load or store)",
+		"Syscall",
+		"Breakpoint",
+		"Reserved instruction",
+		"Coprocessor unusable",
+		"Arithmetic Overflow",
+		"Trap",
+		"14",
+		"Floating-Point",
+		"16", "17", "18", "19", "20", "21", "22",
+		"Watch Hi/Lo",
+		"24", "25", "26", "27", "28", "29", "30", "31",
+	};
+	return txt[(cause & 0x7c) >> 2];
+}
+
+static void print_buserr(const struct pt_regs *regs)
+{
+	const int field = 2 * sizeof(unsigned long);
+	int error = 0;
+
+	if (extio_stat & EXTIO_MC_BUSERR) {
+		printk(KERN_ERR "MC Bus Error\n");
+		error |= 1;
+	}
+	if (extio_stat & EXTIO_HPC3_BUSERR) {
+		printk(KERN_ERR "HPC3 Bus Error 0x%x:<id=0x%x,%s,lane=0x%x>\n",
+			hpc3_berr_stat,
+			(hpc3_berr_stat & HPC3_BESTAT_PIDMASK) >>
+					  HPC3_BESTAT_PIDSHIFT,
+			(hpc3_berr_stat & HPC3_BESTAT_CTYPE) ? "PIO" : "DMA",
+			hpc3_berr_stat & HPC3_BESTAT_BLMASK);
+		error |= 2;
+	}
+	if (extio_stat & EXTIO_EISA_BUSERR) {
+		printk(KERN_ERR "EISA Bus Error\n");
+		error |= 4;
+	}
+	if (cpu_err_stat & CPU_ERRMASK) {
+		printk(KERN_ERR "CPU error 0x%x<%s%s%s%s%s%s> @ 0x%08x\n",
+			cpu_err_stat,
+			cpu_err_stat & SGIMC_CSTAT_RD ? "RD " : "",
+			cpu_err_stat & SGIMC_CSTAT_PAR ? "PAR " : "",
+			cpu_err_stat & SGIMC_CSTAT_ADDR ? "ADDR " : "",
+			cpu_err_stat & SGIMC_CSTAT_SYSAD_PAR ? "SYSAD " : "",
+			cpu_err_stat & SGIMC_CSTAT_SYSCMD_PAR ? "SYSCMD " : "",
+			cpu_err_stat & SGIMC_CSTAT_BAD_DATA ? "BAD_DATA " : "",
+			cpu_err_addr);
+		error |= 8;
+	}
+	if (gio_err_stat & GIO_ERRMASK) {
+		printk(KERN_ERR "GIO error 0x%x:<%s%s%s%s%s%s%s%s> @ 0x%08x\n",
+			gio_err_stat,
+			gio_err_stat & SGIMC_GSTAT_RD ? "RD " : "",
+			gio_err_stat & SGIMC_GSTAT_WR ? "WR " : "",
+			gio_err_stat & SGIMC_GSTAT_TIME ? "TIME " : "",
+			gio_err_stat & SGIMC_GSTAT_PROM ? "PROM " : "",
+			gio_err_stat & SGIMC_GSTAT_ADDR ? "ADDR " : "",
+			gio_err_stat & SGIMC_GSTAT_BC ? "BC " : "",
+			gio_err_stat & SGIMC_GSTAT_PIO_RD ? "PIO_RD " : "",
+			gio_err_stat & SGIMC_GSTAT_PIO_WR ? "PIO_WR " : "",
+			gio_err_addr);
+		error |= 16;
+	}
+	if (!error)
+		printk(KERN_ERR "MC: Hmm, didn't find any error condition.\n");
+	else {
+		printk(KERN_ERR "CP0: config %08x,  "
+			"MC: cpuctrl0/1: %08x/%05x, giopar: %04x\n"
+			"MC: cpu/gio_memacc: %08x/%05x, memcfg0/1: %08x/%08x\n",
+			read_c0_config(),
+			sgimc->cpuctrl0, sgimc->cpuctrl0, sgimc->giopar,
+			sgimc->cmacc, sgimc->gmacc,
+			sgimc->mconfig0, sgimc->mconfig1);
+		print_cache_tags();
+	}
+	printk(KERN_ALERT "%s, epc == %0*lx, ra == %0*lx\n",
+	       cause_excode_text(regs->cp0_cause),
+	       field, regs->cp0_epc, field, regs->regs[31]);
+}
+
+/*
+ * Try to find out, whether the bus error is caused by the instruction
+ * at EPC, otherwise we have an asynchronous error.
+ *
+ * Doc1: "MIPS IV Instruction Set", Rev 3.2 (SGI 007-2597-001)
+ * Doc2: "MIPS R10000 Microporcessor User's Manual", Ver 2.0 (SGI 007-2490-001)
+ * Doc3: "MIPS R4000 Microporcessor User's Manual", 2nd Ed. (SGI 007-2489-001)
+ */
+
+#define JMP_INDEX26_OP 1
+#define JMP_REGISTER_OP 2
+#define JMP_PCREL16_OP 3
+#define BASE_OFFSET_OP 4
+#define BASE_IDXREG_OP 5
+
+/* Match virtual address in an insn with physical error address */
+
+static int match_addr(unsigned paddr, unsigned long vaddr)
+{
+	unsigned long uaddr;
+
+	if ((vaddr & 0xffffffff80000000L) == 0xffffffff80000000L)
+		uaddr = (unsigned) CPHYSADDR(vaddr);
+	else if ((vaddr >> 62) == 2)
+		uaddr = (unsigned) XPHYSADDR(vaddr);
+	else {
+		unsigned long eh = vaddr & ~0x1fffL;
+
+		eh |= read_c0_entryhi() & 0xff;
+		write_c0_entryhi(eh);
+		tlb_probe();
+		if (read_c0_index() & 0x80000000)
+			return 0;
+		tlb_read();
+		if (vaddr & (1L << PAGE_SHIFT))
+			uaddr = (unsigned) read_c0_entrylo1();
+		else
+			uaddr = (unsigned) read_c0_entrylo0();
+		uaddr <<= 6;
+		uaddr &= ~PAGE_MASK;
+		uaddr |= vaddr & PAGE_MASK;
+	}
+	return ((uaddr & ~0x7f) == (paddr & ~0x7f));
+}
+
+/* Check, which kind of memory reference is triggered by `insn' */
+
+static int check_special(unsigned insn)
+{
+	/* See Doc1, page A-180 */
+	unsigned func = insn & 0x3f;
+
+	if (8 == func || 8+1 == func) /* JR, JALR */
+		return JMP_REGISTER_OP;
+
+	return 0;
+}
+
+static int check_regimm(unsigned insn)
+{
+	/* See Doc1, page A-180 */
+	unsigned rt = (insn >> 19) & 3; /* bits 20..19[..16] */
+
+	/* BLTZ, BGEZ, BLTZL, BBGEZL || BLTZAL, BGEZAL, BLTZALL, BBGEZALL */
+	if (!rt || 2 == rt)
+		return JMP_PCREL16_OP;
+
+	return 0;
+}
+
+static int check_cop0(unsigned insn)
+{
+	/* See Doc2, pages 287 ff., 187 ff. */
+	if ((insn >> 26) == 5*8+7) /* CACHE */
+		switch ((insn >> 16) & 0x1f) {
+		case Index_Writeback_Inv_D:
+		case Hit_Writeback_Inv_D:
+		case Index_Writeback_Inv_S:
+		case Hit_Writeback_Inv_S:
+			return BASE_OFFSET_OP;
+		}
+	return 0;
+}
+
+static int check_cop1(unsigned insn)
+{
+	/* See Doc1, pages B-108 ff. */
+	unsigned fmt = (insn >> 21) & 0x1f; /* bits 25..21 */
+
+	if (8 == fmt) /* BC1* */
+		return JMP_PCREL16_OP;
+
+	return 0;
+}
+
+static int check_cop1x(unsigned insn)
+{
+	/* See Doc1, pages B-108 ff. */
+	switch (insn & 0x3f) {
+	case 0:   /* LWXC1 */
+	case 1:   /* LDXC1 */
+	case 8:   /* SWXC1 */
+	case 8+1: /* SDXC1 */
+		return BASE_IDXREG_OP;
+	}
+	return 0;
+}
+
+static int check_plain(unsigned insn)
+{
+	/* See Doc1, page A-180 */
+	unsigned opcode = insn >> 26;
+
+	if (2 == opcode || 3 == opcode) /* J, JAL */
+		return JMP_INDEX26_OP;
+
+	if ((4     <= opcode && opcode <= 7) ||   /* BEQ, BNE, BLEZ, BGTZ */
+	    (4+2*8 <= opcode && opcode <= 7+2*8)) /* BEQL, BNEL, BLEZL, BGTZL */
+		return JMP_PCREL16_OP;
+
+	if (6*8+3 == opcode) /* PREF */
+		return 0;
+
+	if (3*8+2 == opcode || 3*8+3 == opcode || /* LDL, LDR */
+	    4*8 <= opcode) /* misc. LOAD, STORE */
+		return BASE_OFFSET_OP;
+
+	return 0;
+}
+
+/* Check, whether the insn at EPC causes a memory access at `paddr' */
+
+static int check_addr_in_insn(unsigned paddr, const struct pt_regs *regs)
+{
+	unsigned long epc;
+	unsigned insn;
+	unsigned long a;
+	int typ;
+
+	epc = regs->cp0_cause & CAUSEF_BD ? regs->cp0_epc:regs->cp0_epc+4;
+
+	/* show_code() from kernel/traps.c */
+	if (__get_user(insn, (u32 *)epc))
+		return 1;
+
+	/* See Doc1, pages A-180, B-108 ff. */
+	switch (insn >> 26) {
+	case 0:
+		typ = check_special(insn);
+		break;
+	case 1:
+		typ = check_regimm(insn);
+		break;
+	case 2*8:   /* COP0 */
+	case 5*8+7: /* CACHE */
+		typ = check_cop0(insn);
+		break;
+	case 2*8+1:
+		typ = check_cop1(insn);
+		break;
+	case 2*8+3:
+		typ = check_cop1x(insn);
+		break;
+	default:
+		typ = check_plain(insn);
+		break;
+	}
+	switch (typ) {
+	case JMP_INDEX26_OP:
+		a = (regs->cp0_epc + 4) & ~0xfffffff;
+		a |= (insn & 0x3ffffff) << 2;
+		return match_addr(paddr, a);
+	case JMP_REGISTER_OP:
+		a = regs->regs[(insn >> 21) & 0x1f];
+		return match_addr(paddr, a);
+	case JMP_PCREL16_OP:
+		a = regs->cp0_epc + 4 + ((insn & 0xffff) << 2);
+		return match_addr(paddr, a);
+	case BASE_OFFSET_OP:
+		a = regs->regs[(insn >> 21) & 0x1f] + (insn & 0xffff);
+		return match_addr(paddr, a);
+	case BASE_IDXREG_OP:
+		a = regs->regs[(insn >> 21) & 0x1f];
+		a += regs->regs[(insn >> 16) & 0x1f];
+		return match_addr(paddr, a);
+	case 0:
+		return 0;
+	}
+	/* Assume it would be too dangerous to continue ... */
+	return 1;
+}
+
+/*
+ * Check, whether MC's (virtual) DMA address caused the bus error.
+ * See "Virtual DMA Specification", Draft 1.5, Feb 13 1992, SGI
+ */
+
+static int addr_is_ram(unsigned long addr, unsigned sz)
+{
+	int i;
+
+	for (i = 0; i < boot_mem_map.nr_map; i++) {
+		unsigned long a = boot_mem_map.map[i].addr;
+		if (a <= addr && addr+sz <= a+boot_mem_map.map[i].size)
+			return 1;
+	}
+	return 0;
+}
+
+static int check_microtlb(u32 hi, u32 lo, unsigned long vaddr)
+{
+	/* This is likely rather similar to correct code ;-) */
+
+	vaddr &= 0x7fffffff; /* Doc. states that top bit is ignored */
+
+	/* If tlb-entry is valid and VPN-high (bits [30:21] ?) matches... */
+	if ((lo & 2) && (vaddr >> 21) == ((hi<<1) >> 22)) {
+		u32 ctl = sgimc->dma_ctrl;
+		if (ctl & 1) {
+			unsigned int pgsz = (ctl & 2) ? 14:12; /* 16k:4k */
+			/* PTEIndex is VPN-low (bits [22:14]/[20:12] ?) */
+			unsigned long pte = (lo >> 6) << 12; /* PTEBase */
+			pte += 8*((vaddr >> pgsz) & 0x1ff);
+			if (addr_is_ram(pte, 8)) {
+				/*
+				 * Note: Since DMA hardware does look up
+				 * translation on its own, this PTE *must*
+				 * match the TLB/EntryLo-register format !
+				 */
+				unsigned long a = *(unsigned long *)
+						PHYS_TO_XKSEG_UNCACHED(pte);
+				a = (a & 0x3f) << 6; /* PFN */
+				a += vaddr & ((1 << pgsz) - 1);
+				return (cpu_err_addr == a);
+			}
+		}
+	}
+	return 0;
+}
+
+static int check_vdma_memaddr(void)
+{
+	if (cpu_err_stat & CPU_ERRMASK) {
+		u32 a = sgimc->maddronly;
+
+		if (!(sgimc->dma_ctrl & 0x100)) /* Xlate-bit clear ? */
+			return (cpu_err_addr == a);
+
+		if (check_microtlb(sgimc->dtlb_hi0, sgimc->dtlb_lo0, a) ||
+		    check_microtlb(sgimc->dtlb_hi1, sgimc->dtlb_lo1, a) ||
+		    check_microtlb(sgimc->dtlb_hi2, sgimc->dtlb_lo2, a) ||
+		    check_microtlb(sgimc->dtlb_hi3, sgimc->dtlb_lo3, a))
+			return 1;
+	}
+	return 0;
+}
+
+static int check_vdma_gioaddr(void)
+{
+	if (gio_err_stat & GIO_ERRMASK) {
+		u32 a = sgimc->gio_dma_trans;
+		a = (sgimc->gmaddronly & ~a) | (sgimc->gio_dma_sbits & a);
+		return (gio_err_addr == a);
+	}
+	return 0;
+}
+
+/*
+ * MC sends an interrupt whenever bus or parity errors occur. In addition,
+ * if the error happened during a CPU read, it also asserts the bus error
+ * pin on the R4K. Code in bus error handler save the MC bus error registers
+ * and then clear the interrupt when this happens.
+ */
+
+static int ip28_be_interrupt(const struct pt_regs *regs)
+{
+	int i;
+
+	save_and_clear_buserr();
+	/*
+	 * Try to find out, whether we got here by a mispredicted speculative
+	 * load/store operation.  If so, it's not fatal, we can go on.
+	 */
+	/* Any cause other than "Interrupt" (ExcCode 0) is fatal. */
+	if (regs->cp0_cause & CAUSEF_EXCCODE)
+		goto mips_be_fatal;
+
+	/* Any cause other than "Bus error interrupt" (IP6) is weird. */
+	if ((regs->cp0_cause & CAUSEF_IP6) != CAUSEF_IP6)
+		goto mips_be_fatal;
+
+	if (extio_stat & (EXTIO_HPC3_BUSERR | EXTIO_EISA_BUSERR))
+		goto mips_be_fatal;
+
+	/* Any state other than "Memory bus error" is fatal. */
+	if (cpu_err_stat & CPU_ERRMASK & ~SGIMC_CSTAT_ADDR)
+			goto mips_be_fatal;
+
+	/* GIO errors are fatal */
+	if (gio_err_stat & GIO_ERRMASK)
+		goto mips_be_fatal;
+
+	/* Finding `cpu_err_addr' in the insn at EPC is fatal. */
+	if ((cpu_err_stat & CPU_ERRMASK) &&
+	     check_addr_in_insn(cpu_err_addr, regs))
+			goto mips_be_fatal;
+
+	/*
+	 * Now we have an asynchronous bus error, speculatively or DMA caused.
+	 * Need to search all DMA descriptors for the error address.
+	 */
+	for (i = 0; i < sizeof(hpc3)/sizeof(struct hpc3_stat); ++i) {
+		struct hpc3_stat *hp = (struct hpc3_stat *)&hpc3 + i;
+		if ((cpu_err_stat & CPU_ERRMASK) &&
+		    (cpu_err_addr == hp->ndptr || cpu_err_addr == hp->cbp))
+			break;
+		if ((gio_err_stat & GIO_ERRMASK) &&
+		    (gio_err_addr == hp->ndptr || gio_err_addr == hp->cbp))
+			break;
+	}
+	if (i < sizeof(hpc3)/sizeof(struct hpc3_stat)) {
+		struct hpc3_stat *hp = (struct hpc3_stat *)&hpc3 + i;
+		printk(KERN_ERR "at DMA addresses: HPC3 @ %08lx:"
+		       " ctl %08x, ndp %08x, cbp %08x\n",
+		       CPHYSADDR(hp->addr), hp->ctrl, hp->ndptr, hp->cbp);
+		goto mips_be_fatal;
+	}
+	/* Check MC's virtual DMA stuff. */
+	if (check_vdma_memaddr()) {
+		printk(KERN_ERR "at GIO DMA: mem address 0x%08x.\n",
+			sgimc->maddronly);
+		goto mips_be_fatal;
+	}
+	if (check_vdma_gioaddr()) {
+		printk(KERN_ERR "at GIO DMA: gio address 0x%08x.\n",
+			sgimc->gmaddronly);
+		goto mips_be_fatal;
+	}
+	/* A speculative bus error... */
+	if (debug_be_interrupt) {
+		print_buserr(regs);
+		printk(KERN_ERR "discarded!\n");
+	}
+	return MIPS_BE_DISCARD;
+
+mips_be_fatal:
+	print_buserr(regs);
+	return MIPS_BE_FATAL;
+}
+
+void ip22_be_interrupt(int irq)
+{
+	const struct pt_regs *regs = get_irq_regs();
+
+	count_be_interrupt++;
+
+	if (ip28_be_interrupt(regs) != MIPS_BE_DISCARD) {
+		/* Assume it would be too dangerous to continue ... */
+		die_if_kernel("Oops", regs);
+		force_sig(SIGBUS, current);
+	} else if (debug_be_interrupt)
+		show_regs((struct pt_regs *)regs);
+}
+
+static int ip28_be_handler(struct pt_regs *regs, int is_fixup)
+{
+	/*
+	 * We arrive here only in the unusual case of do_be() invocation,
+	 * i.e. by a bus error exception without a bus error interrupt.
+	 */
+	if (is_fixup) {
+		count_be_is_fixup++;
+		save_and_clear_buserr();
+		return MIPS_BE_FIXUP;
+	}
+	count_be_handler++;
+	return ip28_be_interrupt(regs);
+}
+
+void __init ip22_be_init(void)
+{
+	board_be_handler = ip28_be_handler;
+}
+
+int ip28_show_be_info(struct seq_file *m)
+{
+	seq_printf(m, "IP28 be fixups\t\t: %u\n", count_be_is_fixup);
+	seq_printf(m, "IP28 be interrupts\t: %u\n", count_be_interrupt);
+	seq_printf(m, "IP28 be handler\t\t: %u\n", count_be_handler);
+
+	return 0;
+}
+
+static int __init debug_be_setup(char *str)
+{
+	debug_be_interrupt++;
+	return 1;
+}
+__setup("ip28_debug_be", debug_be_setup);
+
diff --git a/include/asm-mips/dma.h b/include/asm-mips/dma.h
index 833437d..27b5c91 100644
--- a/include/asm-mips/dma.h
+++ b/include/asm-mips/dma.h
@@ -84,10 +84,9 @@
  * Deskstations or Acer PICA but not the much more versatile DMA logic used
  * for the local devices on Acer PICA or Magnums.
  */
-#ifdef CONFIG_SGI_IP22
-/* Horrible hack to have a correct DMA window on IP22 */
-#include <asm/sgi/mc.h>
-#define MAX_DMA_ADDRESS		(PAGE_OFFSET + SGIMC_SEG0_BADDR + 0x01000000)
+#if defined(CONFIG_SGI_IP22) || defined(CONFIG_SGI_IP28)
+/* don't care; ISA bus master won't work, ISA slave DMA supports 32bit addr */
+#define MAX_DMA_ADDRESS		PAGE_OFFSET
 #else
 #define MAX_DMA_ADDRESS		(PAGE_OFFSET + 0x01000000)
 #endif
diff --git a/include/asm-mips/mach-ip28/cpu-feature-overrides.h b/include/asm-mips/mach-ip28/cpu-feature-overrides.h
new file mode 100644
index 0000000..9a53b32
--- /dev/null
+++ b/include/asm-mips/mach-ip28/cpu-feature-overrides.h
@@ -0,0 +1,50 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 2003 Ralf Baechle
+ * 6/2004	pf
+ */
+#ifndef __ASM_MACH_IP28_CPU_FEATURE_OVERRIDES_H
+#define __ASM_MACH_IP28_CPU_FEATURE_OVERRIDES_H
+
+/*
+ * IP28 only comes with R10000 family processors all using the same config
+ */
+#define cpu_has_watch		1
+#define cpu_has_mips16		0
+#define cpu_has_divec		0
+#define cpu_has_vce		0
+#define cpu_has_cache_cdex_p	0
+#define cpu_has_cache_cdex_s	0
+#define cpu_has_prefetch	1
+#define cpu_has_mcheck		0
+#define cpu_has_ejtag		0
+
+#define cpu_has_llsc		1
+#define cpu_has_vtag_icache	0
+#define cpu_has_dc_aliases	0 /* see probe_pcache() */
+#define cpu_has_ic_fills_f_dc	0
+#define cpu_has_dsp		0
+#define cpu_icache_snoops_remote_store  1
+#define cpu_has_mipsmt		0
+#define cpu_has_userlocal	0
+
+#define cpu_has_nofpuex		0
+#define cpu_has_64bits		1
+
+#define cpu_has_4kex		1
+#define cpu_has_4k_cache	1
+
+#define cpu_has_inclusive_pcaches	1
+
+#define cpu_dcache_line_size()	32
+#define cpu_icache_line_size()	64
+
+#define cpu_has_mips32r1	0
+#define cpu_has_mips32r2	0
+#define cpu_has_mips64r1	0
+#define cpu_has_mips64r2	0
+
+#endif /* __ASM_MACH_IP28_CPU_FEATURE_OVERRIDES_H */
diff --git a/include/asm-mips/mach-ip28/ds1286.h b/include/asm-mips/mach-ip28/ds1286.h
new file mode 100644
index 0000000..471bb9a
--- /dev/null
+++ b/include/asm-mips/mach-ip28/ds1286.h
@@ -0,0 +1,4 @@
+#ifndef __ASM_MACH_IP28_DS1286_H
+#define __ASM_MACH_IP28_DS1286_H
+#include <asm/mach-ip22/ds1286.h>
+#endif /* __ASM_MACH_IP28_DS1286_H */
diff --git a/include/asm-mips/mach-ip28/spaces.h b/include/asm-mips/mach-ip28/spaces.h
new file mode 100644
index 0000000..05aabb2
--- /dev/null
+++ b/include/asm-mips/mach-ip28/spaces.h
@@ -0,0 +1,22 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 1994 - 1999, 2000, 03, 04 Ralf Baechle
+ * Copyright (C) 2000, 2002  Maciej W. Rozycki
+ * Copyright (C) 1990, 1999, 2000 Silicon Graphics, Inc.
+ * 2004	pf
+ */
+#ifndef _ASM_MACH_IP28_SPACES_H
+#define _ASM_MACH_IP28_SPACES_H
+
+#define CAC_BASE		0xa800000000000000
+
+#define HIGHMEM_START		(~0UL)
+
+#define PHYS_OFFSET		_AC(0x20000000, UL)
+
+#include <asm/mach-generic/spaces.h>
+
+#endif /* _ASM_MACH_IP28_SPACES_H */
diff --git a/include/asm-mips/mach-ip28/war.h b/include/asm-mips/mach-ip28/war.h
new file mode 100644
index 0000000..a1baafa
--- /dev/null
+++ b/include/asm-mips/mach-ip28/war.h
@@ -0,0 +1,25 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 2002, 2004, 2007 by Ralf Baechle <ralf@linux-mips.org>
+ */
+#ifndef __ASM_MIPS_MACH_IP28_WAR_H
+#define __ASM_MIPS_MACH_IP28_WAR_H
+
+#define R4600_V1_INDEX_ICACHEOP_WAR	0
+#define R4600_V1_HIT_CACHEOP_WAR	0
+#define R4600_V2_HIT_CACHEOP_WAR	0
+#define R5432_CP0_INTERRUPT_WAR		0
+#define BCM1250_M3_WAR			0
+#define SIBYTE_1956_WAR			0
+#define MIPS4K_ICACHE_REFILL_WAR	0
+#define MIPS_CACHE_SYNC_WAR		0
+#define TX49XX_ICACHE_INDEX_INV_WAR	0
+#define RM9000_CDEX_SMP_WAR		0
+#define ICACHE_REFILLS_WORKAROUND_WAR	0
+#define R10000_LLSC_WAR			1
+#define MIPS34K_MISSED_ITLB_WAR		0
+
+#endif /* __ASM_MIPS_MACH_IP28_WAR_H */

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-11-29  9:54 Thomas Bogendoerfer
@ 2007-11-29 13:01 ` Ralf Baechle
  2007-12-05  6:16   ` Kumba
  0 siblings, 1 reply; 26+ messages in thread
From: Ralf Baechle @ 2007-11-29 13:01 UTC (permalink / raw)
  To: Thomas Bogendoerfer; +Cc: linux-mips

On Thu, Nov 29, 2007 at 10:54:42AM +0100, Thomas Bogendoerfer wrote:

> Add support for SGI IP28 machines (Indigo 2 with R10k CPUs)
> This work is mainly based on Peter Fuersts work.

Queued for 2.6.25.  There clearly is work remaining to be done but the
code is now in an acceptable shape and the best way to push it forward
is integrating it.  Thanks for all the work and especially to Peter
Fürst for the initial heavyweight lifting!

  Ralf

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-11-29 13:01 ` Ralf Baechle
@ 2007-12-05  6:16   ` Kumba
  2007-12-05  9:39     ` Thomas Bogendoerfer
  0 siblings, 1 reply; 26+ messages in thread
From: Kumba @ 2007-12-05  6:16 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thomas Bogendoerfer, linux-mips

Ralf Baechle wrote:
> On Thu, Nov 29, 2007 at 10:54:42AM +0100, Thomas Bogendoerfer wrote:
> 
>> Add support for SGI IP28 machines (Indigo 2 with R10k CPUs)
>> This work is mainly based on Peter Fuersts work.
> 
> Queued for 2.6.25.  There clearly is work remaining to be done but the
> code is now in an acceptable shape and the best way to push it forward
> is integrating it.  Thanks for all the work and especially to Peter
> Fürst for the initial heavyweight lifting!
> 
>   Ralf

Seconded.  Peter is made of Win.

I've been out of it lately -- did the gcc side of things ever make it in, or do 
we need to go push on that some more?


--Kumba

-- 
Gentoo/MIPS Team Lead

"Such is oft the course of deeds that move the wheels of the world: small hands 
do them because they must, while the eyes of the great are elsewhere."  --Elrond

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-05  6:16   ` Kumba
@ 2007-12-05  9:39     ` Thomas Bogendoerfer
  2007-12-05 19:49       ` peter fuerst
  2007-12-08 17:52       ` Richard Sandiford
  0 siblings, 2 replies; 26+ messages in thread
From: Thomas Bogendoerfer @ 2007-12-05  9:39 UTC (permalink / raw)
  To: Kumba; +Cc: Ralf Baechle, linux-mips

On Wed, Dec 05, 2007 at 01:16:13AM -0500, Kumba wrote:
> I've been out of it lately -- did the gcc side of things ever make it in, 
> or do we need to go push on that some more?

We need push on that. Looking at 

http://gcc.gnu.org/ml/gcc-patches/2006-04/msg00291.html

there seems to be a missing understanding, why the cache
barriers are needed. I guess the patch could be improved
by pointing directly to the errata section of the R10k
user manual. Or even better copy the text out of the user
manual. That should make clear why this patch is needed.

Peter did you do the copyright assigment ? That's probably
the second part, which needs to be done.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessary a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-05  9:39     ` Thomas Bogendoerfer
@ 2007-12-05 19:49       ` peter fuerst
  2007-12-05 20:37         ` David Daney
  2007-12-06 11:41         ` Ralf Baechle
  2007-12-08 17:52       ` Richard Sandiford
  1 sibling, 2 replies; 26+ messages in thread
From: peter fuerst @ 2007-12-05 19:49 UTC (permalink / raw)
  To: Thomas Bogendoerfer; +Cc: Kumba, Ralf Baechle, linux-mips

On Wed, 5 Dec 2007, Thomas Bogendoerfer wrote:

> Date: Wed, 5 Dec 2007 10:39:38 +0100
> From: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
> To: Kumba <kumba@gentoo.org>
> Cc: Ralf Baechle <ralf@linux-mips.org>, linux-mips@linux-mips.org
> Subject: Re: [UPDATED PATCH] IP28 support
>
> On Wed, Dec 05, 2007 at 01:16:13AM -0500, Kumba wrote:
> > I've been out of it lately -- did the gcc side of things ever make it in,
> > or do we need to go push on that some more?
>
> We need push on that. ...

There was no answer to .../2006-05/msg01446.html. Perhaps i should just
put together an updated patch, that incorporates the changes proposed in
msg01446.html, and submit it (with the longer "Cc:" line and a hint to
the increasing demand for it ;-) to revive at least the discussion at
gcc-patches.
What could be changed beyond the proposed changes without either omitting
necessary cache-barriers or crippling the R10k, i can't see yet.

> We need push on that. Looking at
>
> http://gcc.gnu.org/ml/gcc-patches/2006-04/msg00291.html
>
> there seems to be a missing understanding, why the cache
> barriers are needed. I guess the patch could be improved
> by pointing directly to the errata section of the R10k
> user manual. Or even better copy the text out of the user
> manual. That should make clear why this patch is needed.

Better copy, i guess. (Assuming copying whole paragraphs is still proper
citation ;-) Along with the initial patch (.../2006-03.msg00090.html) as
well as in the last letter so far (.../2006-05/msg01446.html) i pointed
to the corresponding chapter in the R10k User's Manual and to the entry
in the NetBSD eMail archive. In the last letter i tried to augment these
by a summarizing explanation, but it seems i'm not very good at that...

>
> Peter did you do the copyright assigment ? That's probably
> the second part, which needs to be done.

Yes, the assignment process became complete on May 22 2006
(though apparently i missed to notify Richard Sandiford about it)

>
> Thomas.
>
> --
> Crap can work. Given enough thrust pigs will fly, but it's not necessary a
> good idea.                                                [ RFC1925, 2.3 ]
>
>
>

kind regards

peter

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-05 19:49       ` peter fuerst
@ 2007-12-05 20:37         ` David Daney
  2007-12-06 11:44           ` Ralf Baechle
  2007-12-06 11:41         ` Ralf Baechle
  1 sibling, 1 reply; 26+ messages in thread
From: David Daney @ 2007-12-05 20:37 UTC (permalink / raw)
  To: peter fuerst; +Cc: Thomas Bogendoerfer, Kumba, Ralf Baechle, linux-mips

peter fuerst wrote:
> 
> On Wed, 5 Dec 2007, Thomas Bogendoerfer wrote:
> 
>> Date: Wed, 5 Dec 2007 10:39:38 +0100
>> From: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
>> To: Kumba <kumba@gentoo.org>
>> Cc: Ralf Baechle <ralf@linux-mips.org>, linux-mips@linux-mips.org
>> Subject: Re: [UPDATED PATCH] IP28 support
>>
>> On Wed, Dec 05, 2007 at 01:16:13AM -0500, Kumba wrote:
>>> I've been out of it lately -- did the gcc side of things ever make it in,
>>> or do we need to go push on that some more?
>> We need push on that. ...
> 
> There was no answer to .../2006-05/msg01446.html. Perhaps i should just
> put together an updated patch,

That would be helpful.  It would have to be against GCC's svn trunk. 
Currently 4.3 is in regression fix only mode.  The earliest the patch 
could appear in an official GCC release would probably be version 4.4


> that incorporates the changes proposed in
> msg01446.html, and submit it (with the longer "Cc:" line and a hint to
> the increasing demand for it ;-) to revive at least the discussion at
> gcc-patches.

Just sent it to gcc-patches@   I think it will be noticed.


> What could be changed beyond the proposed changes without either omitting
> necessary cache-barriers or crippling the R10k, i can't see yet.
> 
>> We need push on that. Looking at
>>
>> http://gcc.gnu.org/ml/gcc-patches/2006-04/msg00291.html
>>
>> there seems to be a missing understanding, why the cache
>> barriers are needed. I guess the patch could be improved
>> by pointing directly to the errata section of the R10k
>> user manual. Or even better copy the text out of the user
>> manual. That should make clear why this patch is needed.
> 
> Better copy, i guess. (Assuming copying whole paragraphs is still proper
> citation ;-) Along with the initial patch (.../2006-03.msg00090.html) as
> well as in the last letter so far (.../2006-05/msg01446.html) i pointed
> to the corresponding chapter in the R10k User's Manual and to the entry
> in the NetBSD eMail archive. In the last letter i tried to augment these
> by a summarizing explanation, but it seems i'm not very good at that...
> 
>> Peter did you do the copyright assigment ? That's probably
>> the second part, which needs to be done.
> 
> Yes, the assignment process became complete on May 22 2006
> (though apparently i missed to notify Richard Sandiford about it)
> 

Good.  Richard is generally quite responsive to patches.  Perhaps CC him 
on your patch.

David Daney

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-05 20:37         ` David Daney
@ 2007-12-06 11:44           ` Ralf Baechle
  0 siblings, 0 replies; 26+ messages in thread
From: Ralf Baechle @ 2007-12-06 11:44 UTC (permalink / raw)
  To: David Daney; +Cc: peter fuerst, Thomas Bogendoerfer, Kumba, linux-mips

On Wed, Dec 05, 2007 at 12:37:59PM -0800, David Daney wrote:

>> There was no answer to .../2006-05/msg01446.html. Perhaps i should just
>> put together an updated patch,
>
> That would be helpful.  It would have to be against GCC's svn trunk. 
> Currently 4.3 is in regression fix only mode.  The earliest the patch could 
> appear in an official GCC release would probably be version 4.4

Many distributions have the policy of applying only patches that are
upstream, even if they're upstream only for a newer version.  As such
getting them into the FSF 4.4 tree would also be tremendously useful as
an icebreaker.

  Ralf

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-05 19:49       ` peter fuerst
  2007-12-05 20:37         ` David Daney
@ 2007-12-06 11:41         ` Ralf Baechle
  1 sibling, 0 replies; 26+ messages in thread
From: Ralf Baechle @ 2007-12-06 11:41 UTC (permalink / raw)
  To: peter fuerst; +Cc: Thomas Bogendoerfer, Kumba, linux-mips

On Wed, Dec 05, 2007 at 08:49:53PM +0100, peter fuerst wrote:

> > there seems to be a missing understanding, why the cache
> > barriers are needed. I guess the patch could be improved
> > by pointing directly to the errata section of the R10k
> > user manual. Or even better copy the text out of the user
> > manual. That should make clear why this patch is needed.
> 
> Better copy, i guess. (Assuming copying whole paragraphs is still proper
> citation ;-) Along with the initial patch (.../2006-03.msg00090.html) as
> well as in the last letter so far (.../2006-05/msg01446.html) i pointed
> to the corresponding chapter in the R10k User's Manual and to the entry
> in the NetBSD eMail archive. In the last letter i tried to augment these
> by a summarizing explanation, but it seems i'm not very good at that...

I'm not sure how far "fair use" of the R10000 manual text can be stretched.
But afair Bill Earl (wje@sgi.com) posted a reasonable explanation which
also for the purposes of the gcc manual is much easier to understand.

  Ralf

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-05  9:39     ` Thomas Bogendoerfer
  2007-12-05 19:49       ` peter fuerst
@ 2007-12-08 17:52       ` Richard Sandiford
  2007-12-08 17:52         ` Richard Sandiford
  2007-12-08 19:24         ` Ralf Baechle
  1 sibling, 2 replies; 26+ messages in thread
From: Richard Sandiford @ 2007-12-08 17:52 UTC (permalink / raw)
  To: Thomas Bogendoerfer; +Cc: Kumba, Ralf Baechle, linux-mips

tsbogend@alpha.franken.de (Thomas Bogendoerfer) writes:
> On Wed, Dec 05, 2007 at 01:16:13AM -0500, Kumba wrote:
>> I've been out of it lately -- did the gcc side of things ever make it in, 
>> or do we need to go push on that some more?
>
> We need push on that. Looking at 
>
> http://gcc.gnu.org/ml/gcc-patches/2006-04/msg00291.html
>
> there seems to be a missing understanding, why the cache
> barriers are needed.

Heh.  Quite probably.  Which bit of my message don't you agree with?

FWIW, I was going off the original message as posted here:

    http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00090.html

The explanation of the chosen workaround seemed to be left to this bit
of http://mail-index.netbsd.org/port-sgimips/2000/06/29/0006.html:

    All is well with coherent IO systems.  On non coherent
    systems like Indigo2 and O2 this creates a race
    condition with DMA reads (IO->mem) where a stale
    cached data can be written back over the DMAed data.

    R10K Indigo2:

    This issue was figured out late the the R10K I2
    design cycle.  The problem was fixed by modifying
    the compiler and assembler to issue a cache barrier
    instruction to address 0(sp) as the first instruction
    in basic blocks that contain stores to registers
    other than $0 and $sp.

and from a compiler point of view, it would be nice to know
_why_ that was a reasonable workaround.  What I was really
looking for was: (a) a short description of the problem,
(b) a list of assumptions that the compiler is going to
make when working around the problem and (c) a description
of what said workarounds are.

My understanding of (a) is that, if a store is speculatively executed,
the target of the store might be fetched into cache and marked dirty.
We therefore want to avoid the speculative execution of stores if:

  (1) the addressed memory might be the target of a later DMA operation.
      If the DMA completes before the "dirty" cache line is flushed,
      the cached data might overwrite the DMAed data.

  (2) the addressed memory might be to IO-mapped cached memory
      (usually through the address being garbage).  The cached
      data will be written back to the IO region when flushed.

We also want to avoid speculative execution of loads if:

  (3) the addressed memory might be to load-sensitive IO-mapped cached
      memory (usually through the address being garbage).  The hardware
      would "see" loads that aren't actually executed.

Is that vaguely accurate?

I tried to piece together (b) by asking questions in the reviews,
but it would be great to have a single explanation.

The idea behind (c) is simple, of course: we insert a cache barrier
before the potentially-problematic stores (and, for certain
configurations, loads, although the original gcc patch had the
associated macro hard-wired to false).  The key is explaining how,
from a compiler internals viewpoint, we decide what is "potentially-
problematic".  This ties in with the assumptions for (b).

I'm sure my attempt at (a) above can be improved upon even if it's
vaguely right.  But...

> I guess the patch could be improved
> by pointing directly to the errata section of the R10k
> user manual.

...I think an integrated explanation of (a), (b) and (c) above
would be better than quoting large parts of the processor manual.
The processor manual is aimed at a much broader audience and has
a lot of superfluous info.  It also doesn't explain what _our_
assumptions are and what our chosen workaround is.

Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-08 17:52       ` Richard Sandiford
@ 2007-12-08 17:52         ` Richard Sandiford
  2007-12-08 19:24         ` Ralf Baechle
  1 sibling, 0 replies; 26+ messages in thread
From: Richard Sandiford @ 2007-12-08 17:52 UTC (permalink / raw)
  To: Thomas Bogendoerfer; +Cc: Kumba, Ralf Baechle, linux-mips

tsbogend@alpha.franken.de (Thomas Bogendoerfer) writes:
> On Wed, Dec 05, 2007 at 01:16:13AM -0500, Kumba wrote:
>> I've been out of it lately -- did the gcc side of things ever make it in, 
>> or do we need to go push on that some more?
>
> We need push on that. Looking at 
>
> http://gcc.gnu.org/ml/gcc-patches/2006-04/msg00291.html
>
> there seems to be a missing understanding, why the cache
> barriers are needed.

Heh.  Quite probably.  Which bit of my message don't you agree with?

FWIW, I was going off the original message as posted here:

    http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00090.html

The explanation of the chosen workaround seemed to be left to this bit
of http://mail-index.netbsd.org/port-sgimips/2000/06/29/0006.html:

    All is well with coherent IO systems.  On non coherent
    systems like Indigo2 and O2 this creates a race
    condition with DMA reads (IO->mem) where a stale
    cached data can be written back over the DMAed data.

    R10K Indigo2:

    This issue was figured out late the the R10K I2
    design cycle.  The problem was fixed by modifying
    the compiler and assembler to issue a cache barrier
    instruction to address 0(sp) as the first instruction
    in basic blocks that contain stores to registers
    other than $0 and $sp.

and from a compiler point of view, it would be nice to know
_why_ that was a reasonable workaround.  What I was really
looking for was: (a) a short description of the problem,
(b) a list of assumptions that the compiler is going to
make when working around the problem and (c) a description
of what said workarounds are.

My understanding of (a) is that, if a store is speculatively executed,
the target of the store might be fetched into cache and marked dirty.
We therefore want to avoid the speculative execution of stores if:

  (1) the addressed memory might be the target of a later DMA operation.
      If the DMA completes before the "dirty" cache line is flushed,
      the cached data might overwrite the DMAed data.

  (2) the addressed memory might be to IO-mapped cached memory
      (usually through the address being garbage).  The cached
      data will be written back to the IO region when flushed.

We also want to avoid speculative execution of loads if:

  (3) the addressed memory might be to load-sensitive IO-mapped cached
      memory (usually through the address being garbage).  The hardware
      would "see" loads that aren't actually executed.

Is that vaguely accurate?

I tried to piece together (b) by asking questions in the reviews,
but it would be great to have a single explanation.

The idea behind (c) is simple, of course: we insert a cache barrier
before the potentially-problematic stores (and, for certain
configurations, loads, although the original gcc patch had the
associated macro hard-wired to false).  The key is explaining how,
from a compiler internals viewpoint, we decide what is "potentially-
problematic".  This ties in with the assumptions for (b).

I'm sure my attempt at (a) above can be improved upon even if it's
vaguely right.  But...

> I guess the patch could be improved
> by pointing directly to the errata section of the R10k
> user manual.

...I think an integrated explanation of (a), (b) and (c) above
would be better than quoting large parts of the processor manual.
The processor manual is aimed at a much broader audience and has
a lot of superfluous info.  It also doesn't explain what _our_
assumptions are and what our chosen workaround is.

Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-08 17:52       ` Richard Sandiford
  2007-12-08 17:52         ` Richard Sandiford
@ 2007-12-08 19:24         ` Ralf Baechle
  2007-12-08 20:09           ` Richard Sandiford
  1 sibling, 1 reply; 26+ messages in thread
From: Ralf Baechle @ 2007-12-08 19:24 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Kumba, linux-mips, rsandifo

On Sat, Dec 08, 2007 at 05:52:06PM +0000, Richard Sandiford wrote:

> tsbogend@alpha.franken.de (Thomas Bogendoerfer) writes:
> > On Wed, Dec 05, 2007 at 01:16:13AM -0500, Kumba wrote:
> >> I've been out of it lately -- did the gcc side of things ever make it in, 
> >> or do we need to go push on that some more?
> >
> > We need push on that. Looking at 
> >
> > http://gcc.gnu.org/ml/gcc-patches/2006-04/msg00291.html
> >
> > there seems to be a missing understanding, why the cache
> > barriers are needed.
> 
> Heh.  Quite probably.  Which bit of my message don't you agree with?
> 
> FWIW, I was going off the original message as posted here:
> 
>     http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00090.html
> 
> The explanation of the chosen workaround seemed to be left to this bit
> of http://mail-index.netbsd.org/port-sgimips/2000/06/29/0006.html:
> 
>     All is well with coherent IO systems.  On non coherent
>     systems like Indigo2 and O2 this creates a race
>     condition with DMA reads (IO->mem) where a stale
>     cached data can be written back over the DMAed data.

It's not a race condition.

>     R10K Indigo2:
> 
>     This issue was figured out late the the R10K I2
>     design cycle.  The problem was fixed by modifying
>     the compiler and assembler to issue a cache barrier
>     instruction to address 0(sp) as the first instruction
>     in basic blocks that contain stores to registers
>     other than $0 and $sp.
> 
> and from a compiler point of view, it would be nice to know
> _why_ that was a reasonable workaround.  What I was really
> looking for was: (a) a short description of the problem,
> (b) a list of assumptions that the compiler is going to
> make when working around the problem and (c) a description
> of what said workarounds are.
> 
> My understanding of (a) is that, if a store is speculatively executed,
> the target of the store might be fetched into cache and marked dirty.
> We therefore want to avoid the speculative execution of stores if:
> 
>   (1) the addressed memory might be the target of a later DMA operation.
>       If the DMA completes before the "dirty" cache line is flushed,
>       the cached data might overwrite the DMAed data.
> 
>   (2) the addressed memory might be to IO-mapped cached memory
>       (usually through the address being garbage).  The cached
>       data will be written back to the IO region when flushed.
> 
> We also want to avoid speculative execution of loads if:
> 
>   (3) the addressed memory might be to load-sensitive IO-mapped cached
>       memory (usually through the address being garbage).  The hardware
>       would "see" loads that aren't actually executed.
> 
> Is that vaguely accurate?

Yes.

> I tried to piece together (b) by asking questions in the reviews,
> but it would be great to have a single explanation.
> 
> The idea behind (c) is simple, of course: we insert a cache barrier
> before the potentially-problematic stores (and, for certain
> configurations, loads, although the original gcc patch had the
> associated macro hard-wired to false).  The key is explaining how,
> from a compiler internals viewpoint, we decide what is "potentially-
> problematic".  This ties in with the assumptions for (b).

The principle for the compiler is a store is problematic unless proven
otherwise.  A speculative store relative to the stack pointer, frame
pointer or global pointer for example is harmless.

> I'm sure my attempt at (a) above can be improved upon even if it's
> vaguely right.  But...
> 
> > I guess the patch could be improved
> > by pointing directly to the errata section of the R10k
> > user manual.
> 
> ...I think an integrated explanation of (a), (b) and (c) above
> would be better than quoting large parts of the processor manual.
> The processor manual is aimed at a much broader audience and has
> a lot of superfluous info.  It also doesn't explain what _our_
> assumptions are and what our chosen workaround is.

There are two R10000 manuals, one from SGI and one from NEC and they're
differing quite a bit on the workaround.  The SGI one gives a large number
of suggestions on how to work around the behaviour some of which even
require hardware asistance by on the system board.  A long time ago
Bill Earl, one of the engineers at SGI responsible for the workaround
emailed me this explanation which I believe is quite reasonable:

[...]
     The R10000 "bug" is, in a sense, a feature, in that it improves
performance, and is harmless on machines with cache-coherent I/O.
Specifically, on a speculative store miss (a cache miss due to a
speculatively executed store instruction), the R10000 fetches the line
dirty-exclusive and marks it modified, in anticipation of the store.
If, however, the speculatively executed store never graduates (is
never committed), the line is left dirty, even though it has not been
modified.  If the line happens to be part of a buffer into which data
is being DMAed, a subsequent victim writeback of the dirty cache line
might overwrite good data from the DMA with the obsolete data in the
cache line.  This means that, one way or the other, a system with
non-cache-coherent I/O and an R10000 must avoid allowing the
processor to perform a speculative store miss with respect to memory
into which a DMA is taking place.

     Note that the Indigo2 and O2 have somewhat different workarounds.
The Indigo2 deals with the kernel side using a special compilation mode,
and the O2 deals with the kernel side using a special hardware feature
plus a generalization of the solution for the user mode part of the problem.
Both deal with the user mode by invalidating TLB entries for pages into
which data is being transferred via DMA, so that the processor cannot
resolve the virtual address, and hence cannot speculatively fetch
a cache line at that address, while the DMA is in progress.  The kernel
side is harder, since the TLB is not used for K0SEG and XKPHYS address
spaces, which is where things get complicated.
[...]

I should mention that the hardware assissted solution for the O2 which is
implemented using an CPLD codenamed "juice" is not currently used by
Linux that is it relies on the same software-only workaround as the
Indigo 2 R10000.

  Ralf

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-08 19:24         ` Ralf Baechle
@ 2007-12-08 20:09           ` Richard Sandiford
  2007-12-08 21:25             ` peter fuerst
  2007-12-09  4:38             ` Ralf Baechle
  0 siblings, 2 replies; 26+ messages in thread
From: Richard Sandiford @ 2007-12-08 20:09 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thomas Bogendoerfer, Kumba, linux-mips

Ralf Baechle <ralf@linux-mips.org> writes:
> On Sat, Dec 08, 2007 at 05:52:06PM +0000, Richard Sandiford wrote:
>> I tried to piece together (b) by asking questions in the reviews,
>> but it would be great to have a single explanation.
>> 
>> The idea behind (c) is simple, of course: we insert a cache barrier
>> before the potentially-problematic stores (and, for certain
>> configurations, loads, although the original gcc patch had the
>> associated macro hard-wired to false).  The key is explaining how,
>> from a compiler internals viewpoint, we decide what is "potentially-
>> problematic".  This ties in with the assumptions for (b).
>
> The principle for the compiler is a store is problematic unless proven
> otherwise.  A speculative store relative to the stack pointer, frame
> pointer or global pointer for example is harmless.

Right.  But just so we're on the same page (and I think we probably are),
my point was that those rules aren't intrinsically obvious.  They're
based on assumptions about how the code is written.  For example,
it assumes there's no DMAing into stack variables.  Maybe obvious,
but I think it needs to be stated explicitly.  Then there's the
language-lawyerly code I gave to Peter on gcc-patches@:

     void foo (int x)
     {
       int array[1];
       if (x)
         bar (array[0x1fff]);
     }

This function is valid if x is never true, so we cannot assume that all
accesses off the stack and frame pointers are actually in-frame.  You're
assuming either (i) the kernel doesn't use code like that or (ii) that
"garbage" addresses in the range [$sp - 0x8000, $sp + 0x7fff] will not
trigger the problem.  I imagine both are reasonable assumptions, and I'm
perfectly happy for us to make them.  But they're the kind of assumption
we need to state explicitly.

Peter's patch also treated accesses to constant integer and symbolic
addresses as safe.  Again, this involves making assumptions about how
constant integer and symbolic addresses are used, and this is a much
less obvious assumption than the stack one.  Again, I understand that
it's a reasonable assumption to make in the linux context, but it's one
we need to pin down.  E.g. there must be no run-time guarding of
target-specific constant integer IO-mapped addresses in cases where
those addresses might trigger the problem on other systems that the
same kernel image supports.

Despite appearances, I'm not trying to be awkward here ;)  I just think
the assumptions are too loosely-defined at the moment (or at least too
scattered around).  It would be nice to have some self-contained
description, targetted specifically at gcc and linux, that contains
anything a gcc hacker or user needs to know about the gcc patch.

Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-08 20:09           ` Richard Sandiford
@ 2007-12-08 21:25             ` peter fuerst
  2007-12-08 23:24               ` Richard Sandiford
  2007-12-09  4:38             ` Ralf Baechle
  1 sibling, 1 reply; 26+ messages in thread
From: peter fuerst @ 2007-12-08 21:25 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Ralf Baechle, Thomas Bogendoerfer, Kumba, linux-mips


Hi!

could text like this help to pin the assumptions down (from
"http://gcc.gnu.org/ml/gcc-patches/2006-05/msg01446.html") ?

  "...
  What cases of $N can be exempted from this measure?
  - Stack-addresses and constant (static) addresses ("sd $M,symbol+n") will not
    be used for DMA, since DMA-buffers are allocated at runtime.
  - Uncached accesses will not be done speculatively, but they fall under the
    "constant"-case already or will not be recognized at compile-time.

  Besides the DMA-problem, depending on the mis-speculation path (up to four
  branches deep), one of the frequently reused multi-purpose registers $N
  will contain some "random" value, which may be a legal but invalid kernel-
  address (say a800000061234567), reaching the memory-controller...
  However, there are cases where a register $N's content is well defined, no
  matter what (mis-)speculation path took us to this instruction:
  - The stack-pointer points to the stack from kernel-initializtion on.
  - Constant addresses ("symbol+n") are well defined "per se".
  (Luckily, legal-but-invalid doesn't occur in user mode, where no cache-
  barriers can be used. There we get either an address-error or a TLB-miss,
  leaving memory/bus untouched.)
  ..."

kind regards

peter


On Sat, 8 Dec 2007, Richard Sandiford wrote:

> Date: Sat, 08 Dec 2007 20:09:31 +0000
> From: Richard Sandiford <rsandifo@nildram.co.uk>
> To: Ralf Baechle <ralf@linux-mips.org>
> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>, Kumba <kumba@gentoo.org>,
>      linux-mips@linux-mips.org
> Subject: Re: [UPDATED PATCH] IP28 support
>
> Ralf Baechle <ralf@linux-mips.org> writes:
> > On Sat, Dec 08, 2007 at 05:52:06PM +0000, Richard Sandiford wrote:
> >> I tried to piece together (b) by asking questions in the reviews,
> >> but it would be great to have a single explanation.
> >>
> >> The idea behind (c) is simple, of course: we insert a cache barrier
> >> before the potentially-problematic stores (and, for certain
> >> configurations, loads, although the original gcc patch had the
> >> associated macro hard-wired to false).  The key is explaining how,
> >> from a compiler internals viewpoint, we decide what is "potentially-
> >> problematic".  This ties in with the assumptions for (b).
> >
> > The principle for the compiler is a store is problematic unless proven
> > otherwise.  A speculative store relative to the stack pointer, frame
> > pointer or global pointer for example is harmless.
>
> Right.  But just so we're on the same page (and I think we probably are),
> my point was that those rules aren't intrinsically obvious.  They're
> based on assumptions about how the code is written.  For example,
> it assumes there's no DMAing into stack variables.  Maybe obvious,
> but I think it needs to be stated explicitly.  Then there's the
> language-lawyerly code I gave to Peter on gcc-patches@:
>
>      void foo (int x)
>      {
>        int array[1];
>        if (x)
>          bar (array[0x1fff]);
>      }
>
> This function is valid if x is never true, so we cannot assume that all
> accesses off the stack and frame pointers are actually in-frame.  You're
> assuming either (i) the kernel doesn't use code like that or (ii) that
> "garbage" addresses in the range [$sp - 0x8000, $sp + 0x7fff] will not
> trigger the problem.  I imagine both are reasonable assumptions, and I'm
> perfectly happy for us to make them.  But they're the kind of assumption
> we need to state explicitly.
>
> Peter's patch also treated accesses to constant integer and symbolic
> addresses as safe.  Again, this involves making assumptions about how
> constant integer and symbolic addresses are used, and this is a much
> less obvious assumption than the stack one.  Again, I understand that
> it's a reasonable assumption to make in the linux context, but it's one
> we need to pin down.  E.g. there must be no run-time guarding of
> target-specific constant integer IO-mapped addresses in cases where
> those addresses might trigger the problem on other systems that the
> same kernel image supports.
>
> Despite appearances, I'm not trying to be awkward here ;)  I just think
> the assumptions are too loosely-defined at the moment (or at least too
> scattered around).  It would be nice to have some self-contained
> description, targetted specifically at gcc and linux, that contains
> anything a gcc hacker or user needs to know about the gcc patch.
>
> Richard
>
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-08 21:25             ` peter fuerst
@ 2007-12-08 23:24               ` Richard Sandiford
  0 siblings, 0 replies; 26+ messages in thread
From: Richard Sandiford @ 2007-12-08 23:24 UTC (permalink / raw)
  To: post; +Cc: Ralf Baechle, Thomas Bogendoerfer, Kumba, linux-mips

peter fuerst <post@pfrst.de> writes:
> could text like this help to pin the assumptions down (from
> "http://gcc.gnu.org/ml/gcc-patches/2006-05/msg01446.html") ?
>
>   "...
>   What cases of $N can be exempted from this measure?
>   - Stack-addresses and constant (static) addresses ("sd $M,symbol+n") will not
>     be used for DMA, since DMA-buffers are allocated at runtime.
>   - Uncached accesses will not be done speculatively, but they fall under the
>     "constant"-case already or will not be recognized at compile-time.
>
>   Besides the DMA-problem, depending on the mis-speculation path (up to four
>   branches deep), one of the frequently reused multi-purpose registers $N
>   will contain some "random" value, which may be a legal but invalid kernel-
>   address (say a800000061234567), reaching the memory-controller...
>   However, there are cases where a register $N's content is well defined, no
>   matter what (mis-)speculation path took us to this instruction:
>   - The stack-pointer points to the stack from kernel-initializtion on.
>   - Constant addresses ("symbol+n") are well defined "per se".
>   (Luckily, legal-but-invalid doesn't occur in user mode, where no cache-
>   barriers can be used. There we get either an address-error or a TLB-miss,
>   leaving memory/bus untouched.)
>   ..."

Well, the explanation of the exceptions doesn't really address the
corner cases I was trying to draw attention to in the message you
replied to.  What about top of the stack + X?  Do we guarantee that
the code will never cause the compiler to generate a store to such
an address, even with an always-false guard?  Or do we guarantee
that stores and loads to [top-of-stack, top-of-stack + 0x7fff] can
be speculated safely?  Do we guarantee that every store and load to
a cached constant address in the kernel image will not result in
a harmful IO access on any target that the image supports?

Perhaps we should just turn this around slightly and instead say:
what must the compiler do, and when must it do it?  The reasons why
aren't that important from the compiler's perspective.  So if we can
just phrase it as:

-mr10k-cache-barrier=load-store
  Insert a cache barrier at the beginning of any sequentially-executed
  series of instructions that contains a load or store.  For the purposes
  of this option, GCC can ignore loads and stores that it can prove:

  (a) access a region in the range [-0x8000 + bottom of stack frame,
      0x7fff + top of stack frame]; or
  (b) access a link-time-constant address.

  Here, a ``sequentially-executed series'' is one in which calls,
  jumps and branches occur only as the last instruction.

-mr10k-cache-barrier=store
  Like -mr10k-cache-barrier=load-store, but ignore all loads.

-mr10k-cache-barrier=none
  ...

And if you guys are willing to make sure that's safe, and change
the kernel whenever you find instances that it isn't safe, then
that should be enough.  (Bear in mind that there's ongoing work
to do link-time optimisation in gcc, so translation-unit separation
is no real guarantee.)

Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-08 20:09           ` Richard Sandiford
  2007-12-08 21:25             ` peter fuerst
@ 2007-12-09  4:38             ` Ralf Baechle
  2007-12-10 11:00               ` Richard Sandiford
  1 sibling, 1 reply; 26+ messages in thread
From: Ralf Baechle @ 2007-12-09  4:38 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Kumba, linux-mips, rsandifo

On Sat, Dec 08, 2007 at 08:09:31PM +0000, Richard Sandiford wrote:

> Ralf Baechle <ralf@linux-mips.org> writes:
> > On Sat, Dec 08, 2007 at 05:52:06PM +0000, Richard Sandiford wrote:
> >> I tried to piece together (b) by asking questions in the reviews,
> >> but it would be great to have a single explanation.
> >> 
> >> The idea behind (c) is simple, of course: we insert a cache barrier
> >> before the potentially-problematic stores (and, for certain
> >> configurations, loads, although the original gcc patch had the
> >> associated macro hard-wired to false).  The key is explaining how,
> >> from a compiler internals viewpoint, we decide what is "potentially-
> >> problematic".  This ties in with the assumptions for (b).
> >
> > The principle for the compiler is a store is problematic unless proven
> > otherwise.  A speculative store relative to the stack pointer, frame
> > pointer or global pointer for example is harmless.
> 
> Right.  But just so we're on the same page (and I think we probably are),
> my point was that those rules aren't intrinsically obvious.  They're
> based on assumptions about how the code is written.  For example,
> it assumes there's no DMAing into stack variables.  Maybe obvious,
> but I think it needs to be stated explicitly.

Can't harm to be explicit.  Linux forbids DMA to the stack.  In that past
DMA to the stack has caused alot of grief for Linux ports on some
architectures.

> Then there's the language-lawyerly code I gave to Peter on gcc-patches@:
> 
>      void foo (int x)
>      {
>        int array[1];
>        if (x)
>          bar (array[0x1fff]);
>      }
> 
> This function is valid if x is never true, so we cannot assume that all
> accesses off the stack and frame pointers are actually in-frame.  You're
> assuming either (i) the kernel doesn't use code like that or (ii) that
> "garbage" addresses in the range [$sp - 0x8000, $sp + 0x7fff] will not
> trigger the problem.  I imagine both are reasonable assumptions, and I'm
> perfectly happy for us to make them.  But they're the kind of assumption
> we need to state explicitly.

Interesting test case.  I've been thinking about it myself but in the end
decieded to believe Peter's analysis since he's banged the head for longer
to the wall about this problem that I have ;-)  I'm quite but not absolutely
certain that this case cannot happen for realworld code, so I'd rather
err on the side of caution.

Peter & Thomas - we could make the stack thing bullet proof by vmallocing
stacks and ensuring a sufficient virtual address gap exists around the stack
such that the stack is the only addressable thing in the range of
$sp +0x7fff / -0x8000?

A -mr10k-cache-barrier=sp-is-safe option?

> Peter's patch also treated accesses to constant integer and symbolic
> addresses as safe.  Again, this involves making assumptions about how
> constant integer and symbolic addresses are used, and this is a much
> less obvious assumption than the stack one.

The latter assumption is also needed for -msym32 kernels, so it's well
proven to be valid.  The former hold, too.

>  Again, I understand that
> it's a reasonable assumption to make in the linux context, but it's one
> we need to pin down.  E.g. there must be no run-time guarding of
> target-specific constant integer IO-mapped addresses in cases where
> those addresses might trigger the problem on other systems that the
> same kernel image supports.

In case of a hypothetic multi-platform kernel of which at least one needs
the R10000 workarounds, all code would be uniformly compiled with the
magic -mr10k-cache-barrier option and all source level workaround would
be enabled.

> Despite appearances, I'm not trying to be awkward here ;)  I just think
> the assumptions are too loosely-defined at the moment (or at least too
> scattered around).  It would be nice to have some self-contained
> description, targetted specifically at gcc and linux, that contains
> anything a gcc hacker or user needs to know about the gcc patch.

Your help is certainly appreciated and trying to find the potencial holes
here will only help.

  Ralf

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-09  4:38             ` Ralf Baechle
@ 2007-12-10 11:00               ` Richard Sandiford
  2007-12-12 15:26                 ` peter fuerst
  0 siblings, 1 reply; 26+ messages in thread
From: Richard Sandiford @ 2007-12-10 11:00 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thomas Bogendoerfer, Kumba, linux-mips

Ralf Baechle <ralf@linux-mips.org> writes:
>> Then there's the language-lawyerly code I gave to Peter on gcc-patches@:
>> 
>>      void foo (int x)
>>      {
>>        int array[1];
>>        if (x)
>>          bar (array[0x1fff]);
>>      }
>> 
>> This function is valid if x is never true, so we cannot assume that all
>> accesses off the stack and frame pointers are actually in-frame.  You're
>> assuming either (i) the kernel doesn't use code like that or (ii) that
>> "garbage" addresses in the range [$sp - 0x8000, $sp + 0x7fff] will not
>> trigger the problem.  I imagine both are reasonable assumptions, and I'm
>> perfectly happy for us to make them.  But they're the kind of assumption
>> we need to state explicitly.
>
> Interesting test case.  I've been thinking about it myself but in the end
> decieded to believe Peter's analysis since he's banged the head for longer
> to the wall about this problem that I have ;-)  I'm quite but not absolutely
> certain that this case cannot happen for realworld code, so I'd rather
> err on the side of caution.
>
> Peter & Thomas - we could make the stack thing bullet proof by vmallocing
> stacks and ensuring a sufficient virtual address gap exists around the stack
> such that the stack is the only addressable thing in the range of
> $sp +0x7fff / -0x8000?

FWIW, my first cut at the option restrictions were based on what
the patch exempts (and doesn't exempt).  We could instead get gcc
to only exempt accesses that it can prove are either to the current
function's stack frame or to its stack arguments.  I.e. rather than
consider every $sp-based access to be safe, we'd instead do some
bounds checking on the value.  (We could also use MEM_ATTRS to
pick up cases where a stack variable is acceesed via something
other than the stack or frame pointers, as happens for large frames.)

>> Peter's patch also treated accesses to constant integer and symbolic
>> addresses as safe.  Again, this involves making assumptions about how
>> constant integer and symbolic addresses are used, and this is a much
>> less obvious assumption than the stack one.
>
> The latter assumption is also needed for -msym32 kernels, so it's well
> proven to be valid.  The former hold, too.
>
>>  Again, I understand that
>> it's a reasonable assumption to make in the linux context, but it's one
>> we need to pin down.  E.g. there must be no run-time guarding of
>> target-specific constant integer IO-mapped addresses in cases where
>> those addresses might trigger the problem on other systems that the
>> same kernel image supports.
>
> In case of a hypothetic multi-platform kernel of which at least one needs
> the R10000 workarounds, all code would be uniformly compiled with the
> magic -mr10k-cache-barrier option and all source level workaround would
> be enabled.

Hmm.  This probably shows I am misunderstanding the problem, but I was
thinking about the IO-mapped case.  I thought one of the problems was
that if you had a cached speculative load or store to an access-sensitive
IO-mapped address, the IO-mapped device might "see" that access even if it
doesn't take place.  Could you not have a situation where a KSEG0 or
XKSEG0 access is access-sensitive on one machine and not another?
The patch won't insert countermeasures before symbolic and constant
addresses, because it believes all such addresses to be safe.

I'm also a little worried that the compiler is free to make up accesses
that didn't exist in the original program, provided that those accesses
are never actually performed in cases where they'd be wrong.  So how about:

-mr10k-cache-barrier=load-store
  Insert a cache barrier at the beginning of any sequentially-executed
  series of instructions that contains a load or store.  For the purposes
  of this option, GCC can ignore loads and stores that it can prove
  are an in-range access to:

  (a) the current function's stack frame;
  (b) an incoming stack argument;
  (b) an object with a link-time-constant address; or
  (c) a block of uncached memory

  It can also ignore sequences that are always immediately preceded by
  an untaken branch-likely instruction.

  Here, a ``sequentially-executed series'' is one in which calls,
  jumps and branches occur only as the last instruction.

-mr10k-cache-barrier=store
  Like -mr10k-cache-barrier=load-store, but ignore all loads.

-mr10k-cache-barrier=none
  ...

Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-10 11:00               ` Richard Sandiford
@ 2007-12-12 15:26                 ` peter fuerst
  2007-12-12 18:09                   ` Richard Sandiford
  0 siblings, 1 reply; 26+ messages in thread
From: peter fuerst @ 2007-12-12 15:26 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Ralf Baechle, Thomas Bogendoerfer, Kumba, linux-mips


On Mon, 10 Dec 2007, Richard Sandiford wrote:

> Ralf Baechle <ralf@linux-mips.org> writes:
> >> Then there's the language-lawyerly code I gave to Peter on gcc-patches@:
> >>
> >>      void foo (int x)
> >>      {
> >>        int array[1];
> >>        if (x)
> >>          bar (array[0x1fff]);
> >>      }
> >>

A strange method to pass data... Of course, cooking up such an "ABI",
where local variables are accessed with a const offset that is not known at
compile-time to be valid, would subvert the test for $sp-based accesses...

> >> This function is valid if x is never true, so we cannot assume that all
> >> accesses off the stack and frame pointers are actually in-frame.  You're
> >> assuming either (i) the kernel doesn't use code like that or (ii) that
> >> "garbage" addresses in the range [$sp - 0x8000, $sp + 0x7fff] will not
> >> trigger the problem.  I imagine both are reasonable assumptions, and I'm
> >> perfectly happy for us to make them.  But they're the kind of assumption
> >> we need to state explicitly.
> >
> > Interesting test case.  I've been thinking about it myself but in the end
> > decieded to believe Peter's analysis since he's banged the head for longer
> > to the wall about this problem that I have ;-)  I'm quite but not absolutely
> > certain that this case cannot happen for realworld code, so I'd rather
> > err on the side of caution.
> >
> > Peter & Thomas - we could make the stack thing bullet proof by vmallocing
> > stacks and ensuring a sufficient virtual address gap exists around the stack
> > such that the stack is the only addressable thing in the range of
> > $sp +0x7fff / -0x8000?

...but having an address-gap, virtual or by unused memory, should make it save
even with such code.

Typical "realworld" examples for speculative access to stack-variables are

void foo (int x)             void foo ()
{                            {
  int array[N];                int array[N], i;
  if (x < N)                   for (i = 0; i < N; i++)
    bar (array[x]);              array[i] = 0;
}                            }
i.e. accesses with non-const offsets, i.e. no longer $sp-based, which always
will trigger a CB.

>
> FWIW, my first cut at the option restrictions were based on what
> the patch exempts (and doesn't exempt).  We could instead get gcc
> to only exempt accesses that it can prove are either to the current
> function's stack frame or to its stack arguments.  I.e. rather than
> consider every $sp-based access to be safe, we'd instead do some

"every $sp-based access" (set(mem(plus(sp)(const_int)))) is restricted
to local variables too, with the constant offset being either
- compiler-generated or
- deliberately put in the source (however including the above example)

> bounds checking on the value.
Fine, if that is possible.

>                                (We could also use MEM_ATTRS to
> pick up cases where a stack variable is acceesed via something
> other than the stack or frame pointers, as happens for large frames.)

Aren't these always accesses with non-constant offset, where a CB can't be
avoided, even if they are recognized as being actually relative to $sp ?

>
> >> Peter's patch also treated accesses to constant integer and symbolic
> >> addresses as safe.  Again, this involves making assumptions about how
> >> constant integer and symbolic addresses are used, and this is a much
> >> less obvious assumption than the stack one.
> >
> > The latter assumption is also needed for -msym32 kernels, so it's well
> > proven to be valid.  The former hold, too.
> >
> >>  Again, I understand that
> >> it's a reasonable assumption to make in the linux context, but it's one
> >> we need to pin down.  E.g. there must be no run-time guarding of
> >> target-specific constant integer IO-mapped addresses in cases where
> >> those addresses might trigger the problem on other systems that the
> >> same kernel image supports.
> >
> > In case of a hypothetic multi-platform kernel of which at least one needs
> > the R10000 workarounds, all code would be uniformly compiled with the
> > magic -mr10k-cache-barrier option and all source level workaround would
> > be enabled.
>
> Hmm.  This probably shows I am misunderstanding the problem, but I was
> thinking about the IO-mapped case.  I thought one of the problems was
> that if you had a cached speculative load or store to an access-sensitive
> IO-mapped address, the IO-mapped device might "see" that access even if it
> doesn't take place.  Could you not have a situation where a KSEG0 or
> XKSEG0 access is access-sensitive on one machine and not another?
> The patch won't insert countermeasures before symbolic and constant
> addresses, because it believes all such addresses to be safe.
>

The threat to IO-addresses comes from the addressing register in the speculated
mem-instruction (set(mem(plus(reg)...), containing one of the addresses as
"garbage".

Symbolic addresses are well defined from link-time on, no matter what history
before the access.  They either point (set(mem(plus(symbol_ref)...) to
- some variable in the cached area, what is harmless (unless DMA-related),
  or to
- IO-devices, accessed uncached, i.e. non-speculative,
unless there is a programming-error ;)
The same holds for const_int used as address.

If used for DMA and also directly accessed, symbolic addresses could be
problematic though:
extern char big_fat_dma_buffer[N];
if (!dma_running)
	*big_fat_dma_buffer = 0;
else
	...
(However, as soon as accessed with non-constant offset - e.g. in a loop,... -
the symbol_ref disappears from the mem-instruction which will trigger a CB)

Btw. with 4.x symbolic addresses are practically (without backtrace analysis)
not exempted from CBs, since they no longer show up in the mem-instruction
and (set(mem(lo_sum(reg)... is seen instead.

> I'm also a little worried that the compiler is free to make up accesses
> that didn't exist in the original program, provided that those accesses
The cache-barrier itself ?

> are never actually performed in cases where they'd be wrong.  So how about:
>
> -mr10k-cache-barrier=load-store
>   Insert a cache barrier at the beginning of any sequentially-executed
>   series of instructions that contains a load or store.  For the purposes
>   of this option, GCC can ignore loads and stores that it can prove
>   are an in-range access to:
>
>   (a) the current function's stack frame;
>   (b) an incoming stack argument;
>   (b) an object with a link-time-constant address; or
>   (c) a block of uncached memory
Can we recognize uncached memory in the instruction ?

>
>   It can also ignore sequences that are always immediately preceded by
>   an untaken branch-likely instruction.
Fine!

>
>   Here, a ``sequentially-executed series'' is one in which calls,
>   jumps and branches occur only as the last instruction.
>
> -mr10k-cache-barrier=store
>   Like -mr10k-cache-barrier=load-store, but ignore all loads.
>
> -mr10k-cache-barrier=none
>   ...
>
> Richard
>
>
>

kind regards

peter

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-12 15:26                 ` peter fuerst
@ 2007-12-12 18:09                   ` Richard Sandiford
  2007-12-12 18:22                     ` Richard Sandiford
  0 siblings, 1 reply; 26+ messages in thread
From: Richard Sandiford @ 2007-12-12 18:09 UTC (permalink / raw)
  To: peter fuerst; +Cc: Ralf Baechle, Thomas Bogendoerfer, Kumba, linux-mips

peter fuerst <pf@pfrst.de> writes:
>> Ralf Baechle <ralf@linux-mips.org> writes:
>> >> Then there's the language-lawyerly code I gave to Peter on gcc-patches@:
>> >>
>> >>      void foo (int x)
>> >>      {
>> >>        int array[1];
>> >>        if (x)
>> >>          bar (array[0x1fff]);
>> >>      }
>> >>
>
> A strange method to pass data... Of course, cooking up such an "ABI",
> where local variables are accessed with a const offset that is not known at
> compile-time to be valid, would subvert the test for $sp-based accesses...

Well, as I said when I gave that example originally, it's unlikely that
the example would be written in that form.  But hide the constants and
checks in configurable macros, and the general idea becomes a little
more feasible.

>> FWIW, my first cut at the option restrictions were based on what
>> the patch exempts (and doesn't exempt).  We could instead get gcc
>> to only exempt accesses that it can prove are either to the current
>> function's stack frame or to its stack arguments.  I.e. rather than
>> consider every $sp-based access to be safe, we'd instead do some
>
> "every $sp-based access" (set(mem(plus(sp)(const_int)))) is restricted
> to local variables too, with the constant offset being either
> - compiler-generated or
> - deliberately put in the source (however including the above example)

That's not literally true.  SP+INT addresses can be used to access
stack arguments too, and 4.x can optimise some varargs accesses to
compile-time base+offset addresses.  And as I said, the compiler is
free to make up accesses that aren't in fact valid for cases where
the access isn't made.  E.g. if you had a loop with a stride of 128,
the compiler could unroll the loop as many times as it likes.  Some
of the unrolled iterations might access areas outside the stack frame.
(You would hope that the compiler would be intelligent enough to crop
the iteration count in such cases, because the extra iterations should
never be used in valid code.  But that isn't the point.  The compiler
doesn't _need_ to crop the iteration count for correctness, and we're
talking about something we _do_ need for correctness.)

>> bounds checking on the value.
> Fine, if that is possible.

FWIW, the frame info is available in cfun->machine->frame at the time
your code runs.

>>                                (We could also use MEM_ATTRS to
>> pick up cases where a stack variable is acceesed via something
>> other than the stack or frame pointers, as happens for large frames.)
>
> Aren't these always accesses with non-constant offset, where a CB can't be
> avoided, even if they are recognized as being actually relative to $sp ?

The MEM_ATTRS I meant were MEM_EXPR + MEM_OFFSET, which only apply where
there is a known constant offset.

>> > In case of a hypothetic multi-platform kernel of which at least one needs
>> > the R10000 workarounds, all code would be uniformly compiled with the
>> > magic -mr10k-cache-barrier option and all source level workaround would
>> > be enabled.
>>
>> Hmm.  This probably shows I am misunderstanding the problem, but I was
>> thinking about the IO-mapped case.  I thought one of the problems was
>> that if you had a cached speculative load or store to an access-sensitive
>> IO-mapped address, the IO-mapped device might "see" that access even if it
>> doesn't take place.  Could you not have a situation where a KSEG0 or
>> XKSEG0 access is access-sensitive on one machine and not another?
>> The patch won't insert countermeasures before symbolic and constant
>> addresses, because it believes all such addresses to be safe.
>>
>
> The threat to IO-addresses comes from the addressing register in the speculated
> mem-instruction (set(mem(plus(reg)...), containing one of the addresses as
> "garbage".
>
> Symbolic addresses are well defined from link-time on, no matter what history
> before the access.  They either point (set(mem(plus(symbol_ref)...) to
> - some variable in the cached area, what is harmless (unless DMA-related),
>   or to
> - IO-devices, accessed uncached, i.e. non-speculative,
> unless there is a programming-error ;)
> The same holds for const_int used as address.

I think you're missing my point.  If you access an I/O-mapped device
through KSEG2 or an uncached XKPHYS address, is it not also physically
possible (though clearly unwise) to access it through KSEG0 or a cached
XKPHYS address too?  So can you guarantee that every const_int cached
address in a multi-platform kernel is not I/O-mapped on any of the r10k
platforms?  Or can you guarantee that the compiler will not manufacture
such an address from an otherwise harmless address?  Again, the key thing
is to think about what the compiler can validly do on non-r10k platforms,
however silly it might seem, and then make sure the workarounds cope
with it.

>> I'm also a little worried that the compiler is free to make up accesses
>> that didn't exist in the original program, provided that those accesses
> The cache-barrier itself ?

No, in general.  Optimisers (particularly loop optimisers) can invent
accesses that didn't exist in the original source code.  Normally they
would only be executed in correct circumstances, but with this
speculative execution, that might not be true.

>> are never actually performed in cases where they'd be wrong.  So how about:
>>
>> -mr10k-cache-barrier=load-store
>>   Insert a cache barrier at the beginning of any sequentially-executed
>>   series of instructions that contains a load or store.  For the purposes
>>   of this option, GCC can ignore loads and stores that it can prove
>>   are an in-range access to:
>>
>>   (a) the current function's stack frame;
>>   (b) an incoming stack argument;
>>   (b) an object with a link-time-constant address; or
>>   (c) a block of uncached memory
> Can we recognize uncached memory in the instruction ?

Well, I was just thinking about teaching the compiler about KSEG2,
the always-uncached XKPHYS addresses, etc.  (Sorry for messing up
the bullet letters there!)  The idea is that we have a correlation
between symbolic constants and C objects, so we can check whether
an offset in a symbolic constant is within the object.  We already
have code to do this in other situations.  But there is no correlation
between const_int addresses and C objects, and we cannot be sure that
a given const_int address existed in the original source code, so
I think the only safe thing is to check its uncached properties instead.

I know all this must be frustrating.  I'm sure your patches work great
as they are with current and past kernels, and current and past compilers.
The problem is that, if it becomes a mainline gcc feature, it needs to be
defined from first principles.  And we need to do that without assuming
that the accesses we're looking at existed in the original source code.

FWIW, I'm happy to help update the patch once we've agreed on an
option spec.

Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [UPDATED PATCH] IP28 support
  2007-12-12 18:09                   ` Richard Sandiford
@ 2007-12-12 18:22                     ` Richard Sandiford
  0 siblings, 0 replies; 26+ messages in thread
From: Richard Sandiford @ 2007-12-12 18:22 UTC (permalink / raw)
  To: peter fuerst; +Cc: Ralf Baechle, Thomas Bogendoerfer, Kumba, linux-mips

Richard Sandiford <rsandifo@nildram.co.uk> writes:
> through KSEG2 or an uncached XKPHYS address, is it not also physically

er, I meant KSEG1 of course.  Same mistake later.

Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2008-01-19 23:56 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-23  1:44 [UPDATED PATCH] IP28 support peter fuerst
2007-12-23  9:39 ` Richard Sandiford
2007-12-24  0:39   ` post
2008-01-16 19:32   ` peter fuerst
2008-01-19 14:14     ` Richard Sandiford
2008-01-19 23:56       ` post
  -- strict thread matches above, loose matches on Subject: below --
2007-12-02 12:00 Thomas Bogendoerfer
2007-11-29  9:54 Thomas Bogendoerfer
2007-11-29 13:01 ` Ralf Baechle
2007-12-05  6:16   ` Kumba
2007-12-05  9:39     ` Thomas Bogendoerfer
2007-12-05 19:49       ` peter fuerst
2007-12-05 20:37         ` David Daney
2007-12-06 11:44           ` Ralf Baechle
2007-12-06 11:41         ` Ralf Baechle
2007-12-08 17:52       ` Richard Sandiford
2007-12-08 17:52         ` Richard Sandiford
2007-12-08 19:24         ` Ralf Baechle
2007-12-08 20:09           ` Richard Sandiford
2007-12-08 21:25             ` peter fuerst
2007-12-08 23:24               ` Richard Sandiford
2007-12-09  4:38             ` Ralf Baechle
2007-12-10 11:00               ` Richard Sandiford
2007-12-12 15:26                 ` peter fuerst
2007-12-12 18:09                   ` Richard Sandiford
2007-12-12 18:22                     ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox