load_unaligned() and "uld" instruction

Linux MIPS Architecture development
 help / color / mirror / Atom feed

* load_unaligned() and "uld" instruction
@ 2000-09-25 18:48 Jun Sun
  2000-09-25 21:16 ` Dominic Sweetman
  0 siblings, 1 reply; 33+ messages in thread
From: Jun Sun @ 2000-09-25 18:48 UTC (permalink / raw)
  To: linux-mips, linux-mips

The USB sub-system uses "unaligned.h" file to access unaligned data. 
All the unaligned data access functions depend on "uld" and "usw"
instructions, which are not available on many CPUs.

I wonder if there is a version of unaligned access functions which do
not depend on those instructions.  If not, I can probably write one.

Any suggestions?

Jun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-25 18:48 load_unaligned() and "uld" instruction Jun Sun
@ 2000-09-25 21:16 ` Dominic Sweetman
  2000-09-25 21:36   ` Jun Sun
  0 siblings, 1 reply; 33+ messages in thread
From: Dominic Sweetman @ 2000-09-25 21:16 UTC (permalink / raw)
  To: Jun Sun; +Cc: linux-mips, linux-mips

Jun Sun (jsun@mvista.com) writes:

> The USB sub-system uses "unaligned.h" file to access unaligned data. 
> All the unaligned data access functions depend on "uld" and "usw"
> instructions, which are not available on many CPUs.

You won't find the instruction 'uld' in *any* MIPS CPU.

uld is an assembler macro-instruction translating into a 

  ldl
  ldr

pair (the instructions are called load-double-left and
load-double-right).  The exact translation depends on whether you're
running big-endian or little-endian... but the 32-bit version on a
big-endian CPU is that 

  ulw $1, <address>

is assembled as

  lwl $1, <address>
  lwr $1, <address+3>

The way that the load-left and load-right work together is kind of
tricky to get your head round.  

So far as I know, all 64-bit MIPS CPUs implement ldl/ldr and the store
equivalents.  MIPS patented these instructions, so clones like Lexra's
don't implement the 32-bit versions (lwl, lwr etc).

-- 
Dominic Sweetman
Algorithmics Ltd
The Fruit Farm, Ely Road, Chittering, CAMBS CB5 9PH, ENGLAND
phone: +44 1223 706200 / fax: +44 1223 706250 / http://www.algor.co.uk

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-25 21:16 ` Dominic Sweetman
@ 2000-09-25 21:36   ` Jun Sun
  2000-09-25 23:29     ` Ralf Baechle
  2000-09-26  6:22     ` Kevin D. Kissell
  0 siblings, 2 replies; 33+ messages in thread
From: Jun Sun @ 2000-09-25 21:36 UTC (permalink / raw)
  To: Dominic Sweetman; +Cc: linux-mips, linux-mips

Dominic Sweetman wrote:
> 
> Jun Sun (jsun@mvista.com) writes:
> 
> > The USB sub-system uses "unaligned.h" file to access unaligned data.
> > All the unaligned data access functions depend on "uld" and "usw"
> > instructions, which are not available on many CPUs.
> 
> You won't find the instruction 'uld' in *any* MIPS CPU.
> 
> uld is an assembler macro-instruction translating into a
> 
>   ldl
>   ldr
> 
> pair (the instructions are called load-double-left and
> load-double-right).  The exact translation depends on whether you're
> running big-endian or little-endian... but the 32-bit version on a
> big-endian CPU is that
> 
>   ulw $1, <address>
> 
> is assembled as
> 
>   lwl $1, <address>
>   lwr $1, <address+3>
> 
> The way that the load-left and load-right work together is kind of
> tricky to get your head round.
> 
> So far as I know, all 64-bit MIPS CPUs implement ldl/ldr and the store
> equivalents.  MIPS patented these instructions, so clones like Lexra's
> don't implement the 32-bit versions (lwl, lwr etc).
> 
> --
> Dominic Sweetman
> Algorithmics Ltd
> The Fruit Farm, Ely Road, Chittering, CAMBS CB5 9PH, ENGLAND
> phone: +44 1223 706200 / fax: +44 1223 706250 / http://www.algor.co.uk

Dominic,

Thanks for the clarification.

I looked at my problem again, and it turns out that it was caused by
"-mips2" compiler option.  If I use "-mips3", the complain goes away,
which seems to make sense - assuming "uld" and "usw" are introduced in
mips III.

This actually brings another question (which I thought I have posted
before).  Take a look of arch/mips/Makefile, you will find most CPUS
uses -mips2 compiler option.  While -mips2 is safe, it cannot take
advantages of "uld" etc.  Is there any reason that we don't want to use
-mips3, at least for some of the later CPUs?

If we have to use "-mips2" option, is there a clean way which allows us
to "uld/usw" instructions (instead of manually twicking the compilation
for each file that uses them)?

Another question is that in the same file most CPUs will take another
compiler option such as "-mcpu=r8000", in which case the cpu model
usually does NOT correspond to the actual CPU.  Why is that?

Thanks.

Jun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-25 21:36   ` Jun Sun
@ 2000-09-25 23:29     ` Ralf Baechle
  2000-09-26  6:22     ` Kevin D. Kissell
  1 sibling, 0 replies; 33+ messages in thread
From: Ralf Baechle @ 2000-09-25 23:29 UTC (permalink / raw)
  To: Jun Sun; +Cc: Dominic Sweetman, linux-mips, linux-mips

On Mon, Sep 25, 2000 at 02:36:39PM -0700, Jun Sun wrote:

> I looked at my problem again, and it turns out that it was caused by
> "-mips2" compiler option.  If I use "-mips3", the complain goes away,
> which seems to make sense - assuming "uld" and "usw" are introduced in
> mips III.
> 
> This actually brings another question (which I thought I have posted
> before).  Take a look of arch/mips/Makefile, you will find most CPUS
> uses -mips2 compiler option.  While -mips2 is safe, it cannot take
> advantages of "uld" etc.  Is there any reason that we don't want to use
> -mips3, at least for some of the later CPUs?

You cannot use any kind of 64-bit operation for the 32-bit kernel except
for the $zero register.  This is because all exceptions as far as they
store / restore the integer registers at all will only deal with the lower
32-bit of the registers.  In other word any interrupt will corrupt the
upper 32-bit bit of gp registers.

Back in history I tried to enable the use of the full 64-bit register in
the kernel - it ended up ugly as hell, especially because we still want
to be able to share most of the code with the R3000.

> If we have to use "-mips2" option, is there a clean way which allows us
> to "uld/usw" instructions (instead of manually twicking the compilation
> for each file that uses them)?
> 
> Another question is that in the same file most CPUs will take another
> compiler option such as "-mcpu=r8000", in which case the cpu model
> usually does NOT correspond to the actual CPU.  Why is that?

-mcpu=<somecpu> chooses what CPU gcc will schedule instructions for.  No
matter what value you choose for <somecpu> the code will run on all CPUs.
-mips<n> chooses which ISA level gcc will generate code for; that code
won't run on CPUs with a ISA level less than <n>.

  Ralf

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-25 21:36   ` Jun Sun
  2000-09-25 23:29     ` Ralf Baechle
@ 2000-09-26  6:22     ` Kevin D. Kissell
  2000-09-26  6:22       ` Kevin D. Kissell
                         ` (2 more replies)
  1 sibling, 3 replies; 33+ messages in thread
From: Kevin D. Kissell @ 2000-09-26  6:22 UTC (permalink / raw)
  To: Jun Sun, Dominic Sweetman; +Cc: linux-mips, linux-mips

> Dominic,
> 
> Thanks for the clarification.

I'll second that - he beat me to it!

> I looked at my problem again, and it turns out that it was caused by
> "-mips2" compiler option.  If I use "-mips3", the complain goes away,
> which seems to make sense - assuming "uld" and "usw" are 
> introduced in mips III.

The "load word left/right" and "store word left/right" instructions are 
part of the original MIPS I ISA.  On the other hand, "uld" represents
a load of an unalgined quad or "doubleword" of 64-bits, and uses
64-bit load double right/left instructions that are part of the 64-bit
MIPS III ISA.  

> This actually brings another question (which I thought I have posted
> before).  Take a look of arch/mips/Makefile, you will find most CPUS
> uses -mips2 compiler option.  While -mips2 is safe, it cannot take
> advantages of "uld" etc.  Is there any reason that we don't want to use
> -mips3, at least for some of the later CPUs?
> 
> If we have to use "-mips2" option, is there a clean way which allows us
> to "uld/usw" instructions (instead of manually twicking the compilation
> for each file that uses them)?

This is a general problem that I've had to fight with the 
"main line" MIPS/Linux distribution.  Most of the work
being done is being done on SGI platforms, and all
SGI systems since the Crimson have had 64-bit CPUs.
Older DECStations use R3000s, and more importantly,
many of the new embedded MIPS designs use "MIPS32"
processors that have R4000-like system coprocessors,
but only 32-bit data paths.  I had to do a fairly complete
redesign of the 2.2 semaphore support code, for example,
in order to get it to rely only on the 32-bit forms of load
locked and store conditional.  It's clear that I'll have to do
something similar with the unaligned accesses in the USB 
support code before it will run on the MIPS 4Kc and 
similar CPUs.

> Another question is that in the same file most CPUs will take another
> compiler option such as "-mcpu=r8000", in which case the cpu model
> usually does NOT correspond to the actual CPU.  Why is that?

The -mcpu tells the compiler and assembler for what kind
of pipeline it should optimise, which is independent of the
ISA level.  "-mcpu=r8000", for example, tells the tools that
the CPU is superscalar. Thus one sees that option selected 
for the R5000 platforms, even though the R5000 and R8000
pipelines are otherwise very dissimilar.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-26  6:22     ` Kevin D. Kissell
@ 2000-09-26  6:22       ` Kevin D. Kissell
  2000-09-26  9:08       ` Dominic Sweetman
  2000-09-26 18:04       ` Jun Sun
  2 siblings, 0 replies; 33+ messages in thread
From: Kevin D. Kissell @ 2000-09-26  6:22 UTC (permalink / raw)
  To: Jun Sun, Dominic Sweetman; +Cc: linux-mips, linux-mips

> Dominic,
> 
> Thanks for the clarification.

I'll second that - he beat me to it!

> I looked at my problem again, and it turns out that it was caused by
> "-mips2" compiler option.  If I use "-mips3", the complain goes away,
> which seems to make sense - assuming "uld" and "usw" are 
> introduced in mips III.

The "load word left/right" and "store word left/right" instructions are 
part of the original MIPS I ISA.  On the other hand, "uld" represents
a load of an unalgined quad or "doubleword" of 64-bits, and uses
64-bit load double right/left instructions that are part of the 64-bit
MIPS III ISA.  

> This actually brings another question (which I thought I have posted
> before).  Take a look of arch/mips/Makefile, you will find most CPUS
> uses -mips2 compiler option.  While -mips2 is safe, it cannot take
> advantages of "uld" etc.  Is there any reason that we don't want to use
> -mips3, at least for some of the later CPUs?
> 
> If we have to use "-mips2" option, is there a clean way which allows us
> to "uld/usw" instructions (instead of manually twicking the compilation
> for each file that uses them)?

This is a general problem that I've had to fight with the 
"main line" MIPS/Linux distribution.  Most of the work
being done is being done on SGI platforms, and all
SGI systems since the Crimson have had 64-bit CPUs.
Older DECStations use R3000s, and more importantly,
many of the new embedded MIPS designs use "MIPS32"
processors that have R4000-like system coprocessors,
but only 32-bit data paths.  I had to do a fairly complete
redesign of the 2.2 semaphore support code, for example,
in order to get it to rely only on the 32-bit forms of load
locked and store conditional.  It's clear that I'll have to do
something similar with the unaligned accesses in the USB 
support code before it will run on the MIPS 4Kc and 
similar CPUs.

> Another question is that in the same file most CPUs will take another
> compiler option such as "-mcpu=r8000", in which case the cpu model
> usually does NOT correspond to the actual CPU.  Why is that?

The -mcpu tells the compiler and assembler for what kind
of pipeline it should optimise, which is independent of the
ISA level.  "-mcpu=r8000", for example, tells the tools that
the CPU is superscalar. Thus one sees that option selected 
for the R5000 platforms, even though the R5000 and R8000
pipelines are otherwise very dissimilar.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-26  6:22     ` Kevin D. Kissell
  2000-09-26  6:22       ` Kevin D. Kissell
@ 2000-09-26  9:08       ` Dominic Sweetman
  2000-09-26  9:08         ` Dominic Sweetman
  2000-09-29 17:22         ` Ralf Baechle
  2000-09-26 18:04       ` Jun Sun
  2 siblings, 2 replies; 33+ messages in thread
From: Dominic Sweetman @ 2000-09-26  9:08 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Jun Sun, Dominic Sweetman, linux-mips, linux-mips

Kevin D. Kissell (kevink@mips.com) writes:

> > Another question is that in the same file most CPUs will take another
> > compiler option such as "-mcpu=r8000", in which case the cpu model
> > usually does NOT correspond to the actual CPU.  Why is that?
> 
> The -mcpu tells the compiler and assembler for what kind
> of pipeline it should optimise, which is independent of the
> ISA level.  "-mcpu=r8000", for example, tells the tools that
> the CPU is superscalar. Thus one sees that option selected 
> for the R5000 platforms, even though the R5000 and R8000
> pipelines are otherwise very dissimilar.

Hmm.  I wish it was that simple.  But some MIPS CPUs have 
instruction set additions which are not related to the mips1, mips2,
etc.  For example, a whole collection of parts with a vaguely
"embedded" orientation has integer multiply/accumulate instructions.

Algorithmics' version of GCC (and, I'm sure, others) picks up on the
-mcpu=xxx flag to do that.  In fact, I don't think there's any other
way to allow the compiler to warn you of some bizarre omissions from
one or two rogue CPUs.

But until compiler support for MIPS Linux is more systematic, you'd be
better being conservative.  And you don't want to unnecessarily
multiply kernel versions - so in general, don't say "-mcpu=" anything
for kernel builds.

The Linux convention is "-mips2"; which is quite odd, because the
MIPS-II ISA was incarnate in just one CPU (the R6000).  A few units
were made around 1990 and even fewer worked; the project was overtaken
by the (-mips3, 64-bit) R4000.

Subsequently, and confusingly, "-mips2" has been re-used to mean
"-mips3 but don't assume 64-bit registers".  Except for floating
point.  Maybe.  (it's sometimes not a good idea to re-use a term).

Ralf wrote:

> You cannot use any kind of 64-bit operation for the 32-bit kernel...

Outside SGI circles, I believe, "32-bit kernels" are all that are
likely to work...

> ... except for the $zero register.  This is because all exceptions
> as far as they store / restore the integer registers at all will
> only deal with the lower 32-bit of the registers.  In other word any
> interrupt will corrupt the upper 32-bit bit of gp registers.

Even calling a subroutine compiled 32-bit may corrupt one of the
registers which are supposed to be preserved.

As Kevin indicates, it would probably be worth some effort to converge
on a kernel which would:

1. build for either 32-bit ("MIPS32" and near-miss) and 64-bit
  (MIPS3, MIPS4 and MIPS64) CPUs.

2. Allow 64-bit operations on 64-bit CPUs, without insisting that
   C data types grow.  Need to save the whole of registers and compile
   "long long" and "double" data types...

This is possible, but needs some thought.  AFAIK, the GCC currently
used for Linux changes the whole calling convention when -mips3 is
selected, which makes (2) pretty difficult.

-- 
Dominic Sweetman
Algorithmics Ltd
The Fruit Farm, Ely Road, Chittering, CAMBS CB5 9PH, ENGLAND
phone: +44 1223 706200 / fax: +44 1223 706250 / http://www.algor.co.uk

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-26  9:08       ` Dominic Sweetman
@ 2000-09-26  9:08         ` Dominic Sweetman
  2000-09-29 17:22         ` Ralf Baechle
  1 sibling, 0 replies; 33+ messages in thread
From: Dominic Sweetman @ 2000-09-26  9:08 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Jun Sun, Dominic Sweetman, linux-mips, linux-mips

Kevin D. Kissell (kevink@mips.com) writes:

> > Another question is that in the same file most CPUs will take another
> > compiler option such as "-mcpu=r8000", in which case the cpu model
> > usually does NOT correspond to the actual CPU.  Why is that?
> 
> The -mcpu tells the compiler and assembler for what kind
> of pipeline it should optimise, which is independent of the
> ISA level.  "-mcpu=r8000", for example, tells the tools that
> the CPU is superscalar. Thus one sees that option selected 
> for the R5000 platforms, even though the R5000 and R8000
> pipelines are otherwise very dissimilar.

Hmm.  I wish it was that simple.  But some MIPS CPUs have 
instruction set additions which are not related to the mips1, mips2,
etc.  For example, a whole collection of parts with a vaguely
"embedded" orientation has integer multiply/accumulate instructions.

Algorithmics' version of GCC (and, I'm sure, others) picks up on the
-mcpu=xxx flag to do that.  In fact, I don't think there's any other
way to allow the compiler to warn you of some bizarre omissions from
one or two rogue CPUs.

But until compiler support for MIPS Linux is more systematic, you'd be
better being conservative.  And you don't want to unnecessarily
multiply kernel versions - so in general, don't say "-mcpu=" anything
for kernel builds.

The Linux convention is "-mips2"; which is quite odd, because the
MIPS-II ISA was incarnate in just one CPU (the R6000).  A few units
were made around 1990 and even fewer worked; the project was overtaken
by the (-mips3, 64-bit) R4000.

Subsequently, and confusingly, "-mips2" has been re-used to mean
"-mips3 but don't assume 64-bit registers".  Except for floating
point.  Maybe.  (it's sometimes not a good idea to re-use a term).

Ralf wrote:

> You cannot use any kind of 64-bit operation for the 32-bit kernel...

Outside SGI circles, I believe, "32-bit kernels" are all that are
likely to work...

> ... except for the $zero register.  This is because all exceptions
> as far as they store / restore the integer registers at all will
> only deal with the lower 32-bit of the registers.  In other word any
> interrupt will corrupt the upper 32-bit bit of gp registers.

Even calling a subroutine compiled 32-bit may corrupt one of the
registers which are supposed to be preserved.

As Kevin indicates, it would probably be worth some effort to converge
on a kernel which would:

1. build for either 32-bit ("MIPS32" and near-miss) and 64-bit
  (MIPS3, MIPS4 and MIPS64) CPUs.

2. Allow 64-bit operations on 64-bit CPUs, without insisting that
   C data types grow.  Need to save the whole of registers and compile
   "long long" and "double" data types...

This is possible, but needs some thought.  AFAIK, the GCC currently
used for Linux changes the whole calling convention when -mips3 is
selected, which makes (2) pretty difficult.

-- 
Dominic Sweetman
Algorithmics Ltd
The Fruit Farm, Ely Road, Chittering, CAMBS CB5 9PH, ENGLAND
phone: +44 1223 706200 / fax: +44 1223 706250 / http://www.algor.co.uk

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-26  9:08       ` Dominic Sweetman
  2000-09-26  9:08         ` Dominic Sweetman
@ 2000-09-29 17:22         ` Ralf Baechle
  2000-10-09 14:49           ` Dominic Sweetman
  1 sibling, 1 reply; 33+ messages in thread
From: Ralf Baechle @ 2000-09-29 17:22 UTC (permalink / raw)
  To: Dominic Sweetman; +Cc: Kevin D. Kissell, linux-mips, linux-mips

On Tue, Sep 26, 2000 at 10:08:15AM +0100, Dominic Sweetman wrote:

> Hmm.  I wish it was that simple.  But some MIPS CPUs have 
> instruction set additions which are not related to the mips1, mips2,
> etc.  For example, a whole collection of parts with a vaguely
> "embedded" orientation has integer multiply/accumulate instructions.
> 
> Algorithmics' version of GCC (and, I'm sure, others) picks up on the
> -mcpu=xxx flag to do that.  In fact, I don't think there's any other
> way to allow the compiler to warn you of some bizarre omissions from
> one or two rogue CPUs.

Ouch.  The gcc documentation says this:

`-mcpu=CPU TYPE'
     Assume the defaults for the machine type CPU TYPE when scheduling
     instructions.  The choices for CPU TYPE are `r2000', `r3000',
     `r4000', `r4400', `r4600', and `r6000'.  While picking a specific
     CPU TYPE will schedule things appropriately for that particular
     chip, the compiler will not generate any code that does not meet
     level 1 of the MIPS ISA (instruction set architecture) without the
     `-mips2' or `-mips3' switches being used.

So in other words I wouldn't expect anything like mmad to be used unless
-mmad is also being choosen.  -mcpu not influencing the set of instructions
being used to build a program is a general gcc convention, not only for
MIPS.  So if the Algorithmics compiler does things different I'd consider
it to be off the track.

> But until compiler support for MIPS Linux is more systematic, you'd be
> better being conservative.  And you don't want to unnecessarily
> multiply kernel versions - so in general, don't say "-mcpu=" anything
> for kernel builds.

> The Linux convention is "-mips2"; which is quite odd, because the
> MIPS-II ISA was incarnate in just one CPU (the R6000).  A few units
> were made around 1990 and even fewer worked; the project was overtaken
> by the (-mips3, 64-bit) R4000.
> 
> Subsequently, and confusingly, "-mips2" has been re-used to mean
> "-mips3 but don't assume 64-bit registers".  Except for floating
> point.  Maybe.  (it's sometimes not a good idea to re-use a term).

In the kernel we actually don't care very much about floating point.

> Outside SGI circles, I believe, "32-bit kernels" are all that are
> likely to work...

Currently.  Some embedded people are actually asking for more than the
512mb memory supported by the 32-bit kernel.  So expect the 64-bit
kernel to become the predominant race in the not to distant future.
Also expect embedded SMP kernels in the not to far future.

No, I don't feel at all like adding highmem support to the 32-bit kernel.

> > ... except for the $zero register.  This is because all exceptions
> > as far as they store / restore the integer registers at all will
> > only deal with the lower 32-bit of the registers.  In other word any
> > interrupt will corrupt the upper 32-bit bit of gp registers.
> 
> Even calling a subroutine compiled 32-bit may corrupt one of the
> registers which are supposed to be preserved.

Sure, but that's kind of expected and obvious when following the
instruction sequence as it gets executed while the corruption by an
exception was pretty unobvious when I first ran into it ...

> As Kevin indicates, it would probably be worth some effort to converge
> on a kernel which would:
> 
> 1. build for either 32-bit ("MIPS32" and near-miss) and 64-bit
>   (MIPS3, MIPS4 and MIPS64) CPUs.
> 
> 2. Allow 64-bit operations on 64-bit CPUs, without insisting that
>    C data types grow.  Need to save the whole of registers and compile
>    "long long" and "double" data types...

I was thinking about moving all the 64-bit CPUs over to the mips64 kernel
and leave the `mips' kernel to the true 32-bit stuff.  If you go and
download a 2.0.14 tarball you'll see that I already once tried to support
full 64-bit operation but only 32-bit address space altogether with
real 32-bit CPUs in the `mips' architecture.  The result was fairly ugly,
so having learned form that I would prefer to keep 32-bit and 64-bit
stuff separate.

Most users will currently still not want to use a 64-bit address space
for apps.  That's ok, we can add support for 2-level page tables to
`mips64'.  That's already been done for example for x86 and looks
fairly sane and maintainable.

> This is possible, but needs some thought.  AFAIK, the GCC currently
> used for Linux changes the whole calling convention when -mips3 is
> selected, which makes (2) pretty difficult.

The calling conventions used by -mips3 are slight confusing, if not even
dangerous.  Older gccs use a non-standard calling convention which essentially
is a blind extension of the 32-bit ABI to 64-bit.  Newer gccs support
the N32 and 64 ABIs.  Unfortunately currently gcc does not support building
a single compiler that supports all three 32, N32 and 64 ABIs.

  Ralf

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-29 17:22         ` Ralf Baechle
@ 2000-10-09 14:49           ` Dominic Sweetman
  0 siblings, 0 replies; 33+ messages in thread
From: Dominic Sweetman @ 2000-10-09 14:49 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: Dominic Sweetman, sde, Kevin D. Kissell, linux-mips, linux-mips

It started when I wrote:

> > Hmm.  I wish it was that simple.  But some MIPS CPUs have 
> > instruction set additions which are not related to the mips1, mips2,
> > etc.  For example, a whole collection of parts with a vaguely
> > "embedded" orientation has integer multiply/accumulate instructions.
> > 
> > Algorithmics' version of GCC (and, I'm sure, others) picks up on the
> > -mcpu=xxx flag to do that.  In fact, I don't think there's any other
> > way to allow the compiler to warn you of some bizarre omissions from
> > one or two rogue CPUs.

Ralf Baechle (ralf@oss.sgi.com) replied:

> Ouch.  The gcc documentation says this:
> 
> `-mcpu=CPU TYPE'
>      Assume the defaults for the machine type CPU TYPE when
>      scheduling instructions.  The choices for CPU TYPE are `r2000',
>      `r3000', `r4000', `r4400', `r4600', and `r6000'.  While picking
>      a specific CPU TYPE will schedule things appropriately for that
>      particular chip, the compiler will not generate any code that
>      does not meet level 1 of the MIPS ISA (instruction set
>      architecture) without the `-mips2' or `-mips3' switches being
>      used.
> 
> So in other words I wouldn't expect anything like mmad to be used
> unless -mmad is also being choosen.  -mcpu not influencing the set
> of instructions being used to build a program is a general gcc
> convention, not only for MIPS.  So if the Algorithmics compiler does
> things different I'd consider it to be off the track.

I think we comply with a somewhat weaker reading of the same
paragraph, in that no "MIPS III" instruction will be used unless you
say -mips3 (or greater).

I could also be pedantic and point out that the effect of
"-mcpu=r3900" (for instance) is not defined by that quotation...

-mmad: as you know (but all readers might not) the integer
multiply-accumulate instructions are not in *any* numbered MIPS
instruction set - at least not until MIPS32, which is a different
series.  If they existed as a single, coherent add-on a single "-mmad"
flag would be the best solution - but they don't: no two
manufacturer's implementations are quite the same.

And the Vr41xx is MIPS III, except that it leaves out the "semaphore"
instructions LL and SC.  We want our toolchain to know these
instructions aren't there, and it seems natural to overload the
-mcpu=r4100 flag for this purpose.  Perhaps we'll propose a change to
the manual!

> > Outside SGI circles, I believe, "32-bit kernels" are all that are
> > likely to work...
> 
> Currently.  Some embedded people are actually asking for more than
> the 512mb memory supported by the 32-bit kernel.  So expect the
> 64-bit kernel to become the predominant race in the not to distant
> future.

I can see why that might be sensible.  Most MIPS CPUs except the
lowest-end are now 64-bit, so why try to fix the memory limitation
twice?

I can sketch some reasons, though, why this might not be automatically
and obviously correct outside SGI:

1. Linux on other architectures doesn't depend on being able to
   address the whole of physical memory through an "unmapped" window
   like MIPS' kseg0.  (So this dependency can't extend into
   machine-independent code).

2. One effect of making the kernel 64-bit will be the memory
   swallowed by all those double-size pointers.

3. You're missing the advantage of a neat trick in the MIPS
   architecture, where 32-bit code running on a mips3+ CPU
   automatically "sign-extends" 32-bit pointers to generate valid
   64-bit addresses.

   So it's not obvious why you shouldn't go the other way, and use
   32-bit pointers inside a kernel which supports 64-bit-pointer
   applications.  

> Also expect embedded SMP kernels in the not to far future.

That's orthogonal to the pointer size.

> > Even calling a subroutine compiled 32-bit may corrupt one of the
> > registers which are supposed to be preserved.
> 
> Sure, but that's kind of expected and obvious when following the
> instruction sequence as it gets executed while the corruption by an
> exception was pretty unobvious when I first ran into it ...

(With me it was the other way around... the interrupt problem was
obvious, but I found it harder to see how the C compiler puns data
between compiler-world types and "register" data types.)

I guess anyone interested needs to be very careful to make the
distinction (familiar to old hands) between:

1. Using a "64-bit capable" CPU (MIPS III or higher), which has 64-bit
   registers, data path and so on...

2. Compiling in an environment where some C variables are implemented
   with 64-bit mips3+ instructions or rely on 64-bit registers.

3. Compiling in an environment where C pointers become 64-bit objects.

It's easy to slip into saying "64-bit" to mean "whichever of
these I'm currently thinking of."

You mentioned Kevin's suggested virtues for a kernel:

> > 1. build for either 32-bit ("MIPS32" and near-miss) and 64-bit
> >   (MIPS3, MIPS4 and MIPS64) CPUs.

Kevin works for MIPS, who have invented MIPS32 to try to stem
incompatible proliferation of the instructino set of MIPS CPUs with
only 32-bit registers and data paths.  This is still new - few, if
any, MIPS32 CPUs have shipped in systems yet.

Linux kernels to run on 32-bit CPUs should perhaps rely on just the
MIPS I instruction set plus a usable TLB (MIPS MMU hardware).  It's
true there are two major branches of the CPU-control instructions, but
it's not that hard to cover up, and surely not a good use of scarce
resources to assume compliance to MIPS32 just now.

> > 2. Allow 64-bit operations on 64-bit CPUs, without insisting that
> >    [standard integer/pointer] C data types grow.  Need to save the
> >    whole of registers and compile "long long" and "double" data
> >    types...

Algorithmics thought that was a good idea, and it's a door we've kept
open to our "embedded" customers, where 64-bit pointers are not
much wanted.  It does create a lot of unexpected side-effects in
return for rather intangible benefits, so I sympathise with Ralf on
that one.

> I was thinking about moving all the 64-bit CPUs over to the mips64
> kernel and leave the `mips' kernel to the true 32-bit stuff.

I think by "64-bit CPUs" you mean all of my (1-3) above, and by "true
32-bit stuff" you mean... I'm really not sure what.

Somewhere buried under this is the problem of maintaining a Linux/MIPS
kernel and providing any kind of confidence that it will (at any
particular version) build and run correctly on "any reasonable MIPS
CPU".

To provide stability on variant platforms means identifying the
interfaces between variant-dependent and -independent code, freezing
those interfaces and treating them with great respect.  I think that's
still foreign to most of the Linux community, because they've grown up
with PCs.

It may simply be the best decision to allow the MIPS kernel landscape
to fragment into islands, with the "compatibility" layer at the
kernel/application interface (and some informal conventions to ease
device driver porting).

> Most users will currently still not want to use a 64-bit address
> space for apps.  That's ok, we can add support for 2-level page
> tables to `mips64'.

I can't see, offhand, why a kernel which can map a large user space
for applications with 64-bit pointers should require different page
tables for applications which use 32-bit pointers.  32-bit pointers
generate perfectly good 64-bit addresses.  The userspace layout of
32-bit-pointer applications needs to feature stack space (for example)
within reach of the 32-bit pointers - but does it really need such
large changes to the VM code?

> The calling conventions used by -mips3 are slight confusing, if not
> even dangerous.  Older gccs use a non-standard calling convention
> which essentially is a blind extension of the 32-bit ABI to
> 64-bit...
>
> Newer gccs support the N32 and 64 ABIs.  Unfortunately currently gcc
> does not support building a single compiler that supports all three
> 32, N32 and 64 ABIs.

While it would be nice to fix it, a single compiler which does all
three is perhaps not so critical... Using o32 puts you in such a
different universe that having a separate compiler is not such a big
deal. 

-- 
Dominic Sweetman
Algorithmics Ltd
The Fruit Farm, Ely Road, Chittering, CAMBS CB5 9PH, ENGLAND
phone: +44 1223 706200 / fax: +44 1223 706250 / http://www.algor.co.uk

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-26  6:22     ` Kevin D. Kissell
  2000-09-26  6:22       ` Kevin D. Kissell
  2000-09-26  9:08       ` Dominic Sweetman
@ 2000-09-26 18:04       ` Jun Sun
  2000-09-27 10:06         ` Maciej W. Rozycki
  2000-10-05 12:13         ` Ralf Baechle
  2 siblings, 2 replies; 33+ messages in thread
From: Jun Sun @ 2000-09-26 18:04 UTC (permalink / raw)
  To: Kevin D. Kissell, ralf; +Cc: Dominic Sweetman, linux-mips, linux-mips

"Kevin D. Kissell" wrote:
> >
> > If we have to use "-mips2" option, is there a clean way which allows us
> > to "uld/usw" instructions (instead of manually twicking the compilation
> > for each file that uses them)?
>

Ralf, before the perfect solution is found, the following patch makes
the gcc complain go away.  It just use ".set mips3" pragma.
 
> It's clear that I'll have to do
> something similar with the unaligned accesses in the USB
> support code before it will run on the MIPS 4Kc and
> similar CPUs.
> 

I am pretty close to get USB running with the v2.4-test5.  The unaligned
access is the minor problem.  The bigger problem I am fighting with now
is bus_to_virt()/virt_to_bus() and USB interrupt.

Jun

=====================================

--- linux/include/asm-mips/unaligned.h.orig     Mon Sep 25 14:02:52 2000
+++ linux/include/asm-mips/unaligned.h  Tue Sep 26 10:53:31 2000
@@ -19,7 +19,7 @@
 {
        unsigned long long __res;
 
-       __asm__("uld\t%0,(%1)"
+       __asm__(".set\tmips3\n\tuld\t%0,(%1)"
                :"=&r" (__res)
                :"r" (__addr));
 
@@ -33,7 +33,7 @@
 {
        unsigned long __res;
 
-       __asm__("ulw\t%0,(%1)"
+       __asm__(".set\tmips3\n\tulw\t%0,(%1)"
                :"=&r" (__res)
                :"r" (__addr));
 
@@ -47,7 +47,7 @@
 {
        unsigned long __res;
 
-       __asm__("ulh\t%0,(%1)"
+       __asm__(".set\tmips3\n\tulh\t%0,(%1)"
                :"=&r" (__res)
                :"r" (__addr));
 
@@ -60,7 +60,7 @@
 extern __inline__ void stq_u(unsigned long __val, unsigned long long *
__addr)
 {
        __asm__ __volatile__(
-               "usd\t%0,(%1)"
+               ".set\tmips3\n\tusd\t%0,(%1)"
                : /* No results */
                :"r" (__val),
                 "r" (__addr));
@@ -72,7 +72,7 @@
 extern __inline__ void stl_u(unsigned long __val, unsigned int *
__addr)
 {
        __asm__ __volatile__(
-               "usw\t%0,(%1)"
+               ".set\tmips3\n\tusw\t%0,(%1)"
                : /* No results */
                :"r" (__val),
                 "r" (__addr));
@@ -84,7 +84,7 @@
 extern __inline__ void stw_u(unsigned long __val, unsigned short *
__addr)
 {
        __asm__ __volatile__(
-               "ush\t%0,(%1)"
+               ".set\tmips3\n\tush\t%0,(%1)"
                : /* No results */
                :"r" (__val),
                 "r" (__addr));

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-26 18:04       ` Jun Sun
@ 2000-09-27 10:06         ` Maciej W. Rozycki
  2000-10-06  0:43           ` Ralf Baechle
  2000-10-05 12:13         ` Ralf Baechle
  1 sibling, 1 reply; 33+ messages in thread
From: Maciej W. Rozycki @ 2000-09-27 10:06 UTC (permalink / raw)
  To: Jun Sun; +Cc: Kevin D. Kissell, ralf, Dominic Sweetman, linux-mips, linux-mips

On Tue, 26 Sep 2000, Jun Sun wrote:

> --- linux/include/asm-mips/unaligned.h.orig     Mon Sep 25 14:02:52 2000
> +++ linux/include/asm-mips/unaligned.h  Tue Sep 26 10:53:31 2000
> @@ -19,7 +19,7 @@
>  {
>         unsigned long long __res;
>  
> -       __asm__("uld\t%0,(%1)"
> +       __asm__(".set\tmips3\n\tuld\t%0,(%1)"
>                 :"=&r" (__res)
>                 :"r" (__addr));
>  
[etc.]

 Please don't.  Gcc already has means to generate proper unaligned
accesses.  See include/asm-alpha/unaligned.h for how to achieve them in a
portable way (i.e. using packed structs) without the problematic inline
asm.

 And please use ".set mips0" (or ".set push" and ".set pop",
appropriately) after using any ".set mips*" directive (or any other ".set"
directive to that matter) not to adversly affect any other code.  Improper
coding of such constructs bites R3K people badly.

 Better yet, configure your compiler appropriately and avoid switching ISA
levels in the code if at all possible.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-27 10:06         ` Maciej W. Rozycki
@ 2000-10-06  0:43           ` Ralf Baechle
  2000-10-06  9:54             ` Maciej W. Rozycki
  0 siblings, 1 reply; 33+ messages in thread
From: Ralf Baechle @ 2000-10-06  0:43 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Jun Sun, Kevin D. Kissell, Dominic Sweetman, linux-mips,
	linux-mips

On Wed, Sep 27, 2000 at 12:06:31PM +0200, Maciej W. Rozycki wrote:

>  Please don't.  Gcc already has means to generate proper unaligned
> accesses.  See include/asm-alpha/unaligned.h for how to achieve them in a
> portable way (i.e. using packed structs) without the problematic inline
> asm.

That's all very nice and guess what - I tried it when I originally wrote
ualigned.h for Linux.  Try building the mentioed Alpha code with and older
compiler like egcs 1.0.3a and take a look at it [1].  23 instructions for
loading a double world - that's just mindboggling.

  Ralf

[1] free barf bag on request.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-06  0:43           ` Ralf Baechle
@ 2000-10-06  9:54             ` Maciej W. Rozycki
  2000-10-06 16:21               ` Ralf Baechle
  0 siblings, 1 reply; 33+ messages in thread
From: Maciej W. Rozycki @ 2000-10-06  9:54 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: Jun Sun, Kevin D. Kissell, Dominic Sweetman, linux-mips,
	linux-mips

On Fri, 6 Oct 2000, Ralf Baechle wrote:

> That's all very nice and guess what - I tried it when I originally wrote
> ualigned.h for Linux.  Try building the mentioed Alpha code with and older
> compiler like egcs 1.0.3a and take a look at it [1].  23 instructions for
> loading a double world - that's just mindboggling.

 Have you actually looked at the code?  They fall back to an inline asm
for pre-egcs 1.1.2 for exactly that reason for now.  It's surprising,
OTOH, as I am sure native egcs 1.0.3 did build a proper lwl/lwr sequence
for me on Ultrix a few years ago...  Maybe it's just a MIPS backend
configuration problem for other targets? 

 I vote for dual code for now and then we may remove the egcs 1.0.3
compatibility cruft one day (for 2.6, for example). 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-06  9:54             ` Maciej W. Rozycki
@ 2000-10-06 16:21               ` Ralf Baechle
  0 siblings, 0 replies; 33+ messages in thread
From: Ralf Baechle @ 2000-10-06 16:21 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Jun Sun, Kevin D. Kissell, Dominic Sweetman, linux-mips,
	linux-mips

On Fri, Oct 06, 2000 at 11:54:18AM +0200, Maciej W. Rozycki wrote:

>  I vote for dual code for now and then we may remove the egcs 1.0.3
> compatibility cruft one day (for 2.6, for example). 

Not much point in that - we end up with performancewise identical code for
both C and assembler variants with current compilers.  So whenever we
finally retire egcs 1.0.3 I think we should switch completly to the
new compiler and the C written unaligned.h.

  Ralf

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-09-26 18:04       ` Jun Sun
  2000-09-27 10:06         ` Maciej W. Rozycki
@ 2000-10-05 12:13         ` Ralf Baechle
  2000-10-06  1:11           ` Jun Sun
  1 sibling, 1 reply; 33+ messages in thread
From: Ralf Baechle @ 2000-10-05 12:13 UTC (permalink / raw)
  To: Jun Sun; +Cc: Kevin D. Kissell, Dominic Sweetman, linux-mips, linux-mips

On Tue, Sep 26, 2000 at 11:04:12AM -0700, Jun Sun wrote:

> > > If we have to use "-mips2" option, is there a clean way which allows us
> > > to "uld/usw" instructions (instead of manually twicking the compilation
> > > for each file that uses them)?
> >
> 
> Ralf, before the perfect solution is found, the following patch makes
> the gcc complain go away.  It just use ".set mips3" pragma.

It's still perfectly broken.  Uld is a 64-bit instruction meaning you still
could get into problems with register corruption or even reserved instruction
exceptions on 32-bit cpus.  Not too mention that nobody did notice that
the constraints of the inline assembler were broken for all access sizes
plus a cast that would have cut off the upper 32 bit of a 64 bit access in
any case.  That's fixed now.

> I am pretty close to get USB running with the v2.4-test5.  The unaligned
> access is the minor problem.  The bigger problem I am fighting with now
> is bus_to_virt()/virt_to_bus() and USB interrupt.

The unaligned exception handler is fairly expensive.  I suggest you should
try to get proper alignment and where that is not possible go through
the entire code and use get_unaligned.  It's going to make a noticable
difference in performance.

  Ralf

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-05 12:13         ` Ralf Baechle
@ 2000-10-06  1:11           ` Jun Sun
  2000-10-05 19:41             ` Kevin D. Kissell
  0 siblings, 1 reply; 33+ messages in thread
From: Jun Sun @ 2000-10-06  1:11 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Kevin D. Kissell, Dominic Sweetman, linux-mips, linux-mips

Ralf Baechle wrote:
> 
> On Tue, Sep 26, 2000 at 11:04:12AM -0700, Jun Sun wrote:
> 
> > > > If we have to use "-mips2" option, is there a clean way which allows us
> > > > to "uld/usw" instructions (instead of manually twicking the compilation
> > > > for each file that uses them)?
> > >
> >
> > Ralf, before the perfect solution is found, the following patch makes
> > the gcc complain go away.  It just use ".set mips3" pragma.
> 
> It's still perfectly broken.  Uld is a 64-bit instruction meaning you still
> could get into problems with register corruption or even reserved instruction
> exceptions on 32-bit cpus.  Not too mention that nobody did notice that
> the constraints of the inline assembler were broken for all access sizes
> plus a cast that would have cut off the upper 32 bit of a 64 bit access in
> any case.  That's fixed now.
> 

With my limited wisdom, I am totally confused by this paragraph.

I think you mentioned a couple of times before where 64-bit instructions
corrupt registers in 32-bit mode.  I think I have done that before with
R5000 R4500.  I did not notice any corruption.  What exactly is the
corruption you are referring to?

With the second half, are you saying the "cut-off-upper-32-bit" bug
actually hides the register corruption problem?  If so, maybe we need
the "cut-off-upper_32-bit" bug for the 32-bit MIPS tree.

Anyway, in short, what is your suggestion for fixing this bug?

Maciej suggested that we use packed struct of gcc (I assume gcc will
generate two loads and get the results with some bit masking and
shifting).  That does not sound too bad, although that does require one
to use the newer gcc.

> > I am pretty close to get USB running with the v2.4-test5.  The unaligned
> > access is the minor problem.  The bigger problem I am fighting with now
> > is bus_to_virt()/virt_to_bus() and USB interrupt.
> 
> The unaligned exception handler is fairly expensive.  I suggest you should
> try to get proper alignment and where that is not possible go through
> the entire code and use get_unaligned.  It's going to make a noticable
> difference in performance.
> 

Fortunately, the USB guys have already used get_unaligned() in all the
places - I hope.

Jun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-06  1:11           ` Jun Sun
@ 2000-10-05 19:41             ` Kevin D. Kissell
  2000-10-05 19:41               ` Kevin D. Kissell
  2000-10-06  4:32               ` Jun Sun
  0 siblings, 2 replies; 33+ messages in thread
From: Kevin D. Kissell @ 2000-10-05 19:41 UTC (permalink / raw)
  To: Ralf Baechle, Jun Sun; +Cc: linux-mips, linux-mips, Dominic Sweetman

> > > Ralf, before the perfect solution is found, the following patch makes
> > > the gcc complain go away.  It just use ".set mips3" pragma.

Which, as Ralf correctly observes, will generate code that will
crash on 32-bit CPUs, and apparently do entirely the wrong
thing for other reasons on the 64-bit ones.

> > It's still perfectly broken.  Uld is a 64-bit instruction meaning you
still
> > could get into problems with register corruption or even reserved
instruction
> > exceptions on 32-bit cpus.  Not too mention that nobody did notice that
> > the constraints of the inline assembler were broken for all access sizes
> > plus a cast that would have cut off the upper 32 bit of a 64 bit access
in
> > any case.  That's fixed now.
> >
>
> With my limited wisdom, I am totally confused by this paragraph.
>
> I think you mentioned a couple of times before where 64-bit instructions
> corrupt registers in 32-bit mode.  I think I have done that before with
> R5000 R4500.  I did not notice any corruption.  What exactly is the
> corruption you are referring to?

Uld is an unaligned doubleword load macro that should generate
a LDL/LDR sequence if MIPS III, IV, V or MIPS64 is enabled in
the compiler/assembler.  That sequence should either execute
correctly or deliver a reserved instruction exception.  No
MIPS-compatible CPU should silently fail or corrupt registers.

> With the second half, are you saying the "cut-off-upper-32-bit" bug
> actually hides the register corruption problem?  If so, maybe we need
> the "cut-off-upper_32-bit" bug for the 32-bit MIPS tree.

This is a joke, right?

            Kevin K.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-05 19:41             ` Kevin D. Kissell
@ 2000-10-05 19:41               ` Kevin D. Kissell
  2000-10-06  4:32               ` Jun Sun
  1 sibling, 0 replies; 33+ messages in thread
From: Kevin D. Kissell @ 2000-10-05 19:41 UTC (permalink / raw)
  To: Ralf Baechle, Jun Sun; +Cc: linux-mips, linux-mips, Dominic Sweetman

> > > Ralf, before the perfect solution is found, the following patch makes
> > > the gcc complain go away.  It just use ".set mips3" pragma.

Which, as Ralf correctly observes, will generate code that will
crash on 32-bit CPUs, and apparently do entirely the wrong
thing for other reasons on the 64-bit ones.

> > It's still perfectly broken.  Uld is a 64-bit instruction meaning you
still
> > could get into problems with register corruption or even reserved
instruction
> > exceptions on 32-bit cpus.  Not too mention that nobody did notice that
> > the constraints of the inline assembler were broken for all access sizes
> > plus a cast that would have cut off the upper 32 bit of a 64 bit access
in
> > any case.  That's fixed now.
> >
>
> With my limited wisdom, I am totally confused by this paragraph.
>
> I think you mentioned a couple of times before where 64-bit instructions
> corrupt registers in 32-bit mode.  I think I have done that before with
> R5000 R4500.  I did not notice any corruption.  What exactly is the
> corruption you are referring to?

Uld is an unaligned doubleword load macro that should generate
a LDL/LDR sequence if MIPS III, IV, V or MIPS64 is enabled in
the compiler/assembler.  That sequence should either execute
correctly or deliver a reserved instruction exception.  No
MIPS-compatible CPU should silently fail or corrupt registers.

> With the second half, are you saying the "cut-off-upper-32-bit" bug
> actually hides the register corruption problem?  If so, maybe we need
> the "cut-off-upper_32-bit" bug for the 32-bit MIPS tree.

This is a joke, right?

            Kevin K.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-05 19:41             ` Kevin D. Kissell
  2000-10-05 19:41               ` Kevin D. Kissell
@ 2000-10-06  4:32               ` Jun Sun
  2000-10-05 22:10                 ` Kevin D. Kissell
  2000-10-06 16:28                 ` Ralf Baechle
  1 sibling, 2 replies; 33+ messages in thread
From: Jun Sun @ 2000-10-06  4:32 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Ralf Baechle, linux-mips, linux-mips, Dominic Sweetman

"Kevin D. Kissell" wrote:
> 
> > > > Ralf, before the perfect solution is found, the following patch makes
> > > > the gcc complain go away.  It just use ".set mips3" pragma.
> 
> Which, as Ralf correctly observes, will generate code that will
> crash on 32-bit CPUs, 

Why will it crash 32-bit CPUs?  On my R5432 CPU, the lwl/lwr sequence
executes just fine.

Or do you mean it will crash SOME 32-bit CPUs?  Do those 32-bit CPUs
support lwl or lwr?  If they don't, they should generate a reserved
instruction exception.  If they do, I don't see any problem. 

> > With the second half, are you saying the "cut-off-upper-32-bit" bug
> > actually hides the register corruption problem?  If so, maybe we need
> > the "cut-off-upper_32-bit" bug for the 32-bit MIPS tree.
> 
> This is a joke, right?
> 

Not entirely.  I was thinking if the unaligned load/store instruction
corrupts the upper 32 bit content on SOME cpus, maybe we do need to cut
the upper 32bit as a workaround.  Well, I hope it is not necessary.

Jun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-06  4:32               ` Jun Sun
@ 2000-10-05 22:10                 ` Kevin D. Kissell
  2000-10-05 22:10                   ` Kevin D. Kissell
  2000-10-06  5:53                   ` Jun Sun
  2000-10-06 16:28                 ` Ralf Baechle
  1 sibling, 2 replies; 33+ messages in thread
From: Kevin D. Kissell @ 2000-10-05 22:10 UTC (permalink / raw)
  To: Jun Sun; +Cc: Ralf Baechle, linux-mips, linux-mips, Dominic Sweetman

Jun Sun wrote:
> "Kevin D. Kissell" wrote:
> >
> > > > > Ralf, before the perfect solution is found, the following patch
makes
> > > > > the gcc complain go away.  It just use ".set mips3" pragma.
> >
> > Which, as Ralf correctly observes, will generate code that will
> > crash on 32-bit CPUs,
>
> Why will it crash 32-bit CPUs?  On my R5432 CPU, the lwl/lwr sequence
> executes just fine.
>
> Or do you mean it will crash SOME 32-bit CPUs?  Do those 32-bit CPUs
> support lwl or lwr?  If they don't, they should generate a reserved
> instruction exception.  If they do, I don't see any problem.

Please re-read my previous message.  I wasn't talking about the
MIPS I lwl/lwr sequence for loading an unaligned 32-bit word, I was
talking about the MIPS III ldl/ldr sequence for loading an unaligned
64-bit doubleword.

            Kevin K.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-05 22:10                 ` Kevin D. Kissell
@ 2000-10-05 22:10                   ` Kevin D. Kissell
  2000-10-06  5:53                   ` Jun Sun
  1 sibling, 0 replies; 33+ messages in thread
From: Kevin D. Kissell @ 2000-10-05 22:10 UTC (permalink / raw)
  To: Jun Sun; +Cc: Ralf Baechle, linux-mips, linux-mips, Dominic Sweetman

Jun Sun wrote:
> "Kevin D. Kissell" wrote:
> >
> > > > > Ralf, before the perfect solution is found, the following patch
makes
> > > > > the gcc complain go away.  It just use ".set mips3" pragma.
> >
> > Which, as Ralf correctly observes, will generate code that will
> > crash on 32-bit CPUs,
>
> Why will it crash 32-bit CPUs?  On my R5432 CPU, the lwl/lwr sequence
> executes just fine.
>
> Or do you mean it will crash SOME 32-bit CPUs?  Do those 32-bit CPUs
> support lwl or lwr?  If they don't, they should generate a reserved
> instruction exception.  If they do, I don't see any problem.

Please re-read my previous message.  I wasn't talking about the
MIPS I lwl/lwr sequence for loading an unaligned 32-bit word, I was
talking about the MIPS III ldl/ldr sequence for loading an unaligned
64-bit doubleword.

            Kevin K.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-05 22:10                 ` Kevin D. Kissell
  2000-10-05 22:10                   ` Kevin D. Kissell
@ 2000-10-06  5:53                   ` Jun Sun
  2000-10-05 23:14                     ` Kevin D. Kissell
                                       ` (2 more replies)
  1 sibling, 3 replies; 33+ messages in thread
From: Jun Sun @ 2000-10-06  5:53 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Ralf Baechle, linux-mips, linux-mips

"Kevin D. Kissell" wrote:
> 
> Jun Sun wrote:
> > "Kevin D. Kissell" wrote:
> > >
> > > > > > Ralf, before the perfect solution is found, the following patch
> makes
> > > > > > the gcc complain go away.  It just use ".set mips3" pragma.
> > >
> > > Which, as Ralf correctly observes, will generate code that will
> > > crash on 32-bit CPUs,
> >
> > Why will it crash 32-bit CPUs?  On my R5432 CPU, the lwl/lwr sequence
> > executes just fine.
> >
> > Or do you mean it will crash SOME 32-bit CPUs?  Do those 32-bit CPUs
> > support lwl or lwr?  If they don't, they should generate a reserved
> > instruction exception.  If they do, I don't see any problem.
> 
> Please re-read my previous message.  I wasn't talking about the
> MIPS I lwl/lwr sequence for loading an unaligned 32-bit word, I was
> talking about the MIPS III ldl/ldr sequence for loading an unaligned
> 64-bit doubleword.
> 
>             Kevin K.

Ahh, my bad.  

Although the usb does use get_unaligned(u64) (ldl/ldr), it actually does
not run into it - at least in my test so far.  That probably explains
why my fix runs on the R5432 CPU so far.

Ralf, I notice you have fixed it in the CVS tree.  Just did a test, and
it looks good here.

Thanks.

Jun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-06  5:53                   ` Jun Sun
@ 2000-10-05 23:14                     ` Kevin D. Kissell
  2000-10-05 23:14                       ` Kevin D. Kissell
  2000-10-06 16:32                     ` Ralf Baechle
  2000-10-07  1:35                     ` Jun Sun
  2 siblings, 1 reply; 33+ messages in thread
From: Kevin D. Kissell @ 2000-10-05 23:14 UTC (permalink / raw)
  To: Jun Sun; +Cc: Ralf Baechle, linux-mips, linux-mips

> > > Why will it crash 32-bit CPUs?  On my R5432 CPU, the lwl/lwr sequence
> > > executes just fine.
> > >
> > > Or do you mean it will crash SOME 32-bit CPUs?  Do those 32-bit CPUs
> > > support lwl or lwr?  If they don't, they should generate a reserved
> > > instruction exception.  If they do, I don't see any problem.
> > 
> > Please re-read my previous message.  I wasn't talking about the
> > MIPS I lwl/lwr sequence for loading an unaligned 32-bit word, I was
> > talking about the MIPS III ldl/ldr sequence for loading an unaligned
> > 64-bit doubleword.
> > 
> >             Kevin K.
> 
> Ahh, my bad.  
> 
> Although the usb does use get_unaligned(u64) (ldl/ldr), it actually does
> not run into it - at least in my test so far.  That probably explains
> why my fix runs on the R5432 CPU so far.

The 5432 may have a 32-bit external bus, but it's still (as far
as I know) a 64-bit part internally, so as long as you're executing
in kernel mode, the ldl/ldr's should work as designed.

            Kevin K.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-05 23:14                     ` Kevin D. Kissell
@ 2000-10-05 23:14                       ` Kevin D. Kissell
  0 siblings, 0 replies; 33+ messages in thread
From: Kevin D. Kissell @ 2000-10-05 23:14 UTC (permalink / raw)
  To: Jun Sun; +Cc: Ralf Baechle, linux-mips, linux-mips

> > > Why will it crash 32-bit CPUs?  On my R5432 CPU, the lwl/lwr sequence
> > > executes just fine.
> > >
> > > Or do you mean it will crash SOME 32-bit CPUs?  Do those 32-bit CPUs
> > > support lwl or lwr?  If they don't, they should generate a reserved
> > > instruction exception.  If they do, I don't see any problem.
> > 
> > Please re-read my previous message.  I wasn't talking about the
> > MIPS I lwl/lwr sequence for loading an unaligned 32-bit word, I was
> > talking about the MIPS III ldl/ldr sequence for loading an unaligned
> > 64-bit doubleword.
> > 
> >             Kevin K.
> 
> Ahh, my bad.  
> 
> Although the usb does use get_unaligned(u64) (ldl/ldr), it actually does
> not run into it - at least in my test so far.  That probably explains
> why my fix runs on the R5432 CPU so far.

The 5432 may have a 32-bit external bus, but it's still (as far
as I know) a 64-bit part internally, so as long as you're executing
in kernel mode, the ldl/ldr's should work as designed.

            Kevin K.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-06  5:53                   ` Jun Sun
  2000-10-05 23:14                     ` Kevin D. Kissell
@ 2000-10-06 16:32                     ` Ralf Baechle
  2000-10-07  1:35                     ` Jun Sun
  2 siblings, 0 replies; 33+ messages in thread
From: Ralf Baechle @ 2000-10-06 16:32 UTC (permalink / raw)
  To: Jun Sun; +Cc: Kevin D. Kissell, linux-mips, linux-mips

On Thu, Oct 05, 2000 at 10:53:34PM -0700, Jun Sun wrote:

> Although the usb does use get_unaligned(u64) (ldl/ldr), it actually does
> not run into it - at least in my test so far.  That probably explains
> why my fix runs on the R5432 CPU so far.

No, you just never hit the window where the your 64-bit reg got corrupted by
an exception.  The old broken macros also had a cast to long in them
which was truncating the loaded 64-bit word so in 100% of cases the upper
32-bit was modified in creative ways.  So I guess you were just lucky and
never hit the case were this actually bits.

  Ralf

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-06  5:53                   ` Jun Sun
  2000-10-05 23:14                     ` Kevin D. Kissell
  2000-10-06 16:32                     ` Ralf Baechle
@ 2000-10-07  1:35                     ` Jun Sun
  2000-10-06 22:26                       ` Ralf Baechle
  2 siblings, 1 reply; 33+ messages in thread
From: Jun Sun @ 2000-10-07  1:35 UTC (permalink / raw)
  To: Ralf Baechle, linux-mips, linux-mips

Jun Sun wrote:
> 
> Ralf, I notice you have fixed it in the CVS tree.  Just did a test, and
> it looks good here.
> 

I was too soon to say that ... :-)

While the __ldq_u() did work, I had a couple of syntax problems with
put_unaligned().  See the patch below.

In addition, my usb subsystem now hangs.  It might mean a bug in the new
unaligned.h or the fix to unaligned.h reveals another bug.  I will let
you know.

Jun

--- unaligned.h.ralf    Fri Oct  6 18:32:34 2000
+++ unaligned.h Fri Oct  6 18:01:43 2000
@@ -117,20 +117,20 @@
        __val;                                                         
\
 })

-#define put_unaligned(x,ptr)                                          
\
+#define put_unaligned(val,ptr)                                        
\
 do {                                                                  
\
        switch (sizeof(*(ptr))) {                                      
\
        case 1:                                                        
\
-               *(unsigned char *)ptr = (val);                         
\
+               *(unsigned char *)(ptr) = (val);                       
\
                break;                                                 
\
        case 2:                                                        
\
-               __stw_u(val, (unsigned short *)ptr);                   
\
+               __stw_u(val, (unsigned short *)(ptr));                 
\
                break;                                                 
\
        case 4:                                                        
\
-               __stl_u(val, (unsigned int *)ptr);                     
\
+               __stl_u(val, (unsigned int *)(ptr));                   
\
                break;                                                 
\
        case 8:                                                        
\
-               __stq_u(val, (unsigned long long *)ptr);               
\
+               __stq_u(val, (unsigned long long *)(ptr));             
\
                break;                                                 
\
        default:                                                       
\
                __put_unaligned_bad_length();                          
\

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-07  1:35                     ` Jun Sun
@ 2000-10-06 22:26                       ` Ralf Baechle
  0 siblings, 0 replies; 33+ messages in thread
From: Ralf Baechle @ 2000-10-06 22:26 UTC (permalink / raw)
  To: Jun Sun; +Cc: linux-mips, linux-mips

On Fri, Oct 06, 2000 at 06:35:15PM -0700, Jun Sun wrote:

> While the __ldq_u() did work, I had a couple of syntax problems with
> put_unaligned().  See the patch below.
> 
> In addition, my usb subsystem now hangs.  It might mean a bug in the new
> unaligned.h or the fix to unaligned.h reveals another bug.  I will let
> you know.

I had already a patch for a the x vs. val thing in the CVS, so I just took
the part which adds the additional brackets from your patch to make
sure the semantic is identical with functions calls.

  Ralf

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-06  4:32               ` Jun Sun
  2000-10-05 22:10                 ` Kevin D. Kissell
@ 2000-10-06 16:28                 ` Ralf Baechle
  2000-10-07  1:24                   ` Jun Sun
  1 sibling, 1 reply; 33+ messages in thread
From: Ralf Baechle @ 2000-10-06 16:28 UTC (permalink / raw)
  To: Jun Sun; +Cc: Kevin D. Kissell, linux-mips, linux-mips, Dominic Sweetman

On Thu, Oct 05, 2000 at 09:32:41PM -0700, Jun Sun wrote:

> > > > > Ralf, before the perfect solution is found, the following patch makes
> > > > > the gcc complain go away.  It just use ".set mips3" pragma.
> > 
> > Which, as Ralf correctly observes, will generate code that will
> > crash on 32-bit CPUs, 
> 
> Why will it crash 32-bit CPUs?  On my R5432 CPU, the lwl/lwr sequence
> executes just fine.

That's a 64-bit CPU with a 32-bit bus ...

> Or do you mean it will crash SOME 32-bit CPUs?  Do those 32-bit CPUs
> support lwl or lwr?  If they don't, they should generate a reserved
> instruction exception.  If they do, I don't see any problem. 

It will crash all 32-bit CPUs.

> Not entirely.  I was thinking if the unaligned load/store instruction
> corrupts the upper 32 bit content on SOME cpus, maybe we do need to cut
> the upper 32bit as a workaround.  Well, I hope it is not necessary.

No, it happens on all CPUs.  Interrupts only restore the lower 32-bit of
the registers.  Partially this happens for the sake of compatibility with
32-bit cpus, partially it's also the because otherwise 8kb kernel stack
wouldn't be sufficient, we'd have to go up to 16kb stacks which again
has potencial influence on the memory managment that can reduce the
reliability of the kernel when low on memory, it increases the overhead.
In short unless a system has serious needs for 64-bit supporting 64-bit
is quite a loss.

  Ralf

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-06 16:28                 ` Ralf Baechle
@ 2000-10-07  1:24                   ` Jun Sun
  2000-10-06 20:46                     ` Kevin D. Kissell
  0 siblings, 1 reply; 33+ messages in thread
From: Jun Sun @ 2000-10-07  1:24 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Kevin D. Kissell, linux-mips, linux-mips, Dominic Sweetman

Ralf Baechle wrote:
> 
> On Thu, Oct 05, 2000 at 09:32:41PM -0700, Jun Sun wrote:
> 
> > > > > > Ralf, before the perfect solution is found, the following patch makes
> > > > > > the gcc complain go away.  It just use ".set mips3" pragma.
> > >
> > > Which, as Ralf correctly observes, will generate code that will
> > > crash on 32-bit CPUs,
> >
> > Why will it crash 32-bit CPUs?  On my R5432 CPU, the lwl/lwr sequence
> > executes just fine.
> 
> That's a 64-bit CPU with a 32-bit bus ...
> 

That is what the manual claims.  However I did find something strange.

I run the following code on R5432:

0x8019dc34 <my_get_unaligned+4>:        ldl     $a2,7($a0)
0x8019dc38 <my_get_unaligned+8>:        ldr     $a2,0($a0)
0x8019dc3c <my_get_unaligned+12>:       srl     $a2,$a2,0x10

As Kevin has guessed, it actually runs fine.  However, the register
content in $a2 is not right.  Basically it appears that $a2 is a 32-bit
register instead of 64-bit register.  I put a srl instruction to make
sure I was not fooled by gdb.

I know R5432 is derived from R5000 FOR 32-bit systems.  I guess there
are probably a lot of short-cuts for 64-bit operations.

Jun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-07  1:24                   ` Jun Sun
@ 2000-10-06 20:46                     ` Kevin D. Kissell
  2000-10-06 20:46                       ` Kevin D. Kissell
  2000-10-07  7:16                       ` Jun Sun
  0 siblings, 2 replies; 33+ messages in thread
From: Kevin D. Kissell @ 2000-10-06 20:46 UTC (permalink / raw)
  To: Jun Sun, Ralf Baechle; +Cc: linux-mips, linux-mips, Dominic Sweetman

Jun Sun wrote:
> Ralf Baechle wrote:
> >
> > On Thu, Oct 05, 2000 at 09:32:41PM -0700, Jun Sun wrote:
> >
> > > > > > > Ralf, before the perfect solution is found, the following
patch makes
> > > > > > > the gcc complain go away.  It just use ".set mips3" pragma.
> > > >
> > > > Which, as Ralf correctly observes, will generate code that will
> > > > crash on 32-bit CPUs,
> > >
> > > Why will it crash 32-bit CPUs?  On my R5432 CPU, the lwl/lwr sequence
> > > executes just fine.
> >
> > That's a 64-bit CPU with a 32-bit bus ...
> >
>
> That is what the manual claims.  However I did find something strange.
>
> I run the following code on R5432:
>
> 0x8019dc34 <my_get_unaligned+4>:        ldl     $a2,7($a0)
> 0x8019dc38 <my_get_unaligned+8>:        ldr     $a2,0($a0)
> 0x8019dc3c <my_get_unaligned+12>:       srl     $a2,$a2,0x10
>
> As Kevin has guessed, it actually runs fine.  However, the register
> content in $a2 is not right.  Basically it appears that $a2 is a 32-bit
> register instead of 64-bit register.  I put a srl instruction to make
> sure I was not fooled by gdb.

Please read the instruction manual for srl more closely.
In order to preserve binary compatibility with 32-bit MIPS
CPUs, srl, sll, and sra always work *as if* only a 32-bit register
is implemented.  If you want to shift the full 64 bits, you need
to use explicit 64-bit shifts: dsrl, dsll, dsra, etc.  Use a dsrl
instead of an srl and you *may* see what you are expecting.

But there is also the issue that  Ralf alluded to in earlier
messages on this thread:  If your kernel exception
handler is only saving and restoring register state
using 32-bit loads and stores, the upper 32-bits of
the registers will tend to decay into sign-extensions
of the least significant 32-bits.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-06 20:46                     ` Kevin D. Kissell
@ 2000-10-06 20:46                       ` Kevin D. Kissell
  2000-10-07  7:16                       ` Jun Sun
  1 sibling, 0 replies; 33+ messages in thread
From: Kevin D. Kissell @ 2000-10-06 20:46 UTC (permalink / raw)
  To: Jun Sun, Ralf Baechle; +Cc: linux-mips, linux-mips, Dominic Sweetman

Jun Sun wrote:
> Ralf Baechle wrote:
> >
> > On Thu, Oct 05, 2000 at 09:32:41PM -0700, Jun Sun wrote:
> >
> > > > > > > Ralf, before the perfect solution is found, the following
patch makes
> > > > > > > the gcc complain go away.  It just use ".set mips3" pragma.
> > > >
> > > > Which, as Ralf correctly observes, will generate code that will
> > > > crash on 32-bit CPUs,
> > >
> > > Why will it crash 32-bit CPUs?  On my R5432 CPU, the lwl/lwr sequence
> > > executes just fine.
> >
> > That's a 64-bit CPU with a 32-bit bus ...
> >
>
> That is what the manual claims.  However I did find something strange.
>
> I run the following code on R5432:
>
> 0x8019dc34 <my_get_unaligned+4>:        ldl     $a2,7($a0)
> 0x8019dc38 <my_get_unaligned+8>:        ldr     $a2,0($a0)
> 0x8019dc3c <my_get_unaligned+12>:       srl     $a2,$a2,0x10
>
> As Kevin has guessed, it actually runs fine.  However, the register
> content in $a2 is not right.  Basically it appears that $a2 is a 32-bit
> register instead of 64-bit register.  I put a srl instruction to make
> sure I was not fooled by gdb.

Please read the instruction manual for srl more closely.
In order to preserve binary compatibility with 32-bit MIPS
CPUs, srl, sll, and sra always work *as if* only a 32-bit register
is implemented.  If you want to shift the full 64 bits, you need
to use explicit 64-bit shifts: dsrl, dsll, dsra, etc.  Use a dsrl
instead of an srl and you *may* see what you are expecting.

But there is also the issue that  Ralf alluded to in earlier
messages on this thread:  If your kernel exception
handler is only saving and restoring register state
using 32-bit loads and stores, the upper 32-bits of
the registers will tend to decay into sign-extensions
of the least significant 32-bits.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: load_unaligned() and "uld" instruction
  2000-10-06 20:46                     ` Kevin D. Kissell
  2000-10-06 20:46                       ` Kevin D. Kissell
@ 2000-10-07  7:16                       ` Jun Sun
  1 sibling, 0 replies; 33+ messages in thread
From: Jun Sun @ 2000-10-07  7:16 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: linux-mips, linux-mips

"Kevin D. Kissell" wrote:
> 
> Jun Sun wrote:
> > That is what the manual claims.  However I did find something strange.
> >
> > I run the following code on R5432:
> >
> > 0x8019dc34 <my_get_unaligned+4>:        ldl     $a2,7($a0)
> > 0x8019dc38 <my_get_unaligned+8>:        ldr     $a2,0($a0)
> > 0x8019dc3c <my_get_unaligned+12>:       srl     $a2,$a2,0x10
> >
> > As Kevin has guessed, it actually runs fine.  However, the register
> > content in $a2 is not right.  Basically it appears that $a2 is a 32-bit
> > register instead of 64-bit register.  I put a srl instruction to make
> > sure I was not fooled by gdb.
> 
> Please read the instruction manual for srl more closely.
> In order to preserve binary compatibility with 32-bit MIPS
> CPUs, srl, sll, and sra always work *as if* only a 32-bit register
> is implemented.  If you want to shift the full 64 bits, you need
> to use explicit 64-bit shifts: dsrl, dsll, dsra, etc.  Use a dsrl
> instead of an srl and you *may* see what you are expecting.
> 

Just re-did the test with dsrl.  It does show that the higher 32-bit are
loaded correctly by ldl/ldr.  The result still was not completely right,
due to the inline assembler bug noted by Ralf earlier.  That bug casts
off the higher 32-bit upon the function return.

Thanks, Kevin.


Jun

... learn something new each day ...

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2000-10-09 14:37 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-09-25 18:48 load_unaligned() and "uld" instruction Jun Sun
2000-09-25 21:16 ` Dominic Sweetman
2000-09-25 21:36   ` Jun Sun
2000-09-25 23:29     ` Ralf Baechle
2000-09-26  6:22     ` Kevin D. Kissell
2000-09-26  6:22       ` Kevin D. Kissell
2000-09-26  9:08       ` Dominic Sweetman
2000-09-26  9:08         ` Dominic Sweetman
2000-09-29 17:22         ` Ralf Baechle
2000-10-09 14:49           ` Dominic Sweetman
2000-09-26 18:04       ` Jun Sun
2000-09-27 10:06         ` Maciej W. Rozycki
2000-10-06  0:43           ` Ralf Baechle
2000-10-06  9:54             ` Maciej W. Rozycki
2000-10-06 16:21               ` Ralf Baechle
2000-10-05 12:13         ` Ralf Baechle
2000-10-06  1:11           ` Jun Sun
2000-10-05 19:41             ` Kevin D. Kissell
2000-10-05 19:41               ` Kevin D. Kissell
2000-10-06  4:32               ` Jun Sun
2000-10-05 22:10                 ` Kevin D. Kissell
2000-10-05 22:10                   ` Kevin D. Kissell
2000-10-06  5:53                   ` Jun Sun
2000-10-05 23:14                     ` Kevin D. Kissell
2000-10-05 23:14                       ` Kevin D. Kissell
2000-10-06 16:32                     ` Ralf Baechle
2000-10-07  1:35                     ` Jun Sun
2000-10-06 22:26                       ` Ralf Baechle
2000-10-06 16:28                 ` Ralf Baechle
2000-10-07  1:24                   ` Jun Sun
2000-10-06 20:46                     ` Kevin D. Kissell
2000-10-06 20:46                       ` Kevin D. Kissell
2000-10-07  7:16                       ` Jun Sun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox