Linux MIPS Architecture development
 help / color / mirror / Atom feed
* sti() does not work.
@ 2001-07-03 22:48 Steven Liu
  2001-07-03 22:48 ` Steven Liu
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Steven Liu @ 2001-07-03 22:48 UTC (permalink / raw)
  To: linux-mips; +Cc: stevenliu@psdc.com

Hi All:

I am working on the porting Linux to mips R3000 and  have a problem
about sti( ) which is called in start_kernel( ).

As we know, sti() is defined as __sti( ) in the
include/asm-mips/system.h:
 
extern __inline__ void  __sti(void)
{
	__asm__ __volatile__(
		".set\tnoreorder\n\t"
		".set\tnoat\n\t"
		"mfc0\t$1,$12\n\t"
		"ori\t$1,0x1f\n\t"
		"xori\t$1,0x1e\n\t"
		"mtc0\t$1,$12\n\t"               /* <----- problem  here
! */
		".set\tat\n\t"
		".set\treorder"
		: /* no outputs */
		: /* no inputs */
		: "$1", "memory");
}

Before calling this function, status_register = 0x1000fc00 and
cause_register=0x00008000. 
Clearly, this is an interrupt of the CPU timer. 

When mtc0 instruction above is executed, the system hangs and the
control does not go to the timer handler.

Any help is greatly appreciated.

Thank you.

Steven liu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* sti() does not work.
  2001-07-03 22:48 sti() does not work Steven Liu
@ 2001-07-03 22:48 ` Steven Liu
  2001-07-04 10:23 ` Thiemo Seufer
  2001-07-04 13:29 ` Ralf Baechle
  2 siblings, 0 replies; 13+ messages in thread
From: Steven Liu @ 2001-07-03 22:48 UTC (permalink / raw)
  To: linux-mips; +Cc: stevenliu@psdc.com

Hi All:

I am working on the porting Linux to mips R3000 and  have a problem
about sti( ) which is called in start_kernel( ).

As we know, sti() is defined as __sti( ) in the
include/asm-mips/system.h:
 
extern __inline__ void  __sti(void)
{
	__asm__ __volatile__(
		".set\tnoreorder\n\t"
		".set\tnoat\n\t"
		"mfc0\t$1,$12\n\t"
		"ori\t$1,0x1f\n\t"
		"xori\t$1,0x1e\n\t"
		"mtc0\t$1,$12\n\t"               /* <----- problem  here
! */
		".set\tat\n\t"
		".set\treorder"
		: /* no outputs */
		: /* no inputs */
		: "$1", "memory");
}

Before calling this function, status_register = 0x1000fc00 and
cause_register=0x00008000. 
Clearly, this is an interrupt of the CPU timer. 

When mtc0 instruction above is executed, the system hangs and the
control does not go to the timer handler.

Any help is greatly appreciated.

Thank you.

Steven liu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sti() does not work.
  2001-07-03 22:48 sti() does not work Steven Liu
  2001-07-03 22:48 ` Steven Liu
@ 2001-07-04 10:23 ` Thiemo Seufer
  2001-07-04 12:23   ` Gleb O. Raiko
  2001-07-04 13:26   ` Ralf Baechle
  2001-07-04 13:29 ` Ralf Baechle
  2 siblings, 2 replies; 13+ messages in thread
From: Thiemo Seufer @ 2001-07-04 10:23 UTC (permalink / raw)
  To: linux-mips

Steven Liu wrote:
> Hi All:
> 
> I am working on the porting Linux to mips R3000 and  have a problem
> about sti( ) which is called in start_kernel( ).
> 
> As we know, sti() is defined as __sti( ) in the
> include/asm-mips/system.h:
>  
> extern __inline__ void  __sti(void)
> {
> 	__asm__ __volatile__(
> 		".set\tnoreorder\n\t"
> 		".set\tnoat\n\t"
> 		"mfc0\t$1,$12\n\t"
> 		"ori\t$1,0x1f\n\t"
> 		"xori\t$1,0x1e\n\t"
> 		"mtc0\t$1,$12\n\t"               /* <----- problem  here
> ! */

Here should follow some nop's on a MIPS I system to make sure $12
is written (why is noreorder used here?).

> 		".set\tat\n\t"
> 		".set\treorder"
> 		: /* no outputs */
> 		: /* no inputs */
> 		: "$1", "memory");
> }

HTH,
Thiemo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sti() does not work.
  2001-07-04 10:23 ` Thiemo Seufer
@ 2001-07-04 12:23   ` Gleb O. Raiko
  2001-07-04 13:26   ` Ralf Baechle
  1 sibling, 0 replies; 13+ messages in thread
From: Gleb O. Raiko @ 2001-07-04 12:23 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: linux-mips

Thiemo Seufer wrote:
> > extern __inline__ void  __sti(void)
> > {
> >       __asm__ __volatile__(
> >               ".set\tnoreorder\n\t"
> >               ".set\tnoat\n\t"
> >               "mfc0\t$1,$12\n\t"
> >               "ori\t$1,0x1f\n\t"
> >               "xori\t$1,0x1e\n\t"
> >               "mtc0\t$1,$12\n\t"               /* <----- problem  here
> > ! */
> 
> Here should follow some nop's on a MIPS I system to make sure $12
> is written (why is noreorder used here?).
> 

Support for r3k in 2.2 is broken for a long time. Use 2.4 instead.

Regards,
Gleb.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sti() does not work.
  2001-07-04 10:23 ` Thiemo Seufer
  2001-07-04 12:23   ` Gleb O. Raiko
@ 2001-07-04 13:26   ` Ralf Baechle
  2001-07-05 11:35     ` Maciej W. Rozycki
  1 sibling, 1 reply; 13+ messages in thread
From: Ralf Baechle @ 2001-07-04 13:26 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: linux-mips

On Wed, Jul 04, 2001 at 12:23:29PM +0200, Thiemo Seufer wrote:

> > extern __inline__ void  __sti(void)
> > {
> > 	__asm__ __volatile__(
> > 		".set\tnoreorder\n\t"
> > 		".set\tnoat\n\t"
> > 		"mfc0\t$1,$12\n\t"
> > 		"ori\t$1,0x1f\n\t"
> > 		"xori\t$1,0x1e\n\t"
> > 		"mtc0\t$1,$12\n\t"               /* <----- problem  here! */
> 
> Here should follow some nop's on a MIPS I system to make sure $12
> is written

There are no nops there since we simply don't care how how many cycles
after the mtc0 the interrupts actually get enabled.  Worst case is the
R4000's 8 stage pipeline where we have a latency of 3 cycles, clearly
nothing that justifies wasting memory and cycles for nops.

> (why is noreorder used here?).

Without the .set noreorder the assembler would be free to do arbitrary
reordering of the object code generated.  Gas doesn't do that but there
are other assemblers that do flow analysis and may generate object code
that doesn't look very much like the source they were fed with.

  Ralf

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sti() does not work.
  2001-07-03 22:48 sti() does not work Steven Liu
  2001-07-03 22:48 ` Steven Liu
  2001-07-04 10:23 ` Thiemo Seufer
@ 2001-07-04 13:29 ` Ralf Baechle
  2 siblings, 0 replies; 13+ messages in thread
From: Ralf Baechle @ 2001-07-04 13:29 UTC (permalink / raw)
  To: Steven Liu; +Cc: linux-mips, stevenliu@psdc.com

On Tue, Jul 03, 2001 at 03:48:04PM -0700, Steven Liu wrote:

> I am working on the porting Linux to mips R3000 and  have a problem
> about sti( ) which is called in start_kernel( ).

> Before calling this function, status_register = 0x1000fc00 and
> cause_register=0x00008000. 
> Clearly, this is an interrupt of the CPU timer. 

R3000 doesn't have a CPU timer, so either you're porting to something else
than the R3000 or you don't have a CPU timer.

> When mtc0 instruction above is executed, the system hangs and the
> control does not go to the timer handler.

When the mtc0 gets executed you take the pending interrupt which goes to
the general exception vector at 0x80000180.  That's magic done in hardware.
So it looks like your interrupt handler is buggy.

  Ralf

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sti() does not work.
  2001-07-04 13:26   ` Ralf Baechle
@ 2001-07-05 11:35     ` Maciej W. Rozycki
  2001-07-13 11:35       ` Ralf Baechle
  0 siblings, 1 reply; 13+ messages in thread
From: Maciej W. Rozycki @ 2001-07-05 11:35 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thiemo Seufer, linux-mips

On Wed, 4 Jul 2001, Ralf Baechle wrote:

> > > extern __inline__ void  __sti(void)
> > > {
> > > 	__asm__ __volatile__(
> > > 		".set\tnoreorder\n\t"
> > > 		".set\tnoat\n\t"
> > > 		"mfc0\t$1,$12\n\t"
> > > 		"ori\t$1,0x1f\n\t"
> > > 		"xori\t$1,0x1e\n\t"
> > > 		"mtc0\t$1,$12\n\t"               /* <----- problem  here! */
> > 
> > Here should follow some nop's on a MIPS I system to make sure $12
> > is written
> 
> There are no nops there since we simply don't care how how many cycles
> after the mtc0 the interrupts actually get enabled.  Worst case is the
> R4000's 8 stage pipeline where we have a latency of 3 cycles, clearly
> nothing that justifies wasting memory and cycles for nops.

 Still there is a nop missing after mfc0 if this is to be executed on a
MIPS I CPU.  The 2.4.x code is fine, though, so nothing to worry about. 

> > (why is noreorder used here?).
> 
> Without the .set noreorder the assembler would be free to do arbitrary
> reordering of the object code generated.  Gas doesn't do that but there
> are other assemblers that do flow analysis and may generate object code
> that doesn't look very much like the source they were fed with.

 Hmm, I would consider that a bug in such an assembler.  The mtc0 and
possibly the mfc0 opcode should be treated as reordering barriers as they
may involve side effects an assembler might not be aware of. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sti() does not work.
  2001-07-05 11:35     ` Maciej W. Rozycki
@ 2001-07-13 11:35       ` Ralf Baechle
  2001-07-13 14:01         ` Maciej W. Rozycki
  0 siblings, 1 reply; 13+ messages in thread
From: Ralf Baechle @ 2001-07-13 11:35 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Thiemo Seufer, linux-mips

On Thu, Jul 05, 2001 at 01:35:11PM +0200, Maciej W. Rozycki wrote:

> > > (why is noreorder used here?).
> > 
> > Without the .set noreorder the assembler would be free to do arbitrary
> > reordering of the object code generated.  Gas doesn't do that but there
> > are other assemblers that do flow analysis and may generate object code
> > that doesn't look very much like the source they were fed with.
> 
>  Hmm, I would consider that a bug in such an assembler.  The mtc0 and
> possibly the mfc0 opcode should be treated as reordering barriers as they
> may involve side effects an assembler might not be aware of. 

Assembler is the art of using sideeffects so things are fairly explicit.
Optimizations are controlled using

  .set noreorder / reorder
  .set volatile / novolatile
  .set nomove / nomove
  .set nobopt / bopt

  Ralf

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sti() does not work.
  2001-07-13 11:35       ` Ralf Baechle
@ 2001-07-13 14:01         ` Maciej W. Rozycki
  2001-07-14 11:04           ` Ralf Baechle
  0 siblings, 1 reply; 13+ messages in thread
From: Maciej W. Rozycki @ 2001-07-13 14:01 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thiemo Seufer, linux-mips

On Fri, 13 Jul 2001, Ralf Baechle wrote:

> >  Hmm, I would consider that a bug in such an assembler.  The mtc0 and
> > possibly the mfc0 opcode should be treated as reordering barriers as they
> > may involve side effects an assembler might not be aware of. 
> 
> Assembler is the art of using sideeffects so things are fairly explicit.
> Optimizations are controlled using
> 
>   .set noreorder / reorder
>   .set volatile / novolatile
>   .set nomove / nomove
>   .set nobopt / bopt

 Sure, but sometimes ".set reorder" allows you to achieve better
optimization across various ISAs without a need to resort to the
preprocessor.  Consider the following code: 

	lw	$1,($2)
	addu	$3,$1

You need an instruction between the two for a MIPS I CPU but MIPS II+ CPUs
interlock here if no instruction is placed.  Assuming no real instruction
can be reordered here, a nop must be inserted if the code gets compiled
for a MIPS I CPU but no instruction is preferred otherwise.  The assembler
does it automatically if the ".set reorder" directive is active, but you
need to decide yourself if it is not.

 Actually with mfc0 there is no problem -- you need a nop in the case like
the above one as coprocessor transfers never interlock; at least docs
state so.  But who believes docs without a grain of salt, so please
correct me if I am wrong (I don't have appropriate hardware to perform a
test). 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sti() does not work.
  2001-07-13 14:01         ` Maciej W. Rozycki
@ 2001-07-14 11:04           ` Ralf Baechle
  2001-07-14 11:39             ` Kevin D. Kissell
  2001-07-16 12:46             ` Maciej W. Rozycki
  0 siblings, 2 replies; 13+ messages in thread
From: Ralf Baechle @ 2001-07-14 11:04 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Thiemo Seufer, linux-mips

On Fri, Jul 13, 2001 at 04:01:29PM +0200, Maciej W. Rozycki wrote:

>  Sure, but sometimes ".set reorder" allows you to achieve better
> optimization across various ISAs without a need to resort to the
> preprocessor.  Consider the following code: 
> 
> 	lw	$1,($2)
> 	addu	$3,$1
> 
> You need an instruction between the two for a MIPS I CPU but MIPS II+ CPUs
> interlock here if no instruction is placed.  Assuming no real instruction
> can be reordered here, a nop must be inserted if the code gets compiled
> for a MIPS I CPU but no instruction is preferred otherwise.  The assembler
> does it automatically if the ".set reorder" directive is active, but you
> need to decide yourself if it is not.
> 
>  Actually with mfc0 there is no problem -- you need a nop in the case like
> the above one as coprocessor transfers never interlock; at least docs
> state so.  But who believes docs without a grain of salt, so please
> correct me if I am wrong (I don't have appropriate hardware to perform a
> test). 

Real wild pig hackers on R3000 were writing code which knows that in the
load delay slot they still have the old register value available.  So you
can implement var1++; var2++ as:

	.set	noreorder
	lw	$reg, var1($gp)
	nop
	addiu	$reg, $reg, 1
	lw	$reg, var2($gp)
	sw	$reg, var1($gp)
	addiu	$reg, $reg, 1
	sw	$reg, var2($gp)

	.common	var1, 4, 4
	.common	var2, 4, 4

Of course only safe with interrupts disabled.  So in a sense introducing
the load interlock broke semantics of MIPS machine code ;-)

  Ralf

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sti() does not work.
  2001-07-14 11:04           ` Ralf Baechle
@ 2001-07-14 11:39             ` Kevin D. Kissell
  2001-07-14 11:39               ` Kevin D. Kissell
  2001-07-16 12:46             ` Maciej W. Rozycki
  1 sibling, 1 reply; 13+ messages in thread
From: Kevin D. Kissell @ 2001-07-14 11:39 UTC (permalink / raw)
  To: Ralf Baechle, Maciej W. Rozycki; +Cc: Thiemo Seufer, linux-mips

> Real wild pig hackers on R3000 were writing code which knows that in the
> load delay slot they still have the old register value available.  So you
> can implement var1++; var2++ as:
> 
> .set noreorder
> lw $reg, var1($gp)
> nop
> addiu $reg, $reg, 1
> lw $reg, var2($gp)
> sw $reg, var1($gp)
> addiu $reg, $reg, 1
> sw $reg, var2($gp)
> 
> .common var1, 4, 4
> .common var2, 4, 4
> 
> Of course only safe with interrupts disabled.  So in a sense introducing
> the load interlock broke semantics of MIPS machine code ;-)

Architecturally, the target register value is UNDEFINED during
the load delay slot on a MIPS I CPU.  Anyone who coded to any
particular assumption regarding its value was coding to a 
specific CPU implementation.  Introducing the load interlock
in later versions of the ISA and later implementations did not
reach backward in time and break the old hardware.  The
implementation-specific code still works for its specific 
implementation.  Refining the spec did not break the code for later
implementations - it was *always* broken for later implementations! ;-)

In a less pedantic tone, there actually is an architecturally
legal case where an assembly coder can justify the use of
noreorder for something other than CP0 pipeline hazards.
If what I want to do is to test a value, branch on the result,
and modify that value regardless of whether the branch is
taken, I can code something like:

    .set noreorder
    bltz    t0,foo
    sra    t0,t0,2
    .set reorder
    <other code>
foo:

Whereas otherwise I need to either consume another
register or replicate the shift both after the branch and
after foo.  If I'm very very lucky, the assembler will "hoist"
such a replicated instruction into the delay slot - a  good
compiler back-end optimiser certainly would.  But I'm not 
aware of any MIPS assembler that would perform that
optimisation - certainly the GNU assembler does not.

            Kevin K.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sti() does not work.
  2001-07-14 11:39             ` Kevin D. Kissell
@ 2001-07-14 11:39               ` Kevin D. Kissell
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin D. Kissell @ 2001-07-14 11:39 UTC (permalink / raw)
  To: Ralf Baechle, Maciej W. Rozycki; +Cc: Thiemo Seufer, linux-mips

> Real wild pig hackers on R3000 were writing code which knows that in the
> load delay slot they still have the old register value available.  So you
> can implement var1++; var2++ as:
> 
> .set noreorder
> lw $reg, var1($gp)
> nop
> addiu $reg, $reg, 1
> lw $reg, var2($gp)
> sw $reg, var1($gp)
> addiu $reg, $reg, 1
> sw $reg, var2($gp)
> 
> .common var1, 4, 4
> .common var2, 4, 4
> 
> Of course only safe with interrupts disabled.  So in a sense introducing
> the load interlock broke semantics of MIPS machine code ;-)

Architecturally, the target register value is UNDEFINED during
the load delay slot on a MIPS I CPU.  Anyone who coded to any
particular assumption regarding its value was coding to a 
specific CPU implementation.  Introducing the load interlock
in later versions of the ISA and later implementations did not
reach backward in time and break the old hardware.  The
implementation-specific code still works for its specific 
implementation.  Refining the spec did not break the code for later
implementations - it was *always* broken for later implementations! ;-)

In a less pedantic tone, there actually is an architecturally
legal case where an assembly coder can justify the use of
noreorder for something other than CP0 pipeline hazards.
If what I want to do is to test a value, branch on the result,
and modify that value regardless of whether the branch is
taken, I can code something like:

    .set noreorder
    bltz    t0,foo
    sra    t0,t0,2
    .set reorder
    <other code>
foo:

Whereas otherwise I need to either consume another
register or replicate the shift both after the branch and
after foo.  If I'm very very lucky, the assembler will "hoist"
such a replicated instruction into the delay slot - a  good
compiler back-end optimiser certainly would.  But I'm not 
aware of any MIPS assembler that would perform that
optimisation - certainly the GNU assembler does not.

            Kevin K.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sti() does not work.
  2001-07-14 11:04           ` Ralf Baechle
  2001-07-14 11:39             ` Kevin D. Kissell
@ 2001-07-16 12:46             ` Maciej W. Rozycki
  1 sibling, 0 replies; 13+ messages in thread
From: Maciej W. Rozycki @ 2001-07-16 12:46 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thiemo Seufer, linux-mips

On Sat, 14 Jul 2001, Ralf Baechle wrote:

> Real wild pig hackers on R3000 were writing code which knows that in the
> load delay slot they still have the old register value available.  So you
> can implement var1++; var2++ as:

 That's crazy...

> Of course only safe with interrupts disabled.  So in a sense introducing
> the load interlock broke semantics of MIPS machine code ;-)

 That broke the MIPS' virtue as well, as MIPS stands for "Microprocessor
without Interlocked Pipeline Stages" (actually mfhi/mflo broke that in the
first place, but it was less significant due to the multiplier being a
separate unit). 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2001-07-16 16:02 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-07-03 22:48 sti() does not work Steven Liu
2001-07-03 22:48 ` Steven Liu
2001-07-04 10:23 ` Thiemo Seufer
2001-07-04 12:23   ` Gleb O. Raiko
2001-07-04 13:26   ` Ralf Baechle
2001-07-05 11:35     ` Maciej W. Rozycki
2001-07-13 11:35       ` Ralf Baechle
2001-07-13 14:01         ` Maciej W. Rozycki
2001-07-14 11:04           ` Ralf Baechle
2001-07-14 11:39             ` Kevin D. Kissell
2001-07-14 11:39               ` Kevin D. Kissell
2001-07-16 12:46             ` Maciej W. Rozycki
2001-07-04 13:29 ` Ralf Baechle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox