All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: thread-ready ABIs
       [not found] <m3elkoa5dw.fsf@myware.mynet>
@ 2002-01-18 18:19 ` H . J . Lu
  2002-01-18 18:31   ` Ulrich Drepper
                     ` (4 more replies)
  0 siblings, 5 replies; 94+ messages in thread
From: H . J . Lu @ 2002-01-18 18:19 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: GNU libc hacker, linux-mips

On Thu, Jan 17, 2002 at 04:07:23PM -0800, Ulrich Drepper wrote:
> The time is near when we (well, I) well start a drastic move toward
> generally using thread registers.  Even in non-threaded code.
> 
> This means that unless all architectures get thread registers (or
> equivalent things like Alpha's special code) we'll have a two class
> society of platforms where all code written for the platforms without
> thread register can be run on the other systems, but not vice versa.
> 
> >From what I see today we have thread registers only on Alpha, x86,
> IA-64, SH, and x86_64.  SPARC shouldn't be too much of a problem.  Sun
> is using %g6 or %g7 (forgot which one) and since they define the ABI
> no big complications are expected.
> 
> Now, what is about the rest?  I assume cris isn't much of a problem
> since it's a purely embedded machine.
> 
> 
> Arm: don't know whether this should fall in the same category.
> Philip?
> 
> 
> m68k: Well, maybe it's time to retire these machines.  But on the
> other hand, there are those useless address registers.  Andreas, Jes?
> 
> 
> PPC (32-bit) is known to be a problem.  I've seen several proposals as
> to what register to use but haven't seen a final decision.  Problems
> with the different PPC implementations are probably hindering this.
> Geoff, could you please make a decision?  I hope the PPC64 ABI already
> allocated a thread register.
> 
> 
> S390: I have no idea.  Martin, please comment and make a decision.
> 
> 
> MIPS: Who feels responsible?  Andreas, HJ?
> 

I don't see there are any registers we can use without breaking ABI.
On the other hand, can we change the mips kernel to save k0 or k1 for
user space?

> 
> PA: no idea.  HP has no 32-bit ELF so.  But they have 64-bit ELF and
> it definitely has a thread register.
> 
> 
> 
> Please consider this a high priority task now.  I've been warning
> about this for a long time.  Jakub is working on some code and once
> this is ready for me to use I'll make lots of changes to ld.so and the
> locale handling and from that point on we have the two classes of
> architectures.
> 
> Oh, this now also concerns Hurd.  So, Roland, how far is using LDTs on
> Hurd/x86?
> 


H.J.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 18:19 ` thread-ready ABIs H . J . Lu
@ 2002-01-18 18:31   ` Ulrich Drepper
  2002-01-18 19:08     ` H . J . Lu
  2002-01-20  0:14     ` Ralf Baechle
  2002-01-18 20:03   ` Maciej W. Rozycki
                     ` (3 subsequent siblings)
  4 siblings, 2 replies; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-18 18:31 UTC (permalink / raw)
  To: H . J . Lu; +Cc: GNU libc hacker, linux-mips

"H . J . Lu" <hjl@lucon.org> writes:

> I don't see there are any registers we can use without breaking ABI.
> On the other hand, can we change the mips kernel to save k0 or k1 for
> user space?

Are these registers which are readable by normal users but writable
only in ring 0?  If yes, this is definitely worthwhile (similar to how
x86 works).  The only problem will be the MIPS variants which don't
have this register.  I bet there are some.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 18:31   ` Ulrich Drepper
@ 2002-01-18 19:08     ` H . J . Lu
  2002-01-18 19:20       ` Ulrich Drepper
  2002-01-20  0:14     ` Ralf Baechle
  1 sibling, 1 reply; 94+ messages in thread
From: H . J . Lu @ 2002-01-18 19:08 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: GNU libc hacker, linux-mips

On Fri, Jan 18, 2002 at 10:31:17AM -0800, Ulrich Drepper wrote:
> "H . J . Lu" <hjl@lucon.org> writes:
> 
> > I don't see there are any registers we can use without breaking ABI.
> > On the other hand, can we change the mips kernel to save k0 or k1 for
> > user space?
> 
> Are these registers which are readable by normal users but writable
> only in ring 0?  If yes, this is definitely worthwhile (similar to how

I can write to k0/k1. But the value is not perserved by kernel.

> x86 works).  The only problem will be the MIPS variants which don't
> have this register.  I bet there are some.

I don't think so. k0/k1 is reserved for OS. I don't know if OS can
restore it for use space or not.


H.J.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 19:08     ` H . J . Lu
@ 2002-01-18 19:20       ` Ulrich Drepper
  2002-01-19 12:14           ` Dominic Sweetman
  0 siblings, 1 reply; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-18 19:20 UTC (permalink / raw)
  To: H . J . Lu; +Cc: GNU libc hacker, linux-mips

"H . J . Lu" <hjl@lucon.org> writes:

> I can write to k0/k1. But the value is not perserved by kernel.

Strange.  This means the registers cannot have been used so far and if
the kernel can be changed it is free.

> I don't think so. k0/k1 is reserved for OS. I don't know if OS can
> restore it for use space or not.

There are so many different MIPS implementations that I wouldn't bet
on it.  One would have to look at the minimum architecture definition.
Also, what do the new MIPS32 cores do?

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 18:19 ` thread-ready ABIs H . J . Lu
  2002-01-18 18:31   ` Ulrich Drepper
@ 2002-01-18 20:03   ` Maciej W. Rozycki
  2002-01-18 20:20     ` Ulrich Drepper
  2002-01-18 21:23   ` Daniel Jacobowitz
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 94+ messages in thread
From: Maciej W. Rozycki @ 2002-01-18 20:03 UTC (permalink / raw)
  To: H . J . Lu; +Cc: Ulrich Drepper, GNU libc hacker, linux-mips

On Fri, 18 Jan 2002, H . J . Lu wrote:

> > This means that unless all architectures get thread registers (or
> > equivalent things like Alpha's special code) we'll have a two class
> > society of platforms where all code written for the platforms without
> > thread register can be run on the other systems, but not vice versa.
[...]
> On the other hand, can we change the mips kernel to save k0 or k1 for
> user space?

 No way.  MIPS doesn't predefine any stack-switching hardware and it
doesn't save any registers on exceptions (except from copying pc to cp0's
epc or errorepc).  The k0, k1 registers are defined as reserved for the
kernel use to switch to a kernel stack and save current values of other
registers upon a kernel entry due to an exception.  The general exception
handler (used for almost everything, including interrupts for most
systems) uses the registers this way to "bootstrap" itself.

 The dedicated TLB exception handler, which needs to be very fast for any
reasonable performance to achieve, uses these two registers solely,
without even touching anything else.

 As a result, anything written to k0 or k1 is lost immediately after the
first exception to happen afterwards. 

 Of course, this use of k0, k1 is purely conventional -- they are ordinary
32-bit general-purpose registers from the hardware point of view.  Only
zero and, to some extent, ra registers are different on MIPS. 

 The usage of all 32 registers is fixed in the ABI for MIPS.  But what
about that Alpha's special code?  It could possibly be reused given the
large Alpha's similarity to MIPS. 

  Maciej

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 20:03   ` Maciej W. Rozycki
@ 2002-01-18 20:20     ` Ulrich Drepper
  2002-01-18 20:50       ` Maciej W. Rozycki
  0 siblings, 1 reply; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-18 20:20 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: H . J . Lu, GNU libc hacker, linux-mips

"Maciej W. Rozycki" <macro@ds2.pg.gda.pl> writes:

> But what about that Alpha's special code?  It could possibly be
> reused given the large Alpha's similarity to MIPS.

No.  Alpha has certain builtin code which looks similar to calls or
software interrupts but are executed in the CPU.  This allows access
to some memory in the CPU which is almost as fast as a normal register
access.  MIPS doesn't have such hardware.  If you cannot find a
register you're doomed.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 20:20     ` Ulrich Drepper
@ 2002-01-18 20:50       ` Maciej W. Rozycki
  2002-01-18 21:02         ` Ulrich Drepper
  0 siblings, 1 reply; 94+ messages in thread
From: Maciej W. Rozycki @ 2002-01-18 20:50 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: H . J . Lu, linux-mips

On 18 Jan 2002, Ulrich Drepper wrote:

> > But what about that Alpha's special code?  It could possibly be
> > reused given the large Alpha's similarity to MIPS.
> 
> No.  Alpha has certain builtin code which looks similar to calls or
> software interrupts but are executed in the CPU.  This allows access

 Yep, PALcode is possibly the most significant difference.

> to some memory in the CPU which is almost as fast as a normal register
> access.  MIPS doesn't have such hardware.  If you cannot find a
> register you're doomed.

 Hmm, why would an ABI reserve spare registers for a possible future use
that might never happen?  We can probably define a new ABI specifically
for Linux, though, if the gain surpasses the loss. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 20:50       ` Maciej W. Rozycki
@ 2002-01-18 21:02         ` Ulrich Drepper
  2002-01-18 21:35           ` Maciej W. Rozycki
  0 siblings, 1 reply; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-18 21:02 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: H . J . Lu, linux-mips

"Maciej W. Rozycki" <macro@ds2.pg.gda.pl> writes:

> Hmm, why would an ABI reserve spare registers for a possible future
> use that might never happen?  We can probably define a new ABI
> specifically for Linux, though, if the gain surpasses the loss.

I don't really care what is done for MIPS and there is no reason to
find excuses for not having the foresight.  I just present the facts:
if there is no thread register or something equally fast MIPS will be
one of the platforms which will have only a subset of the
functionality of the other Linux architectures and not all
applications will be able to be compiled for MIPS.  That's all.  If
this is fine (e.g., for MIPS on embedded platforms) then all is good.
If somebody wants to use threads and MIPS there is a problem.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 18:19 ` thread-ready ABIs H . J . Lu
  2002-01-18 18:31   ` Ulrich Drepper
  2002-01-18 20:03   ` Maciej W. Rozycki
@ 2002-01-18 21:23   ` Daniel Jacobowitz
  2002-01-19  0:35     ` Kevin D. Kissell
  2002-01-22  1:39   ` Richard Henderson
  4 siblings, 0 replies; 94+ messages in thread
From: Daniel Jacobowitz @ 2002-01-18 21:23 UTC (permalink / raw)
  To: H . J . Lu; +Cc: Ulrich Drepper, GNU libc hacker, linux-mips

On Fri, Jan 18, 2002 at 10:19:08AM -0800, H . J . Lu wrote:
> > MIPS: Who feels responsible?  Andreas, HJ?
> > 
> 
> I don't see there are any registers we can use without breaking ABI.
> On the other hand, can we change the mips kernel to save k0 or k1 for
> user space?

No, there are no free registers and $k0/$k1 are needed by the kernel
for exceptions.  The only way I can see to do this would be to change
the ABI.

There are none available; the least used that I see is $v1, but $v1 is
used to return half of a double precision return value.  We would have
to steal one of the existing call-saved or call-clobbered registers.

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-18 21:24 Justin Carlson
  2002-01-18 21:31 ` Ulrich Drepper
  2002-01-18 21:42 ` Maciej W. Rozycki
  0 siblings, 2 replies; 94+ messages in thread
From: Justin Carlson @ 2002-01-18 21:24 UTC (permalink / raw)
  To: drepper; +Cc: linux-mips

[-- Attachment #1: Type: text/plain, Size: 1204 bytes --]

For those of us who are slightly behind, could you give some brief
summary of what this thread register hullabaloo is about?  I hadn't been
following this thread, but a search of the archives makes it look like
it hasn't really been explained yet.

_Why_ do we need a general register which is read-only to userland?  Are
you trying to store thread-context information in a fast way?  Why does
this need to happen?

Depending on what the exact requirements are, I could see several ways
to free up a register:

We could, theoretically, free up k1 or k0 (but not both) at the expense
of some time in the stackframe setup at the userland/kernel boundary and
some time in the fast TLB handler.  This wouldn't be read-only from
userland, though, but is that really a hard requirement?  

There is precedent for hijacking some CP0 registers for purposes other
than originally intended, e.g., the WATCH registers for holding the
kernel stack pointer.  I don't have a mips spec in front of me, though,
so I don't know if any CP0 registers are readable from userland: I seem
to remember that all mfc0 ops are priveleged at the instruction level,
not the register level, though.

-Justin

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 21:24 Justin Carlson
@ 2002-01-18 21:31 ` Ulrich Drepper
  2002-01-18 21:42 ` Maciej W. Rozycki
  1 sibling, 0 replies; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-18 21:31 UTC (permalink / raw)
  To: Justin Carlson; +Cc: linux-mips

Justin Carlson <justincarlson@cmu.edu> writes:

> _Why_ do we need a general register which is read-only to userland?  Are
> you trying to store thread-context information in a fast way?  Why does
> this need to happen?

Read-only is no requirement.  It is possible to live with this
arrangement is all I said.  If it's a normal register, fine, this is
how it works on most platforms.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 21:02         ` Ulrich Drepper
@ 2002-01-18 21:35           ` Maciej W. Rozycki
  2002-01-18 21:44             ` Ulrich Drepper
  0 siblings, 1 reply; 94+ messages in thread
From: Maciej W. Rozycki @ 2002-01-18 21:35 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: H . J . Lu, linux-mips

On 18 Jan 2002, Ulrich Drepper wrote:

> I don't really care what is done for MIPS and there is no reason to
> find excuses for not having the foresight.  I just present the facts:

 Tell that to the SysV committee. ;-)

 BTW, the i386 ABI supplement defines no spare registers, either -- all
are already assigned.  Where did you get extraneous registers for the i386
from (especially given the usual register shortage there)?  Maybe we could
use the same approach for MIPS.  Where to look for the code in glibc in a
current snapshot?

 One possible approach is to reserve GOT entries for thread registers. 
While not as fast as CPU's registers, if frequently accessed they would
stick in the cache.  Since the ABI mandates the code to keep a pointer to
the GOT in the gp register, accesses to got entries need only a single
instruction.  I haven't thought on it much -- someone might have a better
idea. 

> if there is no thread register or something equally fast MIPS will be
> one of the platforms which will have only a subset of the
> functionality of the other Linux architectures and not all
> applications will be able to be compiled for MIPS.  That's all.  If
> this is fine (e.g., for MIPS on embedded platforms) then all is good.
> If somebody wants to use threads and MIPS there is a problem.

 I have only workstation/server MIPS systems and I do care. 

  Maciej

 PS. Too bad libc-hacker rejects my submissions...

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 21:24 Justin Carlson
  2002-01-18 21:31 ` Ulrich Drepper
@ 2002-01-18 21:42 ` Maciej W. Rozycki
  1 sibling, 0 replies; 94+ messages in thread
From: Maciej W. Rozycki @ 2002-01-18 21:42 UTC (permalink / raw)
  To: Justin Carlson; +Cc: drepper, linux-mips

On 18 Jan 2002, Justin Carlson wrote:

> We could, theoretically, free up k1 or k0 (but not both) at the expense
> of some time in the stackframe setup at the userland/kernel boundary and
> some time in the fast TLB handler.  This wouldn't be read-only from
> userland, though, but is that really a hard requirement?  

 Much, *much* time, especially in the case of TLB exceptions.

> There is precedent for hijacking some CP0 registers for purposes other
> than originally intended, e.g., the WATCH registers for holding the
> kernel stack pointer.  I don't have a mips spec in front of me, though,

 That's not used exactly a stack pointer, but as a safeguard.  I'm still
thinking on a better use of this register, i.e. as a watchpoint for gdb. 

> so I don't know if any CP0 registers are readable from userland: I seem
> to remember that all mfc0 ops are priveleged at the instruction level,
> not the register level, though.

 Technically you can make cp0 registers r/w accessible from the userland,
but that's unacceptable for us.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 21:35           ` Maciej W. Rozycki
@ 2002-01-18 21:44             ` Ulrich Drepper
  2002-01-18 22:17               ` Maciej W. Rozycki
  0 siblings, 1 reply; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-18 21:44 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: H . J . Lu, linux-mips

"Maciej W. Rozycki" <macro@ds2.pg.gda.pl> writes:

> Where did you get extraneous registers for the i386
> from (especially given the usual register shortage there)?

%gs

> Maybe we could use the same approach for MIPS.

I doubt it.

> Where to look for the code in glibc in a current snapshot?

%gs is used for a long time linuxthreads/sysdeps/386/useldt.h

>  One possible approach is to reserve GOT entries for thread registers. 
> While not as fast as CPU's registers, if frequently accessed they would
> stick in the cache.  Since the ABI mandates the code to keep a pointer to
> the GOT in the gp register, accesses to got entries need only a single
> instruction.  I haven't thought on it much -- someone might have a better
> idea. 

How would you have different values for different threads?  It would
mean having multiple GOTs which is a resource waste and a nightmare in
resource management.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 21:44             ` Ulrich Drepper
@ 2002-01-18 22:17               ` Maciej W. Rozycki
  0 siblings, 0 replies; 94+ messages in thread
From: Maciej W. Rozycki @ 2002-01-18 22:17 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: H . J . Lu, linux-mips

On 18 Jan 2002, Ulrich Drepper wrote:

> > Where did you get extraneous registers for the i386
> > from (especially given the usual register shortage there)?
> 
> %gs

 Ah well, then you just have it by an accident and not because it was
specifically designed to be spare...

> > Maybe we could use the same approach for MIPS.
> 
> I doubt it.

 Indeed.

> > Where to look for the code in glibc in a current snapshot?
> 
> %gs is used for a long time linuxthreads/sysdeps/386/useldt.h

 Thanks.

> >  One possible approach is to reserve GOT entries for thread registers. 
> > While not as fast as CPU's registers, if frequently accessed they would
> > stick in the cache.  Since the ABI mandates the code to keep a pointer to
> > the GOT in the gp register, accesses to got entries need only a single
> > instruction.  I haven't thought on it much -- someone might have a better
> > idea. 
> 
> How would you have different values for different threads?  It would
> mean having multiple GOTs which is a resource waste and a nightmare in
> resource management.

 OK, now I understand you need some kind of a tid, that needs not be
writeable.  A read-only register can be moderately easily provided by
either k0 or k1 if exit paths of exceptions reload the given one.  The
trail code of exceptions only needs one of them at most. 

  Maciej

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-19  0:35     ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-19  0:35 UTC (permalink / raw)
  To: H . J . Lu, Ulrich Drepper; +Cc: GNU libc hacker, linux-mips

> On Thu, Jan 17, 2002 at 04:07:23PM -0800, Ulrich Drepper wrote:
> > The time is near when we (well, I) well start a drastic move toward
> > generally using thread registers.  Even in non-threaded code.
> > 
> > This means that unless all architectures get thread registers (or
> > equivalent things like Alpha's special code) we'll have a two class
> > society of platforms where all code written for the platforms without
> > thread register can be run on the other systems, but not vice versa.

[snip]

> > MIPS: Who feels responsible?  Andreas, HJ?
> 
> I don't see there are any registers we can use without breaking ABI.
> On the other hand, can we change the mips kernel to save k0 or k1 for
> user space?

Thank you for posting this to linux-mips, since I'm not sure 
that anyone at MIPS is on the GNU_libc_hacker list.

It would, in principle, be possible to save/restore k0
or k1 (but not both) if no other clever solution can be found.  
There are other VM OSes that manage to do so for MIPS, 
for other outside-the-old-ABI reasons.  It does, of course,
add some instructions and some memory traffic to the 
low-level exception handling , and we would have to look 
at whether we would want to make such a feature standard 
or specific to a "thread-ready" kernel build.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-19  0:35     ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-19  0:35 UTC (permalink / raw)
  To: H . J . Lu, Ulrich Drepper; +Cc: GNU libc hacker, linux-mips

> On Thu, Jan 17, 2002 at 04:07:23PM -0800, Ulrich Drepper wrote:
> > The time is near when we (well, I) well start a drastic move toward
> > generally using thread registers.  Even in non-threaded code.
> > 
> > This means that unless all architectures get thread registers (or
> > equivalent things like Alpha's special code) we'll have a two class
> > society of platforms where all code written for the platforms without
> > thread register can be run on the other systems, but not vice versa.

[snip]

> > MIPS: Who feels responsible?  Andreas, HJ?
> 
> I don't see there are any registers we can use without breaking ABI.
> On the other hand, can we change the mips kernel to save k0 or k1 for
> user space?

Thank you for posting this to linux-mips, since I'm not sure 
that anyone at MIPS is on the GNU_libc_hacker list.

It would, in principle, be possible to save/restore k0
or k1 (but not both) if no other clever solution can be found.  
There are other VM OSes that manage to do so for MIPS, 
for other outside-the-old-ABI reasons.  It does, of course,
add some instructions and some memory traffic to the 
low-level exception handling , and we would have to look 
at whether we would want to make such a feature standard 
or specific to a "thread-ready" kernel build.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-19  0:35     ` Kevin D. Kissell
  (?)
@ 2002-01-19  4:11     ` H . J . Lu
  2002-01-19 12:27       ` Dominic Sweetman
  2002-01-20 10:38       ` Machida Hiroyuki
  -1 siblings, 2 replies; 94+ messages in thread
From: H . J . Lu @ 2002-01-19  4:11 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Ulrich Drepper, GNU libc hacker, linux-mips

On Sat, Jan 19, 2002 at 01:35:38AM +0100, Kevin D. Kissell wrote:
> 
> It would, in principle, be possible to save/restore k0
> or k1 (but not both) if no other clever solution can be found.  
> There are other VM OSes that manage to do so for MIPS, 
> for other outside-the-old-ABI reasons.  It does, of course,
> add some instructions and some memory traffic to the 
> low-level exception handling , and we would have to look 
> at whether we would want to make such a feature standard 
> or specific to a "thread-ready" kernel build.

I like the read-only k0 idea. We just need to make a system call to
tell kernel what value to put in k0 before returning to the user space.
It shouldn't be too hard to implement. I will try it next week.


H.J.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-19 12:14           ` Dominic Sweetman
  0 siblings, 0 replies; 94+ messages in thread
From: Dominic Sweetman @ 2002-01-19 12:14 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: H . J . Lu, GNU libc hacker, linux-mips


Well, just about k0/k1:

So far as the hardware and instruction set is concerned, 
k0/k1 are just two of the 32 general purpose registers.  There's
nothing special about them and a program in user mode can read/write
them.

By a mere software convention, they're reserved.  But this is an
important software convention, because MIPS hardware does so little to
help out on an exception or interrupt.  Couple that to the lack of any
absolute addressing mode, and any exception handler pretty much has to
have a GP register it can write without saving, in order to be able to
point to the register-save area.

[You could, maybe, do something tricky with a negative offset
from the (constant zero) $0 register and special mapping]

OK, so that's one of them.  The second is used to reduce the length
and run-time of the tiny exception handler which is used to refill the
TLB when a page translation is not loaded.

The OS doesn't rely on user programs not corrupting these registers,
of course: it typically uses them only in non-interruptible code
sequences.  But since the OS changes them under the feet of user
programs, the convention that you don't use them is pretty strongly
enforced.

Dominic Sweetman
Algorithmics Ltd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-19 12:14           ` Dominic Sweetman
  0 siblings, 0 replies; 94+ messages in thread
From: Dominic Sweetman @ 2002-01-19 12:14 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: H . J . Lu, GNU libc hacker, linux-mips


Well, just about k0/k1:

So far as the hardware and instruction set is concerned, 
k0/k1 are just two of the 32 general purpose registers.  There's
nothing special about them and a program in user mode can read/write
them.

By a mere software convention, they're reserved.  But this is an
important software convention, because MIPS hardware does so little to
help out on an exception or interrupt.  Couple that to the lack of any
absolute addressing mode, and any exception handler pretty much has to
have a GP register it can write without saving, in order to be able to
point to the register-save area.

[You could, maybe, do something tricky with a negative offset
from the (constant zero) $0 register and special mapping]

OK, so that's one of them.  The second is used to reduce the length
and run-time of the tiny exception handler which is used to refill the
TLB when a page translation is not loaded.

The OS doesn't rely on user programs not corrupting these registers,
of course: it typically uses them only in non-interruptible code
sequences.  But since the OS changes them under the feet of user
programs, the convention that you don't use them is pretty strongly
enforced.

Dominic Sweetman
Algorithmics Ltd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-19  4:11     ` H . J . Lu
@ 2002-01-19 12:27       ` Dominic Sweetman
  2002-01-19 19:42         ` H . J . Lu
  2002-01-19 22:21           ` Kevin D. Kissell
  2002-01-20 10:38       ` Machida Hiroyuki
  1 sibling, 2 replies; 94+ messages in thread
From: Dominic Sweetman @ 2002-01-19 12:27 UTC (permalink / raw)
  To: H . J . Lu; +Cc: Kevin D. Kissell, Ulrich Drepper, GNU libc hacker, linux-mips


H . J . Lu (hjl@lucon.org) writes:

> > It would, in principle, be possible to save/restore k0
> > or k1 (but not both) if no other clever solution can be found.  
> > There are other VM OSes that manage to do so for MIPS, 
> > for other outside-the-old-ABI reasons.  It does, of course,
> > add some instructions and some memory traffic to the 
> > low-level exception handling , and we would have to look 
> > at whether we would want to make such a feature standard 
> > or specific to a "thread-ready" kernel build.
> 
> I like the read-only k0 idea. We just need to make a system call to
> tell kernel what value to put in k0 before returning to the user space.
> It shouldn't be too hard to implement. I will try it next week.

You could, I guess, wire a TLB entry to map the thread register into
the highest virtual memory region of the machine (the top of 'kseg2'),
which is accessible in a single instruction as a negative offset from
$0.  The kernel can write it through kseg0 or 64-bit equivalent, if
you're a bit careful about cache aliases.

Reading something out of the cache is pretty cheap: would that be
close enough to a 'register' to do the job?  There's no change to
critical routines, that way.

Dominic Sweetman
Algorithmics Ltd.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-19 12:27       ` Dominic Sweetman
@ 2002-01-19 19:42         ` H . J . Lu
  2002-01-21 13:27           ` Maciej W. Rozycki
  2002-01-19 22:21           ` Kevin D. Kissell
  1 sibling, 1 reply; 94+ messages in thread
From: H . J . Lu @ 2002-01-19 19:42 UTC (permalink / raw)
  To: Dominic Sweetman
  Cc: Kevin D. Kissell, Ulrich Drepper, GNU libc hacker, linux-mips

On Sat, Jan 19, 2002 at 12:27:52PM +0000, Dominic Sweetman wrote:
> 
> H . J . Lu (hjl@lucon.org) writes:
> 
> > > It would, in principle, be possible to save/restore k0
> > > or k1 (but not both) if no other clever solution can be found.  
> > > There are other VM OSes that manage to do so for MIPS, 
> > > for other outside-the-old-ABI reasons.  It does, of course,
> > > add some instructions and some memory traffic to the 
> > > low-level exception handling , and we would have to look 
> > > at whether we would want to make such a feature standard 
> > > or specific to a "thread-ready" kernel build.
> > 
> > I like the read-only k0 idea. We just need to make a system call to
> > tell kernel what value to put in k0 before returning to the user space.
> > It shouldn't be too hard to implement. I will try it next week.
> 
> You could, I guess, wire a TLB entry to map the thread register into
> the highest virtual memory region of the machine (the top of 'kseg2'),
> which is accessible in a single instruction as a negative offset from
> $0.  The kernel can write it through kseg0 or 64-bit equivalent, if
> you're a bit careful about cache aliases.

But it has to be a per thread value.

> 
> Reading something out of the cache is pretty cheap: would that be
> close enough to a 'register' to do the job?  There's no change to
> critical routines, that way.
> 

This is a patch against 2.4.16. Will this restore k1 to a known per
thread value?


H.J.
--- include/asm-mips/stackframe.h.thread	Wed Dec 12 12:34:53 2001
+++ include/asm-mips/stackframe.h	Sat Jan 19 11:36:38 2002
@@ -191,6 +191,7 @@ __asm__ (                               
 		lw	$2,  PT_R2(sp)
 
 #define RESTORE_SP_AND_RET                               \
+		lw	$27, PT_R27(sp);                 \
 		.set	push;				 \
 		.set	noreorder;			 \
 		lw	k0, PT_EPC(sp);                  \
@@ -229,6 +230,7 @@ __asm__ (                               
 		lw	$2,  PT_R2(sp)
 
 #define RESTORE_SP_AND_RET                               \
+		lw	$27, PT_R27(sp);                 \
 		lw	sp,  PT_R29(sp);                 \
 		.set	mips3;				 \
 		eret;					 \
@@ -237,6 +239,7 @@ __asm__ (                               
 #endif
 
 #define RESTORE_SP                                       \
+		lw	$27, PT_R27(sp);                 \
 		lw	sp,  PT_R29(sp);                 \
 
 #define RESTORE_ALL                                      \

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-19 22:21           ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-19 22:21 UTC (permalink / raw)
  To: Dominic Sweetman, H . J . Lu; +Cc: Ulrich Drepper, GNU libc hacker, linux-mips

> > > It would, in principle, be possible to save/restore k0
> > > or k1 (but not both) if no other clever solution can be found.  
> > > There are other VM OSes that manage to do so for MIPS, 
> > > for other outside-the-old-ABI reasons.  It does, of course,
> > > add some instructions and some memory traffic to the 
> > > low-level exception handling , and we would have to look 
> > > at whether we would want to make such a feature standard 
> > > or specific to a "thread-ready" kernel build.
> > 
> > I like the read-only k0 idea. We just need to make a system call to
> > tell kernel what value to put in k0 before returning to the user space.
> > It shouldn't be too hard to implement. I will try it next week.
> 
> You could, I guess, wire a TLB entry to map the thread register into
> the highest virtual memory region of the machine (the top of 'kseg2'),
> which is accessible in a single instruction as a negative offset from
> $0.

Funny you should mention this.  I was thinking about it
yesterday in this context as something else that I've seen 
done in some non-Linux MIPS OSes, and something that 
I think would be a better solution for CPU-specific fast 
storage in SMP configurations than some of the hacks that
I've seen proposed for SMP MIPS/Linux so far.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-19 22:21           ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-19 22:21 UTC (permalink / raw)
  To: Dominic Sweetman, H . J . Lu; +Cc: Ulrich Drepper, GNU libc hacker, linux-mips

> > > It would, in principle, be possible to save/restore k0
> > > or k1 (but not both) if no other clever solution can be found.  
> > > There are other VM OSes that manage to do so for MIPS, 
> > > for other outside-the-old-ABI reasons.  It does, of course,
> > > add some instructions and some memory traffic to the 
> > > low-level exception handling , and we would have to look 
> > > at whether we would want to make such a feature standard 
> > > or specific to a "thread-ready" kernel build.
> > 
> > I like the read-only k0 idea. We just need to make a system call to
> > tell kernel what value to put in k0 before returning to the user space.
> > It shouldn't be too hard to implement. I will try it next week.
> 
> You could, I guess, wire a TLB entry to map the thread register into
> the highest virtual memory region of the machine (the top of 'kseg2'),
> which is accessible in a single instruction as a negative offset from
> $0.

Funny you should mention this.  I was thinking about it
yesterday in this context as something else that I've seen 
done in some non-Linux MIPS OSes, and something that 
I think would be a better solution for CPU-specific fast 
storage in SMP configurations than some of the hacks that
I've seen proposed for SMP MIPS/Linux so far.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 18:31   ` Ulrich Drepper
  2002-01-18 19:08     ` H . J . Lu
@ 2002-01-20  0:14     ` Ralf Baechle
  1 sibling, 0 replies; 94+ messages in thread
From: Ralf Baechle @ 2002-01-20  0:14 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: H . J . Lu, GNU libc hacker, linux-mips

On Fri, Jan 18, 2002 at 10:31:17AM -0800, Ulrich Drepper wrote:

> > I don't see there are any registers we can use without breaking ABI.
> > On the other hand, can we change the mips kernel to save k0 or k1 for
> > user space?

These are reserved for kernel use.  Saving them is not a good idea as it
would impact performance of TLB exception handlers which are extremly
performance sensitive.

> Are these registers which are readable by normal users but writable
> only in ring 0?  If yes, this is definitely worthwhile (similar to how
> x86 works).  The only problem will be the MIPS variants which don't
> have this register.  I bet there are some.

No.

  Ralf

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-19  0:35     ` Kevin D. Kissell
  (?)
  (?)
@ 2002-01-20  0:24     ` Ralf Baechle
  2002-01-21 23:22       ` Ulrich Drepper
  -1 siblings, 1 reply; 94+ messages in thread
From: Ralf Baechle @ 2002-01-20  0:24 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: H . J . Lu, Ulrich Drepper, GNU libc hacker, linux-mips

On Sat, Jan 19, 2002 at 01:35:38AM +0100, Kevin D. Kissell wrote:

> Thank you for posting this to linux-mips, since I'm not sure 
> that anyone at MIPS is on the GNU_libc_hacker list.
> 
> It would, in principle, be possible to save/restore k0
> or k1 (but not both) if no other clever solution can be found.  
> There are other VM OSes that manage to do so for MIPS, 
> for other outside-the-old-ABI reasons.  It does, of course,
> add some instructions and some memory traffic to the 
> low-level exception handling , and we would have to look 
> at whether we would want to make such a feature standard 
> or specific to a "thread-ready" kernel build.

Changing the kernel for the small number of threaded applications that
exists and taking a performance impact for the kernel itself and anything
that's using threads is an exquisite example for a bad tradeoff.

  Ralf

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-19  4:11     ` H . J . Lu
  2002-01-19 12:27       ` Dominic Sweetman
@ 2002-01-20 10:38       ` Machida Hiroyuki
  2002-01-20 11:58           ` Kevin D. Kissell
  1 sibling, 1 reply; 94+ messages in thread
From: Machida Hiroyuki @ 2002-01-20 10:38 UTC (permalink / raw)
  To: hjl; +Cc: kevink, drepper, libc-hacker, linux-mips

From: "H . J . Lu" <hjl@lucon.org>
Subject: Re: thread-ready ABIs
Date: Fri, 18 Jan 2002 20:11:39 -0800

> I like the read-only k0 idea. We just need to make a system call to
> tell kernel what value to put in k0 before returning to the user space.
> It shouldn't be too hard to implement. I will try it next week.
> 
> 
> H.J.

Please don't use k1, we already use k1 to implement fast
test-and-set method for MIPS1 machine.  We plan to mereg
the method into main glibc and kernel tree.

You can use test-and-set without systemcall on MIPS1 machines using
this method. You can find the paper described about it in
	http://lc.linux.or.jp/lc2001/papers/tas-ps2-paper.pdf
	(sorry in japanese only)

The abstract of the paper attached below;

The Implementation of user level test-and-set on PS2 Linux In the
multi-thread environment like Linux, a fast user-level mutual
exclusion mechanism is strongly required. But MIPS chips designed
for embedded and single processor, like the Emotion Engine, have
no atomic test-and-set instruction. We implemented the fast
user-level mutual exclusion without invoking system-call and its
costs, on the PS2 Linux. This method utilizes the memory protection
facility of Operating System, to detect preemption and nullify the
operation. In this paper, we present the method and its evaluation. 


---
Hiroyuki Machida
Sony Corp.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-20 11:58           ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-20 11:58 UTC (permalink / raw)
  To: hjl, Machida Hiroyuki; +Cc: drepper, libc-hacker, linux-mips

> > I like the read-only k0 idea. We just need to make a system call to
> > tell kernel what value to put in k0 before returning to the user space.
> > It shouldn't be too hard to implement. I will try it next week.
> > 
> > H.J.
> 
> Please don't use k1, we already use k1 to implement fast
> test-and-set method for MIPS1 machine.  We plan to mereg
> the method into main glibc and kernel tree.

I don't read Japanese, but I've worked on similar
methods for semaphores using k1, so I can guess
roughly what you've done.   We'll have to be very
careful if we want to have both a thread-register
extended ABI *and* this approach to semaphores.
The TLB miss handler must inevitably destroy one
or the other of k0/k1, though it can avoid destroying
both.  Thus the merge of thread-register+semaphore
must not require that both be preserved on an
arbitrary memory reference.  That may or may not
be possible, so it would be good if you guys at
Sony could post your code ASAP so we can see
what can and cannot be merged.

This situation also behooves us to verify that
k0/k1 are really and truly the only candidates
for a thread register in MIPS.  I haven't been
involved in any of the earlier discussions,
and am not on the libc-hacker mailing list
(and thus cannot post to it, by the way), but
was it considered to simply use one of the
static registers (say, s7/$23) in the existing
ABI?  Assuming it was set up correctly at
process startup, it would be preserved by
pre-thread library and .o modules, and could
be exploited by newly generated code.  Losing
a static register would be a hit on code generation
efficiency and performance, at least in principle.
Was this the reason why the idea was rejected,
or is there a more fundamental technical problem?

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-20 11:58           ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-20 11:58 UTC (permalink / raw)
  To: hjl, Machida Hiroyuki; +Cc: drepper, libc-hacker, linux-mips

> > I like the read-only k0 idea. We just need to make a system call to
> > tell kernel what value to put in k0 before returning to the user space.
> > It shouldn't be too hard to implement. I will try it next week.
> > 
> > H.J.
> 
> Please don't use k1, we already use k1 to implement fast
> test-and-set method for MIPS1 machine.  We plan to mereg
> the method into main glibc and kernel tree.

I don't read Japanese, but I've worked on similar
methods for semaphores using k1, so I can guess
roughly what you've done.   We'll have to be very
careful if we want to have both a thread-register
extended ABI *and* this approach to semaphores.
The TLB miss handler must inevitably destroy one
or the other of k0/k1, though it can avoid destroying
both.  Thus the merge of thread-register+semaphore
must not require that both be preserved on an
arbitrary memory reference.  That may or may not
be possible, so it would be good if you guys at
Sony could post your code ASAP so we can see
what can and cannot be merged.

This situation also behooves us to verify that
k0/k1 are really and truly the only candidates
for a thread register in MIPS.  I haven't been
involved in any of the earlier discussions,
and am not on the libc-hacker mailing list
(and thus cannot post to it, by the way), but
was it considered to simply use one of the
static registers (say, s7/$23) in the existing
ABI?  Assuming it was set up correctly at
process startup, it would be preserved by
pre-thread library and .o modules, and could
be exploited by newly generated code.  Losing
a static register would be a hit on code generation
efficiency and performance, at least in principle.
Was this the reason why the idea was rejected,
or is there a more fundamental technical problem?

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-20 11:58           ` Kevin D. Kissell
  (?)
@ 2002-01-20 13:16           ` Machida Hiroyuki
  2002-01-22  6:27             ` patches for test-and-set without ll/sc (Re: thread-ready ABIs) Machida Hiroyuki
  -1 siblings, 1 reply; 94+ messages in thread
From: Machida Hiroyuki @ 2002-01-20 13:16 UTC (permalink / raw)
  To: kevink; +Cc: hjl, drepper, libc-hacker, linux-mips



From: "Kevin D. Kissell" <kevink@mips.com>
Subject: Re: thread-ready ABIs
Date: Sun, 20 Jan 2002 12:58:00 +0100

> I don't read Japanese, but I've worked on similar
> methods for semaphores using k1, so I can guess
> roughly what you've done.   We'll have to be very
> careful if we want to have both a thread-register
> extended ABI *and* this approach to semaphores.
> The TLB miss handler must inevitably destroy one
> or the other of k0/k1, though it can avoid destroying
> both.  Thus the merge of thread-register+semaphore
> must not require that both be preserved on an
> arbitrary memory reference.  That may or may not
> be possible, so it would be good if you guys at
> Sony could post your code ASAP so we can see
> what can and cannot be merged.

We released source codes to the public with PS2 Linux (beta
version) DISC. But I think you can't get the DISC in outside of
japan. Patches included in those SRPMs are not separeted by
function. That meanes single big patche includes r5900 porting
codes, r5900 specific devices drivers and other enhancements. 
I can put kernel and glibc SRPMs in that DISC to you ftp site, if
you really want to get SRPMs with such a dirty patch.
Please send me your ftp site and how to put, if you want to SRPMs.

I'll write short descriptions about what our test-and-set does,
and try to make a separate patch for the method, anyway.


> This situation also behooves us to verify that
> k0/k1 are really and truly the only candidates
> for a thread register in MIPS.  I haven't been
	<snip>

Sorry, I don't read libc-hackers's archives yet...

---
Hiroyuki Machida
Sony Corp.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-20 11:58           ` Kevin D. Kissell
  (?)
  (?)
@ 2002-01-20 19:19           ` H . J . Lu
  2002-01-21  9:39               ` Kevin D. Kissell
  2002-01-21 13:43             ` Maciej W. Rozycki
  -1 siblings, 2 replies; 94+ messages in thread
From: H . J . Lu @ 2002-01-20 19:19 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Machida Hiroyuki, drepper, GNU C Library, linux-mips

On Sun, Jan 20, 2002 at 12:58:00PM +0100, Kevin D. Kissell wrote:
> > > I like the read-only k0 idea. We just need to make a system call to
> > > tell kernel what value to put in k0 before returning to the user space.
> > > It shouldn't be too hard to implement. I will try it next week.
> > > 
> > > H.J.
> > 
> > Please don't use k1, we already use k1 to implement fast
> > test-and-set method for MIPS1 machine.  We plan to mereg
> > the method into main glibc and kernel tree.
> 
> I don't read Japanese, but I've worked on similar
> methods for semaphores using k1, so I can guess
> roughly what you've done.   We'll have to be very
> careful if we want to have both a thread-register
> extended ABI *and* this approach to semaphores.
> The TLB miss handler must inevitably destroy one
> or the other of k0/k1, though it can avoid destroying
> both.  Thus the merge of thread-register+semaphore
> must not require that both be preserved on an
> arbitrary memory reference.  That may or may not
> be possible, so it would be good if you guys at
> Sony could post your code ASAP so we can see
> what can and cannot be merged.

As I understand, we don't need k1 based semaphore for MIPS II or above.
So many processors can still benefit from the thread register. We can
use a system call to implement loading a thread register. So it is
a trade off between system-call/k1 for thread-register/semaphore. We
can make it a configure time option. Since PS2 is already using k1 for
semaphore, I'd like to see it get merged in ASAP so that anything we
do won't break PS2.

> 
> This situation also behooves us to verify that
> k0/k1 are really and truly the only candidates
> for a thread register in MIPS.  I haven't been
> involved in any of the earlier discussions,
> and am not on the libc-hacker mailing list
> (and thus cannot post to it, by the way), but

You haven't missed anything. I changed it to the glibc alpha list.

> was it considered to simply use one of the
> static registers (say, s7/$23) in the existing
> ABI?  Assuming it was set up correctly at
> process startup, it would be preserved by
> pre-thread library and .o modules, and could
> be exploited by newly generated code.  Losing
> a static register would be a hit on code generation
> efficiency and performance, at least in principle.
> Was this the reason why the idea was rejected,
> or is there a more fundamental technical problem?

We never considered it since it is an invasive ABI change. Besides
the performance issue, it may break exist codes. I prefer k1 if all
possible.


H.J.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-21  9:39               ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-21  9:39 UTC (permalink / raw)
  To: H . J . Lu; +Cc: Machida Hiroyuki, drepper, GNU C Library, linux-mips

> > > > I like the read-only k0 idea. We just need to make a system call to
> > > > tell kernel what value to put in k0 before returning to the user
space.
> > > > It shouldn't be too hard to implement. I will try it next week.
> > > >
> > > > H.J.
> > >
> > > Please don't use k1, we already use k1 to implement fast
> > > test-and-set method for MIPS1 machine.  We plan to merge
> > > the method into main glibc and kernel tree.
[snip]
> >
> > This situation... behooves us to verify that
> > k0/k1 are really and truly the only candidates
> > for a thread register in MIPS.  I haven't been
> > involved in any of the earlier discussions,
> > and am not on the libc-hacker mailing list
> > (and thus cannot post to it, by the way), but
> > was it considered to simply use one of the
> > static registers (say, s7/$23) in the existing
> > ABI?  Assuming it was set up correctly at
> > process startup, it would be preserved by
> > pre-thread library and .o modules, and could
> > be exploited by newly generated code.  Losing
> > a static register would be a hit on code generation
> > efficiency and performance, at least in principle.
> > Was this the reason why the idea was rejected,
> > or is there a more fundamental technical problem?
>
> We never considered it since it is an invasive ABI change. Besides
> the performance issue, it may break exist codes. I prefer k1 if all
> possible.

If anything, assuming that k0 or k1 are sane in
compiler-generated code is more of a violation
of the ABI than imposing an optional use of s7.
Sony's use in libraries is somewhat less intrusive.

Please explain how the use of a static register as
described above would break existing codes.
It's a common technique to bind a static register
to a global variable.  Linking to libraries with no
knowledge of this variable breaks nothing, since
by the ABI, all use of "s" registers requires that
they be preserved and returned intact to the caller.
It seems to me to be quite straightforward to apply
this technique to the thread register.  The *only*
issue I see is that of performance, and it is by
no means clear how severe this would be.
In the compiled code that I have examined
for compiler efficiency in the past, it's pretty
rare that *all* static registers are actually used.

I consider this to be a fairly serious issue, and
will take it up with the people who own the ABI
within MIPS.  I hope to be able to make an
"official" recommendation shortly.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-21  9:39               ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-21  9:39 UTC (permalink / raw)
  To: H . J . Lu; +Cc: Machida Hiroyuki, drepper, GNU C Library, linux-mips

> > > > I like the read-only k0 idea. We just need to make a system call to
> > > > tell kernel what value to put in k0 before returning to the user
space.
> > > > It shouldn't be too hard to implement. I will try it next week.
> > > >
> > > > H.J.
> > >
> > > Please don't use k1, we already use k1 to implement fast
> > > test-and-set method for MIPS1 machine.  We plan to merge
> > > the method into main glibc and kernel tree.
[snip]
> >
> > This situation... behooves us to verify that
> > k0/k1 are really and truly the only candidates
> > for a thread register in MIPS.  I haven't been
> > involved in any of the earlier discussions,
> > and am not on the libc-hacker mailing list
> > (and thus cannot post to it, by the way), but
> > was it considered to simply use one of the
> > static registers (say, s7/$23) in the existing
> > ABI?  Assuming it was set up correctly at
> > process startup, it would be preserved by
> > pre-thread library and .o modules, and could
> > be exploited by newly generated code.  Losing
> > a static register would be a hit on code generation
> > efficiency and performance, at least in principle.
> > Was this the reason why the idea was rejected,
> > or is there a more fundamental technical problem?
>
> We never considered it since it is an invasive ABI change. Besides
> the performance issue, it may break exist codes. I prefer k1 if all
> possible.

If anything, assuming that k0 or k1 are sane in
compiler-generated code is more of a violation
of the ABI than imposing an optional use of s7.
Sony's use in libraries is somewhat less intrusive.

Please explain how the use of a static register as
described above would break existing codes.
It's a common technique to bind a static register
to a global variable.  Linking to libraries with no
knowledge of this variable breaks nothing, since
by the ABI, all use of "s" registers requires that
they be preserved and returned intact to the caller.
It seems to me to be quite straightforward to apply
this technique to the thread register.  The *only*
issue I see is that of performance, and it is by
no means clear how severe this would be.
In the compiled code that I have examined
for compiler efficiency in the past, it's pretty
rare that *all* static registers are actually used.

I consider this to be a fairly serious issue, and
will take it up with the people who own the ABI
within MIPS.  I hope to be able to make an
"official" recommendation shortly.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-19 19:42         ` H . J . Lu
@ 2002-01-21 13:27           ` Maciej W. Rozycki
  0 siblings, 0 replies; 94+ messages in thread
From: Maciej W. Rozycki @ 2002-01-21 13:27 UTC (permalink / raw)
  To: H . J . Lu
  Cc: Dominic Sweetman, Kevin D. Kissell, Ulrich Drepper,
	GNU libc alpha, linux-mips

On Sat, 19 Jan 2002, H . J . Lu wrote:

> But it has to be a per thread value.

 But threads run under different contexts in Linux, AFAIK.

> This is a patch against 2.4.16. Will this restore k1 to a known per
> thread value?

 It wouldn't -- k1 isn't saved.  And there are TLB exception handlers (the
most important performance issue here) that don't use framing at all --
see arch/mips/mm/tlbex-r?k.S.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-20 19:19           ` thread-ready ABIs H . J . Lu
  2002-01-21  9:39               ` Kevin D. Kissell
@ 2002-01-21 13:43             ` Maciej W. Rozycki
  1 sibling, 0 replies; 94+ messages in thread
From: Maciej W. Rozycki @ 2002-01-21 13:43 UTC (permalink / raw)
  To: H . J . Lu
  Cc: Kevin D. Kissell, Machida Hiroyuki, drepper, GNU C Library,
	linux-mips

On Sun, 20 Jan 2002, H . J . Lu wrote:

> As I understand, we don't need k1 based semaphore for MIPS II or above.
> So many processors can still benefit from the thread register. We can
> use a system call to implement loading a thread register. So it is
> a trade off between system-call/k1 for thread-register/semaphore. We
> can make it a configure time option. Since PS2 is already using k1 for
> semaphore, I'd like to see it get merged in ASAP so that anything we
> do won't break PS2.

 I believe we need not trade anything off if we split k1 into two parts. 
We could use e.g. the 31 MSBs for the thread register and the LSB for the
ll/sc equivalent.  Other splits are possible if the ll/sc emulation needs
more bits. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-21  9:39               ` Kevin D. Kissell
  (?)
@ 2002-01-21 13:56               ` Maciej W. Rozycki
  2002-01-21 18:24                 ` H . J . Lu
  -1 siblings, 1 reply; 94+ messages in thread
From: Maciej W. Rozycki @ 2002-01-21 13:56 UTC (permalink / raw)
  To: Kevin D. Kissell
  Cc: H . J . Lu, Machida Hiroyuki, drepper, GNU C Library, linux-mips

On Mon, 21 Jan 2002, Kevin D. Kissell wrote:

> If anything, assuming that k0 or k1 are sane in
> compiler-generated code is more of a violation
> of the ABI than imposing an optional use of s7.
> Sony's use in libraries is somewhat less intrusive.

 Hmm, it's a glibc/kernel internal implementation detail.  I don't think
this is an ABI violation, as from a program's point of view k0/k1 are
still "undefined -- do not use".

> It's a common technique to bind a static register
> to a global variable.  Linking to libraries with no
> knowledge of this variable breaks nothing, since
> by the ABI, all use of "s" registers requires that
> they be preserved and returned intact to the caller.
> It seems to me to be quite straightforward to apply
> this technique to the thread register.  The *only*
> issue I see is that of performance, and it is by
> no means clear how severe this would be.

 The k0/k1 approach is a performance hit as well.  Possibly a worse one,
as you lose a few cycles unconditionally every exception, while having one
static register less in the code can be dealt with by a compiler in a more
flexible way.  

> In the compiled code that I have examined
> for compiler efficiency in the past, it's pretty
> rare that *all* static registers are actually used.

 Even with one register less there are still eight remaining, indeed.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-21 13:56               ` Maciej W. Rozycki
@ 2002-01-21 18:24                 ` H . J . Lu
  2002-01-21 18:36                   ` Ulrich Drepper
  0 siblings, 1 reply; 94+ messages in thread
From: H . J . Lu @ 2002-01-21 18:24 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Kevin D. Kissell, Machida Hiroyuki, drepper, GNU C Library,
	linux-mips

On Mon, Jan 21, 2002 at 02:56:21PM +0100, Maciej W. Rozycki wrote:
> 
> > It's a common technique to bind a static register
> > to a global variable.  Linking to libraries with no
> > knowledge of this variable breaks nothing, since
> > by the ABI, all use of "s" registers requires that
> > they be preserved and returned intact to the caller.
> > It seems to me to be quite straightforward to apply
> > this technique to the thread register.  The *only*
> > issue I see is that of performance, and it is by
> > no means clear how severe this would be.
> 
>  The k0/k1 approach is a performance hit as well.  Possibly a worse one,
> as you lose a few cycles unconditionally every exception, while having one
> static register less in the code can be dealt with by a compiler in a more
> flexible way.  
> 
> > In the compiled code that I have examined
> > for compiler efficiency in the past, it's pretty
> > rare that *all* static registers are actually used.
> 
>  Even with one register less there are still eight remaining, indeed.

If people believe it won't be a big problem, we can tell gcc not to use
$23, at least when compiling glibc.  The question is, should $23 be
fixed outside of glibc? Ulrich, should applciations have access to
thread register directly? If not, we may add a switch to tell gcc that
$23 is fixed and compile glibc with it.



H.J.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-21 18:24                 ` H . J . Lu
@ 2002-01-21 18:36                   ` Ulrich Drepper
  2002-01-21 18:52                     ` H . J . Lu
  0 siblings, 1 reply; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-21 18:36 UTC (permalink / raw)
  To: H . J . Lu
  Cc: Maciej W. Rozycki, Kevin D. Kissell, Machida Hiroyuki,
	GNU C Library, linux-mips

"H . J . Lu" <hjl@lucon.org> writes:

> Ulrich, should applciations have access to thread register directly?

It doesn't matter.  The value isn't changed in the lifetime of a
thread.  So the overhead of a syscall wouldn't be too much.  And
protection against programs overwriting the register isn't necessary.
It's the program's fault if that happens.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-21 18:36                   ` Ulrich Drepper
@ 2002-01-21 18:52                     ` H . J . Lu
  2002-01-21 18:58                       ` H . J . Lu
                                         ` (3 more replies)
  0 siblings, 4 replies; 94+ messages in thread
From: H . J . Lu @ 2002-01-21 18:52 UTC (permalink / raw)
  To: Ulrich Drepper
  Cc: Maciej W. Rozycki, Kevin D. Kissell, Machida Hiroyuki,
	GNU C Library, linux-mips

On Mon, Jan 21, 2002 at 10:36:26AM -0800, Ulrich Drepper wrote:
> "H . J . Lu" <hjl@lucon.org> writes:
> 
> > Ulrich, should applciations have access to thread register directly?
> 
> It doesn't matter.  The value isn't changed in the lifetime of a
> thread.  So the overhead of a syscall wouldn't be too much.  And
> protection against programs overwriting the register isn't necessary.
> It's the program's fault if that happens.

Thq question is if we should reserve $23 outside of glibc. $23 is
a saved register in the MIPS ABI. It doesn't change across function
calls. If applications outside of glibc don't need to access the
thread register directly, that means $23 can be used as a saved
register. We don't have to change anything when compiling applications.
We only need to compile glibc with $23 reserved as the thread register.


H.J.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-21 18:52                     ` H . J . Lu
@ 2002-01-21 18:58                       ` H . J . Lu
  2002-01-21 18:59                       ` Daniel Jacobowitz
                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 94+ messages in thread
From: H . J . Lu @ 2002-01-21 18:58 UTC (permalink / raw)
  To: Ulrich Drepper
  Cc: Maciej W. Rozycki, Kevin D. Kissell, Machida Hiroyuki,
	GNU C Library, linux-mips

On Mon, Jan 21, 2002 at 10:52:53AM -0800, H . J . Lu wrote:
> On Mon, Jan 21, 2002 at 10:36:26AM -0800, Ulrich Drepper wrote:
> > "H . J . Lu" <hjl@lucon.org> writes:
> > 
> > > Ulrich, should applciations have access to thread register directly?
> > 
> > It doesn't matter.  The value isn't changed in the lifetime of a
> > thread.  So the overhead of a syscall wouldn't be too much.  And
> > protection against programs overwriting the register isn't necessary.
> > It's the program's fault if that happens.
> 
> Thq question is if we should reserve $23 outside of glibc. $23 is
> a saved register in the MIPS ABI. It doesn't change across function
> calls. If applications outside of glibc don't need to access the
> thread register directly, that means $23 can be used as a saved
> register. We don't have to change anything when compiling applications.
> We only need to compile glibc with $23 reserved as the thread register.

In another word, is a thread register purely a convention within glibc
as long as it doesn't change when entering glibc?



H.J.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-21 18:52                     ` H . J . Lu
  2002-01-21 18:58                       ` H . J . Lu
@ 2002-01-21 18:59                       ` Daniel Jacobowitz
  2002-01-21 19:05                         ` H . J . Lu
  2002-01-21 21:04                           ` Kevin D. Kissell
  2002-01-21 19:30                       ` Geoff Keating
  2002-01-21 21:07                       ` Ulrich Drepper
  3 siblings, 2 replies; 94+ messages in thread
From: Daniel Jacobowitz @ 2002-01-21 18:59 UTC (permalink / raw)
  To: H . J . Lu
  Cc: Ulrich Drepper, Maciej W. Rozycki, Kevin D. Kissell,
	Machida Hiroyuki, GNU C Library, linux-mips

On Mon, Jan 21, 2002 at 10:52:53AM -0800, H . J . Lu wrote:
> On Mon, Jan 21, 2002 at 10:36:26AM -0800, Ulrich Drepper wrote:
> > "H . J . Lu" <hjl@lucon.org> writes:
> > 
> > > Ulrich, should applciations have access to thread register directly?
> > 
> > It doesn't matter.  The value isn't changed in the lifetime of a
> > thread.  So the overhead of a syscall wouldn't be too much.  And
> > protection against programs overwriting the register isn't necessary.
> > It's the program's fault if that happens.
> 
> Thq question is if we should reserve $23 outside of glibc. $23 is
> a saved register in the MIPS ABI. It doesn't change across function
> calls. If applications outside of glibc don't need to access the
> thread register directly, that means $23 can be used as a saved
> register. We don't have to change anything when compiling applications.
> We only need to compile glibc with $23 reserved as the thread register.

That's not right.  If it is call-saved in the application, that means
the application can use it.  Main may have to restore it before it
returns to __libc_start_main, but that doesn't do you any good.

It doesn't change across function calls, but it does change inside
function calls.

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-21 18:59                       ` Daniel Jacobowitz
@ 2002-01-21 19:05                         ` H . J . Lu
  2002-01-21 19:09                           ` Daniel Jacobowitz
  2002-01-21 21:04                           ` Kevin D. Kissell
  1 sibling, 1 reply; 94+ messages in thread
From: H . J . Lu @ 2002-01-21 19:05 UTC (permalink / raw)
  To: Ulrich Drepper, Maciej W. Rozycki, Kevin D. Kissell,
	Machida Hiroyuki, GNU C Library, linux-mips

On Mon, Jan 21, 2002 at 01:59:10PM -0500, Daniel Jacobowitz wrote:
> On Mon, Jan 21, 2002 at 10:52:53AM -0800, H . J . Lu wrote:
> > On Mon, Jan 21, 2002 at 10:36:26AM -0800, Ulrich Drepper wrote:
> > > "H . J . Lu" <hjl@lucon.org> writes:
> > > 
> > > > Ulrich, should applciations have access to thread register directly?
> > > 
> > > It doesn't matter.  The value isn't changed in the lifetime of a
> > > thread.  So the overhead of a syscall wouldn't be too much.  And
> > > protection against programs overwriting the register isn't necessary.
> > > It's the program's fault if that happens.
> > 
> > Thq question is if we should reserve $23 outside of glibc. $23 is
> > a saved register in the MIPS ABI. It doesn't change across function
> > calls. If applications outside of glibc don't need to access the
> > thread register directly, that means $23 can be used as a saved
> > register. We don't have to change anything when compiling applications.
> > We only need to compile glibc with $23 reserved as the thread register.
> 
> That's not right.  If it is call-saved in the application, that means
> the application can use it.  Main may have to restore it before it
> returns to __libc_start_main, but that doesn't do you any good.
> 
> It doesn't change across function calls, but it does change inside
> function calls.

What is wrong about using a thread register as long as it contains
the right value when it is accessed as a thread pointer? If
applications don't have access to the thread pointer, I don't see the
problem using the thread register.


H.J.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-21 19:05                         ` H . J . Lu
@ 2002-01-21 19:09                           ` Daniel Jacobowitz
  2002-01-21 19:18                             ` H . J . Lu
  0 siblings, 1 reply; 94+ messages in thread
From: Daniel Jacobowitz @ 2002-01-21 19:09 UTC (permalink / raw)
  To: H . J . Lu
  Cc: Ulrich Drepper, Maciej W. Rozycki, Kevin D. Kissell,
	Machida Hiroyuki, GNU C Library, linux-mips

On Mon, Jan 21, 2002 at 11:05:33AM -0800, H . J . Lu wrote:
> On Mon, Jan 21, 2002 at 01:59:10PM -0500, Daniel Jacobowitz wrote:
> > On Mon, Jan 21, 2002 at 10:52:53AM -0800, H . J . Lu wrote:
> > > On Mon, Jan 21, 2002 at 10:36:26AM -0800, Ulrich Drepper wrote:
> > > > "H . J . Lu" <hjl@lucon.org> writes:
> > > > 
> > > > > Ulrich, should applciations have access to thread register directly?
> > > > 
> > > > It doesn't matter.  The value isn't changed in the lifetime of a
> > > > thread.  So the overhead of a syscall wouldn't be too much.  And
> > > > protection against programs overwriting the register isn't necessary.
> > > > It's the program's fault if that happens.
> > > 
> > > Thq question is if we should reserve $23 outside of glibc. $23 is
> > > a saved register in the MIPS ABI. It doesn't change across function
> > > calls. If applications outside of glibc don't need to access the
> > > thread register directly, that means $23 can be used as a saved
> > > register. We don't have to change anything when compiling applications.
> > > We only need to compile glibc with $23 reserved as the thread register.
> > 
> > That's not right.  If it is call-saved in the application, that means
> > the application can use it.  Main may have to restore it before it
> > returns to __libc_start_main, but that doesn't do you any good.
> > 
> > It doesn't change across function calls, but it does change inside
> > function calls.
> 
> What is wrong about using a thread register as long as it contains
> the right value when it is accessed as a thread pointer? If
> applications don't have access to the thread pointer, I don't see the
> problem using the thread register.

When is the thread pointer accessed?  My understanding was that it
would be needed for the lifetime of the application, in functions
called from the application.  In that case its value can not be
trusted.

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-21 19:09                           ` Daniel Jacobowitz
@ 2002-01-21 19:18                             ` H . J . Lu
  0 siblings, 0 replies; 94+ messages in thread
From: H . J . Lu @ 2002-01-21 19:18 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Ulrich Drepper, Maciej W. Rozycki, Kevin D. Kissell,
	Machida Hiroyuki, GNU C Library, linux-mips

On Mon, Jan 21, 2002 at 02:09:32PM -0500, Daniel Jacobowitz wrote:
> 
> When is the thread pointer accessed?  My understanding was that it
> would be needed for the lifetime of the application, in functions
> called from the application.  In that case its value can not be
> trusted.

You are right. If we use $23 as the thread pointer, we have to change
the ABI. Any assembler codes have to be checked.


H.J.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-21 18:52                     ` H . J . Lu
  2002-01-21 18:58                       ` H . J . Lu
  2002-01-21 18:59                       ` Daniel Jacobowitz
@ 2002-01-21 19:30                       ` Geoff Keating
  2002-01-21 21:07                       ` Ulrich Drepper
  3 siblings, 0 replies; 94+ messages in thread
From: Geoff Keating @ 2002-01-21 19:30 UTC (permalink / raw)
  To: hjl; +Cc: drepper, macro, kevink, machida, libc-alpha, linux-mips

> Date: Mon, 21 Jan 2002 10:52:53 -0800
> From: "H . J . Lu" <hjl@lucon.org>
> Cc: "Maciej W. Rozycki" <macro@ds2.pg.gda.pl>,
>    "Kevin D. Kissell" <kevink@mips.com>,
>    Machida Hiroyuki <machida@sm.sony.co.jp>,
>    GNU C Library <libc-alpha@sources.redhat.com>, linux-mips@oss.sgi.com

> On Mon, Jan 21, 2002 at 10:36:26AM -0800, Ulrich Drepper wrote:
> > "H . J . Lu" <hjl@lucon.org> writes:
> > 
> > > Ulrich, should applciations have access to thread register directly?
> > 
> > It doesn't matter.  The value isn't changed in the lifetime of a
> > thread.  So the overhead of a syscall wouldn't be too much.  And
> > protection against programs overwriting the register isn't necessary.
> > It's the program's fault if that happens.
> 
> Thq question is if we should reserve $23 outside of glibc. $23 is
> a saved register in the MIPS ABI. It doesn't change across function
> calls. If applications outside of glibc don't need to access the
> thread register directly, that means $23 can be used as a saved
> register. We don't have to change anything when compiling applications.
> We only need to compile glibc with $23 reserved as the thread register.

This won't work, will it?  We need a register that application code is
not allowed to change ever, not one that is saved and restored.  Even
if the user knows that no glibc routines are called between the save
and the restore, a signal could happen.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-21 21:04                           ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-21 21:04 UTC (permalink / raw)
  To: Daniel Jacobowitz, H . J . Lu
  Cc: Ulrich Drepper, Maciej W. Rozycki, Machida Hiroyuki,
	GNU C Library, linux-mips

From: "Daniel Jacobowitz" <dan@debian.org>
> On Mon, Jan 21, 2002 at 10:52:53AM -0800, H . J . Lu wrote:
> > On Mon, Jan 21, 2002 at 10:36:26AM -0800, Ulrich Drepper wrote:
> > > "H . J . Lu" <hjl@lucon.org> writes:
> > >
> > > > Ulrich, should applciations have access to thread register directly?
> > >
> > > It doesn't matter.  The value isn't changed in the lifetime of a
> > > thread.  So the overhead of a syscall wouldn't be too much.  And
> > > protection against programs overwriting the register isn't necessary.
> > > It's the program's fault if that happens.
> >
> > Thq question is if we should reserve $23 outside of glibc. $23 is
> > a saved register in the MIPS ABI. It doesn't change across function
> > calls. If applications outside of glibc don't need to access the
> > thread register directly, that means $23 can be used as a saved
> > register. We don't have to change anything when compiling applications.
> > We only need to compile glibc with $23 reserved as the thread register.
>
> That's not right.  If it is call-saved in the application, that means
> the application can use it.  Main may have to restore it before it
> returns to __libc_start_main, but that doesn't do you any good.
>
> It doesn't change across function calls, but it does change inside
> function calls.

You are quite correct, and you have stated the problem very
succinctly.  We cannot, as I had hoped, simply superimpose
a thread pointer on a static register and keep it otherwise
invisible to the code generator.  So it's not the "easy way
out".

That does not necessarily mean that it's the wrong
solution.  As Maciej has pointed out, from the standpoint
of performance, making the kernel do gymnastics to
preserve or set up a "k" register on each trap may
well be worse for overall performance than having
one fewer "s" register.   Stealing the "s" register
would involve a change to the ABI and the compiler,
but would make it invisible to the kernel.  Using
(or abusing) a "k" register would techncially also
require a change to the ABI (though one that would
be more perfectly backward compatible), a smaller
perturbation of the compiler (thread pointer stuff
gets added, but the s-register compliment is unchanged),
and a kernel hack that may or may not conflict with
other people's work (e.g. Sony).  I've been kicking this
around with my colleagues at MIPS, and as I say,
I hope to be back with a semi-official recommendation
shortly.  Meanwhile, I'm very interested in other's views
of the pros, cons, and alternatives.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-21 21:04                           ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-21 21:04 UTC (permalink / raw)
  To: Daniel Jacobowitz, H . J . Lu
  Cc: Ulrich Drepper, Maciej W. Rozycki, Machida Hiroyuki,
	GNU C Library, linux-mips

From: "Daniel Jacobowitz" <dan@debian.org>
> On Mon, Jan 21, 2002 at 10:52:53AM -0800, H . J . Lu wrote:
> > On Mon, Jan 21, 2002 at 10:36:26AM -0800, Ulrich Drepper wrote:
> > > "H . J . Lu" <hjl@lucon.org> writes:
> > >
> > > > Ulrich, should applciations have access to thread register directly?
> > >
> > > It doesn't matter.  The value isn't changed in the lifetime of a
> > > thread.  So the overhead of a syscall wouldn't be too much.  And
> > > protection against programs overwriting the register isn't necessary.
> > > It's the program's fault if that happens.
> >
> > Thq question is if we should reserve $23 outside of glibc. $23 is
> > a saved register in the MIPS ABI. It doesn't change across function
> > calls. If applications outside of glibc don't need to access the
> > thread register directly, that means $23 can be used as a saved
> > register. We don't have to change anything when compiling applications.
> > We only need to compile glibc with $23 reserved as the thread register.
>
> That's not right.  If it is call-saved in the application, that means
> the application can use it.  Main may have to restore it before it
> returns to __libc_start_main, but that doesn't do you any good.
>
> It doesn't change across function calls, but it does change inside
> function calls.

You are quite correct, and you have stated the problem very
succinctly.  We cannot, as I had hoped, simply superimpose
a thread pointer on a static register and keep it otherwise
invisible to the code generator.  So it's not the "easy way
out".

That does not necessarily mean that it's the wrong
solution.  As Maciej has pointed out, from the standpoint
of performance, making the kernel do gymnastics to
preserve or set up a "k" register on each trap may
well be worse for overall performance than having
one fewer "s" register.   Stealing the "s" register
would involve a change to the ABI and the compiler,
but would make it invisible to the kernel.  Using
(or abusing) a "k" register would techncially also
require a change to the ABI (though one that would
be more perfectly backward compatible), a smaller
perturbation of the compiler (thread pointer stuff
gets added, but the s-register compliment is unchanged),
and a kernel hack that may or may not conflict with
other people's work (e.g. Sony).  I've been kicking this
around with my colleagues at MIPS, and as I say,
I hope to be back with a semi-official recommendation
shortly.  Meanwhile, I'm very interested in other's views
of the pros, cons, and alternatives.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-21 18:52                     ` H . J . Lu
                                         ` (2 preceding siblings ...)
  2002-01-21 19:30                       ` Geoff Keating
@ 2002-01-21 21:07                       ` Ulrich Drepper
  3 siblings, 0 replies; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-21 21:07 UTC (permalink / raw)
  To: H . J . Lu
  Cc: Maciej W. Rozycki, Kevin D. Kissell, Machida Hiroyuki,
	GNU C Library, linux-mips

"H . J . Lu" <hjl@lucon.org> writes:

> Thq question is if we should reserve $23 outside of glibc. $23 is
> a saved register in the MIPS ABI. It doesn't change across function
> calls. If applications outside of glibc don't need to access the
> thread register directly, that means $23 can be used as a saved
> register.

It depends on the final decisions os the thrad ABI but it is best to
assume that compiler-generated code will access the register.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-20  0:24     ` Ralf Baechle
@ 2002-01-21 23:22       ` Ulrich Drepper
  2002-01-21 23:57           ` Kevin D. Kissell
  0 siblings, 1 reply; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-21 23:22 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Kevin D. Kissell, H . J . Lu, GNU libc hacker, linux-mips

Ralf Baechle <ralf@oss.sgi.com> writes:

> Changing the kernel for the small number of threaded applications that
> exists and taking a performance impact for the kernel itself and anything
> that's using threads is an exquisite example for a bad tradeoff.

Well, it seems you haven't read what I wrote.  It's not about a small
number of threaded applications anymore.  The thread register will be
part of the ABI and all applications, threaded or not, will use it.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-21 23:57           ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-21 23:57 UTC (permalink / raw)
  To: Ralf Baechle, Ulrich Drepper
  Cc: Mike Uhler, linux-mips, GNU libc hacker, H . J . Lu

Ulrich Drepper" <drepper@redhat.com> writes:
>
> Ralf Baechle <ralf@oss.sgi.com> writes:
>
> > Changing the kernel for the small number of threaded applications that
> > exists and taking a performance impact for the kernel itself and
anything
> > that's using threads is an exquisite example for a bad tradeoff.
>
> Well, it seems you haven't read what I wrote.  It's not about a small
> number of threaded applications anymore.  The thread register will be
> part of the ABI and all applications, threaded or not, will use it.

As MIPS "owns" the ABI, whether or not the thread register
becomes a part of it is not something that anyone outside
of MIPS can simply decree.   I'd very much appreciate it if
someone would explain to me just what this register is used
for, and why a register needs to be permantly allocated
for this purpose.  There may still be other ways to solve the
problem without doing violence to the kernel or to the ABI.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-21 23:57           ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-21 23:57 UTC (permalink / raw)
  To: Ralf Baechle, Ulrich Drepper
  Cc: Mike Uhler, linux-mips, GNU libc hacker, H . J . Lu

Ulrich Drepper" <drepper@redhat.com> writes:
>
> Ralf Baechle <ralf@oss.sgi.com> writes:
>
> > Changing the kernel for the small number of threaded applications that
> > exists and taking a performance impact for the kernel itself and
anything
> > that's using threads is an exquisite example for a bad tradeoff.
>
> Well, it seems you haven't read what I wrote.  It's not about a small
> number of threaded applications anymore.  The thread register will be
> part of the ABI and all applications, threaded or not, will use it.

As MIPS "owns" the ABI, whether or not the thread register
becomes a part of it is not something that anyone outside
of MIPS can simply decree.   I'd very much appreciate it if
someone would explain to me just what this register is used
for, and why a register needs to be permantly allocated
for this purpose.  There may still be other ways to solve the
problem without doing violence to the kernel or to the ABI.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22  0:16             ` Ulrich Drepper
  0 siblings, 0 replies; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-22  0:16 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Ralf Baechle, Mike Uhler, linux-mips, H . J . Lu

"Kevin D. Kissell" <kevink@mips.com> writes:

> As MIPS "owns" the ABI, whether or not the thread register
> becomes a part of it is not something that anyone outside
> of MIPS can simply decree.

Well, MIPS might define the "official" ABI but nobody is forced to use
it and if nobody uses it it's nor worth anything.

> I'd very much appreciate it if someone would explain to me just what
> this register is used for, and why a register needs to be permantly
> allocated for this purpose.

Simply look at the ABIs for some less-backward processors.  Read the
thread-local storage section in the IA-64 ABI specification.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22  0:16             ` Ulrich Drepper
  0 siblings, 0 replies; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-22  0:16 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Ralf Baechle, Mike Uhler, linux-mips, H . J . Lu

"Kevin D. Kissell" <kevink@mips.com> writes:

> As MIPS "owns" the ABI, whether or not the thread register
> becomes a part of it is not something that anyone outside
> of MIPS can simply decree.

Well, MIPS might define the "official" ABI but nobody is forced to use
it and if nobody uses it it's nor worth anything.

> I'd very much appreciate it if someone would explain to me just what
> this register is used for, and why a register needs to be permantly
> allocated for this purpose.

Simply look at the ABIs for some less-backward processors.  Read the
thread-local storage section in the IA-64 ABI specification.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-18 18:19 ` thread-ready ABIs H . J . Lu
                     ` (3 preceding siblings ...)
  2002-01-19  0:35     ` Kevin D. Kissell
@ 2002-01-22  1:39   ` Richard Henderson
  4 siblings, 0 replies; 94+ messages in thread
From: Richard Henderson @ 2002-01-22  1:39 UTC (permalink / raw)
  To: H . J . Lu; +Cc: Ulrich Drepper, GNU libc hacker, linux-mips

On Fri, Jan 18, 2002 at 10:19:08AM -0800, H . J . Lu wrote:
> On the other hand, can we change the mips kernel to save k0 or k1 for
> user space?

I doubt it.  Traditionally these are clobbered by the TLB fill trap.


r~

^ permalink raw reply	[flat|nested] 94+ messages in thread

* patches for test-and-set without ll/sc (Re: thread-ready ABIs)
  2002-01-20 13:16           ` Machida Hiroyuki
@ 2002-01-22  6:27             ` Machida Hiroyuki
  2002-01-22  6:37               ` Ulrich Drepper
  0 siblings, 1 reply; 94+ messages in thread
From: Machida Hiroyuki @ 2002-01-22  6:27 UTC (permalink / raw)
  To: kevink, hjl, drepper, libc-hacker, linux-mips

[-- Attachment #1: Type: Text/Plain, Size: 4228 bytes --]


Hi, all.

As I said at 1/20, I'll post the short descriptions about our
test-and-set implementation and patches for linux-2.4.17 and
glibc-2.2.3. 

=====================================================================

We implemented the fast and safe user level test and set function for 
single MIPS CPUs. You don't need to use LL/SC and sysmips() with
this method. (excatly say, sysmips() is needed for initializing, but
once initialized, we don't use it any more).


  NOTE: We assume the single processor to use this method, You can
  not use our method for SMP.  


WHAT'S CHANGED:

  * kernel side change #1
	Set specific constant (we call this value
	"_TST_ACCESS_MAGIC") to K1 on every transition from kernel
	mode to user mode. This means you can use k1 in any
	exception handler as same as before our method introduced,
	except that you have to do 
		"li	k1, _TST_ACCESS_MAGIC" 
	at the very previous of
		"eret" 
	or 
		"j	k0;
		"rfe"
	.
	We choose the value of _TST_ACCESS_MAGIC, to cause SEGV
	fault when you use this value as address.


  * kernel side change #2
	On memory fault hander, kernel check write-access to 
	_TST_ACCESS_MAGIC from fixed address range of user process.
	(EPC is in  _TST_START_MAGIC to _TST_START_MAGIC+PAGE_SIZE)
	If the condtion is met, kernel restart user process 
	from _TST_START_MAGIC. 


  * kernel side change #3
	We add pseudo device driver "/dev/tst" to provide
	test_and_set procedure at the same virtual address
	(_TST_START_MAGIC) to any user process. 

	
    _TST_START_MAGIC:
	        .set noreorder
	0:
	        move    k1, a0
	        lw      v0, 0(a0)
	        nop
	        bnez    v0, 1f
	        nop
	        bne     k1, a0, 0b
	        nop			....<point A>
	        sw      a1, 0(k1)
	1:
	        jr      ra
	        nop


  * glibc change:

	We implement  test_and_set(addr, val) as follows,

		Do mmap /dev/tst to _TST_START_MAGIC, if not yet mapped.
		call _TST_START_MAGIC(addr, val)
	
	If we can't open /dev/tst then, use sysmips() as final resort.


HOW TO WORK:
	If  no context-switch is occured in _TST_START_MAGIC()
	procedure,  nobody changes the mutex var. It's no problem. 
	So you can do _TST_START_MAGIC() porcedure as you see.

	But, if some context-swtich is occured in _TST_START_MAGIC() 
	somebody chages the mutex var. It's a problem.
	We must not store to the mutex var, if context-swtich is
	occured at <point A>.  
	In our method, kernel sets k1 as _TST_ACCESS_MAGIC on
	transition to user mode.  "sw      a1, 0(k1)"  causes
	SEGV-fault if context-swtich is occured at <point A>. 
	The SEGV-fault hander catch this situation, restart user
	process from top of _TST_START_MAGIC().


PATCHES:

I attached three patches;
	1. patch for linux kernel 2.4.17 (SourceForge tree)
	2. patch for glibc 2.2.3  (of HHL 2.0)
	3. patch for linuxthread 2.2.3 (of HHL 2.0)

To test those patches; you must
	turn on CONFIG_MIPS_TST_DEV on config kernel,
	have working version of sysmips(MIPS_ATOMIC_SET),
	update kernel headers before building glibc and
	make /dev/tst device ("mknod c /dev/tst 123 0", 123 is a
	tempoary major number for this device) 

I'v tested  at ITE board. On testing, I'v made lettle changes into
"drivers/char/Config.in" and "arch/mips/kernel/sysmip.c" to enable
CONFIG_MIPS_TST_DEV and to work sysmips() at ITE board. Those chages
are not included in the patch set.

===================================================================

    You can find the paper about it in
	http://lc.linux.or.jp/lc2001/papers/tas-ps2-paper.pdf
	(sorry in japanese only)

    The abstract of the paper is following;

	The Implementation of user level test-and-set on PS2 Linux
	In the multi-thread environment like Linux, a fast
	user-level mutual exclusion mechanism is strongly
	required. But MIPS chips designed for embedded and single
	processor, like the Emotion Engine, have no atomic
	test-and-set instruction. We implemented the fast user-level
	mutual exclusion without invoking system-call and its costs,
	on the PS2 Linux. This method utilizes the memory protection 
	facility of Operating System, to detect preemption and
	nullify the operation. In this paper, we present the method
	and its evaluation.  

---
Hiroyuki Machida
Sony Corp.

[-- Attachment #2: linux-2.4-mips-tas.patch --]
[-- Type: Text/Plain, Size: 18770 bytes --]

Index: arch/mips/kernel/entry.S
===================================================================
RCS file: /cvsroot/linux-mips/linux/arch/mips/kernel/entry.S,v
retrieving revision 1.14
diff -u -p -r1.14 entry.S
--- arch/mips/kernel/entry.S	2001/12/10 17:46:47	1.14
+++ arch/mips/kernel/entry.S	2002/01/22 05:13:37
@@ -161,6 +161,7 @@ handle_vced:
 		addiu	k1, 1
 		sw	k1, %lo(vced_count)(k0)
 #endif
+		TST_DEV_EPILOGUE
 		eret
 
 handle_vcei:
@@ -172,6 +173,7 @@ handle_vcei:
 		addiu	k1, 1
 		sw	k1, %lo(vcei_count)(k0)
 #endif
+		TST_DEV_EPILOGUE
 		eret
 		.set    pop
 		END(except_vec3_r4000)
Index: arch/mips/kernel/gdb-low.S
===================================================================
RCS file: /cvsroot/linux-mips/linux/arch/mips/kernel/gdb-low.S,v
retrieving revision 1.4
diff -u -p -r1.4 gdb-low.S
--- arch/mips/kernel/gdb-low.S	2002/01/02 17:06:08	1.4
+++ arch/mips/kernel/gdb-low.S	2002/01/22 05:13:38
@@ -304,6 +304,7 @@
 		lw	v1,GDB_FR_REG3(sp)
 		lw	v0,GDB_FR_REG2(sp)
 		lw	$1,GDB_FR_REG1(sp)
+		TST_DEV_EPILOGUE
 #if defined(CONFIG_CPU_R3000) || defined(CONFIG_CPU_TX39XX)
 		lw	k0, GDB_FR_EPC(sp)
 		lw	sp, GDB_FR_REG29(sp)		/* Deallocate stack */
Index: arch/mips/kernel/r2300_misc.S
===================================================================
RCS file: /cvsroot/linux-mips/linux/arch/mips/kernel/r2300_misc.S,v
retrieving revision 1.2
diff -u -p -r1.2 r2300_misc.S
--- arch/mips/kernel/r2300_misc.S	2001/10/09 21:37:55	1.2
+++ arch/mips/kernel/r2300_misc.S	2002/01/22 05:13:38
@@ -130,9 +130,10 @@
 1:	tlbwr; \
 2:
 
-#define RET(reg) \
+#define RET(reg) /* don't pass k1 to REG */ \
 	mfc0	reg, CP0_EPC; \
 	nop; \
+	TST_DEV_EPILOGUE /* this may use k1 */ \
 	jr	reg; \
 	 rfe
 			
Index: arch/mips/kernel/r4k_misc.S
===================================================================
RCS file: /cvsroot/linux-mips/linux/arch/mips/kernel/r4k_misc.S,v
retrieving revision 1.5
diff -u -p -r1.5 r4k_misc.S
--- arch/mips/kernel/r4k_misc.S	2001/10/09 21:37:55	1.5
+++ arch/mips/kernel/r4k_misc.S	2002/01/22 05:13:38
@@ -167,6 +167,7 @@ invalid_tlbl:
 	 tlbwi
 1:
 	nop
+	TST_DEV_EPILOGUE
 	.set	mips3	
 	eret
 	.set	mips0
@@ -191,6 +192,7 @@ nopage_tlbl:
 	 tlbwi
 1:
 	nop
+	TST_DEV_EPILOGUE
 	.set	mips3	
 	eret
 	.set	mips0
@@ -225,6 +227,7 @@ nopage_tlbs:
 	 tlbwi
 1:
 	nop
+	TST_DEV_EPILOGUE
 	.set	mips3
 	eret
 	.set	mips0
Index: arch/mips/mm/fault.c
===================================================================
RCS file: /cvsroot/linux-mips/linux/arch/mips/mm/fault.c,v
retrieving revision 1.8
diff -u -p -r1.8 fault.c
--- arch/mips/mm/fault.c	2001/12/07 19:28:37	1.8
+++ arch/mips/mm/fault.c	2002/01/22 05:14:37
@@ -26,6 +26,10 @@
 #include <asm/system.h>
 #include <asm/uaccess.h>
 
+#if defined (CONFIG_MIPS_TST_DEV) || defined (CONFIG_MIPS_TST_DEV_MODULE)
+#include <linux/tst_dev.h>
+#endif
+
 #define development_version (LINUX_VERSION_CODE & 0x100)
 
 /*
@@ -160,6 +164,28 @@ bad_area:
 bad_area_nosemaphore:
 	/* User mode accesses just cause a SIGSEGV */
 	if (user_mode(regs)) {
+#if defined (CONFIG_MIPS_TST_DEV) || defined (CONFIG_MIPS_TST_DEV_MODULE)
+		/* TEST AND SET magic code */
+		/* Restart user program from _TST_START_MAGIC, 
+		  when all of following conditions are matched;
+
+		1. User program tried to wirte into _TST_ACCESS_MAGIC address.
+		2. That write access was done at the page including 
+			_TST_START_MAGIC.
+		 */
+
+		if (address == _TST_ACCESS_MAGIC && write ) {
+
+			unsigned long pc;
+			pc =  (unsigned long)regs->cp0_epc;
+			if ( _TST_START_MAGIC <= pc
+			     && pc < (_TST_START_MAGIC + PAGE_SIZE)){
+
+				regs->cp0_epc = (unsigned long)_TST_START_MAGIC;
+				return;
+			}
+		}
+#endif /* defined(CONFIG_MIPS_TST_DEV) || defined(CONFIG_MIPS_TST_DEV_MODULE) */
 		tsk->thread.cp0_badvaddr = address;
 		tsk->thread.error_code = write;
 #if 0
Index: arch/mips/mm/tlbex-r3k.S
===================================================================
RCS file: /cvsroot/linux-mips/linux/arch/mips/mm/tlbex-r3k.S,v
retrieving revision 1.3
diff -u -p -r1.3 tlbex-r3k.S
--- arch/mips/mm/tlbex-r3k.S	2002/01/04 18:04:53	1.3
+++ arch/mips/mm/tlbex-r3k.S	2002/01/22 05:14:37
@@ -48,9 +48,10 @@
 	lw	k0, (k1)
 	nop
 	mtc0	k0, CP0_ENTRYLO0
-	mfc0	k1, CP0_EPC
+	mfc0	k0, CP0_EPC
 	tlbwr
-	jr	k1
+	TST_DEV_EPILOGUE
+	jr	k0
 	rfe
 	END(except_vec0_r2300)
 
@@ -155,9 +156,11 @@
 1:	tlbwr; \
 2:
 
-#define RET(reg) \
+
+#define RET(reg) /* don't pass k1 to REG */ \
 	mfc0	reg, CP0_EPC; \
 	nop; \
+	TST_DEV_EPILOGUE /* this may use k1 */ \
 	jr	reg; \
 	 rfe
 			
Index: arch/mips/mm/tlbex-r4k.S
===================================================================
RCS file: /cvsroot/linux-mips/linux/arch/mips/mm/tlbex-r4k.S,v
retrieving revision 1.4
diff -u -p -r1.4 tlbex-r4k.S
--- arch/mips/mm/tlbex-r4k.S	2002/01/04 18:04:53	1.4
+++ arch/mips/mm/tlbex-r4k.S	2002/01/22 05:14:37
@@ -90,6 +90,7 @@
 	tlbwr					# write random tlb entry
 1:
 	nop
+	TST_DEV_EPILOGUE
 	eret					# return from trap
 	END(except_vec0_r4000)
 
@@ -117,6 +118,7 @@
 	nop
 	tlbwr
 	nop
+	TST_DEV_EPILOGUE
 	eret
 	END(except_vec0_r4600)
 
@@ -156,6 +158,7 @@
 	nop
 	tlbwr					# write random tlb entry
 	nop					# traditional nop
+	TST_DEV_EPILOGUE
 	eret					# return from trap
 	END(except_vec0_nevada)
 
@@ -187,6 +190,7 @@
 	tlbwr
 1:
 	nop
+	TST_DEV_EPILOGUE
 	eret
 	END(except_vec0_r45k_bvahwbug)
 
@@ -219,6 +223,7 @@
 	tlbwr
 1:
 	nop
+	TST_DEV_EPILOGUE
 	eret
 	END(except_vec0_r4k_mphwbug)
 #endif
@@ -250,6 +255,7 @@
 	tlbwr
 1:
 	nop
+	TST_DEV_EPILOGUE
 	eret
 	END(except_vec0_r4k_250MHZhwbug)
 
@@ -284,6 +290,7 @@
 	tlbwr
 1:
 	nop
+	TST_DEV_EPILOGUE
 	eret
 	END(except_vec0_r4k_MP250MHZhwbug)
 #endif
@@ -454,6 +461,7 @@ invalid_tlbl:
 	 tlbwi
 1:
 	nop
+	TST_DEV_EPILOGUE
 	.set	mips3	
 	eret
 	.set	mips0
@@ -478,6 +486,7 @@ nopage_tlbl:
 	 tlbwi
 1:
 	nop
+	TST_DEV_EPILOGUE
 	.set	mips3	
 	eret
 	.set	mips0
@@ -508,6 +517,7 @@ nopage_tlbs:
 	 tlbwi
 1:
 	nop
+	TST_DEV_EPILOGUE
 	.set	mips3
 	eret
 	.set	mips0
@@ -638,6 +648,7 @@ END(get_real_pte)
 	lui	k0, %hi(__saved_at)
 	lw	$at, %lo(__saved_at)(k0)	# restore at
 	lw	ra, %lo(__saved_ra)(k0)		# restore ra
+	TST_DEV_EPILOGUE
 	eret					# return from trap
 	END(translate_pte)
 
Index: drivers/char/Config.in
===================================================================
RCS file: /cvsroot/linux-mips/linux/drivers/char/Config.in,v
retrieving revision 1.29
diff -u -p -r1.29 Config.in
--- drivers/char/Config.in	2001/12/02 19:05:31	1.29
+++ drivers/char/Config.in	2002/01/22 05:33:21
@@ -269,3 +269,10 @@ if [ "$CONFIG_MIPS_ITE8172" = "y" ]; the
   tristate ' ITE GPIO' CONFIG_ITE_GPIO
 fi
 endmenu
+
+if [ "$CONFIG_MIPS" = "y" -a "$CONFIG_CPU_HAS_LLSC" = "n" ]; then
+   mainmenu_option next_comment
+   comment 'MIPS specific pseudo device driver'
+   tristate '  MIPS1 fast test and set support' CONFIG_MIPS_TST_DEV
+   endmenu
+fi
Index: drivers/char/Makefile
===================================================================
RCS file: /cvsroot/linux-mips/linux/drivers/char/Makefile,v
retrieving revision 1.24
diff -u -p -r1.24 Makefile
--- drivers/char/Makefile	2001/12/05 19:49:28	1.24
+++ drivers/char/Makefile	2002/01/22 05:33:21
@@ -24,7 +24,7 @@ obj-y	 += mem.o tty_io.o n_tty.o tty_ioc
 export-objs     :=	busmouse.o console.o keyboard.o sysrq.o \
 			misc.o pty.o random.o selection.o serial.o \
 			sonypi.o tty_io.o tty_ioctl.o generic_serial.o \
-			au1000_gpio.o lcd.o
+			au1000_gpio.o lcd.o tst_dev.o
 
 mod-subdirs	:=	joystick ftape drm pcmcia
 
@@ -247,6 +247,8 @@ obj-$(CONFIG_SH_WDT) += shwdt.o
 obj-$(CONFIG_EUROTECH_WDT) += eurotechwdt.o
 obj-$(CONFIG_SOFT_WATCHDOG) += softdog.o
 obj-$(CONFIG_VR41XX_WDT) += vr41xxwdt.o
+
+obj-$(CONFIG_MIPS_TST_DEV) += tst_dev.o
 
 subdir-$(CONFIG_MWAVE) += mwave
 ifeq ($(CONFIG_MWAVE),y)
Index: include/asm-mips/stackframe.h
===================================================================
RCS file: /cvsroot/linux-mips/linux/include/asm-mips/stackframe.h,v
retrieving revision 1.3
diff -u -p -r1.3 stackframe.h
--- include/asm-mips/stackframe.h	2001/10/24 23:32:54	1.3
+++ include/asm-mips/stackframe.h	2002/01/22 05:33:22
@@ -16,6 +16,15 @@
 #include <asm/offset.h>
 #include <linux/config.h>
 
+#if defined (CONFIG_MIPS_TST_DEV) || defined (CONFIG_MIPS_TST_DEV_MODULE)
+#include <linux/tst_dev.h>
+#define	TST_DEV_EPILOGUE \
+		li	k1, _TST_ACCESS_MAGIC;
+#else
+#define	TST_DEV_EPILOGUE
+#endif
+
+
 #define SAVE_AT                                          \
 		.set	push;                            \
 		.set	noat;                            \
@@ -195,6 +204,7 @@ __asm__ (                               
 		.set	noreorder;			 \
 		lw	k0, PT_EPC(sp);                  \
 		lw	sp,  PT_R29(sp);                 \
+		TST_DEV_EPILOGUE			 \
 		jr	k0;                              \
 		 rfe;					 \
 		.set	pop
@@ -230,6 +240,7 @@ __asm__ (                               
 
 #define RESTORE_SP_AND_RET                               \
 		lw	sp,  PT_R29(sp);                 \
+		TST_DEV_EPILOGUE			 \
 		.set	mips3;				 \
 		eret;					 \
 		.set	mips0
Index: include/linux/tst_dev.h
===================================================================
--- /dev/null	Wed May  6 05:32:27 1998
+++ include/linux/tst_dev.h	Mon Jan 21 14:29:50 2002
@@ -0,0 +1,37 @@
+/*
+ * tst_dev.h - MIPS1 TEST and SET pseudo device interface
+ */
+
+
+#ifndef _TST_DEV_H
+#define _TST_DEV_H
+
+#define TST_DEVICE_NAME	"tst"
+
+#ifndef _LANGUAGE_ASSEMBLY
+
+#include <linux/types.h>
+
+struct _tst_area_info {
+	__u32 	magic;
+	__u32 	pad1;
+	void 	*map_addr;
+#if _MIPS_SZPTR==32
+	__u32 	pad2;
+#endif
+	};
+
+#endif /*_LANGUAGE_ASSEMBLY*/
+
+#define _TST_INFO_MAGIC			0x20000304	/* obsolete */
+#define _TST_INFO_MAGIC_2ARGS		0x20000305
+
+#ifdef __KERNEL__
+#define _TST_ACCESS_MAGIC	0x00200000
+#define _TST_START_MAGIC	0x00300000
+#endif  /* __KERNEL_ */
+
+#endif  /*_TST_DEV_H */
+
+
+
Index: drivers/char/tst_dev.c
===================================================================
--- /dev/null	Wed May  6 05:32:27 1998
+++ drivers/char/tst_dev.c	Tue Jan 22 12:01:16 2002
@@ -0,0 +1,404 @@
+/*
+ * tst_dev.c - Test and Set device for mips which has not LL/SC. 
+ *
+ *        Copyright (C) 2000, 2001, 2002  Sony Computer Entertainment Inc.
+ *        Copyright 2001, 2002  Sony Corp.
+ *
+ * This file is subject to the terms and conditions of the GNU General
+ * Public License Version 2. See the file "COPYING" in the main
+ * directory of this archive for more details.
+ *
+ */
+
+#include <linux/autoconf.h>
+
+#ifndef CONFIG_MIPS
+#error "Sorry, this device is for MIPS only."
+#endif
+#if  !defined(CONFIG_PREEMPT) && defined(CONFIG_SMP)
+#error "Not on this device"
+#endif
+
+/*
+ * 	Setup/Clean up Driver Module
+ */
+
+#ifdef MODULE
+
+#ifndef EXPORT_SYMTAB
+#define EXPORT_SYMTAB
+#endif
+
+#if defined(CONFIG_MODVERSIONS) && !defined(MODVERSIONS)
+#define MODVERSIONS
+#endif
+
+#ifdef MODVERSIONS
+#include <linux/modversions.h>
+#endif
+
+#endif /* MODULE */
+
+#include <linux/init.h>
+
+#include <linux/errno.h>	/* error codes */
+#include <linux/kernel.h>	/* printk() */
+#include <linux/fs.h>		/* file op. */
+#include <linux/proc_fs.h>	/* proc fs file op. */
+#include <linux/mman.h>
+#include <linux/pagemap.h>
+#include <asm/io.h>
+#include <asm/uaccess.h>	/* copy to/from user space */
+#include <asm/page.h>		/* page size */
+#include <asm/pgtable.h>	/* PAGE_READONLY */
+
+#include <linux/tst_dev.h>
+
+#include <linux/module.h>
+
+#include <linux/major.h>
+
+#ifndef TSTDEV_MAJOR
+#define TSTDEV_MAJOR    123
+#endif
+
+static int tst_major=TSTDEV_MAJOR;	
+
+EXPORT_SYMBOL(tst_major);	/* export symbole */
+MODULE_PARM(tst_major,"i");	/* as parameter on loaing */
+
+
+/*
+ * File Operations table
+ *	please refer <linux/fs.h> for other methods.
+ */
+
+static struct file_operations  tst_fops; 
+
+
+#ifndef TST_DEVICE_NAME
+#define  TST_DEVICE_NAME "tst"
+#endif 
+
+
+static spinlock_t lock;
+static struct page * tst_code_buffer = 0 ;
+static const __u32 tst_code[] = {
+/*   0:*/   0x0080d821,        //move    $k1,$a0
+/*   4:*/   0x8c820000,        //lw      $v0,0($a0)
+/*   8:*/   0x00000000,        //nop
+/*   c:*/   0x14400004,        //bnez    $v0,0x20
+/*  10:*/   0x00000000,        //nop
+/*  14:*/   0x1764fffa,        //bne     $k1,$a0,0x0
+/*  18:*/   0x00000000,        //nop
+/*  1c:*/   0xaf650000,        //sw      $a1,0($k1)
+/*  20:*/   0x03e00008,        //jr      $ra
+/*  24:*/   0x00000000,        //nop
+			0};
+
+
+EXPORT_SYMBOL(tst_code_buffer);	/* export symbole */
+
+/********************
+
+#include<asm/regdef.h>
+
+        .set noreorder
+0:
+        move    k1 ,a0
+        lw      v0, 0(a0)
+	nop
+        bnez    v0, 1f
+        nop
+        bne     k1, a0, 0b
+        nop
+        sw      a1, 0(k1)
+1:
+        jr      ra
+        nop
+
+*********************/
+
+
+
+
+static 
+int try_init_code_buffer(void)
+{
+
+	spin_lock(&lock);
+	if (!tst_code_buffer) {
+		tst_code_buffer = alloc_pages(GFP_KERNEL, 0);
+		spin_unlock(&lock);
+
+		if (!tst_code_buffer)
+			return -EBUSY;
+
+		clear_page(page_address(tst_code_buffer));
+
+		memcpy (page_address(tst_code_buffer), (void *)tst_code, 
+			sizeof (tst_code) * sizeof (tst_code[0]));
+
+	} else {
+		spin_unlock(&lock);
+	}
+	return 0;
+}
+
+#ifdef MODULE
+static 
+void try_free_code_buffer(void)
+{
+
+	spin_lock(&lock);
+	if (tst_code_buffer) {
+		page_t *pg = tst_code_buffer;
+		tst_code_buffer=0;
+		spin_unlock(&lock);
+		put_page (pg);
+	} else {
+		spin_unlock(&lock);
+	}
+}
+#endif
+
+static get_info_t get_tst_info;
+
+
+/*
+ * Caller of (*get_info)() is  proc_file_read() in fs/proc/generic.c
+ */
+static 
+int get_tst_info (char *buf, 	/*  allocated area for info */
+	       char **start, 	/*  return youown area if you allocate */
+	       off_t pos,	/*  pos arg of vfs read */
+	       int count)	/*  readable bytes */
+{
+
+/* SPRINTF does not exist in the kernel */
+#define MY_BUFSIZE 256
+#define MARGIN 16
+	char mybuf[MY_BUFSIZE+MARGIN];
+
+	int len;
+
+	len = sprintf(mybuf,
+		      "_TST_INFO_MAGIC:\t0x%8.8x\n"
+		      "_TST_START_MAGIC:\t0x%8.8x\n"
+		      "_TST_ACCESS_MAGIC:\t0x%8.8x\n",
+		      _TST_INFO_MAGIC_2ARGS,
+		      _TST_START_MAGIC,
+		      _TST_ACCESS_MAGIC
+		      );
+	if (len >= MY_BUFSIZE) mybuf[MY_BUFSIZE] = '\0';
+
+	if ( pos+count >= len ) {
+		count = len-pos;
+	}
+	memcpy (buf, mybuf+pos, count);
+	return count;
+}
+
+#ifdef MODULE
+
+#define tst_dev_init init_module
+
+void
+cleanup_module (void)
+{
+	/* free code buffer */
+	try_free_code_buffer();
+
+	/* unregister /proc entry */
+
+	(void) remove_proc_entry(TST_DEVICE_NAME, NULL);
+
+	/* unregister chrdev */
+	unregister_chrdev(tst_major, TST_DEVICE_NAME);
+}
+
+
+#endif /* MODULE */
+
+
+int __init tst_dev_init(void)
+{
+
+	int result;
+
+	spin_lock_init(&lock);
+
+	result = register_chrdev(tst_major, TST_DEVICE_NAME , &tst_fops);
+	if (result < 0) {
+		printk(KERN_WARNING 
+		       TST_DEVICE_NAME ": can't get major %d\n",tst_major);
+		return result;
+	}
+	if (tst_major == 0) tst_major = result; /* dynamic */
+
+	/*
+	 * register /proc entry, if you want.
+	 */
+
+
+	if (!create_proc_info_entry(TST_DEVICE_NAME, 0, NULL, &get_tst_info)) {
+		printk(KERN_WARNING 
+		       TST_DEVICE_NAME ": can't get proc entry\n");
+		unregister_chrdev(tst_major, TST_DEVICE_NAME);
+		return result;
+	}
+
+	(void) try_init_code_buffer();
+
+	return 0;
+}
+
+#ifndef MODULE
+__initcall(tst_dev_init);
+#endif
+
+
+//========================================================================
+
+/*
+ * VMA Opreations
+ */
+
+static void tst_vma_open(struct vm_area_struct *vma)
+{
+    MOD_INC_USE_COUNT;
+}
+
+static void tst_vma_close(struct vm_area_struct *vma)
+{
+    MOD_DEC_USE_COUNT;
+}
+
+struct page *
+tst_vma_nopage (struct vm_area_struct * area, 
+			unsigned long address, int write_access)
+{
+	if ( address  != _TST_START_MAGIC 
+	    || area->vm_start  != _TST_START_MAGIC
+	    || area->vm_pgoff != 0 )
+		return 0;
+
+	get_page(tst_code_buffer);
+	flush_page_to_ram(tst_code_buffer);
+	return tst_code_buffer;
+}
+
+
+static struct vm_operations_struct tst_vm_ops = {
+	open:tst_vma_open,
+	close:tst_vma_close,
+	nopage:tst_vma_nopage,
+};
+
+//========================================================================
+
+/*
+ * 	Device File Operations
+ */
+
+
+/*
+ * Open and Close
+ */
+
+static int tst_open (struct inode *p_inode, struct file *p_file)
+{
+	
+	int ret_code;
+        if ( p_file->f_mode & FMODE_WRITE ) {
+                return -EPERM;
+        }
+	
+	ret_code =  try_init_code_buffer ();
+	if (ret_code) {
+		return ret_code;
+	}
+
+	/* 
+	 * if you want store something for later processing, do it on
+	 * p_file->private_data .
+	 */
+        MOD_INC_USE_COUNT;
+        return 0;          /* success */
+}
+
+static int tst_release (struct inode *p_inode, struct file *p_file)
+{
+	MOD_DEC_USE_COUNT;
+	return 0;
+}
+
+
+/*
+ * Mmap
+ */
+static int tst_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	unsigned long size;
+
+	if (vma->vm_start != _TST_START_MAGIC)
+		return -ENXIO;
+
+	if (vma->vm_pgoff != 0)
+		return -ENXIO;
+
+	size = vma->vm_end - vma->vm_start;
+	if (size != PAGE_SIZE)
+		return -EINVAL;
+
+	vma->vm_ops = &tst_vm_ops;
+
+	tst_vma_open(vma);
+
+	return 0;
+}
+
+
+/*
+ * Read
+ */
+static ssize_t tst_read(struct file *p_file, char * p_buff, size_t count, 
+		   loff_t * p_pos)
+{
+	
+	struct _tst_area_info info;
+	int data;
+	struct inode * p_inode;
+	int info_size = sizeof(info);
+
+	p_inode = p_file->f_dentry->d_inode;
+	data = MAJOR(p_inode->i_rdev);
+
+	info.magic = _TST_INFO_MAGIC_2ARGS;
+	info.pad1 = 0;
+	info.map_addr = (void *)_TST_START_MAGIC;
+#if _MIPS_SZPTR==32
+	info.pad2 = 0;
+#endif
+
+	if (*p_pos + count >= info_size){
+		count = info_size - *p_pos;
+	}
+	if(copy_to_user(p_buff,((char *)&info)+*p_pos, count)) {
+		return -EFAULT;
+	}
+	*p_pos += count;
+	return count;
+}
+
+static
+struct file_operations  tst_fops = {
+	/* ssize_t (*read) (struct file *, char *, size_t, loff_t *); */
+	read:tst_read,
+	/* int (*open) (struct inode *, struct file *); */
+	open:tst_open,
+	/* int (*release) (struct inode *, struct file *);*/
+	release:tst_release, 
+	/* int (*mmap) (struct file *, struct vm_area_struct *); */
+	mmap:tst_mmap,
+};
Index: arch/mips/kernel/head.S
===================================================================
RCS file: /cvsroot/linux-mips/linux/arch/mips/kernel/head.S,v
retrieving revision 1.12
diff -u -p -r1.12 head.S
--- arch/mips/kernel/head.S	2002/01/04 18:04:53	1.12
+++ arch/mips/kernel/head.S	2002/01/22 05:48:36
@@ -101,6 +101,7 @@
 		addiu	k1, k1, 4
 1:		mtc0	k1, $24
 		RESTORE_ALL
+		TST_DEV_EPILOGUE
 		.word	0x4200001f      # deret, return EJTAG debug exception.
 		 nop
 		.set	at

[-- Attachment #3: glibc-2.2.3-mips-tas.patch --]
[-- Type: Text/Plain, Size: 6678 bytes --]

Index: sysdeps/unix/sysv/linux/mips/Makefile
===================================================================
RCS file: /export/cvsroot/CoPE/cmplrs/glibc-2.2/sysdeps/unix/sysv/linux/mips/Makefile,v
retrieving revision 1.1.3.1
diff -u -p -r1.1.3.1 Makefile
--- sysdeps/unix/sysv/linux/mips/Makefile	13 Dec 2001 05:28:00 -0000	1.1.3.1
+++ sysdeps/unix/sysv/linux/mips/Makefile	22 Jan 2002 05:22:35 -0000
@@ -5,7 +5,7 @@ sysdep_routines += rt_sigsuspend rt_sigp
 endif
 
 ifeq ($(subdir),misc)
-sysdep_routines += cachectl cacheflush sysmips _test_and_set
+sysdep_routines += cachectl cacheflush sysmips _test_and_set mips1_tst
 
 sysdep_headers += sys/cachectl.h sys/sysmips.h sys/tas.h
 endif
Index: sysdeps/unix/sysv/linux/mips/Versions
===================================================================
RCS file: /export/cvsroot/CoPE/cmplrs/glibc-2.2/sysdeps/unix/sysv/linux/mips/Versions,v
retrieving revision 1.1.3.1
diff -u -p -r1.1.3.1 Versions
--- sysdeps/unix/sysv/linux/mips/Versions	13 Dec 2001 05:28:00 -0000	1.1.3.1
+++ sysdeps/unix/sysv/linux/mips/Versions	22 Jan 2002 05:22:35 -0000
@@ -17,5 +17,8 @@ libc {
   GLIBC_2.2 {
     # _*
     _test_and_set;
+    # mips1 test and set
+    __mips1_tst_func;
+    __mips1_tst_func_2args;
   }
 }
Index: sysdeps/unix/sysv/linux/mips/sys/tas.h
===================================================================
RCS file: /export/cvsroot/CoPE/cmplrs/glibc-2.2/sysdeps/unix/sysv/linux/mips/sys/tas.h,v
retrieving revision 1.1.3.1
diff -u -p -r1.1.3.1 tas.h
--- sysdeps/unix/sysv/linux/mips/sys/tas.h	13 Dec 2001 05:28:00 -0000	1.1.3.1
+++ sysdeps/unix/sysv/linux/mips/sys/tas.h	22 Jan 2002 05:22:35 -0000
@@ -23,6 +23,7 @@
 #include <features.h>
 #include <sgidefs.h>
 #include <sys/sysmips.h>
+#include <linux/config.h>
 
 __BEGIN_DECLS
 
@@ -34,7 +35,8 @@ extern int _test_and_set (int *p, int v)
 #  define _EXTERN_INLINE extern __inline
 # endif
 
-# if (_MIPS_ISA >= _MIPS_ISA_MIPS2)
+# if (_MIPS_ISA >= _MIPS_ISA_MIPS2) && \
+       !(defined(CONFIG_MIPS_TST_DEV) || defined(CONFIG_MIPS_TST_DEV_MODULE))
 
 _EXTERN_INLINE int
 _test_and_set (int *p, int v) __THROW
@@ -59,15 +61,19 @@ _test_and_set (int *p, int v) __THROW
   return r;
 }
 
-# else /* !(_MIPS_ISA >= _MIPS_ISA_MIPS2) */
+# else /* !((_MIPS_ISA >= _MIPS_ISA_MIPS2) && \
+       !(defined(CONFIG_MIPS_TST_DEV) || defined(CONFIG_MIPS_TST_DEV_MODULE)))*/
+
+extern int (*__mips1_tst_func_2args)(volatile int *, int);
 
 _EXTERN_INLINE int
 _test_and_set (int *p, int v) __THROW
 {
-  return sysmips (MIPS_ATOMIC_SET, (int) p, v, 0);
+  return (*__mips1_tst_func_2args)((volatile int *)p, v); 
 }
 
-# endif /* !(_MIPS_ISA >= _MIPS_ISA_MIPS2) */
+# endif /* !((_MIPS_ISA >= _MIPS_ISA_MIPS2) && \
+       !(defined(CONFIG_MIPS_TST_DEV) || defined(CONFIG_MIPS_TST_DEV_MODULE)))*/
 
 #endif /* __USE_EXTERN_INLINES */
 
Index: sysdeps/unix/sysv/linux/mips/mips1_tst.c
===================================================================
--- /dev/null	Wed May  6 05:32:27 1998
+++ sysdeps/unix/sysv/linux/mips//mips1_tst.c	Mon Jan 21 19:35:38 2002
@@ -0,0 +1,155 @@
+/*
+- mips1_tst.c: fast test and set using /dev/tst.
+
+        Copyright (C) 2000  Sony Computer Entertainment Inc.
+        Copyright 2002  Sony Corp. 
+
+This file is subject to the terms and conditions of the GNU Library
+General Public License Version 2 or later. See the file "COPYING.LIB" 
+in the main directory of this archive for more details.
+*/
+
+#include <unistd.h>
+#include <fcntl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include<sys/sysmips.h>
+
+#include<linux/tst_dev.h>
+#include<asm/sgidefs.h>
+
+
+static int init_tst_func(volatile int *spinlock);
+static int init_tst_func_2args(volatile int *spinlock, int val);
+
+static volatile short lock_setup=0;
+
+/*
+ * __mips1_tst_func: 
+ *	This interface was originally designed for glibc-2.0.[67] and 
+ *	used on PS2 linux with glibc-2.2.2. This interface can NOT accept 
+ *	2nd arg of _test_and_set() in glibc-2.2.
+ *	We maintain this old interface to keep binary compatibility.
+ *
+ * __mips1_tst_func_2args:
+ *	New interface which can accept 2nd arg of _test_and_set() 
+ *	in glibc-2.2.
+ *
+ */
+int (*__mips1_tst_func)(volatile int *) = (void *)init_tst_func;
+int (*__mips1_tst_func_2args)(volatile int *, int) = 
+					(void *)init_tst_func_2args;
+
+static int sysmips_tst_1arg(volatile int *spinlock)
+{
+    return sysmips((const int)MIPS_ATOMIC_SET,
+	    (const int)spinlock,
+	    (const int)1,
+	    (const int)NULL);
+}
+
+static int sysmips_tst_2args(volatile int *spinlock, int val)
+{
+    return sysmips((const int)MIPS_ATOMIC_SET,
+	    (const int)spinlock,
+	    (const int)val,
+	    (const int)NULL);
+}
+
+static int tst_1arg(volatile int *spinlock)
+{
+	return (*__mips1_tst_func_2args)(spinlock, 1) ;
+}
+
+static int tst_2args(volatile int *spinlock, int val)
+{
+	static volatile int lock;
+	int retval;
+
+	while ((*__mips1_tst_func)(&lock)!=0);
+	retval = *spinlock;
+	if (retval==0) {
+		*spinlock = val;
+	}
+	lock = 0;
+	return retval;
+}
+
+static void setup_tst_func(void)
+{
+	struct _tst_area_info info;
+	int fd;
+	void *addr;
+	int res;
+	size_t size = getpagesize();
+	int device_accept_2args = 0;
+
+	while (sysmips((const int)MIPS_ATOMIC_SET,
+		(const int)&lock_setup,
+		(const int)1,
+		(const int)NULL)) ;
+	
+	if ( __mips1_tst_func != init_tst_func ){
+		lock_setup=0;
+		return;
+	}
+
+	fd = open( "/dev/" TST_DEVICE_NAME , O_RDONLY);
+	if (fd < 0)  goto fail;
+
+	res = read ( fd, &info, sizeof(info));
+	switch (info.magic) {
+	    case _TST_INFO_MAGIC:
+	    	break;
+	    case _TST_INFO_MAGIC_2ARGS:
+	    	device_accept_2args = 1;
+	    	break;
+	    default:
+		close(fd);
+		goto fail;
+	}
+
+	addr=(void *)mmap(info.map_addr, size,
+		PROT_READ|PROT_EXEC, MAP_SHARED|MAP_FIXED,
+		fd, 0);
+	close(fd);
+
+	if (addr != info.map_addr ) {
+		if (addr != (void *)0 && addr !=(void *) -1)
+			munmap(addr,size);
+		goto fail;
+	}
+
+	if (device_accept_2args) {
+		/* Use new device interface */
+		__mips1_tst_func_2args = addr;
+		__mips1_tst_func = tst_1arg;
+	} else {
+		/* Use old device interface */
+		__mips1_tst_func_2args = tst_2args;
+		__mips1_tst_func = addr;
+		
+	}
+	lock_setup=0;
+    	return;
+    fail:
+    	/* last resort */
+	__mips1_tst_func = sysmips_tst_1arg;
+	__mips1_tst_func_2args = sysmips_tst_2args;
+	lock_setup=0;
+    	return;
+}
+
+static int init_tst_func(volatile int *spinlock)
+{
+	setup_tst_func();
+	return (*__mips1_tst_func)(spinlock) ;
+}
+
+static int init_tst_func_2args(volatile int *spinlock, int val)
+{
+	setup_tst_func();
+	return (*__mips1_tst_func_2args)(spinlock, val) ;
+}
+

[-- Attachment #4: linuxthread-2.2.3-mips-tas.patch --]
[-- Type: Text/Plain, Size: 3423 bytes --]

Index: linuxthreads/sysdeps/mips/pspinlock.c
===================================================================
RCS file: /export/cvsroot/CoPE/cmplrs/glibc-2.2/linuxthreads/sysdeps/mips/pspinlock.c,v
retrieving revision 1.1.3.1
diff -u -p -r1.1.3.1 pspinlock.c
--- linuxthreads/sysdeps/mips/pspinlock.c	13 Dec 2001 05:28:33 -0000	1.1.3.1
+++ linuxthreads/sysdeps/mips/pspinlock.c	22 Jan 2002 05:25:59 -0000
@@ -23,7 +23,9 @@
 #include <sys/tas.h>
 #include "internals.h"
 
-#if (_MIPS_ISA >= _MIPS_ISA_MIPS2)
+
+#if (_MIPS_ISA >= _MIPS_ISA_MIPS2) && \
+       !(defined(CONFIG_MIPS_TST_DEV) || defined(CONFIG_MIPS_TST_DEV_MODULE))
 
 /* This implementation is similar to the one used in the Linux kernel.  */
 int
@@ -49,7 +51,8 @@ __pthread_spin_lock (pthread_spinlock_t 
   return 0;
 }
 
-#else /* !(_MIPS_ISA >= _MIPS_ISA_MIPS2) */
+#else /* !((_MIPS_ISA >= _MIPS_ISA_MIPS2) && \
+       !(defined(CONFIG_MIPS_TST_DEV) || defined(CONFIG_MIPS_TST_DEV_MODULE)))*/
 
 int
 __pthread_spin_lock (pthread_spinlock_t *lock)
@@ -58,7 +61,8 @@ __pthread_spin_lock (pthread_spinlock_t 
   return 0;
 }
 
-#endif /* !(_MIPS_ISA >= _MIPS_ISA_MIPS2) */
+#endif /* !((_MIPS_ISA >= _MIPS_ISA_MIPS2) && \
+       !(defined(CONFIG_MIPS_TST_DEV) || defined(CONFIG_MIPS_TST_DEV_MODULE)))*/
 
 weak_alias (__pthread_spin_lock, pthread_spin_lock)
 
@@ -66,8 +70,7 @@ weak_alias (__pthread_spin_lock, pthread
 int
 __pthread_spin_trylock (pthread_spinlock_t *lock)
 {
-  /* To be done.  */
-  return 0;
+  return (_test_and_set (lock, 1) ? EBUSY : 0);
 }
 weak_alias (__pthread_spin_trylock, pthread_spin_trylock)
 
Index: linuxthreads/sysdeps/mips/pt-machine.h
===================================================================
RCS file: /export/cvsroot/CoPE/cmplrs/glibc-2.2/linuxthreads/sysdeps/mips/pt-machine.h,v
retrieving revision 1.1.3.1
diff -u -p -r1.1.3.1 pt-machine.h
--- linuxthreads/sysdeps/mips/pt-machine.h	13 Dec 2001 05:28:33 -0000	1.1.3.1
+++ linuxthreads/sysdeps/mips/pt-machine.h	22 Jan 2002 05:25:59 -0000
@@ -33,7 +33,8 @@
 
 /* Spinlock implementation; required.  */
 
-#if (_MIPS_ISA >= _MIPS_ISA_MIPS2)
+#if (_MIPS_ISA >= _MIPS_ISA_MIPS2) && \
+       !(defined(CONFIG_MIPS_TST_DEV) || defined(CONFIG_MIPS_TST_DEV_MODULE))
 
 PT_EI long int
 testandset (int *spinlock)
@@ -60,14 +61,16 @@ testandset (int *spinlock)
   return ret;
 }
 
-#else /* !(_MIPS_ISA >= _MIPS_ISA_MIPS2) */
+#else /* !((_MIPS_ISA >= _MIPS_ISA_MIPS2) && \
+       !(defined(CONFIG_MIPS_TST_DEV) || defined(CONFIG_MIPS_TST_DEV_MODULE)))*/
 
 PT_EI long int
 testandset (int *spinlock)
 {
   return _test_and_set (spinlock, 1);
 }
-#endif /* !(_MIPS_ISA >= _MIPS_ISA_MIPS2) */
+#endif /* !((_MIPS_ISA >= _MIPS_ISA_MIPS2) && \
+       !(defined(CONFIG_MIPS_TST_DEV) || defined(CONFIG_MIPS_TST_DEV_MODULE)))*/
 
 
 /* Get some notion of the current stack.  Need not be exactly the top
@@ -78,7 +81,8 @@ register char * stack_pointer __asm__ ("
 
 /* Compare-and-swap for semaphores. */
 
-#if (_MIPS_ISA >= _MIPS_ISA_MIPS2)
+#if (_MIPS_ISA >= _MIPS_ISA_MIPS2) && \
+       !(defined(CONFIG_MIPS_TST_DEV) || defined(CONFIG_MIPS_TST_DEV_MODULE))
 
 #define HAS_COMPARE_AND_SWAP
 PT_EI int
@@ -106,4 +110,5 @@ __compare_and_swap (long int *p, long in
   return ret;
 }
 
-#endif /* (_MIPS_ISA >= _MIPS_ISA_MIPS2) */
+#endif /* !((_MIPS_ISA >= _MIPS_ISA_MIPS2) && \
+       !(defined(CONFIG_MIPS_TST_DEV) || defined(CONFIG_MIPS_TST_DEV_MODULE)))*/

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: patches for test-and-set without ll/sc (Re: thread-ready ABIs)
  2002-01-22  6:27             ` patches for test-and-set without ll/sc (Re: thread-ready ABIs) Machida Hiroyuki
@ 2002-01-22  6:37               ` Ulrich Drepper
  2002-01-22  6:46                 ` Machida Hiroyuki
  2002-01-24  9:56                   ` Andreas Jaeger
  0 siblings, 2 replies; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-22  6:37 UTC (permalink / raw)
  To: Machida Hiroyuki; +Cc: kevink, hjl, libc-hacker, linux-mips

Machida Hiroyuki <machida@sm.sony.co.jp> writes:

>   * glibc change:
> 
> 	We implement  test_and_set(addr, val) as follows,
> 
> 		Do mmap /dev/tst to _TST_START_MAGIC, if not yet mapped.
> 		call _TST_START_MAGIC(addr, val)
> 	
> 	If we can't open /dev/tst then, use sysmips() as final resort.

First, the patch as it is unacceptable.  A file with copyright Sony?
All the code must be copyrighted by the FSF.  Sony will have to assign
the copyright for the code to the FSF.

Also, no such change can be accepted until the necessary kernel
changes are in the official kernel sources.  I cannot make any
exceptions since otherwise all kinds of people want to see support for
their local hack added.

Furthermore, the symbols were not available in version 2.2.  Therefore
they cannot be exported with this version.  It'll either be 2.2.6 (if
their ever will be such a release) or 2.3.

And finally, the patch should be sent to the glibc MIPS maintainer for
review.  The question is who feels responsible...

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: patches for test-and-set without ll/sc (Re: thread-ready ABIs)
  2002-01-22  6:37               ` Ulrich Drepper
@ 2002-01-22  6:46                 ` Machida Hiroyuki
  2002-01-22  6:56                   ` Ulrich Drepper
  2002-01-24  9:56                   ` Andreas Jaeger
  1 sibling, 1 reply; 94+ messages in thread
From: Machida Hiroyuki @ 2002-01-22  6:46 UTC (permalink / raw)
  To: drepper; +Cc: kevink, hjl, libc-hacker, linux-mips


From: Ulrich Drepper <drepper@redhat.com>
Subject: Re: patches for test-and-set without ll/sc (Re: thread-ready ABIs)
Date: 21 Jan 2002 22:37:02 -0800

> First, the patch as it is unacceptable.  A file with copyright Sony?
> All the code must be copyrighted by the FSF.  Sony will have to assign
> the copyright for the code to the FSF.

Please let us why. Acctually, glibc includes codes copyrighted by
SUN and gcc includes codes copryrighed by HP and SGI.

---
Hiroyuki Machida
Sony Corp.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: patches for test-and-set without ll/sc (Re: thread-ready ABIs)
  2002-01-22  6:46                 ` Machida Hiroyuki
@ 2002-01-22  6:56                   ` Ulrich Drepper
  0 siblings, 0 replies; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-22  6:56 UTC (permalink / raw)
  To: Machida Hiroyuki; +Cc: kevink, hjl, libc-hacker, linux-mips

Machida Hiroyuki <machida@sm.sony.co.jp> writes:

> Please let us why. Acctually, glibc includes codes copyrighted by
> SUN and gcc includes codes copryrighed by HP and SGI.

It contains public domain code and the rest of the code is assigned.
If there is a header saying that somebody from a company wrote the
code this is mentioned but the person also has a document signed.

If you cannot live with this the code cannot be accepted.  Any further
discussion you have to have with the legal people at the FSF.  I've no
time for this.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22  9:37               ` Dominic Sweetman
  0 siblings, 0 replies; 94+ messages in thread
From: Dominic Sweetman @ 2002-01-22  9:37 UTC (permalink / raw)
  To: Ulrich Drepper
  Cc: Kevin D. Kissell, Ralf Baechle, Mike Uhler, linux-mips,
	H . J . Lu


Kevin asked:

> > I'd very much appreciate it if someone would explain to me just
> > what this register is used for, and why a register needs to be
> > permantly allocated for this purpose.

Ulrich Drepper (drepper@redhat.com) writes:

> Simply look at the ABIs for some less-backward processors.  Read the
> thread-local storage section in the IA-64 ABI specification.

Sometimes when you're busy it's understandable to respond with "RTFM".
But to fail to provide a URL is not very respectful: other people
reading this list are quite smart, they're just smart about different
things from you.

URL please.

Dominic Sweetman
Algorithmics Ltd
The Fruit Farm, Ely Road, Chittering, CAMBS CB5 9PH, ENGLAND
phone: +44 1223 706200 / fax: +44 1223 706250 / home: +44 20 7226 0032
http://www.algor.co.uk

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22  9:37               ` Dominic Sweetman
  0 siblings, 0 replies; 94+ messages in thread
From: Dominic Sweetman @ 2002-01-22  9:37 UTC (permalink / raw)
  To: Ulrich Drepper
  Cc: Kevin D. Kissell, Ralf Baechle, Mike Uhler, linux-mips,
	H . J . Lu


Kevin asked:

> > I'd very much appreciate it if someone would explain to me just
> > what this register is used for, and why a register needs to be
> > permantly allocated for this purpose.

Ulrich Drepper (drepper@redhat.com) writes:

> Simply look at the ABIs for some less-backward processors.  Read the
> thread-local storage section in the IA-64 ABI specification.

Sometimes when you're busy it's understandable to respond with "RTFM".
But to fail to provide a URL is not very respectful: other people
reading this list are quite smart, they're just smart about different
things from you.

URL please.

Dominic Sweetman
Algorithmics Ltd
The Fruit Farm, Ely Road, Chittering, CAMBS CB5 9PH, ENGLAND
phone: +44 1223 706200 / fax: +44 1223 706250 / home: +44 20 7226 0032
http://www.algor.co.uk

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22  9:59             ` Dominic Sweetman
  0 siblings, 0 replies; 94+ messages in thread
From: Dominic Sweetman @ 2002-01-22  9:59 UTC (permalink / raw)
  To: Kevin D. Kissell
  Cc: Ralf Baechle, Ulrich Drepper, Mike Uhler, linux-mips,
	GNU libc hacker, H . J . Lu


Kevin,

Since nobody seems to be prepared to essay a brief definition of a
thread register, I'll make one up from first principles and maybe the
experts will beat it into shape.

Multiple threads in a Linux process share the same address space: code
and data.  A thread has its own unique stack, but since (by
definition) it shares all its data with every other thread it has no
identity - there is no thread-unique static data.  That means it has
no handle to acquire and manage any thread-specific variables.

[Some threads purists would probably maintain that's a Good Thing:
 threads to them are like electrons to quantum physicists,
 indistinguishable by definition].

Linux is not noted for computer science purity; so an OS-maintained
"thread identity" variable which is cheap to read in user space sounds
a useful thing to have.

A patient Linux expert (if any such are reading this list) might like
to say what value is typically held (a pointer? an index?) and how
it's used (my money's on "wrapped in an impenetrable macro").

In a more baroque (synonym for "less backward"?) architecture there
are usually registers hanging about which no compiler or OS author has
previously figured out any use for, which can be bent to this purpose.
Unfortunately, MIPS original architects committed the grave error of
making all the registers useful.

I quite like the idea of putting the thread value at a known offset in
low virtual memory, but I expect the kernel keeps virtual page 0
invalid to catch null pointers and that instructions start at the
first boundary which doesn't create cache alias problems...

--
Dominic Sweetman
Algorithmics Ltd
The Fruit Farm, Ely Road, Chittering, CAMBS CB5 9PH, ENGLAND
phone: +44 1223 706200 / fax: +44 1223 706250 / home: +44 20 7226 0032
http://www.algor.co.uk

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22  9:59             ` Dominic Sweetman
  0 siblings, 0 replies; 94+ messages in thread
From: Dominic Sweetman @ 2002-01-22  9:59 UTC (permalink / raw)
  To: Kevin D. Kissell
  Cc: Ralf Baechle, Ulrich Drepper, Mike Uhler, linux-mips,
	GNU libc hacker, H . J . Lu


Kevin,

Since nobody seems to be prepared to essay a brief definition of a
thread register, I'll make one up from first principles and maybe the
experts will beat it into shape.

Multiple threads in a Linux process share the same address space: code
and data.  A thread has its own unique stack, but since (by
definition) it shares all its data with every other thread it has no
identity - there is no thread-unique static data.  That means it has
no handle to acquire and manage any thread-specific variables.

[Some threads purists would probably maintain that's a Good Thing:
 threads to them are like electrons to quantum physicists,
 indistinguishable by definition].

Linux is not noted for computer science purity; so an OS-maintained
"thread identity" variable which is cheap to read in user space sounds
a useful thing to have.

A patient Linux expert (if any such are reading this list) might like
to say what value is typically held (a pointer? an index?) and how
it's used (my money's on "wrapped in an impenetrable macro").

In a more baroque (synonym for "less backward"?) architecture there
are usually registers hanging about which no compiler or OS author has
previously figured out any use for, which can be bent to this purpose.
Unfortunately, MIPS original architects committed the grave error of
making all the registers useful.

I quite like the idea of putting the thread value at a known offset in
low virtual memory, but I expect the kernel keeps virtual page 0
invalid to catch null pointers and that instructions start at the
first boundary which doesn't create cache alias problems...

--
Dominic Sweetman
Algorithmics Ltd
The Fruit Farm, Ely Road, Chittering, CAMBS CB5 9PH, ENGLAND
phone: +44 1223 706200 / fax: +44 1223 706250 / home: +44 20 7226 0032
http://www.algor.co.uk

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 12:18               ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 12:18 UTC (permalink / raw)
  To: Dominic Sweetman
  Cc: Ralf Baechle, Ulrich Drepper, Mike Uhler, MIPS/Linux List (SGI),
	GNU libc hacker, H . J . Lu

> Since nobody seems to be prepared to essay a brief definition of a
> thread register, I'll make one up from first principles and maybe the
> experts will beat it into shape.

Thank you, Dom, for trying to inject some civility into this debate.

> Multiple threads in a Linux process share the same address space: code
> and data.  A thread has its own unique stack, but since (by
> definition) it shares all its data with every other thread it has no
> identity - there is no thread-unique static data.  That means it has
> no handle to acquire and manage any thread-specific variables.
> 
> [Some threads purists would probably maintain that's a Good Thing:
>  threads to them are like electrons to quantum physicists,
>  indistinguishable by definition].
> 
> Linux is not noted for computer science purity; so an OS-maintained
> "thread identity" variable which is cheap to read in user space sounds
> a useful thing to have.
> 
> A patient Linux expert (if any such are reading this list) might like
> to say what value is typically held (a pointer? an index?) and how
> it's used (my money's on "wrapped in an impenetrable macro").
> 
> In a more baroque (synonym for "less backward"?) architecture there
> are usually registers hanging about which no compiler or OS author has
> previously figured out any use for, which can be bent to this purpose.
> Unfortunately, MIPS original architects committed the grave error of
> making all the registers useful.
> 
> I quite like the idea of putting the thread value at a known offset in
> low virtual memory, but I expect the kernel keeps virtual page 0
> invalid to catch null pointers and that instructions start at the
> first boundary which doesn't create cache alias problems...

I think that the problem is complicated by the fact that
there may be a many->many mapping of kernel threads
(and CPUs) to user-land threads.  In that case, no single
low-memory address can be correct for all kernel threads.
However, since every kernel thread should have its own
stack segment, it would appear to me that having a
variable "under" the stack would satisfy the need for
per-kernel-thread storage at a knowable location.
I suspect that there is a second-order problem in that
the base stack address may differ for instances of
the same binary launched under different circumstances.
But I don't think that renders the problem impossible.
One could have a global pointer, resolvable at link
time, which could be set to SP+delta by whatever
we call crt0 these days, and which should provide the
required semantics.  Each user thread startup or 
context switch could follow the global pointer to the 
kernel-thread-specific memory location which 
could be used as the pointer to the user-thread
specific data area.

Even with the double indirection, that strikes me
as far more efficient than performing a system call
on every thread startup to set up a magic value to be 
returned in a k-register (as some have suggested) 
and considerably less messy, technically and commercially, 
than pulling a register out of the ABI and rendering it 
useless for programs which happen not to be executing
the threads library  (as others have proposed).

I await news as to why this is impossible and/or
unacceptable, and I shall endeavor to modify
or withdraw the suggestion accordingly.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 12:18               ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 12:18 UTC (permalink / raw)
  To: Dominic Sweetman
  Cc: Ralf Baechle, Ulrich Drepper, Mike Uhler, MIPS/Linux List (SGI),
	GNU libc hacker, H . J . Lu

> Since nobody seems to be prepared to essay a brief definition of a
> thread register, I'll make one up from first principles and maybe the
> experts will beat it into shape.

Thank you, Dom, for trying to inject some civility into this debate.

> Multiple threads in a Linux process share the same address space: code
> and data.  A thread has its own unique stack, but since (by
> definition) it shares all its data with every other thread it has no
> identity - there is no thread-unique static data.  That means it has
> no handle to acquire and manage any thread-specific variables.
> 
> [Some threads purists would probably maintain that's a Good Thing:
>  threads to them are like electrons to quantum physicists,
>  indistinguishable by definition].
> 
> Linux is not noted for computer science purity; so an OS-maintained
> "thread identity" variable which is cheap to read in user space sounds
> a useful thing to have.
> 
> A patient Linux expert (if any such are reading this list) might like
> to say what value is typically held (a pointer? an index?) and how
> it's used (my money's on "wrapped in an impenetrable macro").
> 
> In a more baroque (synonym for "less backward"?) architecture there
> are usually registers hanging about which no compiler or OS author has
> previously figured out any use for, which can be bent to this purpose.
> Unfortunately, MIPS original architects committed the grave error of
> making all the registers useful.
> 
> I quite like the idea of putting the thread value at a known offset in
> low virtual memory, but I expect the kernel keeps virtual page 0
> invalid to catch null pointers and that instructions start at the
> first boundary which doesn't create cache alias problems...

I think that the problem is complicated by the fact that
there may be a many->many mapping of kernel threads
(and CPUs) to user-land threads.  In that case, no single
low-memory address can be correct for all kernel threads.
However, since every kernel thread should have its own
stack segment, it would appear to me that having a
variable "under" the stack would satisfy the need for
per-kernel-thread storage at a knowable location.
I suspect that there is a second-order problem in that
the base stack address may differ for instances of
the same binary launched under different circumstances.
But I don't think that renders the problem impossible.
One could have a global pointer, resolvable at link
time, which could be set to SP+delta by whatever
we call crt0 these days, and which should provide the
required semantics.  Each user thread startup or 
context switch could follow the global pointer to the 
kernel-thread-specific memory location which 
could be used as the pointer to the user-thread
specific data area.

Even with the double indirection, that strikes me
as far more efficient than performing a system call
on every thread startup to set up a magic value to be 
returned in a k-register (as some have suggested) 
and considerably less messy, technically and commercially, 
than pulling a register out of the ABI and rendering it 
useless for programs which happen not to be executing
the threads library  (as others have proposed).

I await news as to why this is impossible and/or
unacceptable, and I shall endeavor to modify
or withdraw the suggestion accordingly.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-22 12:18               ` Kevin D. Kissell
  (?)
@ 2002-01-22 15:21               ` Daniel Jacobowitz
  2002-01-22 15:44                 ` Dominic Sweetman
  2002-01-22 16:05                   ` Kevin D. Kissell
  -1 siblings, 2 replies; 94+ messages in thread
From: Daniel Jacobowitz @ 2002-01-22 15:21 UTC (permalink / raw)
  To: Kevin D. Kissell
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), GNU libc hacker, H . J . Lu

On Tue, Jan 22, 2002 at 01:18:03PM +0100, Kevin D. Kissell wrote:
> I think that the problem is complicated by the fact that
> there may be a many->many mapping of kernel threads
> (and CPUs) to user-land threads.  In that case, no single
> low-memory address can be correct for all kernel threads.
> However, since every kernel thread should have its own
> stack segment, it would appear to me that having a
> variable "under" the stack would satisfy the need for
> per-kernel-thread storage at a knowable location.
> I suspect that there is a second-order problem in that
> the base stack address may differ for instances of
> the same binary launched under different circumstances.
> But I don't think that renders the problem impossible.
> One could have a global pointer, resolvable at link
> time, which could be set to SP+delta by whatever
> we call crt0 these days, and which should provide the
> required semantics.  Each user thread startup or 

Resolvable at link time and set by crt0 seem to be mutually
exclusive... but perhaps I'm misunderstanding you.

In any case, that's not the real problem.  Linux user threads do not
have true separate stacks.  They share their _entire_ address space;
the stacks are all bounded (default is 2MB) and grouped together at the
top of the available memory region.

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-22 15:21               ` Daniel Jacobowitz
@ 2002-01-22 15:44                 ` Dominic Sweetman
  2002-01-22 21:44                   ` Tommy S. Christensen
  2002-01-22 16:05                   ` Kevin D. Kissell
  1 sibling, 1 reply; 94+ messages in thread
From: Dominic Sweetman @ 2002-01-22 15:44 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Kevin D. Kissell, Dominic Sweetman, Ralf Baechle, Ulrich Drepper,
	Mike Uhler, MIPS/Linux List (SGI), H . J . Lu


> In any case, that's not the real problem.  Linux user threads do not
> have true separate stacks.  They share their _entire_ address space;
> the stacks are all bounded (default is 2MB) and grouped together at
> the top of the available memory region.

Quite.

A comment by Kevin reminded me of the real constraint (which the
experts probably take for granted): this system is supposed to work on
shared-memory multiprocessors and multithreaded CPUs.

In both cases two or more threads within an address space can be
active simultaneously.  On a multithreaded CPU (in particular) there's
only one TLB, so memory (including any memory specially handled by the
kernel) is all held in common.  The *only* thing available to a user
privilege program which distinguishes the threads is the CPU register
set.

(Well, and the stack, which is a difference inherited from the value
in the stack pointer register.  But the stack pointer is not really
going to help much to return a thread-characteristic pointer or ID.)

So MIPS really do need to figure out which register can be freed up.
Well, at least I know why now.  Hope the rest of you aren't too bored!

Dominic Sweetman
Algorithmics Ltd
http://www.algor.co.uk

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 16:05                   ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 16:05 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), GNU libc hacker, H . J . Lu

Tue, Jan 22, 2002 at 01:18:03PM +0100, Kevin D. Kissell wrote:
> > I think that the problem is complicated by the fact that
> > there may be a many->many mapping of kernel threads
> > (and CPUs) to user-land threads.  In that case, no single
> > low-memory address can be correct for all kernel threads.
> > However, since every kernel thread should have its own
> > stack segment, it would appear to me that having a
> > variable "under" the stack would satisfy the need for
> > per-kernel-thread storage at a knowable location.
> > I suspect that there is a second-order problem in that
> > the base stack address may differ for instances of
> > the same binary launched under different circumstances.
> > But I don't think that renders the problem impossible.
> > One could have a global pointer, resolvable at link
> > time, which could be set to SP+delta by whatever
> > we call crt0 these days, and which should provide the
> > required semantics.  Each user thread startup or 
> 
> Resolvable at link time and set by crt0 seem to be mutually
> exclusive... but perhaps I'm misunderstanding you.

You are.  The *address* of the pointer to the pointer
can be resolved at link time.  The *value* of the pointer
to the pointer is set by crt0 (if stack origins are not
intrinsically fixed at link time - if they are, the indirection
is not necessary).

> In any case, that's not the real problem.  Linux user threads do not
> have true separate stacks.  They share their _entire_ address space;
> the stacks are all bounded (default is 2MB) and grouped together at the
> top of the available memory region.

Exactly.  But if all we all we are worried about is thread
specific data for user threads multiplexed on exactly
one kernel thread, we could probably get by with a
simple global variable for the thread pointer for the
current user thread running in the process.   It's the
case of multiple user threads running within multiple
*kernel* threads (e.g. created by fork()) that complicates
things, and makes people want to use a register
or other storage resource associated with exactly one
kernel thread (and CPU).  A permanently assigned
register, as we have seen, creates various complications,
so I'm looking for another kernel-thread-specific resource,
of which I believe the stack region is the best candidate.
Each process/task/program would have a single global
variable, which points to a common address in the
stack region of each kernel thread, which is used
to store the address of the user-thread-specific
data of the user thread executing on that kernel thread.

Of course, I still haven't seen an informed description
of the actual problem that Ulrich and H.J. are trying to
solve, so it may in fact be simpler (or more complex).

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 16:05                   ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 16:05 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), GNU libc hacker, H . J . Lu

Tue, Jan 22, 2002 at 01:18:03PM +0100, Kevin D. Kissell wrote:
> > I think that the problem is complicated by the fact that
> > there may be a many->many mapping of kernel threads
> > (and CPUs) to user-land threads.  In that case, no single
> > low-memory address can be correct for all kernel threads.
> > However, since every kernel thread should have its own
> > stack segment, it would appear to me that having a
> > variable "under" the stack would satisfy the need for
> > per-kernel-thread storage at a knowable location.
> > I suspect that there is a second-order problem in that
> > the base stack address may differ for instances of
> > the same binary launched under different circumstances.
> > But I don't think that renders the problem impossible.
> > One could have a global pointer, resolvable at link
> > time, which could be set to SP+delta by whatever
> > we call crt0 these days, and which should provide the
> > required semantics.  Each user thread startup or 
> 
> Resolvable at link time and set by crt0 seem to be mutually
> exclusive... but perhaps I'm misunderstanding you.

You are.  The *address* of the pointer to the pointer
can be resolved at link time.  The *value* of the pointer
to the pointer is set by crt0 (if stack origins are not
intrinsically fixed at link time - if they are, the indirection
is not necessary).

> In any case, that's not the real problem.  Linux user threads do not
> have true separate stacks.  They share their _entire_ address space;
> the stacks are all bounded (default is 2MB) and grouped together at the
> top of the available memory region.

Exactly.  But if all we all we are worried about is thread
specific data for user threads multiplexed on exactly
one kernel thread, we could probably get by with a
simple global variable for the thread pointer for the
current user thread running in the process.   It's the
case of multiple user threads running within multiple
*kernel* threads (e.g. created by fork()) that complicates
things, and makes people want to use a register
or other storage resource associated with exactly one
kernel thread (and CPU).  A permanently assigned
register, as we have seen, creates various complications,
so I'm looking for another kernel-thread-specific resource,
of which I believe the stack region is the best candidate.
Each process/task/program would have a single global
variable, which points to a common address in the
stack region of each kernel thread, which is used
to store the address of the user-thread-specific
data of the user thread executing on that kernel thread.

Of course, I still haven't seen an informed description
of the actual problem that Ulrich and H.J. are trying to
solve, so it may in fact be simpler (or more complex).

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-22 16:05                   ` Kevin D. Kissell
  (?)
@ 2002-01-22 16:34                   ` Daniel Jacobowitz
  2002-01-22 17:08                       ` Kevin D. Kissell
  -1 siblings, 1 reply; 94+ messages in thread
From: Daniel Jacobowitz @ 2002-01-22 16:34 UTC (permalink / raw)
  To: Kevin D. Kissell
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

On Tue, Jan 22, 2002 at 05:05:45PM +0100, Kevin D. Kissell wrote:
> > In any case, that's not the real problem.  Linux user threads do not
> > have true separate stacks.  They share their _entire_ address space;
> > the stacks are all bounded (default is 2MB) and grouped together at the
> > top of the available memory region.
> 
> Exactly.  But if all we all we are worried about is thread
> specific data for user threads multiplexed on exactly
> one kernel thread, we could probably get by with a
> simple global variable for the thread pointer for the
> current user thread running in the process.   It's the
> case of multiple user threads running within multiple
> *kernel* threads (e.g. created by fork()) that complicates
> things, and makes people want to use a register
> or other storage resource associated with exactly one
> kernel thread (and CPU).  A permanently assigned
> register, as we have seen, creates various complications,
> so I'm looking for another kernel-thread-specific resource,
> of which I believe the stack region is the best candidate.
> Each process/task/program would have a single global
> variable, which points to a common address in the
> stack region of each kernel thread, which is used
> to store the address of the user-thread-specific
> data of the user thread executing on that kernel thread.

Perhaps I'm mangling terminology.  LinuxThreads is a one-to-one mapping
of kernel threads to user threads.  All the kernel threads, and thus
all the user threads, share the same memory region - including the
stack region.  Their stacks are differentiated solely by different
values in the stack pointer register.  Thus I don't think what you're
suggesting is possible.

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 17:08                       ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 17:08 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

> Perhaps I'm mangling terminology.  LinuxThreads is a one-to-one mapping
> of kernel threads to user threads.  All the kernel threads, and thus
> all the user threads, share the same memory region - including the
> stack region.  Their stacks are differentiated solely by different
> values in the stack pointer register.  Thus I don't think what you're
> suggesting is possible.

I don't see how fork() semantics can be preserved unless
the stack regions are replicated (copy-on-write) on a fork().
Under ATT and BSD Unix (which is where I did most of
my kernel hacking in the old days) that was the *only*
way to get a new kernel thread, so it was "obvious"
that my proposed hack would work.  Linux does have
the clone() function as well, and if LinuxThreads are
implemented in terms of clone(foo, stakptr, CLONE_VM, arg),
you are correct, the proposed scheme would not work
without modification.

One such modification would be to have each newly
cloned thread explicitly allocate and map a 1-page
VM region that is private to the kernel thread, and bound 
to a known virtual address that is common to all threads
within the task.  That known virtual address would take the 
place of the below-the-stack storage location I described 
earlier.  The same algorithm would apply - one has a globally 
known address that maps to different storage per-thread, 
which can be used to store the address of the (globally visible) 
per-thread information.  The set-up is slightly more complicated 
and heavyweight than the fork()-based model I suggested, 
but one could in principle eliminate one level of indirection 
at on the lookups at run-time.

            Regards,

            Kevin K.
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 17:08                       ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 17:08 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

> Perhaps I'm mangling terminology.  LinuxThreads is a one-to-one mapping
> of kernel threads to user threads.  All the kernel threads, and thus
> all the user threads, share the same memory region - including the
> stack region.  Their stacks are differentiated solely by different
> values in the stack pointer register.  Thus I don't think what you're
> suggesting is possible.

I don't see how fork() semantics can be preserved unless
the stack regions are replicated (copy-on-write) on a fork().
Under ATT and BSD Unix (which is where I did most of
my kernel hacking in the old days) that was the *only*
way to get a new kernel thread, so it was "obvious"
that my proposed hack would work.  Linux does have
the clone() function as well, and if LinuxThreads are
implemented in terms of clone(foo, stakptr, CLONE_VM, arg),
you are correct, the proposed scheme would not work
without modification.

One such modification would be to have each newly
cloned thread explicitly allocate and map a 1-page
VM region that is private to the kernel thread, and bound 
to a known virtual address that is common to all threads
within the task.  That known virtual address would take the 
place of the below-the-stack storage location I described 
earlier.  The same algorithm would apply - one has a globally 
known address that maps to different storage per-thread, 
which can be used to store the address of the (globally visible) 
per-thread information.  The set-up is slightly more complicated 
and heavyweight than the fork()-based model I suggested, 
but one could in principle eliminate one level of indirection 
at on the lookups at run-time.

            Regards,

            Kevin K.
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-22 17:08                       ` Kevin D. Kissell
  (?)
@ 2002-01-22 17:13                       ` Daniel Jacobowitz
  2002-01-22 17:34                           ` Kevin D. Kissell
  2002-01-27 20:24                           ` Alan Cox
  -1 siblings, 2 replies; 94+ messages in thread
From: Daniel Jacobowitz @ 2002-01-22 17:13 UTC (permalink / raw)
  To: Kevin D. Kissell
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

On Tue, Jan 22, 2002 at 06:08:12PM +0100, Kevin D. Kissell wrote:
> > Perhaps I'm mangling terminology.  LinuxThreads is a one-to-one mapping
> > of kernel threads to user threads.  All the kernel threads, and thus
> > all the user threads, share the same memory region - including the
> > stack region.  Their stacks are differentiated solely by different
> > values in the stack pointer register.  Thus I don't think what you're
> > suggesting is possible.
> 
> I don't see how fork() semantics can be preserved unless
> the stack regions are replicated (copy-on-write) on a fork().
> Under ATT and BSD Unix (which is where I did most of
> my kernel hacking in the old days) that was the *only*
> way to get a new kernel thread, so it was "obvious"
> that my proposed hack would work.  Linux does have
> the clone() function as well, and if LinuxThreads are
> implemented in terms of clone(foo, stakptr, CLONE_VM, arg),
> you are correct, the proposed scheme would not work
> without modification.

Which it is.  Fork shares no memory regions; vfork/clone share all
memory regions.  AFAIK there is no share-heap-but-not-stack option in
Linux.

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 17:34                           ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 17:34 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

> > > Perhaps I'm mangling terminology.  LinuxThreads is a one-to-one
mapping
> > > of kernel threads to user threads.  All the kernel threads, and thus
> > > all the user threads, share the same memory region - including the
> > > stack region.  Their stacks are differentiated solely by different
> > > values in the stack pointer register.  Thus I don't think what you're
> > > suggesting is possible.
> >
> > I don't see how fork() semantics can be preserved unless
> > the stack regions are replicated (copy-on-write) on a fork().
> > Under ATT and BSD Unix (which is where I did most of
> > my kernel hacking in the old days) that was the *only*
> > way to get a new kernel thread, so it was "obvious"
> > that my proposed hack would work.  Linux does have
> > the clone() function as well, and if LinuxThreads are
> > implemented in terms of clone(foo, stakptr, CLONE_VM, arg),
> > you are correct, the proposed scheme would not work
> > without modification.
>
> Which it is.  Fork shares no memory regions;

Oh, come on.  If it doesn't share text regions, it's completely
brain dead!

> vfork/clone share all memory regions.  AFAIK there is no
> share-heap-but-not-stack option in Linux.

Yeah.  Not that it matters, but I had misremebered there being
finer grained control than that on clone().  Probably confused
it with something that someone overlaid on Mach once upon a time...

Anyway, do you see a hole or a serious performance
problem with my modified proposal (explicit mmap()
to create the necessary storage)?


            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 17:34                           ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 17:34 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

> > > Perhaps I'm mangling terminology.  LinuxThreads is a one-to-one
mapping
> > > of kernel threads to user threads.  All the kernel threads, and thus
> > > all the user threads, share the same memory region - including the
> > > stack region.  Their stacks are differentiated solely by different
> > > values in the stack pointer register.  Thus I don't think what you're
> > > suggesting is possible.
> >
> > I don't see how fork() semantics can be preserved unless
> > the stack regions are replicated (copy-on-write) on a fork().
> > Under ATT and BSD Unix (which is where I did most of
> > my kernel hacking in the old days) that was the *only*
> > way to get a new kernel thread, so it was "obvious"
> > that my proposed hack would work.  Linux does have
> > the clone() function as well, and if LinuxThreads are
> > implemented in terms of clone(foo, stakptr, CLONE_VM, arg),
> > you are correct, the proposed scheme would not work
> > without modification.
>
> Which it is.  Fork shares no memory regions;

Oh, come on.  If it doesn't share text regions, it's completely
brain dead!

> vfork/clone share all memory regions.  AFAIK there is no
> share-heap-but-not-stack option in Linux.

Yeah.  Not that it matters, but I had misremebered there being
finer grained control than that on clone().  Probably confused
it with something that someone overlaid on Mach once upon a time...

Anyway, do you see a hole or a serious performance
problem with my modified proposal (explicit mmap()
to create the necessary storage)?


            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-22 17:34                           ` Kevin D. Kissell
  (?)
@ 2002-01-22 17:37                           ` Daniel Jacobowitz
  2002-01-22 17:47                               ` Kevin D. Kissell
  -1 siblings, 1 reply; 94+ messages in thread
From: Daniel Jacobowitz @ 2002-01-22 17:37 UTC (permalink / raw)
  To: Kevin D. Kissell
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

On Tue, Jan 22, 2002 at 06:34:42PM +0100, Kevin D. Kissell wrote:
> > > > Perhaps I'm mangling terminology.  LinuxThreads is a one-to-one
> mapping
> > > > of kernel threads to user threads.  All the kernel threads, and thus
> > > > all the user threads, share the same memory region - including the
> > > > stack region.  Their stacks are differentiated solely by different
> > > > values in the stack pointer register.  Thus I don't think what you're
> > > > suggesting is possible.
> > >
> > > I don't see how fork() semantics can be preserved unless
> > > the stack regions are replicated (copy-on-write) on a fork().
> > > Under ATT and BSD Unix (which is where I did most of
> > > my kernel hacking in the old days) that was the *only*
> > > way to get a new kernel thread, so it was "obvious"
> > > that my proposed hack would work.  Linux does have
> > > the clone() function as well, and if LinuxThreads are
> > > implemented in terms of clone(foo, stakptr, CLONE_VM, arg),
> > > you are correct, the proposed scheme would not work
> > > without modification.
> >
> > Which it is.  Fork shares no memory regions;
> 
> Oh, come on.  If it doesn't share text regions, it's completely
> brain dead!

They aren't shared, they're duplicated.  They use the same physical
memory, and the same virtual addresses, but the page table entries are
separate.  That's what I meant.  No copy of the page table is common on
fork(), AFAIK.

> > vfork/clone share all memory regions.  AFAIK there is no
> > share-heap-but-not-stack option in Linux.
> 
> Yeah.  Not that it matters, but I had misremebered there being
> finer grained control than that on clone().  Probably confused
> it with something that someone overlaid on Mach once upon a time...
> 
> Anyway, do you see a hole or a serious performance
> problem with my modified proposal (explicit mmap()
> to create the necessary storage)?

Same problem as with clone.  I recommend the clone manpage; it says:

       CLONE_VM
              If CLONE_VM is set, the calling process and the child processes run in the same
              memory space.  In particular, memory writes performed by the calling process or
              by the child process are also visible in the other process.  Moreover, any mem­
              ory mapping or unmapping performed with mmap(2) or munmap(2) by  the  child  or
              calling process also affects the other process.

              If CLONE_VM is not set, the child process runs in a separate copy of the memory
              space of the calling process at the time of clone.  Memory writes or file  map­
              pings/unmappings  performed by one of the processes do not affect the other, as
              with fork(2).

That is, if any memory OR MAPPING is shared, they all are.

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 17:41                 ` Ulrich Drepper
  0 siblings, 0 replies; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-22 17:41 UTC (permalink / raw)
  To: Dominic Sweetman
  Cc: Kevin D. Kissell, Ralf Baechle, Mike Uhler, linux-mips,
	H . J . Lu

Dominic Sweetman <dom@algor.co.uk> writes:

> Sometimes when you're busy it's understandable to respond with "RTFM".
> But to fail to provide a URL is not very respectful: other people
> reading this list are quite smart, they're just smart about different
> things from you.

A simple Google search would have turned up

  http://developer.intel.com/design/itanium/downloads/24537003.pdf

as the first choice.  Section 7.5.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 17:41                 ` Ulrich Drepper
  0 siblings, 0 replies; 94+ messages in thread
From: Ulrich Drepper @ 2002-01-22 17:41 UTC (permalink / raw)
  To: Dominic Sweetman
  Cc: Kevin D. Kissell, Ralf Baechle, Mike Uhler, linux-mips,
	H . J . Lu

Dominic Sweetman <dom@algor.co.uk> writes:

> Sometimes when you're busy it's understandable to respond with "RTFM".
> But to fail to provide a URL is not very respectful: other people
> reading this list are quite smart, they're just smart about different
> things from you.

A simple Google search would have turned up

  http://developer.intel.com/design/itanium/downloads/24537003.pdf

as the first choice.  Section 7.5.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 17:47                               ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 17:47 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

> > Anyway, do you see a hole or a serious performance
> > problem with my modified proposal (explicit mmap()
> > to create the necessary storage)?
>
> Same problem as with clone.  I recommend the clone manpage; it says:
>
>        CLONE_VM
>               If CLONE_VM is set, the calling process and the child
processes run in the same
>               memory space.  In particular, memory writes performed by the
calling process or
>               by the child process are also visible in the other process.
Moreover, any mem­
>               ory mapping or unmapping performed with mmap(2) or munmap(2)
by  the  child  or
>               calling process also affects the other process.
>
>               If CLONE_VM is not set, the child process runs in a separate
copy of the memory
>               space of the calling process at the time of clone.  Memory
writes or file  map­
>               pings/unmappings  performed by one of the processes do not
affect the other, as
>               with fork(2).
>
> That is, if any memory OR MAPPING is shared, they all are.

Daniel, you didn't read my message.  The per-thread memory
would be allocated *after* the clone() in pthread_create().
More specifically, pthread_create() would set it up so that
the function passed to clone for invocation was in fact a
wrapper that sets up the memory and thread data before
invoking the application function passed to pthread_create().

Now, if the idea is that the clone() system call is supposed
to cause the thread to be born, like Athena, full-grown from
the head of Zeus, with the analog to the thread register
already set up when it leaves the kernel, then I would be inclined
to concede that we need to change the ABI, the kernel, and
compilers, and I would ask just what we get for our trouble.
But if we are permitted the pthreads abstraction, there's a
lot that can be done transparently.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 17:47                               ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 17:47 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

> > Anyway, do you see a hole or a serious performance
> > problem with my modified proposal (explicit mmap()
> > to create the necessary storage)?
>
> Same problem as with clone.  I recommend the clone manpage; it says:
>
>        CLONE_VM
>               If CLONE_VM is set, the calling process and the child
processes run in the same
>               memory space.  In particular, memory writes performed by the
calling process or
>               by the child process are also visible in the other process.
Moreover, any mem­
>               ory mapping or unmapping performed with mmap(2) or munmap(2)
by  the  child  or
>               calling process also affects the other process.
>
>               If CLONE_VM is not set, the child process runs in a separate
copy of the memory
>               space of the calling process at the time of clone.  Memory
writes or file  map­
>               pings/unmappings  performed by one of the processes do not
affect the other, as
>               with fork(2).
>
> That is, if any memory OR MAPPING is shared, they all are.

Daniel, you didn't read my message.  The per-thread memory
would be allocated *after* the clone() in pthread_create().
More specifically, pthread_create() would set it up so that
the function passed to clone for invocation was in fact a
wrapper that sets up the memory and thread data before
invoking the application function passed to pthread_create().

Now, if the idea is that the clone() system call is supposed
to cause the thread to be born, like Athena, full-grown from
the head of Zeus, with the analog to the thread register
already set up when it leaves the kernel, then I would be inclined
to concede that we need to change the ABI, the kernel, and
compilers, and I would ask just what we get for our trouble.
But if we are permitted the pthreads abstraction, there's a
lot that can be done transparently.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-22 17:47                               ` Kevin D. Kissell
  (?)
@ 2002-01-22 17:57                               ` Daniel Jacobowitz
  2002-01-22 18:18                                   ` Kevin D. Kissell
  -1 siblings, 1 reply; 94+ messages in thread
From: Daniel Jacobowitz @ 2002-01-22 17:57 UTC (permalink / raw)
  To: Kevin D. Kissell
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

On Tue, Jan 22, 2002 at 06:47:55PM +0100, Kevin D. Kissell wrote:
> > > Anyway, do you see a hole or a serious performance
> > > problem with my modified proposal (explicit mmap()
> > > to create the necessary storage)?
> >
> > Same problem as with clone.  I recommend the clone manpage; it says:
> >
> >        CLONE_VM
> >               If CLONE_VM is set, the calling process and the child
> processes run in the same
> >               memory space.  In particular, memory writes performed by the
> calling process or
> >               by the child process are also visible in the other process.
> Moreover, any mem­
> >               ory mapping or unmapping performed with mmap(2) or munmap(2)
> by  the  child  or
> >               calling process also affects the other process.
> >
> >               If CLONE_VM is not set, the child process runs in a separate
> copy of the memory
> >               space of the calling process at the time of clone.  Memory
> writes or file  map­
> >               pings/unmappings  performed by one of the processes do not
> affect the other, as
> >               with fork(2).
> >
> > That is, if any memory OR MAPPING is shared, they all are.
> 
> Daniel, you didn't read my message.  The per-thread memory
> would be allocated *after* the clone() in pthread_create().
> More specifically, pthread_create() would set it up so that
> the function passed to clone for invocation was in fact a
> wrapper that sets up the memory and thread data before
> invoking the application function passed to pthread_create().
> 
> Now, if the idea is that the clone() system call is supposed
> to cause the thread to be born, like Athena, full-grown from
> the head of Zeus, with the analog to the thread register
> already set up when it leaves the kernel, then I would be inclined
> to concede that we need to change the ABI, the kernel, and
> compilers, and I would ask just what we get for our trouble.
> But if we are permitted the pthreads abstraction, there's a
> lot that can be done transparently.

No, you didn't read my manpage quote, Kevin.  Or we're just talking
past each other.  The problem is not that existing mappings are shared,
but that "any memory mapping or unmapping performed with mmap(2)
or munmap(2) by the child or calling process also affects the other
process".  That is, if the child maps some private storage, the parent
will see it too.  Thus we can not use the private storage as a
thread-local storage unless we already have some thread-local way to
say where it is for this particular thread, and we're back where we
started.

Does that make sense, or am I missing your objection?

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 18:18                                   ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 18:18 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

> No, you didn't read my manpage quote, Kevin.  Or we're just talking
> past each other.  The problem is not that existing mappings are shared,
> but that "any memory mapping or unmapping performed with mmap(2)
> or munmap(2) by the child or calling process also affects the other
> process".  That is, if the child maps some private storage, the parent
> will see it too.  Thus we can not use the private storage as a
> thread-local storage unless we already have some thread-local way to
> say where it is for this particular thread, and we're back where we
> started.
> 
> Does that make sense, or am I missing your objection?

It doen't necessarily make *sense*, in that it seems to
be a pretty crippled memory model ;-) but I do see your
objection.  Sorry to have seemed dense, I'm doing several
things at once on a couple of screens this evening and
reading too quickly.  I had misread that as underscoring
that the effects of mmaps() *prior* to the clone() were
inherited.  Feh.  Well, we aren't likely to have the luxury
of fixing the underlying design of pthreads for Linux
to use a fork()-based model with explicit sharing
(which has its own problems, of course), so we may well 
be looking at ABI abuse.  I was really, really, hoping 
to avoid that, in that gcc/Linux is far from the only user 
(and commercially speaking, far from being the most 
important user) of the ABI, and any change that breaks 
backward compatibility and cross-platform compatibility 
would be a Very Bad Thing.

More on this later, and thanks for your (civil) comments,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 18:18                                   ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 18:18 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

> No, you didn't read my manpage quote, Kevin.  Or we're just talking
> past each other.  The problem is not that existing mappings are shared,
> but that "any memory mapping or unmapping performed with mmap(2)
> or munmap(2) by the child or calling process also affects the other
> process".  That is, if the child maps some private storage, the parent
> will see it too.  Thus we can not use the private storage as a
> thread-local storage unless we already have some thread-local way to
> say where it is for this particular thread, and we're back where we
> started.
> 
> Does that make sense, or am I missing your objection?

It doen't necessarily make *sense*, in that it seems to
be a pretty crippled memory model ;-) but I do see your
objection.  Sorry to have seemed dense, I'm doing several
things at once on a couple of screens this evening and
reading too quickly.  I had misread that as underscoring
that the effects of mmaps() *prior* to the clone() were
inherited.  Feh.  Well, we aren't likely to have the luxury
of fixing the underlying design of pthreads for Linux
to use a fork()-based model with explicit sharing
(which has its own problems, of course), so we may well 
be looking at ABI abuse.  I was really, really, hoping 
to avoid that, in that gcc/Linux is far from the only user 
(and commercially speaking, far from being the most 
important user) of the ABI, and any change that breaks 
backward compatibility and cross-platform compatibility 
would be a Very Bad Thing.

More on this later, and thanks for your (civil) comments,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-22 15:44                 ` Dominic Sweetman
@ 2002-01-22 21:44                   ` Tommy S. Christensen
  2002-01-22 21:53                       ` Kevin D. Kissell
  2002-01-23  1:12                     ` Jason Gunthorpe
  0 siblings, 2 replies; 94+ messages in thread
From: Tommy S. Christensen @ 2002-01-22 21:44 UTC (permalink / raw)
  To: Dominic Sweetman
  Cc: Daniel Jacobowitz, Kevin D. Kissell, Ralf Baechle, Ulrich Drepper,
	Mike Uhler, MIPS/Linux List (SGI), H . J . Lu

Dominic Sweetman wrote:
> 
> > In any case, that's not the real problem.  Linux user threads do not
> > have true separate stacks.  They share their _entire_ address space;
> > the stacks are all bounded (default is 2MB) and grouped together at
> > the top of the available memory region.
> 
> Quite.
> 
> A comment by Kevin reminded me of the real constraint (which the
> experts probably take for granted): this system is supposed to work on
> shared-memory multiprocessors and multithreaded CPUs.
> 
> In both cases two or more threads within an address space can be
> active simultaneously.  On a multithreaded CPU (in particular) there's
> only one TLB, so memory (including any memory specially handled by the
> kernel) is all held in common.  The *only* thing available to a user
> privilege program which distinguishes the threads is the CPU register
> set.
> 
> (Well, and the stack, which is a difference inherited from the value
> in the stack pointer register.  But the stack pointer is not really
> going to help much to return a thread-characteristic pointer or ID.)

Well, why not use the stack?

I am not quite familiar with the requirements on this "thread register",
but couldn't something like this be made to work:
  #define TID *((sp & ~(STACK_SIZE-1)) + STACK_SIZE - TID_OFFSET)

It assumes a fixed maximum stack size (and alignment), which it should
be possible to meet (virtual memory is cheap). The STACK_SIZE could
probably even be a (process global!) variable if it is not desirable
to limit this at compile time.

  -Tommy

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 21:53                       ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 21:53 UTC (permalink / raw)
  To: Tommy S. Christensen, Dominic Sweetman
  Cc: Daniel Jacobowitz, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

"Tommy S. Christensen" <tommy.christensen@eicon.com> wrote:
>

> Dominic Sweetman wrote:
> > 
> > > In any case, that's not the real problem.  Linux user threads do not
> > > have true separate stacks.  They share their _entire_ address space;
> > > the stacks are all bounded (default is 2MB) and grouped together at
> > > the top of the available memory region.
> > 
> > Quite.
> > 
> > A comment by Kevin reminded me of the real constraint (which the
> > experts probably take for granted): this system is supposed to work on
> > shared-memory multiprocessors and multithreaded CPUs.
> > 
> > In both cases two or more threads within an address space can be
> > active simultaneously.  On a multithreaded CPU (in particular) there's
> > only one TLB, so memory (including any memory specially handled by the
> > kernel) is all held in common.  The *only* thing available to a user
> > privilege program which distinguishes the threads is the CPU register
> > set.
> > 
> > (Well, and the stack, which is a difference inherited from the value
> > in the stack pointer register.  But the stack pointer is not really
> > going to help much to return a thread-characteristic pointer or ID.)
> 
> Well, why not use the stack?
> 
> I am not quite familiar with the requirements on this "thread register",
> but couldn't something like this be made to work:
>   #define TID *((sp & ~(STACK_SIZE-1)) + STACK_SIZE - TID_OFFSET)
> 
> It assumes a fixed maximum stack size (and alignment), which it should
> be possible to meet (virtual memory is cheap). The STACK_SIZE could
> probably even be a (process global!) variable if it is not desirable
> to limit this at compile time.

Thanks for writing this up.  I had the same thought over dinner,
but I'm throughly discredited today, and it's better that it came
from someone else.   ;-)

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 21:53                       ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 21:53 UTC (permalink / raw)
  To: Tommy S. Christensen, Dominic Sweetman
  Cc: Daniel Jacobowitz, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

"Tommy S. Christensen" <tommy.christensen@eicon.com> wrote:
>

> Dominic Sweetman wrote:
> > 
> > > In any case, that's not the real problem.  Linux user threads do not
> > > have true separate stacks.  They share their _entire_ address space;
> > > the stacks are all bounded (default is 2MB) and grouped together at
> > > the top of the available memory region.
> > 
> > Quite.
> > 
> > A comment by Kevin reminded me of the real constraint (which the
> > experts probably take for granted): this system is supposed to work on
> > shared-memory multiprocessors and multithreaded CPUs.
> > 
> > In both cases two or more threads within an address space can be
> > active simultaneously.  On a multithreaded CPU (in particular) there's
> > only one TLB, so memory (including any memory specially handled by the
> > kernel) is all held in common.  The *only* thing available to a user
> > privilege program which distinguishes the threads is the CPU register
> > set.
> > 
> > (Well, and the stack, which is a difference inherited from the value
> > in the stack pointer register.  But the stack pointer is not really
> > going to help much to return a thread-characteristic pointer or ID.)
> 
> Well, why not use the stack?
> 
> I am not quite familiar with the requirements on this "thread register",
> but couldn't something like this be made to work:
>   #define TID *((sp & ~(STACK_SIZE-1)) + STACK_SIZE - TID_OFFSET)
> 
> It assumes a fixed maximum stack size (and alignment), which it should
> be possible to meet (virtual memory is cheap). The STACK_SIZE could
> probably even be a (process global!) variable if it is not desirable
> to limit this at compile time.

Thanks for writing this up.  I had the same thought over dinner,
but I'm throughly discredited today, and it's better that it came
from someone else.   ;-)

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 23:13                         ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 23:13 UTC (permalink / raw)
  To: Kevin D. Kissell, Tommy S. Christensen, Dominic Sweetman
  Cc: Daniel Jacobowitz, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

> > Well, why not use the stack?
> >
> > I am not quite familiar with the requirements on this "thread register",
> > but couldn't something like this be made to work:
> >   #define TID *((sp & ~(STACK_SIZE-1)) + STACK_SIZE - TID_OFFSET)
> >
> > It assumes a fixed maximum stack size (and alignment), which it should
> > be possible to meet (virtual memory is cheap). The STACK_SIZE could
> > probably even be a (process global!) variable if it is not desirable
> > to limit this at compile time.
>
> Thanks for writing this up.  I had the same thought over dinner,
> but I'm throughly discredited today, and it's better that it came
> from someone else.   ;-)

That having been said, I don't think this scheme
will really work. Programs do build themselves
temporary stacks in the heap from time to time.
Signal stacks come to mind.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-22 23:13                         ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-22 23:13 UTC (permalink / raw)
  To: Kevin D. Kissell, Tommy S. Christensen, Dominic Sweetman
  Cc: Daniel Jacobowitz, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	MIPS/Linux List (SGI), H . J . Lu

> > Well, why not use the stack?
> >
> > I am not quite familiar with the requirements on this "thread register",
> > but couldn't something like this be made to work:
> >   #define TID *((sp & ~(STACK_SIZE-1)) + STACK_SIZE - TID_OFFSET)
> >
> > It assumes a fixed maximum stack size (and alignment), which it should
> > be possible to meet (virtual memory is cheap). The STACK_SIZE could
> > probably even be a (process global!) variable if it is not desirable
> > to limit this at compile time.
>
> Thanks for writing this up.  I had the same thought over dinner,
> but I'm throughly discredited today, and it's better that it came
> from someone else.   ;-)

That having been said, I don't think this scheme
will really work. Programs do build themselves
temporary stacks in the heap from time to time.
Signal stacks come to mind.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
  2002-01-22 21:44                   ` Tommy S. Christensen
  2002-01-22 21:53                       ` Kevin D. Kissell
@ 2002-01-23  1:12                     ` Jason Gunthorpe
  1 sibling, 0 replies; 94+ messages in thread
From: Jason Gunthorpe @ 2002-01-23  1:12 UTC (permalink / raw)
  To: Tommy S. Christensen, Ulrich Drepper; +Cc: MIPS/Linux List (SGI)


On Tue, 22 Jan 2002, Tommy S. Christensen wrote:

> Well, why not use the stack?
> 
> I am not quite familiar with the requirements on this "thread register",
> but couldn't something like this be made to work:
>   #define TID *((sp & ~(STACK_SIZE-1)) + STACK_SIZE - TID_OFFSET)

Last time I looked at how pthreads worked it did use the stack pointer to
decide what the TID is. It got rather ugly because the stack on thread 0
was not under program control, so it had all sorts of unknown properties.
But that could be fixed with kernel support I think.

The only reason I can think of to have a *fast* thread-local variable is
to implement thread-local storage. This is a good thing for glibc and
multi-threaded programs - the ultimate implemenation would probably be to
have gcc know about it (if ia64 has dedicated hardware, it is not
unimaginable, and other compilers do implement this)

extern int errno __attribute__((thread_local));

On i386 this has often been done using fs/gs to point to a block of ram. 

However, I expect you could probably also base the thread-local ram on the
top/bottom of the stack which means each procedure can compute the
(constant!) base in a couple of instructions. The runtime can know how
much to set aside before it begins executing the new thread. Aligning SP
can be done in a kernel independent way for tid 0. 

I don't know if this is worse than making the TLB handler slower to free
up k0/k1, it entirely depends how many functions will be using thread
local stuff.. 

Jason

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: patches for test-and-set without ll/sc (Re: thread-ready ABIs)
@ 2002-01-24  9:56                   ` Andreas Jaeger
  0 siblings, 0 replies; 94+ messages in thread
From: Andreas Jaeger @ 2002-01-24  9:56 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: Machida Hiroyuki, kevink, hjl, libc-hacker, linux-mips

Ulrich Drepper <drepper@redhat.com> writes:

> Machida Hiroyuki <machida@sm.sony.co.jp> writes:
>
>>   * glibc change:
>> 
>> 	We implement  test_and_set(addr, val) as follows,
>> 
>> 		Do mmap /dev/tst to _TST_START_MAGIC, if not yet mapped.
>> 		call _TST_START_MAGIC(addr, val)
>> 	
>> 	If we can't open /dev/tst then, use sysmips() as final resort.
>
> First, the patch as it is unacceptable.  A file with copyright Sony?
> All the code must be copyrighted by the FSF.  Sony will have to assign
> the copyright for the code to the FSF.
>
> Also, no such change can be accepted until the necessary kernel
> changes are in the official kernel sources.  I cannot make any
> exceptions since otherwise all kinds of people want to see support for
> their local hack added.
>
> Furthermore, the symbols were not available in version 2.2.  Therefore
> they cannot be exported with this version.  It'll either be 2.2.6 (if
> their ever will be such a release) or 2.3.
>
> And finally, the patch should be sent to the glibc MIPS maintainer for
> review.  The question is who feels responsible...

I'll look into it later in more detail.

But for now, let me just tell that I agree with Ulrich's comments.
Additionally I'd like to wait with adding this patch until:
- a solution for the thread register is found for MIPS (and those
  solution should not conflict with this patch)
- the kernel side patches have been adopted.

Therefore please discuss this with the kernel and ABI folks, and then
let's look again at the issues.

Andreas
-- 
 Andreas Jaeger
  SuSE Labs aj@suse.de
   private aj@arthur.inka.de
    http://www.suse.de/~aj

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: patches for test-and-set without ll/sc (Re: thread-ready ABIs)
@ 2002-01-24  9:56                   ` Andreas Jaeger
  0 siblings, 0 replies; 94+ messages in thread
From: Andreas Jaeger @ 2002-01-24  9:56 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: Machida Hiroyuki, kevink, hjl, libc-hacker, linux-mips

Ulrich Drepper <drepper@redhat.com> writes:

> Machida Hiroyuki <machida@sm.sony.co.jp> writes:
>
>>   * glibc change:
>> 
>> 	We implement  test_and_set(addr, val) as follows,
>> 
>> 		Do mmap /dev/tst to _TST_START_MAGIC, if not yet mapped.
>> 		call _TST_START_MAGIC(addr, val)
>> 	
>> 	If we can't open /dev/tst then, use sysmips() as final resort.
>
> First, the patch as it is unacceptable.  A file with copyright Sony?
> All the code must be copyrighted by the FSF.  Sony will have to assign
> the copyright for the code to the FSF.
>
> Also, no such change can be accepted until the necessary kernel
> changes are in the official kernel sources.  I cannot make any
> exceptions since otherwise all kinds of people want to see support for
> their local hack added.
>
> Furthermore, the symbols were not available in version 2.2.  Therefore
> they cannot be exported with this version.  It'll either be 2.2.6 (if
> their ever will be such a release) or 2.3.
>
> And finally, the patch should be sent to the glibc MIPS maintainer for
> review.  The question is who feels responsible...

I'll look into it later in more detail.

But for now, let me just tell that I agree with Ulrich's comments.
Additionally I'd like to wait with adding this patch until:
- a solution for the thread register is found for MIPS (and those
  solution should not conflict with this patch)
- the kernel side patches have been adopted.

Therefore please discuss this with the kernel and ABI folks, and then
let's look again at the issues.

Andreas
-- 
 Andreas Jaeger
  SuSE Labs aj@suse.de
   private aj@arthur.inka.de
    http://www.suse.de/~aj

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-27 20:24                           ` Alan Cox
  0 siblings, 0 replies; 94+ messages in thread
From: Alan Cox @ 2002-01-27 20:24 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Kevin D. Kissell, Dominic Sweetman, Ralf Baechle, Ulrich Drepper,
	Mike Uhler, "MIPS/Linux List (SGI)", H . J . Lu

> Which it is.  Fork shares no memory regions; vfork/clone share all
> memory regions.  AFAIK there is no share-heap-but-not-stack option in
> Linux.

Thats a design decision. At the point you don't have identical mappings for
both threads you need two sets of page tables and you take all the
performance hits that go with changing current tables on a schedule.

Its a lot cheaper to use a different %esp for each thread

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-27 20:24                           ` Alan Cox
  0 siblings, 0 replies; 94+ messages in thread
From: Alan Cox @ 2002-01-27 20:24 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Kevin D. Kissell, Dominic Sweetman, Ralf Baechle, Ulrich Drepper,
	Mike Uhler, "MIPS/Linux List SGI", H . J . Lu

> Which it is.  Fork shares no memory regions; vfork/clone share all
> memory regions.  AFAIK there is no share-heap-but-not-stack option in
> Linux.

Thats a design decision. At the point you don't have identical mappings for
both threads you need two sets of page tables and you take all the
performance hits that go with changing current tables on a schedule.

Its a lot cheaper to use a different %esp for each thread

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-28  8:50                             ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-28  8:50 UTC (permalink / raw)
  To: Daniel Jacobowitz, Alan Cox
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	"MIPS/Linux List (SGI)", H . J . Lu

> > Which it is.  Fork shares no memory regions; vfork/clone share all
> > memory regions.  AFAIK there is no share-heap-but-not-stack option in
> > Linux.
> 
> Thats a design decision. At the point you don't have identical mappings for
> both threads you need two sets of page tables and you take all the
> performance hits that go with changing current tables on a schedule.
> 
> Its a lot cheaper to use a different %esp for each thread

That's a point of view that reflects the PC-centric origins of
Linux.  Large-scale parallel systems, such as the current
high-end server offerings from Sun, IBM/Sequent, and SGI,
have a non-uniform memory access (NUMA) model on which
insisting on maintining identical page tables for all threads of
a parallel program can result in an intollerable level of remote
memory accesses.  The usual technique employed on such
systems is a dynamic replication of frequently accessed
pages.  This of course implies greater OS overhead for all 
operations on virtual memory maps, and additional overhead
to synchronize the copies, but that can be more than
compensated for by reducing the average memory latency
seen by the CPUs.  An even more extrememe case, though
one that is perhaps less relevant to mainstream parallel 
servers, is that of the emulation of shared memory ("virtual
shared memory" or VSM as we used to call it) on message
based highly-parallel machines, which likewise depends on
having distinctly set-up and managed page tables for different
threads/processes within a parallel program.

There's nothing wrong with having an OS design that allows
common page tables to be used across the threads of a
parallel program running on a simple, small-scale SMP
platform like a dual or quad CPU PC.  But making that
the *only* way that thread parallelism is supported
is, in my opinion, a design error that will have to be fixed
one of these days.  And the longer the existing model
is enhanced and maintained, the harder it will be to fix.

All that having been said, please note that this issue is
orthogonal to the question of whether one should have
a single "process image" across all threads of a parallel
program.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: thread-ready ABIs
@ 2002-01-28  8:50                             ` Kevin D. Kissell
  0 siblings, 0 replies; 94+ messages in thread
From: Kevin D. Kissell @ 2002-01-28  8:50 UTC (permalink / raw)
  To: Daniel Jacobowitz, Alan Cox
  Cc: Dominic Sweetman, Ralf Baechle, Ulrich Drepper, Mike Uhler,
	"MIPS/Linux List (SGI)", H . J . Lu

> > Which it is.  Fork shares no memory regions; vfork/clone share all
> > memory regions.  AFAIK there is no share-heap-but-not-stack option in
> > Linux.
> 
> Thats a design decision. At the point you don't have identical mappings for
> both threads you need two sets of page tables and you take all the
> performance hits that go with changing current tables on a schedule.
> 
> Its a lot cheaper to use a different %esp for each thread

That's a point of view that reflects the PC-centric origins of
Linux.  Large-scale parallel systems, such as the current
high-end server offerings from Sun, IBM/Sequent, and SGI,
have a non-uniform memory access (NUMA) model on which
insisting on maintining identical page tables for all threads of
a parallel program can result in an intollerable level of remote
memory accesses.  The usual technique employed on such
systems is a dynamic replication of frequently accessed
pages.  This of course implies greater OS overhead for all 
operations on virtual memory maps, and additional overhead
to synchronize the copies, but that can be more than
compensated for by reducing the average memory latency
seen by the CPUs.  An even more extrememe case, though
one that is perhaps less relevant to mainstream parallel 
servers, is that of the emulation of shared memory ("virtual
shared memory" or VSM as we used to call it) on message
based highly-parallel machines, which likewise depends on
having distinctly set-up and managed page tables for different
threads/processes within a parallel program.

There's nothing wrong with having an OS design that allows
common page tables to be used across the threads of a
parallel program running on a simple, small-scale SMP
platform like a dual or quad CPU PC.  But making that
the *only* way that thread parallelism is supported
is, in my opinion, a design error that will have to be fixed
one of these days.  And the longer the existing model
is enhanced and maintained, the harder it will be to fix.

All that having been said, please note that this issue is
orthogonal to the question of whether one should have
a single "process image" across all threads of a parallel
program.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2002-01-28  9:47 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <m3elkoa5dw.fsf@myware.mynet>
2002-01-18 18:19 ` thread-ready ABIs H . J . Lu
2002-01-18 18:31   ` Ulrich Drepper
2002-01-18 19:08     ` H . J . Lu
2002-01-18 19:20       ` Ulrich Drepper
2002-01-19 12:14         ` Dominic Sweetman
2002-01-19 12:14           ` Dominic Sweetman
2002-01-20  0:14     ` Ralf Baechle
2002-01-18 20:03   ` Maciej W. Rozycki
2002-01-18 20:20     ` Ulrich Drepper
2002-01-18 20:50       ` Maciej W. Rozycki
2002-01-18 21:02         ` Ulrich Drepper
2002-01-18 21:35           ` Maciej W. Rozycki
2002-01-18 21:44             ` Ulrich Drepper
2002-01-18 22:17               ` Maciej W. Rozycki
2002-01-18 21:23   ` Daniel Jacobowitz
2002-01-19  0:35   ` Kevin D. Kissell
2002-01-19  0:35     ` Kevin D. Kissell
2002-01-19  4:11     ` H . J . Lu
2002-01-19 12:27       ` Dominic Sweetman
2002-01-19 19:42         ` H . J . Lu
2002-01-21 13:27           ` Maciej W. Rozycki
2002-01-19 22:21         ` Kevin D. Kissell
2002-01-19 22:21           ` Kevin D. Kissell
2002-01-20 10:38       ` Machida Hiroyuki
2002-01-20 11:58         ` Kevin D. Kissell
2002-01-20 11:58           ` Kevin D. Kissell
2002-01-20 13:16           ` Machida Hiroyuki
2002-01-22  6:27             ` patches for test-and-set without ll/sc (Re: thread-ready ABIs) Machida Hiroyuki
2002-01-22  6:37               ` Ulrich Drepper
2002-01-22  6:46                 ` Machida Hiroyuki
2002-01-22  6:56                   ` Ulrich Drepper
2002-01-24  9:56                 ` Andreas Jaeger
2002-01-24  9:56                   ` Andreas Jaeger
2002-01-20 19:19           ` thread-ready ABIs H . J . Lu
2002-01-21  9:39             ` Kevin D. Kissell
2002-01-21  9:39               ` Kevin D. Kissell
2002-01-21 13:56               ` Maciej W. Rozycki
2002-01-21 18:24                 ` H . J . Lu
2002-01-21 18:36                   ` Ulrich Drepper
2002-01-21 18:52                     ` H . J . Lu
2002-01-21 18:58                       ` H . J . Lu
2002-01-21 18:59                       ` Daniel Jacobowitz
2002-01-21 19:05                         ` H . J . Lu
2002-01-21 19:09                           ` Daniel Jacobowitz
2002-01-21 19:18                             ` H . J . Lu
2002-01-21 21:04                         ` Kevin D. Kissell
2002-01-21 21:04                           ` Kevin D. Kissell
2002-01-21 19:30                       ` Geoff Keating
2002-01-21 21:07                       ` Ulrich Drepper
2002-01-21 13:43             ` Maciej W. Rozycki
2002-01-20  0:24     ` Ralf Baechle
2002-01-21 23:22       ` Ulrich Drepper
2002-01-21 23:57         ` Kevin D. Kissell
2002-01-21 23:57           ` Kevin D. Kissell
2002-01-22  0:16           ` Ulrich Drepper
2002-01-22  0:16             ` Ulrich Drepper
2002-01-22  9:37             ` Dominic Sweetman
2002-01-22  9:37               ` Dominic Sweetman
2002-01-22 17:41               ` Ulrich Drepper
2002-01-22 17:41                 ` Ulrich Drepper
2002-01-22  9:59           ` Dominic Sweetman
2002-01-22  9:59             ` Dominic Sweetman
2002-01-22 12:18             ` Kevin D. Kissell
2002-01-22 12:18               ` Kevin D. Kissell
2002-01-22 15:21               ` Daniel Jacobowitz
2002-01-22 15:44                 ` Dominic Sweetman
2002-01-22 21:44                   ` Tommy S. Christensen
2002-01-22 21:53                     ` Kevin D. Kissell
2002-01-22 21:53                       ` Kevin D. Kissell
2002-01-22 23:13                       ` Kevin D. Kissell
2002-01-22 23:13                         ` Kevin D. Kissell
2002-01-23  1:12                     ` Jason Gunthorpe
2002-01-22 16:05                 ` Kevin D. Kissell
2002-01-22 16:05                   ` Kevin D. Kissell
2002-01-22 16:34                   ` Daniel Jacobowitz
2002-01-22 17:08                     ` Kevin D. Kissell
2002-01-22 17:08                       ` Kevin D. Kissell
2002-01-22 17:13                       ` Daniel Jacobowitz
2002-01-22 17:34                         ` Kevin D. Kissell
2002-01-22 17:34                           ` Kevin D. Kissell
2002-01-22 17:37                           ` Daniel Jacobowitz
2002-01-22 17:47                             ` Kevin D. Kissell
2002-01-22 17:47                               ` Kevin D. Kissell
2002-01-22 17:57                               ` Daniel Jacobowitz
2002-01-22 18:18                                 ` Kevin D. Kissell
2002-01-22 18:18                                   ` Kevin D. Kissell
2002-01-27 20:24                         ` Alan Cox
2002-01-27 20:24                           ` Alan Cox
2002-01-28  8:50                           ` Kevin D. Kissell
2002-01-28  8:50                             ` Kevin D. Kissell
2002-01-22  1:39   ` Richard Henderson
2002-01-18 21:24 Justin Carlson
2002-01-18 21:31 ` Ulrich Drepper
2002-01-18 21:42 ` Maciej W. Rozycki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.