* [PATCH]: R10000 Needs LL/SC Workaround in Glibc
@ 2008-10-31 5:01 Kumba
2008-11-01 7:33 ` Kumba
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Kumba @ 2008-10-31 5:01 UTC (permalink / raw)
To: libc-ports; +Cc: Daniel Jacobowitz, Linux MIPS List
[-- Attachment #1: Type: text/plain, Size: 962 bytes --]
The attached patch adds a workaround for R10000 CPUs to use the branch likely
(beqzl) instruction in atomic operations, because revisions of the CPU before
3.0 misbehave, while revisions 2.6 and earlier will deadlock. This issue has
been noted on SGI IP28 (Indigo2 Impact R10000) systems and SGI IP27 Origin systems.
I drafted it after some discussion with several people in the Linux/MIPS IRC
Channel after discovering glibc didn't work quite right on my IP28 machine. The
patch is based on Debian bug #462112, viewable here:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=462112
Feedback would be welcome on any suggestions for improving this patch (please
CC, as I'm not subscribed to the ML).
Thanks!
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
"The past tempts us, the present confuses us, the future frightens us. And our
lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
[-- Attachment #2: glibc-trunk-r10k-beqzl.patch --]
[-- Type: text/plain, Size: 3766 bytes --]
diff -Naurp libc.orig/ports/sysdeps/mips/bits/atomic.h libc/ports/sysdeps/mips/bits/atomic.h
--- libc.orig/ports/sysdeps/mips/bits/atomic.h 2005-03-28 04:14:59.000000000 -0500
+++ libc/ports/sysdeps/mips/bits/atomic.h 2008-10-30 23:39:37.000000000 -0400
@@ -53,6 +53,31 @@ typedef uintmax_t uatomic_max_t;
#define MIPS_SYNC_STR_1(X) MIPS_SYNC_STR_2(X)
#define MIPS_SYNC_STR MIPS_SYNC_STR_1(MIPS_SYNC)
+/* Certain revisions of the R10000 Processor need an LL/SC Workaround
+ enabled. Revisions before 3.0 misbehave on atomic operations, and
+ Revs 2.6 and lower deadlock after several seconds due to other errata.
+
+ To quote the R10K Errata:
+ Workaround: The basic idea is to inhibit the four instructions
+ from simultaneously becoming active in R10000. Padding all
+ ll/sc sequences with nops or changing the looping branch in the
+ routines to a branch likely (which is always predicted taken
+ by R10000) will work. The nops should go after the loop, and the
+ number of them should be 28. This number could be decremented for
+ each additional instruction in the ll/sc loop such as the lock
+ modifier(s) between the ll and sc, the looping branch and its
+ delay slot. For typical short routines with one ll/sc loop, any
+ instructions after the loop could also count as a decrement. The
+ nop workaround pollutes the cache more but would be a few cycles
+ faster if all the code is in the cache and the looping branch
+ is predicted not taken. */
+
+#ifndef (_MIPS_ARCH_R10000)
+#define R10K_BEQZ_INSN "beqz %1,1b\n"
+#else
+#define R10K_BEQZ_INSN "beqzl %1,1b\n"
+#endif
+
/* Compare and exchange. For all of the "xxx" routines, we expect a
"__prev" and a "__cmp" variable to be provided by the enclosing scope,
in which values are returned. */
@@ -74,7 +99,7 @@ typedef uintmax_t uatomic_max_t;
"bne %0,%2,2f\n\t" \
"move %1,%3\n\t" \
"sc %1,%4\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -98,7 +123,7 @@ typedef uintmax_t uatomic_max_t;
"bne %0,%2,2f\n\t" \
"move %1,%3\n\t" \
"scd %1,%4\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -192,7 +217,7 @@ typedef uintmax_t uatomic_max_t;
"ll %0,%3\n\t" \
"move %1,%2\n\t" \
"sc %1,%3\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -216,7 +241,7 @@ typedef uintmax_t uatomic_max_t;
"lld %0,%3\n\t" \
"move %1,%2\n\t" \
"scd %1,%3\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -251,7 +276,7 @@ typedef uintmax_t uatomic_max_t;
"ll %0,%3\n\t" \
"addu %1,%0,%2\n\t" \
"sc %1,%3\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -275,7 +300,7 @@ typedef uintmax_t uatomic_max_t;
"lld %0,%3\n\t" \
"daddu %1,%0,%2\n\t" \
"scd %1,%3\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH]: R10000 Needs LL/SC Workaround in Glibc
2008-10-31 5:01 [PATCH]: R10000 Needs LL/SC Workaround in Glibc Kumba
@ 2008-11-01 7:33 ` Kumba
2008-11-01 11:26 ` Ralf Baechle
2008-11-01 17:23 ` James Perkins
2 siblings, 0 replies; 9+ messages in thread
From: Kumba @ 2008-11-01 7:33 UTC (permalink / raw)
To: libc-ports; +Cc: Daniel Jacobowitz, Linux MIPS List
[-- Attachment #1: Type: text/plain, Size: 1420 bytes --]
Kumba wrote:
>
> The attached patch adds a workaround for R10000 CPUs to use the branch
> likely (beqzl) instruction in atomic operations, because revisions of
> the CPU before 3.0 misbehave, while revisions 2.6 and earlier will
> deadlock. This issue has been noted on SGI IP28 (Indigo2 Impact R10000)
> systems and SGI IP27 Origin systems.
>
> I drafted it after some discussion with several people in the Linux/MIPS
> IRC Channel after discovering glibc didn't work quite right on my IP28
> machine. The patch is based on Debian bug #462112, viewable here:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=462112
>
> Feedback would be welcome on any suggestions for improving this patch
> (please CC, as I'm not subscribed to the ML).
Had a typo in my original patch w/ some stray parenthesis. A fixed patch is
attached.
I've wondered this on the equivalent patch on the gcc-patches ML as well, on
whether this check should be strictly limited to when -march=r10000 is passed to
the compiler. I think -march=mips4 is probably better, but would having beqzl
used even when -march=mips2, which is the ISA level that added branch likely, be
even better?
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
"The past tempts us, the present confuses us, the future frightens us. And our
lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
[-- Attachment #2: glibc-trunk-r10k-beqzl.patch --]
[-- Type: text/plain, Size: 3764 bytes --]
diff -Naurp libc.orig/ports/sysdeps/mips/bits/atomic.h libc/ports/sysdeps/mips/bits/atomic.h
--- libc.orig/ports/sysdeps/mips/bits/atomic.h 2005-03-28 04:14:59.000000000 -0500
+++ libc/ports/sysdeps/mips/bits/atomic.h 2008-10-30 23:39:37.000000000 -0400
@@ -53,6 +53,31 @@ typedef uintmax_t uatomic_max_t;
#define MIPS_SYNC_STR_1(X) MIPS_SYNC_STR_2(X)
#define MIPS_SYNC_STR MIPS_SYNC_STR_1(MIPS_SYNC)
+/* Certain revisions of the R10000 Processor need an LL/SC Workaround
+ enabled. Revisions before 3.0 misbehave on atomic operations, and
+ Revs 2.6 and lower deadlock after several seconds due to other errata.
+
+ To quote the R10K Errata:
+ Workaround: The basic idea is to inhibit the four instructions
+ from simultaneously becoming active in R10000. Padding all
+ ll/sc sequences with nops or changing the looping branch in the
+ routines to a branch likely (which is always predicted taken
+ by R10000) will work. The nops should go after the loop, and the
+ number of them should be 28. This number could be decremented for
+ each additional instruction in the ll/sc loop such as the lock
+ modifier(s) between the ll and sc, the looping branch and its
+ delay slot. For typical short routines with one ll/sc loop, any
+ instructions after the loop could also count as a decrement. The
+ nop workaround pollutes the cache more but would be a few cycles
+ faster if all the code is in the cache and the looping branch
+ is predicted not taken. */
+
+#ifndef _MIPS_ARCH_R10000
+#define R10K_BEQZ_INSN "beqz %1,1b\n"
+#else
+#define R10K_BEQZ_INSN "beqzl %1,1b\n"
+#endif
+
/* Compare and exchange. For all of the "xxx" routines, we expect a
"__prev" and a "__cmp" variable to be provided by the enclosing scope,
in which values are returned. */
@@ -74,7 +99,7 @@ typedef uintmax_t uatomic_max_t;
"bne %0,%2,2f\n\t" \
"move %1,%3\n\t" \
"sc %1,%4\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -98,7 +123,7 @@ typedef uintmax_t uatomic_max_t;
"bne %0,%2,2f\n\t" \
"move %1,%3\n\t" \
"scd %1,%4\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -192,7 +217,7 @@ typedef uintmax_t uatomic_max_t;
"ll %0,%3\n\t" \
"move %1,%2\n\t" \
"sc %1,%3\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -216,7 +241,7 @@ typedef uintmax_t uatomic_max_t;
"lld %0,%3\n\t" \
"move %1,%2\n\t" \
"scd %1,%3\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -251,7 +276,7 @@ typedef uintmax_t uatomic_max_t;
"ll %0,%3\n\t" \
"addu %1,%0,%2\n\t" \
"sc %1,%3\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -275,7 +300,7 @@ typedef uintmax_t uatomic_max_t;
"lld %0,%3\n\t" \
"daddu %1,%0,%2\n\t" \
"scd %1,%3\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH]: R10000 Needs LL/SC Workaround in Glibc
2008-10-31 5:01 [PATCH]: R10000 Needs LL/SC Workaround in Glibc Kumba
2008-11-01 7:33 ` Kumba
@ 2008-11-01 11:26 ` Ralf Baechle
2008-11-04 7:16 ` Kumba
2008-11-01 17:23 ` James Perkins
2 siblings, 1 reply; 9+ messages in thread
From: Ralf Baechle @ 2008-11-01 11:26 UTC (permalink / raw)
To: Kumba; +Cc: libc-ports, Daniel Jacobowitz, Linux MIPS List
On Fri, Oct 31, 2008 at 01:01:30AM -0400, Kumba wrote:
> +#ifndef (_MIPS_ARCH_R10000)
> +#define R10K_BEQZ_INSN "beqz %1,1b\n"
> +#else
> +#define R10K_BEQZ_INSN "beqzl %1,1b\n"
> +#endif
In the kernel we have very good knowledge about what types of processors
are being used for what configuration; much less in userland and the code
as suggested by you would result in a silent failure on affected R10000
machines if version built not for the R10000 was being used - iow no
improvment over what we have right now. So for userland I'd prefer to
o MIPS I builds: use the some 28 nops.
o Builds for MIPS II or better: always use the branch likely
o A runtime test would have to be implemented pessimisticall because it
would have to rely on /proc being mounted which isn't available early in
the boot process. It's probably going to add more overhead than it
saves anyway.
There is a price for using branch likely - but not that high. In the grand
picture it'll almost certainly vanish in the benchmarking noise.
Ralf
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH]: R10000 Needs LL/SC Workaround in Glibc
2008-10-31 5:01 [PATCH]: R10000 Needs LL/SC Workaround in Glibc Kumba
2008-11-01 7:33 ` Kumba
2008-11-01 11:26 ` Ralf Baechle
@ 2008-11-01 17:23 ` James Perkins
2008-11-04 7:16 ` Kumba
2008-11-23 4:16 ` Kumba
2 siblings, 2 replies; 9+ messages in thread
From: James Perkins @ 2008-11-01 17:23 UTC (permalink / raw)
To: Kumba; +Cc: libc-ports, Daniel Jacobowitz, Linux MIPS List
"move %1,%3\n\t" \
"sc %1,%4\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN \
acq "\n\t" \
".set pop\n" \
Is it possible to leave the parameters in the inline code and
remove them from the macro definition? I feel the code is more
readable without having to refer to the macro definition if
the parameters are left in place.
Cheers,
James
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH]: R10000 Needs LL/SC Workaround in Glibc
2008-11-01 11:26 ` Ralf Baechle
@ 2008-11-04 7:16 ` Kumba
0 siblings, 0 replies; 9+ messages in thread
From: Kumba @ 2008-11-04 7:16 UTC (permalink / raw)
To: libc-ports; +Cc: Ralf Baechle, Daniel Jacobowitz, Linux MIPS List
Ralf Baechle wrote:
>
> In the kernel we have very good knowledge about what types of processors
> are being used for what configuration; much less in userland and the code
> as suggested by you would result in a silent failure on affected R10000
> machines if version built not for the R10000 was being used - iow no
> improvment over what we have right now. So for userland I'd prefer to
>
> o MIPS I builds: use the some 28 nops.
> o Builds for MIPS II or better: always use the branch likely
> o A runtime test would have to be implemented pessimisticall because it
> would have to rely on /proc being mounted which isn't available early in
> the boot process. It's probably going to add more overhead than it
> saves anyway.
>
> There is a price for using branch likely - but not that high. In the grand
> picture it'll almost certainly vanish in the benchmarking noise.
Good idea. I'll tinker with this once I wrap my head around the gcc-side of things.
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
"The past tempts us, the present confuses us, the future frightens us. And our
lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH]: R10000 Needs LL/SC Workaround in Glibc
2008-11-01 17:23 ` James Perkins
@ 2008-11-04 7:16 ` Kumba
2008-11-23 4:16 ` Kumba
1 sibling, 0 replies; 9+ messages in thread
From: Kumba @ 2008-11-04 7:16 UTC (permalink / raw)
To: James Perkins; +Cc: libc-ports, Daniel Jacobowitz, Linux MIPS List
James Perkins wrote:
> "move %1,%3\n\t" \
> "sc %1,%4\n\t" \
> - "beqz %1,1b\n" \
> + R10K_BEQZ_INSN \
> acq "\n\t" \
> ".set pop\n" \
>
> Is it possible to leave the parameters in the inline code and
> remove them from the macro definition? I feel the code is more
> readable without having to refer to the macro definition if
> the parameters are left in place.
Makes sense, I'll keep this in mind for the next patch revision I make.
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
"The past tempts us, the present confuses us, the future frightens us. And our
lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH]: R10000 Needs LL/SC Workaround in Glibc
2008-11-01 17:23 ` James Perkins
2008-11-04 7:16 ` Kumba
@ 2008-11-23 4:16 ` Kumba
2009-01-27 15:29 ` Daniel Jacobowitz
1 sibling, 1 reply; 9+ messages in thread
From: Kumba @ 2008-11-23 4:16 UTC (permalink / raw)
To: libc-ports; +Cc: Daniel Jacobowitz, Linux MIPS List
James Perkins wrote:
> "move %1,%3\n\t" \
> "sc %1,%4\n\t" \
> - "beqz %1,1b\n" \
> + R10K_BEQZ_INSN \
> acq "\n\t" \
> ".set pop\n" \
>
> Is it possible to leave the parameters in the inline code and
> remove them from the macro definition? I feel the code is more
> readable without having to refer to the macro definition if
> the parameters are left in place.
Here's try #2. The gcc-side is already sent in and accepted. If I'm still
missing anything, please let me know!
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
2008-11-22 Joshua Kinard <kumba@gentoo.org>
* ports/sysdeps/mips/bits/atomic.h
(R10K_BEQZ_INSN, R10K_NOPS_INSN): Define depending on ISA.
(__arch_compare_and_exchange_xxx_32_int): Replace 'beqz' insn with
R10K_BEQZ_INSN and add R10K_NOPS_INSN.
(__arch_compare_and_exchange_xxx_64_int): Likewise
(__arch_exchange_xxx_32_int): Likewise
(__arch_exchange_xxx_64_int): Likewise
(__arch_exchange_and_add_32_int): Likewise
(__arch_exchange_and_add_64_int): Likewise
Index: ports/sysdeps/mips/bits/atomic.h
===================================================================
RCS file: /cvs/glibc/ports/sysdeps/mips/bits/atomic.h,v
retrieving revision 1.1
diff -u -p -r1.1 atomic.h
--- ports/sysdeps/mips/bits/atomic.h 28 Mar 2005 09:14:59 -0000 1.1
+++ ports/sysdeps/mips/bits/atomic.h 23 Nov 2008 03:22:53 -0000
@@ -49,6 +49,61 @@ typedef uintmax_t uatomic_max_t;
# define MIPS_SYNC sync
#endif
+/* Certain revisions of the R10000 Processor need an LL/SC Workaround
+ enabled. Revisions before 3.0 misbehave on atomic operations, and
+ Revs 2.6 and lower deadlock after several seconds due to other errata.
+
+ To quote the R10K Errata:
+ Workaround: The basic idea is to inhibit the four instructions
+ from simultaneously becoming active in R10000. Padding all
+ ll/sc sequences with nops or changing the looping branch in the
+ routines to a branch likely (which is always predicted taken
+ by R10000) will work. The nops should go after the loop, and the
+ number of them should be 28. This number could be decremented for
+ each additional instruction in the ll/sc loop such as the lock
+ modifier(s) between the ll and sc, the looping branch and its
+ delay slot. For typical short routines with one ll/sc loop, any
+ instructions after the loop could also count as a decrement. The
+ nop workaround pollutes the cache more but would be a few cycles
+ faster if all the code is in the cache and the looping branch
+ is predicted not taken. */
+
+#if (defined(_MIPS_ARCH_MIPS2) || defined(_MIPS_ARCH_MIPS3) || \
+ defined(_MIPS_ARCH_MIPS4))
+#define R10K_BEQZ_INSN "beqzl"
+#define R10K_NOPS_INSN ""
+#else
+#define R10K_BEQZ_INSN "beqz"
+#define R10K_NOPS_INSN "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n" \
+ "\tnop\n"
+#endif
+
#define MIPS_SYNC_STR_2(X) #X
#define MIPS_SYNC_STR_1(X) MIPS_SYNC_STR_2(X)
#define MIPS_SYNC_STR MIPS_SYNC_STR_1(MIPS_SYNC)
@@ -74,7 +129,8 @@ typedef uintmax_t uatomic_max_t;
"bne %0,%2,2f\n\t" \
"move %1,%3\n\t" \
"sc %1,%4\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN " %1,1b\n" \
+ R10K_NOPS_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -98,7 +154,8 @@ typedef uintmax_t uatomic_max_t;
"bne %0,%2,2f\n\t" \
"move %1,%3\n\t" \
"scd %1,%4\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN " %1,1b\n" \
+ R10K_NOPS_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -192,7 +249,8 @@ typedef uintmax_t uatomic_max_t;
"ll %0,%3\n\t" \
"move %1,%2\n\t" \
"sc %1,%3\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN " %1,1b\n" \
+ R10K_NOPS_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -216,7 +274,8 @@ typedef uintmax_t uatomic_max_t;
"lld %0,%3\n\t" \
"move %1,%2\n\t" \
"scd %1,%3\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN " %1,1b\n" \
+ R10K_NOPS_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -251,7 +310,8 @@ typedef uintmax_t uatomic_max_t;
"ll %0,%3\n\t" \
"addu %1,%0,%2\n\t" \
"sc %1,%3\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN " %1,1b\n" \
+ R10K_NOPS_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
@@ -275,7 +335,8 @@ typedef uintmax_t uatomic_max_t;
"lld %0,%3\n\t" \
"daddu %1,%0,%2\n\t" \
"scd %1,%3\n\t" \
- "beqz %1,1b\n" \
+ R10K_BEQZ_INSN " %1,1b\n" \
+ R10K_NOPS_INSN \
acq "\n\t" \
".set pop\n" \
"2:\n\t" \
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH]: R10000 Needs LL/SC Workaround in Glibc
2008-11-23 4:16 ` Kumba
@ 2009-01-27 15:29 ` Daniel Jacobowitz
2009-01-27 16:13 ` Maciej W. Rozycki
0 siblings, 1 reply; 9+ messages in thread
From: Daniel Jacobowitz @ 2009-01-27 15:29 UTC (permalink / raw)
To: Kumba; +Cc: libc-ports, Linux MIPS List
On Sat, Nov 22, 2008 at 11:16:18PM -0500, Kumba wrote:
> Here's try #2. The gcc-side is already sent in and accepted. If I'm
> still missing anything, please let me know!
>
> Joshua Kinard
> Gentoo/MIPS
> kumba@gentoo.org
>
>
> 2008-11-22 Joshua Kinard <kumba@gentoo.org>
>
> * ports/sysdeps/mips/bits/atomic.h
> (R10K_BEQZ_INSN, R10K_NOPS_INSN): Define depending on ISA.
> (__arch_compare_and_exchange_xxx_32_int): Replace 'beqz' insn with
> R10K_BEQZ_INSN and add R10K_NOPS_INSN.
> (__arch_compare_and_exchange_xxx_64_int): Likewise
> (__arch_exchange_xxx_32_int): Likewise
> (__arch_exchange_xxx_64_int): Likewise
> (__arch_exchange_and_add_32_int): Likewise
> (__arch_exchange_and_add_64_int): Likewise
Thinking about this...
MIPS I: 28 NOPs is really horrid. Not so much on this processor if
the code is all in cache, but I guess that older/simpler processors
are going to sit for a number of cycles chewing through those NOPs.
Are distributions still building MIPS I code? Can we assume that
people who want to run glibc on an R10K can at least get something
for MIPS II?
MIPS II, MIPS III, MIPS IV: Using beqzl does not seem particularly
horrid - although it's still a shame since this branch is in fact
anti-likely. It will almost never be taken.
Other platforms: !(MIPS II or MIPS III or MIPS IV) is not the same as
(MIPS I)! Please don't activate this workaround on builds that won't
run on an R10K, like MIPS32.
--
Daniel Jacobowitz
CodeSourcery
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH]: R10000 Needs LL/SC Workaround in Glibc
2009-01-27 15:29 ` Daniel Jacobowitz
@ 2009-01-27 16:13 ` Maciej W. Rozycki
0 siblings, 0 replies; 9+ messages in thread
From: Maciej W. Rozycki @ 2009-01-27 16:13 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: Kumba, libc-ports, Linux MIPS List
On Tue, 27 Jan 2009, Daniel Jacobowitz wrote:
> > 2008-11-22 Joshua Kinard <kumba@gentoo.org>
> >
> > * ports/sysdeps/mips/bits/atomic.h
> > (R10K_BEQZ_INSN, R10K_NOPS_INSN): Define depending on ISA.
> > (__arch_compare_and_exchange_xxx_32_int): Replace 'beqz' insn with
> > R10K_BEQZ_INSN and add R10K_NOPS_INSN.
> > (__arch_compare_and_exchange_xxx_64_int): Likewise
> > (__arch_exchange_xxx_32_int): Likewise
> > (__arch_exchange_xxx_64_int): Likewise
> > (__arch_exchange_and_add_32_int): Likewise
> > (__arch_exchange_and_add_64_int): Likewise
>
> Thinking about this...
>
> MIPS I: 28 NOPs is really horrid. Not so much on this processor if
> the code is all in cache, but I guess that older/simpler processors
> are going to sit for a number of cycles chewing through those NOPs.
> Are distributions still building MIPS I code? Can we assume that
> people who want to run glibc on an R10K can at least get something
> for MIPS II?
I agree this is horrible. I would rather not have a workaround for a
broken chip in the official sources at all than badly hit good chips
(comprising the vast majority). Unless this can be made a compile-time
option, so that whoever is interested in it can use "-march=mips1
-mfix-r10000" or suchlike to get it activated, I am against the change.
> MIPS II, MIPS III, MIPS IV: Using beqzl does not seem particularly
> horrid - although it's still a shame since this branch is in fact
> anti-likely. It will almost never be taken.
Again if only "-march=mips2 -mfix-r10000" etc. activates it, then I am
fine with that, otherwise it is a no-no for me.
> Other platforms: !(MIPS II or MIPS III or MIPS IV) is not the same as
> (MIPS I)! Please don't activate this workaround on builds that won't
> run on an R10K, like MIPS32.
Nothing to add here. ;)
Maciej
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-01-27 16:13 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-31 5:01 [PATCH]: R10000 Needs LL/SC Workaround in Glibc Kumba
2008-11-01 7:33 ` Kumba
2008-11-01 11:26 ` Ralf Baechle
2008-11-04 7:16 ` Kumba
2008-11-01 17:23 ` James Perkins
2008-11-04 7:16 ` Kumba
2008-11-23 4:16 ` Kumba
2009-01-27 15:29 ` Daniel Jacobowitz
2009-01-27 16:13 ` Maciej W. Rozycki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox