* [parisc-linux] Re: [parisc-linux-cvs] linux grundler
[not found] <20020804011707.2B9164860@dsl2.external.hp.com>
@ 2002-08-04 2:03 ` Grant Grundler
0 siblings, 0 replies; 26+ messages in thread
From: Grant Grundler @ 2002-08-04 2:03 UTC (permalink / raw)
To: parisc-linux; +Cc: parisc-linux-cvs, jes
Grant Grundler wrote:
> Log message:
> 2.4.18-pa62 acenic-0.91+minor fix
> Jes Sorensen asked me to test his latest driver...besides
> one minor build fix in acenic.h it loaded and booted.
>
> First time I tried the machine crashed. I didn't have /sbin/hotplug
> but CONFIG_HOTPLUG was enabled. May this serve as a reminder
> to others...
diff against whatever we had in our CVS before:
ftp://ftp.parisc-linux.org/patches/acenic-0.91-ggg.diff
and the original tarball (drivers/net/acenic.[ch]):
ftp://ftp.parisc-linux.org/patches/acenic-0.91.tar.gz
This line of output:
eth1: Enabling PCI Memory Mapped access - was not enabled by BIOS/Firmware
The driver isn't using pci_enable_device() and it should.
Jes is aware of the problem.
But I'm happy...another HP IO card supported by parisc-linux. ;^)
I tried to run netperf but netserver segfaults on parisc.
I'll rebuild to see if compiler changes have fixed netserver and try again.
Anyway, I need to figure out which other 1000SX card is connected
to the switch.
grant
a500:~# modprobe acenic
acenic.c: v0.91 07/31/2002 Jes Sorensen, linux-acenic@SunSITE.dk
http://home.cern.ch/~jes/gige/acenic.html
eth1: Enabling PCI Memory Mapped access - was not enabled by BIOS/Firmware
eth1: Alteon AceNIC Gigabit Ethernet at 0xfffffffffb000000, irq 322
Tigon II (Rev. 6), Firmware: 12.4.11, MAC: 00:30:6e:04:80:68
PCI bus width: 64 bits, speed: 66MHz, latency: 128 clks
eth1: Firmware up and running
eth1: Optical link UP (Full Duplex, Flow Control: TX RX)
a500:~# uname -a
Linux a500 2.4.18-pa61 #4 SMP Sat Aug 3 01:00:08 PDT 2002 parisc64 unknown unknown GNU/Linux
a500:~# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:30:6E:04:80:68
inet addr:192.168.0.21 Bcast:192.168.0.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:40 errors:0 dropped:0 overruns:0 frame:0
TX packets:3 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:8951 (8.7 KiB) TX bytes:1522 (1.4 KiB)
Interrupt:66
^ permalink raw reply [flat|nested] 26+ messages in thread
* [parisc-linux] Re: [parisc-linux-cvs] linux grundler
[not found] ` <20030208222746.GB19683@dsl2.external.hp.com>
@ 2003-02-08 23:23 ` Matthew Wilcox
2003-02-09 0:35 ` John David Anglin
` (2 more replies)
0 siblings, 3 replies; 26+ messages in thread
From: Matthew Wilcox @ 2003-02-08 23:23 UTC (permalink / raw)
To: Grant Grundler; +Cc: parisc-linux
On Sat, Feb 08, 2003 at 03:27:46PM -0700, Grant Grundler wrote:
> > Kudos to John David Anglin and Carlos O'Donnell for realizing
> > PA 2.0 is not strongly ordered like PA1.x is.
> > Read appendix G or PA-RISC 2.0 Architecture (Gerry Kane) for
> > details on "Memory Ordering Model".
I'm confused. This directly contradicts the comments in
include/asm-parisc/system.h. If this needs updating, so does that.
> > I've uploaded palinux-20030208.tgz to dsl2 and will try to build
> > a binutils.deb with this change as well.
Is all that's needed to take the latest binutils from debian unstable
and rebuild it on woody?
> +++ arch/parisc/Makefile 8 Feb 2003 06:21:32 -0000
> @@ -32,6 +32,10 @@ CROSS_COMPILE := hppa-linux-
> +ifdef CONFIG_PA20
> +CFLAGS += -mpa-risc-2-0
> +endif
I've reverted this one. We already have:
ifdef CONFIG_PA8X00
CFLAGS += -march=2.0 -mschedule=8000
endif
which covers the same cases.
> +++ include/asm-parisc/spinlock_t.h 8 Feb 2003 06:21:34 -0000
> +> I've attached a summary of the change, but basically, for PA 2.0, as
> +> long as the ",CO" (coherent operation) completer is specified, then the
> +> 16-byte alignment requirement for ldcw and ldcd is relaxed, and instead
> +> they only require "natural" alignment (4-byte for ldcw, 8-byte for
> +> ldcd).
That's interesting from an architecture PoV. From my recollection when jsm
was debugging problems on the 710, PCX-S is the only processor which actually
enforces the 16-byte alignment restriction on ldcw. So _practically_, we
don't need it unless we're supporting those old processors.
> +#ifdef CONFIG_PA20
> +/* PA2.0 is not strongly ordered. ldcw enforces ordering
> + * and we need to make sure ordering is enforced on the unlock too.
> + */
> +#define spin_unlock(x) \
> + __asm__ __volatile__ ("stw,o %%sp,0(%0)" : : "r" (x) : "memory" )
> +#else
> +
> +/* PA1.1 is strongly ordered. No issues here. */
> #define spin_unlock(x) do { (x)->lock = 1; } while(0)
> +
> +#endif
Actually... this may be a long-standing bug in our spinlocks. There's nothing
to prevent gcc reordering writes around this assignment. We need a barrier()
before the assignment, or maybe it'd be as well to do the assignment in an
asm() statement.
> #define spin_unlock_wait(x) do { barrier(); } while(((volatile spinlock_t *)(x))->lock == 0)
i wonder if this one was working around some obscure compiler bug a
while back. I don't see why we need to cast to a volatile spinlock_t *
given that lock is already defined as volatile.
One final point.... up till now, we've been telling people it's OK to
run kernels configured for PA1.1 on PA2.0 processors. This patch says
to me that's not safe. Do we need our distros (yeah, I hear there'll
soon be more than Debian supporting PA) to ship 5 flavours of kernel
(PA1.1 UP & SMP, PA2.0 32-bit SMP, 64-bit UP and 64-bit SMP) rather than
the current four?
--
"It's not Hollywood. War is real, war is primarily not about defeat or
victory, it is about death. I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-08 23:23 ` Matthew Wilcox
@ 2003-02-09 0:35 ` John David Anglin
2003-02-09 0:49 ` Randolph Chung
2003-02-09 3:27 ` Grant Grundler
2003-02-09 3:10 ` Grant Grundler
2003-02-09 8:11 ` Grant Grundler
2 siblings, 2 replies; 26+ messages in thread
From: John David Anglin @ 2003-02-09 0:35 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: grundler, parisc-linux
> That's interesting from an architecture PoV. From my recollection when jsm
> was debugging problems on the 710, PCX-S is the only processor which actually
> enforces the 16-byte alignment restriction on ldcw. So _practically_, we
> don't need it unless we're supporting those old processors.
I am pretty sure that any PA 1.x machine needs the alignment.
> > +#ifdef CONFIG_PA20
> > +/* PA2.0 is not strongly ordered. ldcw enforces ordering
> > + * and we need to make sure ordering is enforced on the unlock too.
> > + */
> > +#define spin_unlock(x) \
> > + __asm__ __volatile__ ("stw,o %%sp,0(%0)" : : "r" (x) : "memory" )
If you change the above to
__asm__ __volatile__ ("stw,ma %%sp,0(%0)" : : "r" (x) : "memory")
it should work on both PA11 and PA20. The ordered completer is only
PA 2.
> Actually... this may be a long-standing bug in our spinlocks. There's nothing
> to prevent gcc reordering writes around this assignment. We need a barrier()
> before the assignment, or maybe it'd be as well to do the assignment in an
> asm() statement.
>
> > #define spin_unlock_wait(x) do { barrier(); } while(((volatile spinlock_t *)(x))->lock == 0)
I think the volatile provides the barrier in my suggested version.
Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 0:35 ` John David Anglin
@ 2003-02-09 0:49 ` Randolph Chung
2003-02-09 0:56 ` Randolph Chung
2003-02-09 3:27 ` Grant Grundler
1 sibling, 1 reply; 26+ messages in thread
From: Randolph Chung @ 2003-02-09 0:49 UTC (permalink / raw)
To: John David Anglin; +Cc: Matthew Wilcox, grundler, parisc-linux
> I think the volatile provides the barrier in my suggested version.
volatile would prevent reordering accesses to a particular spinlock,
but does it ensure that code that happens after the spinlock will not
be reordered to happen before the spin-unlock?
for example, if we had:
volatile spinlock l1, l2;
lock(l1)
do something
unlock(l1)
lock(l2)
do something else
unlock(l2)
will gcc ever move the lock(l2) before the unlock(l1)?
(for reference, see this thread:
http://lists.debian.org/debian-gcc/2002/debian-gcc-200210/msg00058.html)
randolph
--
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 0:49 ` Randolph Chung
@ 2003-02-09 0:56 ` Randolph Chung
2003-02-09 2:03 ` Matthew Wilcox
2003-02-09 2:11 ` John David Anglin
0 siblings, 2 replies; 26+ messages in thread
From: Randolph Chung @ 2003-02-09 0:56 UTC (permalink / raw)
To: John David Anglin; +Cc: Matthew Wilcox, grundler, parisc-linux
> lock(l1)
> do something
> unlock(l1)
>
> lock(l2)
> do something else
> unlock(l2)
>
> will gcc ever move the lock(l2) before the unlock(l1)?
well, bad example, since l1 and l2 are both volatile, but in general,
nothing will prevent part of "do something else" to happen inside the l1
lock, right?
randolph
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 0:56 ` Randolph Chung
@ 2003-02-09 2:03 ` Matthew Wilcox
2003-02-09 2:18 ` John David Anglin
2003-02-09 14:55 ` James Bottomley
2003-02-09 2:11 ` John David Anglin
1 sibling, 2 replies; 26+ messages in thread
From: Matthew Wilcox @ 2003-02-09 2:03 UTC (permalink / raw)
To: Randolph Chung; +Cc: John David Anglin, Matthew Wilcox, grundler, parisc-linux
On Sat, Feb 08, 2003 at 04:56:08PM -0800, Randolph Chung wrote:
> > lock(l1)
> > do something
> > unlock(l1)
> >
> > lock(l2)
> > do something else
> > unlock(l2)
> >
> > will gcc ever move the lock(l2) before the unlock(l1)?
>
> well, bad example, since l1 and l2 are both volatile, but in general,
> nothing will prevent part of "do something else" to happen inside the l1
> lock, right?
I think it's even worse than that. What stops gcc reordering:
typedef struct {
spinlock_t lock;
volatile int counter;
} rwlock_t;
static __inline__ void _raw_read_lock(rwlock_t *rw)
{
while (__ldcw (&(x)->lock) == 0) \
while (((x)->lock) == 0) ; } while (0)
rw->counter++;
do { (x)->lock = 1; } while(0)
}
to:
while (__ldcw (&(x)->lock) == 0) \
while (((x)->lock) == 0) ; } while (0)
do { (x)->lock = 1; } while(0)
rw->counter++;
--
"It's not Hollywood. War is real, war is primarily not about defeat or
victory, it is about death. I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 0:56 ` Randolph Chung
2003-02-09 2:03 ` Matthew Wilcox
@ 2003-02-09 2:11 ` John David Anglin
1 sibling, 0 replies; 26+ messages in thread
From: John David Anglin @ 2003-02-09 2:11 UTC (permalink / raw)
To: tausq; +Cc: willy, grundler, parisc-linux
> > lock(l1)
> > do something
> > unlock(l1)
> >
> > lock(l2)
> > do something else
> > unlock(l2)
> >
> > will gcc ever move the lock(l2) before the unlock(l1)?
>
> well, bad example, since l1 and l2 are both volatile, but in general,
> nothing will prevent part of "do something else" to happen inside the l1
> lock, right?
Volatile would stop gcc from moving code across either the lock or the unlock.
A blockage is in fact implemented as an UNSPEC_VOLATILE instruction.
The PA 8000 has a large reorder buffer, so there are similar issues.
It's guaranteed that instructions will appear to execute in order on a
processor but the memory updates as viewed from another processor may
occur out of order. To ensure that other processors have a consistent
view of whats happening, one must use ordered instructions for the lock
and unlock operations. This will force all stores and loads inside the
lock to complete. Further, if cache flush operations occur inside
a lock, you must also do a "sync" before unlocking to ensure that these
operations complete. The ldcw insn is strongly ordered. A load and
store on PA 2.0 can be made ordered with the correct completer. Cache
flush operations are weakly ordered. On PA 1.x, all load and store
instructions are strongly ordered, so ordering isn't an issue.
The current spinlock reset is done in high-level C. So, gcc can move code
that that isn't dependent across the reset. PA 2.0 processors can also do
the same.
Appendix G describes ordering in much more detail. Hopefully, I haven't
butchered what's said there too badly.
The reordering that GCC does is conditional on instruction dependencies. I
just reworked the handling of the PIC register restore in PIC code
because all the PIC register were not apparent in the initial scheduling
pass. This is in 3.3 and main, but not 3.2.
Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 2:03 ` Matthew Wilcox
@ 2003-02-09 2:18 ` John David Anglin
2003-02-09 14:55 ` James Bottomley
1 sibling, 0 replies; 26+ messages in thread
From: John David Anglin @ 2003-02-09 2:18 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: tausq, willy, grundler, parisc-linux
> I think it's even worse than that. What stops gcc reordering:
>
> typedef struct {
> spinlock_t lock;
> volatile int counter;
> } rwlock_t;
>
> static __inline__ void _raw_read_lock(rwlock_t *rw)
> {
> while (__ldcw (&(x)->lock) == 0) \
> while (((x)->lock) == 0) ; } while (0)
> rw->counter++;
> do { (x)->lock = 1; } while(0)
> }
>
> to:
>
> while (__ldcw (&(x)->lock) == 0) \
> while (((x)->lock) == 0) ; } while (0)
> do { (x)->lock = 1; } while(0)
> rw->counter++;
Nothing. The reset needs to be a volatile asm. This will stop GCC from
doing the above. On PA 2.0, the processor can do similar reordering. So,
the reset needs to be an ordered store. Well, I think the scheduling
model tends to try to do things as early as possible, consistent with
not over feeding the pipeline. However, I wouldn't rely on this to
get the instrunction order that you want.
Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 26+ messages in thread
* [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-08 23:23 ` Matthew Wilcox
2003-02-09 0:35 ` John David Anglin
@ 2003-02-09 3:10 ` Grant Grundler
2003-02-09 12:29 ` Matthew Wilcox
2003-02-09 8:11 ` Grant Grundler
2 siblings, 1 reply; 26+ messages in thread
From: Grant Grundler @ 2003-02-09 3:10 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: parisc-linux
On Sat, Feb 08, 2003 at 11:23:03PM +0000, Matthew Wilcox wrote:
> Is all that's needed to take the latest binutils from debian unstable
> and rebuild it on woody?
Almost. It needs a patch too:
ftp://ftp.parisc-linux.org/patches/900_order_hppa.diff
Just drop that in debian/patches before building the debs.
That patch applies clean to the binutils for unstable.
I don't even pretend to understand how to properly build
debian packages, much less binutils or "cross release" builds.
That's why I made the tarball for "testing".
> That's interesting from an architecture PoV. From my recollection when jsm
> was debugging problems on the 710, PCX-S is the only processor which actually
> enforces the 16-byte alignment restriction on ldcw. So _practically_, we
> don't need it unless we're supporting those old processors.
I don't care to find out the hard way.
I'd rather just comply with the architecture and not worry about it.
If someone can demonstrate a perf advantage or issue, I'll be
more receptive.
> Actually... this may be a long-standing bug in our spinlocks. There's nothing
> to prevent gcc reordering writes around this assignment. We need a barrier()
> before the assignment, or maybe it'd be as well to do the assignment in an
> asm() statement.
I've read the followups to this and I gather our spinlocks are very broken.
If someone tells me what the right fix is, I'll test on PA20 32/64 bit
and commit.
> One final point.... up till now, we've been telling people it's OK to
> run kernels configured for PA1.1 on PA2.0 processors. This patch says
> to me that's not safe.
Only for SMP. I think for UP the rule still holds.
> Do we need our distros (yeah, I hear there'll
> soon be more than Debian supporting PA) to ship 5 flavours of kernel
> (PA1.1 UP & SMP, PA2.0 32-bit SMP, 64-bit UP and 64-bit SMP) rather than
> the current four?
Unfortunately yes.
OTOH, PA20 SMP still hasn't proven stable so maybe it's not worth
doing at the moment either. Once PA20 SMP is stable, we could drop
the 64-bit UP kernels since most systems that *require* 64-bit are SMP.
thanks,
grant
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 0:35 ` John David Anglin
2003-02-09 0:49 ` Randolph Chung
@ 2003-02-09 3:27 ` Grant Grundler
2003-02-09 4:06 ` John David Anglin
1 sibling, 1 reply; 26+ messages in thread
From: Grant Grundler @ 2003-02-09 3:27 UTC (permalink / raw)
To: John David Anglin; +Cc: Matthew Wilcox, parisc-linux
On Sat, Feb 08, 2003 at 07:35:18PM -0500, John David Anglin wrote:
> > +#define spin_unlock(x) \
> > + __asm__ __volatile__ ("stw,o %%sp,0(%0)" : : "r" (x) : "memory" )
>
> If you change the above to
>
> __asm__ __volatile__ ("stw,ma %%sp,0(%0)" : : "r" (x) : "memory")
>
> it should work on both PA11 and PA20. The ordered completer is only
> PA 2.
Excellent idea.
In case it's not obvious to others, "stw,o" is an alias
for "stw,ma" with a Zero index value. But PA11 assembler
will grok the stw,ma properly.
I'll look at getting this into the kernel this week if someone
else doesn't beat me to it. I'd like to test what I've got a bit
more now and get the binutils .deb availability issue resolved.
thanks,
grant
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 3:27 ` Grant Grundler
@ 2003-02-09 4:06 ` John David Anglin
0 siblings, 0 replies; 26+ messages in thread
From: John David Anglin @ 2003-02-09 4:06 UTC (permalink / raw)
To: Grant Grundler; +Cc: willy, parisc-linux
> > If you change the above to
> >
> > __asm__ __volatile__ ("stw,ma %%sp,0(%0)" : : "r" (x) : "memory")
> >
> > it should work on both PA11 and PA20. The ordered completer is only
> > PA 2.
>
> Excellent idea.
The reason I am so up on this is that in testing gcc 3.4 on hppa1.1 this
weekend I found a problem with the locking code in atomicity.h. This
was the 16-byte alignment issue. In reviewing the assembly code being
generated, I revisited how the lock reset was done and came up with the
above.
This is also relevant to locking in glibc.
Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 26+ messages in thread
* [parisc-linux] re: [parisc-linux-cvs] linux grundler
@ 2003-02-09 7:40 John Marvin
2003-02-09 8:26 ` [parisc-linux] " Grant Grundler
0 siblings, 1 reply; 26+ messages in thread
From: John Marvin @ 2003-02-09 7:40 UTC (permalink / raw)
To: parisc-linux; +Cc: grundler
> o moved disable_sr_hash() from SMP to common code path so all
> CPU's (including monarch) have this disabled.
Huh? You didn't seriously think the monarch still had sr hashing enabled
did you? Believe me, shared memory wouldn't work at all if that was the
case.
This change is wrong. init_per_cpu() is called too early for the monarch.
disable_sr_hashing relies on the boot cpu data to be initialized to
determine which code should be used to disable the sr hashing. It might
work for PA2.0, since it is the default case, but you probably broke all
of the other processors.
disable_sr_hashing() is called for the monarch in mm/init.c in
setup_bootmem().
John Marvin
jsm@fc.hp.com
^ permalink raw reply [flat|nested] 26+ messages in thread
* [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-08 23:23 ` Matthew Wilcox
2003-02-09 0:35 ` John David Anglin
2003-02-09 3:10 ` Grant Grundler
@ 2003-02-09 8:11 ` Grant Grundler
2003-02-09 12:21 ` Matthew Wilcox
2 siblings, 1 reply; 26+ messages in thread
From: Grant Grundler @ 2003-02-09 8:11 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: parisc-linux
On Sat, Feb 08, 2003 at 11:23:03PM +0000, Matthew Wilcox wrote:
> Is all that's needed to take the latest binutils from debian unstable
> and rebuild it on woody?
I've rebuilt "unstable" binutils + ldwa,o patch on "testing" and dropped
them on
ftp://ftp.parisc-linux.org/unofficial-debs/
-rw-r--r-- 1 grundler ftpadmin 478700 Feb 8 23:51 binutils-dev_2.13.90.0.16-1_hppa.deb
-rw-r--r-- 1 grundler ftpadmin 3311272 Feb 8 23:53 binutils-multiarch_2.13.90.0.16-1_hppa.deb
-rw-r--r-- 1 grundler ftpadmin 2307548 Feb 8 23:54 binutils_2.13.90.0.16-1_hppa.deb
They install fine on my c3k though TBH, I haven't tried to use them.
(I did use the tarballs before posting earlier).
If someone wants to repeat that exercise for woody, that would be
fine with me.
/me checks off an item from his TODO list
Apperently, we still have problems with timers or timer code on PA20 SMP.
Still haven't been able to isolate this to any peice of code. :^(
g'night,
grant
^ permalink raw reply [flat|nested] 26+ messages in thread
* [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 7:40 John Marvin
@ 2003-02-09 8:26 ` Grant Grundler
0 siblings, 0 replies; 26+ messages in thread
From: Grant Grundler @ 2003-02-09 8:26 UTC (permalink / raw)
To: John Marvin; +Cc: parisc-linux
On Sun, Feb 09, 2003 at 12:40:14AM -0700, John Marvin wrote:
> > o moved disable_sr_hash() from SMP to common code path so all
> > CPU's (including monarch) have this disabled.
>
> Huh? You didn't seriously think the monarch still had sr hashing enabled
> did you?
sorry - I did.
I could not find where it was getting cleared for the monarch.
> Believe me, shared memory wouldn't work at all if that was the case.
ok.
> This change is wrong. init_per_cpu() is called too early for the monarch.
yup - it's before collect_boot_cpu_data(). :^(
> disable_sr_hashing() is called for the monarch in mm/init.c in
> setup_bootmem().
ugh. my bad.
I was only looking in arch/parisc/kernel/ since that's where it's defined
and was being used in the SMP case.
Can I move the disable_sr_hashing() call from setup_bootmem() to
before/after cache_init() in setup_arch()?
Please feel free to correct if you have time to muck with it.
If my damage is not backed out in the morning, I'll back it out then.
thanks for pointing out this stupidity,
grant
^ permalink raw reply [flat|nested] 26+ messages in thread
* [parisc-linux] re: [parisc-linux-cvs] linux grundler
@ 2003-02-09 8:55 John Marvin
0 siblings, 0 replies; 26+ messages in thread
From: John Marvin @ 2003-02-09 8:55 UTC (permalink / raw)
To: parisc-linux; +Cc: grundler
> Can I move the disable_sr_hashing() call from setup_bootmem() to
> before/after cache_init() in setup_arch()?
Sure. As long as you call it before paging_init() you will be OK.
> Please feel free to correct if you have time to muck with it.
> If my damage is not backed out in the morning, I'll back it out then.
I don't have a current tree and I need to go to bed.
John
^ permalink raw reply [flat|nested] 26+ messages in thread
* [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 8:11 ` Grant Grundler
@ 2003-02-09 12:21 ` Matthew Wilcox
0 siblings, 0 replies; 26+ messages in thread
From: Matthew Wilcox @ 2003-02-09 12:21 UTC (permalink / raw)
To: Grant Grundler; +Cc: Matthew Wilcox, parisc-linux
On Sun, Feb 09, 2003 at 01:11:21AM -0700, Grant Grundler wrote:
> I've rebuilt "unstable" binutils + ldwa,o patch on "testing" and dropped
> them on
> ftp://ftp.parisc-linux.org/unofficial-debs/
>
> -rw-r--r-- 1 grundler ftpadmin 478700 Feb 8 23:51 binutils-dev_2.13.90.0.16-1_hppa.deb
> -rw-r--r-- 1 grundler ftpadmin 3311272 Feb 8 23:53 binutils-multiarch_2.13.90.0.16-1_hppa.deb
> -rw-r--r-- 1 grundler ftpadmin 2307548 Feb 8 23:54 binutils_2.13.90.0.16-1_hppa.deb
I've removed these. Few problems:
- Same version number as the unstable package, so an upgrade wouldn't
replace them
- Not listed in the Packages file.
- debian/changelog not updated
I'm doing a fresh build now.
--
"It's not Hollywood. War is real, war is primarily not about defeat or
victory, it is about death. I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk
^ permalink raw reply [flat|nested] 26+ messages in thread
* [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 3:10 ` Grant Grundler
@ 2003-02-09 12:29 ` Matthew Wilcox
2003-02-09 14:35 ` Matthew Wilcox
2003-02-09 19:14 ` Grant Grundler
0 siblings, 2 replies; 26+ messages in thread
From: Matthew Wilcox @ 2003-02-09 12:29 UTC (permalink / raw)
To: Grant Grundler; +Cc: Matthew Wilcox, parisc-linux
On Sat, Feb 08, 2003 at 08:10:14PM -0700, Grant Grundler wrote:
> Just drop that in debian/patches before building the debs.
> That patch applies clean to the binutils for unstable.
> I don't even pretend to understand how to properly build
> debian packages, much less binutils or "cross release" builds.
> That's why I made the tarball for "testing".
OK. It's not really hard, but I don't mind doing it. It's not like
this is a regular occurrence. It's building now; when I've finished
I'll upload it to unofficial-debs and people will be able to apt-get
install it from there.
> I don't care to find out the hard way.
> I'd rather just comply with the architecture and not worry about it.
> If someone can demonstrate a perf advantage or issue, I'll be
> more receptive.
Sure. When someone's trying to implement futexes, this may prove critical..
> > One final point.... up till now, we've been telling people it's OK to
> > run kernels configured for PA1.1 on PA2.0 processors. This patch says
> > to me that's not safe.
>
> Only for SMP. I think for UP the rule still holds.
Agree. That was implicit in the kernel list i gave later, but I should've
stated it explicitly.
> Unfortunately yes.
>
> OTOH, PA20 SMP still hasn't proven stable so maybe it's not worth
> doing at the moment either. Once PA20 SMP is stable, we could drop
> the 64-bit UP kernels since most systems that *require* 64-bit are SMP.
Sure, but there's a measurable performance difference if you compile out
spinlocks
--
"It's not Hollywood. War is real, war is primarily not about defeat or
victory, it is about death. I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk
^ permalink raw reply [flat|nested] 26+ messages in thread
* [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 12:29 ` Matthew Wilcox
@ 2003-02-09 14:35 ` Matthew Wilcox
2003-02-09 19:14 ` Grant Grundler
1 sibling, 0 replies; 26+ messages in thread
From: Matthew Wilcox @ 2003-02-09 14:35 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Grant Grundler, parisc-linux
On Sun, Feb 09, 2003 at 12:29:13PM +0000, Matthew Wilcox wrote:
> OK. It's not really hard, but I don't mind doing it. It's not like
> this is a regular occurrence. It's building now; when I've finished
> I'll upload it to unofficial-debs and people will be able to apt-get
> install it from there.
done. http://ftp.parisc-linux.org/unofficial-debs/README
untested as i don't have root on any machine that's convenient.
--
"It's not Hollywood. War is real, war is primarily not about defeat or
victory, it is about death. I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 2:03 ` Matthew Wilcox
2003-02-09 2:18 ` John David Anglin
@ 2003-02-09 14:55 ` James Bottomley
1 sibling, 0 replies; 26+ messages in thread
From: James Bottomley @ 2003-02-09 14:55 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Randolph Chung, John David Anglin, grundler, parisc-linux
On Sat, 2003-02-08 at 20:03, Matthew Wilcox wrote:
> I think it's even worse than that. What stops gcc reordering:
>
> typedef struct {
> spinlock_t lock;
> volatile int counter;
> } rwlock_t;
>
> static __inline__ void _raw_read_lock(rwlock_t *rw)
> {
> while (__ldcw (&(x)->lock) == 0) \
> while (((x)->lock) == 0) ; } while (0)
> rw->counter++;
> do { (x)->lock = 1; } while(0)
> }
>
> to:
>
> while (__ldcw (&(x)->lock) == 0) \
> while (((x)->lock) == 0) ; } while (0)
> do { (x)->lock = 1; } while(0)
> rw->counter++;
Compilers themselves have fairly strong reordering rules. For instance,
do { } while() blocks cannot be reordered like your example (that's why
the kernel uses do { } while(0); as a compiler reordering barrier.
You can if you prefer use the barrier() macro, which prevents the
compiler from reordering statements around it.
Of course, the processor can still reorder what the compiler doesn't as
part of its speculation and optimisation (that's what mb(), rmb() and
wmb() are all about).
James
^ permalink raw reply [flat|nested] 26+ messages in thread
* [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 12:29 ` Matthew Wilcox
2003-02-09 14:35 ` Matthew Wilcox
@ 2003-02-09 19:14 ` Grant Grundler
2003-02-09 21:24 ` Aaron St. Pierre
2003-02-09 23:56 ` Randolph Chung
1 sibling, 2 replies; 26+ messages in thread
From: Grant Grundler @ 2003-02-09 19:14 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: parisc-linux
On Sun, Feb 09, 2003 at 12:29:13PM +0000, Matthew Wilcox wrote:
> I'll upload it to unofficial-debs and people will be able to apt-get
> install it from there.
cool - thanks.
> Sure, but there's a measurable performance difference if you compile out
> spinlocks
if people care about the last 5% performance, they still have several
options:
o build your own kernel
o run HPUX
I've not done or seen any lmbench perf results recently but before hpux
was 10x faster on context switch/task related stuff. And X11 server
takes advantage of HW acceleration.
(I'm associating UP systems with graphics which isn't always true)
grant
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 19:14 ` Grant Grundler
@ 2003-02-09 21:24 ` Aaron St. Pierre
2003-02-10 16:47 ` Grant Grundler
2003-02-09 23:56 ` Randolph Chung
1 sibling, 1 reply; 26+ messages in thread
From: Aaron St. Pierre @ 2003-02-09 21:24 UTC (permalink / raw)
To: Grant Grundler; +Cc: Matthew Wilcox, parisc-linux
In another life Grant Grundler wrote:
> On Sun, Feb 09, 2003 at 12:29:13PM +0000, Matthew Wilcox wrote:
> > I'll upload it to unofficial-debs and people will be able to apt-get
> > install it from there.
>
> cool - thanks.
>
> > Sure, but there's a measurable performance difference if you compile out
> > spinlocks
>
> if people care about the last 5% performance, they still have several
> options:
> o build your own kernel
As far as you know am I the only person, that has reported not being
able to boot a 2.4.20-pa?? kernel compiled either natively or cross ?
By the way, I tried pa24 today, though I haven't built it natively
yet I cross compiled it to no avail...
> o run HPUX
>
> I've not done or seen any lmbench perf results recently but before hpux
> was 10x faster on context switch/task related stuff. And X11 server
> takes advantage of HW acceleration.
> (I'm associating UP systems with graphics which isn't always true)
>
> grant
> _______________________________________________
> parisc-linux mailing list
> parisc-linux@lists.parisc-linux.org
> http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
--
Aaron St. Pierre tel: 978.828.6177
asp@ungod.com
Either I'm dead or my watch has stopped.
-- Groucho Marx's last words
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 19:14 ` Grant Grundler
2003-02-09 21:24 ` Aaron St. Pierre
@ 2003-02-09 23:56 ` Randolph Chung
1 sibling, 0 replies; 26+ messages in thread
From: Randolph Chung @ 2003-02-09 23:56 UTC (permalink / raw)
To: Grant Grundler; +Cc: Matthew Wilcox, parisc-linux
> I've not done or seen any lmbench perf results recently but before hpux
well, we can fix that... :) Here are the numbers for a 2x440Mhz A500 running
2.4.20-pa24 SMP.
btw, if you look at
http://lists.parisc-linux.org/pipermail/parisc-linux/2002-April/015984.html
we are only 2-3x slower than hpux :-)
I'm a bit confused about the "mmap latency" numbers. why are they so
high?
Related to this, thibaut and I were experimenting with running dbench on
two A500s today running identical kernels, but one is 2x440MHz and one
is 2x550MHz. The 550MHz is almost 10x faster. Is that expected?
Memory/disk configurations (3GB RAM in 440MHz, 2.5GB in 550MHz, etc) are
not exactly the same, but I wouldn't have expected them to differ by that much.
randolph
L M B E N C H 2 . 0 S U M M A R Y
------------------------------------
Basic system parameters
----------------------------------------------------
Host OS Description Mhz
--------- ------------- ----------------------- ----
ios Linux 2.4.20- hppa64-linux-gnu 440
Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
Host OS Mhz null null open selct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
ios Linux 2.4.20- 440 0.73 1.85 10.2 11.8 110.8 2.17 16.9 17.K 41.K 85.K
Context switching - times in microseconds - smaller is better
-------------------------------------------------------------
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
--------- ------------- ----- ------ ------ ------ ------ ------- -------
ios Linux 2.4.20- 5.630 5.1500 4.9900 9.0800 23.4 52.1 175.8
*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP
ctxsw UNIX UDP TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
ios Linux 2.4.20- 5.630 20.4 48.4 89.2 172.5 114.1 351.5 1000
File & VM system latencies in microseconds - smaller is better
--------------------------------------------------------------
Host OS 0K File 10K File Mmap Prot Page
Create Delete Create Delete Latency Fault Fault
--------- ------------- ------ ------ ------ ------ ------- ----- -----
ios Linux 2.4.20- 130.0 80.0 509.9 180.0 3120.0K 603.4 58.0
*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------
Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem
UNIX reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
ios Linux 2.4.20- 49.1 81.0 33.6 74.0 268.4 201.8 168.8 260. 250.9
Memory latencies in nanoseconds - smaller is better
(WARNING - may not be correct, check graphs)
---------------------------------------------------
Host OS Mhz L1 $ L2 $ Main mem Guesses
--------- ------------- ---- ----- ------ -------- -------
ios Linux 2.4.20- 440 20.0 20.0 220.0 No L1 cache?
^ permalink raw reply [flat|nested] 26+ messages in thread
* [parisc-linux] Re: [parisc-linux-cvs] linux grundler
@ 2003-02-10 8:38 John Marvin
0 siblings, 0 replies; 26+ messages in thread
From: John Marvin @ 2003-02-10 8:38 UTC (permalink / raw)
To: parisc-linux
> 2.4.20-pa24 PA20 memory ordering
> Kudos to John David Anglin and Carlos O'Donnell for realizing
> PA 2.0 is not strongly ordered like PA1.x is.
> Read appendix G or PA-RISC 2.0 Architecture (Gerry Kane) for
> details on "Memory Ordering Model".
Sorry, this is wrong. I'm afraid you are wasting your time with all of
these code changes.
The problem is that the Kane book defines the architecture, i.e. it
defines what can be done, not what has been done. Of course, to know what
has been done you have to read the various processor ERS's, and I'm not
sure we've made any of the PA2.0 chip ERS's available.
Anyway, no PA2.0 processor has implemented the PSW O bit, TLB O bit or
support for the ,o completers (they are just ignored). All of the PA2.0
processors are strongly ordered, just like the PA1.x processors are. I
can pretty much guarantee that no future PA processor is going to change
that fact.
John Marvin
jsm@fc.hp.com
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
2003-02-09 21:24 ` Aaron St. Pierre
@ 2003-02-10 16:47 ` Grant Grundler
0 siblings, 0 replies; 26+ messages in thread
From: Grant Grundler @ 2003-02-10 16:47 UTC (permalink / raw)
To: Aaron St. Pierre; +Cc: parisc-linux
On Sun, Feb 09, 2003 at 04:24:54PM -0500, Aaron St. Pierre wrote:
> In another life Grant Grundler wrote:
> > if people care about the last 5% performance, they still have several
> > options:
> > o build your own kernel
>
> As far as you know am I the only person, that has reported not being
> able to boot a 2.4.20-pa?? kernel compiled either natively or cross ?
yes. So it's not an option for you.
> > o run HPUX
> >
> > I've not done or seen any lmbench perf results recently but before hpux
> > was 10x faster on context switch/task related stuff. And X11 server
Seems I remembered this partially wrong.
Linux was slightly faster in lots of areas, ~5x slower on fork/exec/shi,
and ~10x slower mmap latency.
http://lists.parisc-linux.org/pipermail/parisc-linux/2002-March/015966.html
grant
^ permalink raw reply [flat|nested] 26+ messages in thread
* [parisc-linux] Re: [parisc-linux-cvs] linux grundler
[not found] <20030708022259.B62B849404E@palinux.hppa>
@ 2003-07-08 15:06 ` Carlos O'Donell
0 siblings, 0 replies; 26+ messages in thread
From: Carlos O'Donell @ 2003-07-08 15:06 UTC (permalink / raw)
To: parisc-linux; +Cc: parisc-linux-cvs
> Builds/links/boots using my c3000 .config.
Same here. Not dying under load either :)
c.
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [parisc-linux] Re: [parisc-linux-cvs] linux grundler
@ 2003-07-09 10:35 Joel Soete
0 siblings, 0 replies; 26+ messages in thread
From: Joel Soete @ 2003-07-09 10:35 UTC (permalink / raw)
To: Carlos O'Donell, parisc-linux; +Cc: parisc-linux-cvs
>>
>>> Builds/links/boots using my c3000 .config.
>
>Same here. Not dying under load either :)
>
>c.
hmm i tested successfully NS87... builtin but failed as module:
modprob do a page fault and lsmod shows me a module 'busy' and so the system
ailled to reboot (only power off button works but fs were so not cleanly
umount :( )
Cheers,
Joel
------------------------------------------------------
Soldes Tiscali ADSL : 27,50 euros/mois jusque fin 2003.
On s'habitue vite à payer son ADSL moins cher!
Plus d'info? Cliquez ici... http://reg.tiscali.be/default.asp?lg=fr
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2003-07-09 10:35 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20030708022259.B62B849404E@palinux.hppa>
2003-07-08 15:06 ` [parisc-linux] Re: [parisc-linux-cvs] linux grundler Carlos O'Donell
2003-07-09 10:35 Joel Soete
-- strict thread matches above, loose matches on Subject: below --
2003-02-10 8:38 John Marvin
2003-02-09 8:55 [parisc-linux] " John Marvin
2003-02-09 7:40 John Marvin
2003-02-09 8:26 ` [parisc-linux] " Grant Grundler
[not found] <20030208222242.AA3554829@dsl2.external.hp.com>
[not found] ` <20030208222746.GB19683@dsl2.external.hp.com>
2003-02-08 23:23 ` Matthew Wilcox
2003-02-09 0:35 ` John David Anglin
2003-02-09 0:49 ` Randolph Chung
2003-02-09 0:56 ` Randolph Chung
2003-02-09 2:03 ` Matthew Wilcox
2003-02-09 2:18 ` John David Anglin
2003-02-09 14:55 ` James Bottomley
2003-02-09 2:11 ` John David Anglin
2003-02-09 3:27 ` Grant Grundler
2003-02-09 4:06 ` John David Anglin
2003-02-09 3:10 ` Grant Grundler
2003-02-09 12:29 ` Matthew Wilcox
2003-02-09 14:35 ` Matthew Wilcox
2003-02-09 19:14 ` Grant Grundler
2003-02-09 21:24 ` Aaron St. Pierre
2003-02-10 16:47 ` Grant Grundler
2003-02-09 23:56 ` Randolph Chung
2003-02-09 8:11 ` Grant Grundler
2003-02-09 12:21 ` Matthew Wilcox
[not found] <20020804011707.2B9164860@dsl2.external.hp.com>
2002-08-04 2:03 ` Grant Grundler
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.