* "Illegal instruction" traps on smp clients - 2.4.19
@ 2003-02-27 14:44 Rudy Klinksiek
2003-02-27 15:57 ` Ethan Weinstein
2003-02-27 16:37 ` David Bryan
0 siblings, 2 replies; 6+ messages in thread
From: Rudy Klinksiek @ 2003-02-27 14:44 UTC (permalink / raw)
To: linuxppc-dev
Hello:
This is a message that was posted last week on linux-smp.
No responses, so I'm rewriting/reposting here.
Our configuration uses Linux 2.4.19, from Synergy ( derived
from YellowDog version 2.1). We have several boards
configd in a server/client relationship. These boards
contain either 2 or 4 G4 Altived ppc processors. The
server has an attached disk, clients are diskless, mounting
their root file system over nfs.
I am seeing frequent "Illegal instruction" traps on clients
that run an smp kernel. Other symptoms include failure of
various daemons during startup ( syslogd, crond, sshd, etc ).
Symptoms also occur during rsh/rlogin usage.
Running a UP kernel on clients works just fine.
Smp and UP kernels work fine on the "server".
Has anyone else seen this type of problem or something similar?
This appears to me to be an smp problem.
A fix relating to page table/tlb invalidation ordering
was detailed by Sunil Saxena at
http://www.cs.helsinki.fi/linux/linux-kernel/2002-20/0756.html
for the x86 architecture, and these mods seem to have made it
into 2.4.18 . The ppc arch was not addressed. Also have
noticed this problem being addressed starting in 2.5.16 .
Its not really practical for me to use 2.5.xx at this point.
I am hoping that someone familiar with this code and the
ppc architecture can verify that this is indeed a problem
for 2.4.19.
And then, what can I do about it? I'm willing to try things
as my time permits. I have looked at 2.5.60 memory.c/mmap.c
and related functions, and trying to port the new methods
back to 2.4.19 seems to be a rather daunting task.
Comments, suggestions?
My background involves writing device drivers for VMS,
Solaris, and now Linux.
Any assistance or guidance would be appreciated
Thanks
klink
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: "Illegal instruction" traps on smp clients - 2.4.19
2003-02-27 14:44 "Illegal instruction" traps on smp clients - 2.4.19 Rudy Klinksiek
@ 2003-02-27 15:57 ` Ethan Weinstein
2003-02-27 16:23 ` Benjamin Herrenschmidt
2003-02-27 16:37 ` David Bryan
1 sibling, 1 reply; 6+ messages in thread
From: Ethan Weinstein @ 2003-02-27 15:57 UTC (permalink / raw)
To: Rudy Klinksiek, linuxppc-dev
Rudy Klinksiek wrote:
| Smp and UP kernels work fine on the "server".
|
| Has anyone else seen this type of problem or something similar?
|
I have. I run a dual G4(7400), also with CONFIG_HIMEM and got plenty of
random SIGILL's, SIGABRT's and deamons crashing upon startup until I
started using Ben's bk tree. I've been using 2.4.20-ben(x) for quite
some time now without this problem.
| This appears to me to be an smp problem.
| And then, what can I do about it? I'm willing to try things
| as my time permits. I have looked at 2.5.60 memory.c/mmap.c
| and related functions, and trying to port the new methods
| back to 2.4.19 seems to be a rather daunting task.
|
| Comments, suggestions?
|
Try rsyncing ben's bk tree and using it, it cleared up this issue for me.
rsync -avz --delete rsync.penguinppc.org::linux-2.4-benh
- -Ethan Weinstein
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: "Illegal instruction" traps on smp clients - 2.4.19
2003-02-27 15:57 ` Ethan Weinstein
@ 2003-02-27 16:23 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2003-02-27 16:23 UTC (permalink / raw)
To: Ethan Weinstein; +Cc: Rudy Klinksiek, linuxppc-dev
On Thu, 2003-02-27 at 16:57, Ethan Weinstein wrote:
> Rudy Klinksiek wrote:
>
> | Smp and UP kernels work fine on the "server".
> |
> | Has anyone else seen this type of problem or something similar?
> |
>
> I have. I run a dual G4(7400), also with CONFIG_HIMEM and got plenty of
> random SIGILL's, SIGABRT's and deamons crashing upon startup until I
> started using Ben's bk tree. I've been using 2.4.20-ben(x) for quite
> some time now without this problem.
>
> | This appears to me to be an smp problem.
>
> | And then, what can I do about it? I'm willing to try things
> | as my time permits. I have looked at 2.5.60 memory.c/mmap.c
> | and related functions, and trying to port the new methods
> | back to 2.4.19 seems to be a rather daunting task.
> |
> | Comments, suggestions?
> |
> Try rsyncing ben's bk tree and using it, it cleared up this issue for me.
>
> rsync -avz --delete rsync.penguinppc.org::linux-2.4-benh
This is actually my "stable" tree (2.4.20-ben7 at this time). Please
let me know if it's also fixed by the linuxppc_2_4 PPC bk tree (see
penguinppc.org for details on how to get these).
Ben.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: "Illegal instruction" traps on smp clients - 2.4.19
2003-02-27 14:44 "Illegal instruction" traps on smp clients - 2.4.19 Rudy Klinksiek
2003-02-27 15:57 ` Ethan Weinstein
@ 2003-02-27 16:37 ` David Bryan
2003-02-27 18:26 ` Michael R. Zucca
1 sibling, 1 reply; 6+ messages in thread
From: David Bryan @ 2003-02-27 16:37 UTC (permalink / raw)
To: Rudy Klinksiek; +Cc: linuxppc-dev
Rudy,
We saw problems like this about a year ago when running SMP on dual PPC7400
boards.
There is a I/O signal called SHD between the processors in a multiprocessing
system. This signal is used to indicate when a reservation is held on an
address. The lwarx/stcwx instruction pair uses reservations to guarantee
atomicity in SMP systems. (The lwarx/stcwx instructions are used
extensively in the kernel, particularly in the spinlock routines). To enable
use of the SHD signal, the 7400 has to either be in MESI mode or in MEI mode
with the SHD explicitly enabled. These modes are controlled by two bits in
the Memory subsystem control register (MSSCR0). At reset, the MSSCR0
defaults to MEI mode with the SHD signal disabled. By placing the 7400 in
MESI mode at boot, we solved the problem.
Hope this helps,
Dave
----------------------------------------------
T h e P T R G r o u p, I n c.
----------------------------------------------
->->->-> ->->->->-> ->->->->
-> -> ->
->->->-> -> ->->->->
-> -> -> ->
-> -> -> ->
----------------------------------------------
Embedded, Real-Time Solutions, and Training
David Bryan www.ThePTRGroup.com
----------------------------------------------
-----Original Message-----
From: owner-linuxppc-dev@lists.linuxppc.org
[mailto:owner-linuxppc-dev@lists.linuxppc.org]On Behalf Of Rudy
Klinksiek
Sent: Thursday, February 27, 2003 9:45 AM
To: linuxppc-dev@lists.linuxppc.org
Subject: "Illegal instruction" traps on smp clients - 2.4.19
Hello:
This is a message that was posted last week on linux-smp.
No responses, so I'm rewriting/reposting here.
Our configuration uses Linux 2.4.19, from Synergy ( derived
from YellowDog version 2.1). We have several boards
configd in a server/client relationship. These boards
contain either 2 or 4 G4 Altived ppc processors. The
server has an attached disk, clients are diskless, mounting
their root file system over nfs.
I am seeing frequent "Illegal instruction" traps on clients
that run an smp kernel. Other symptoms include failure of
various daemons during startup ( syslogd, crond, sshd, etc ).
Symptoms also occur during rsh/rlogin usage.
Running a UP kernel on clients works just fine.
Smp and UP kernels work fine on the "server".
Has anyone else seen this type of problem or something similar?
This appears to me to be an smp problem.
A fix relating to page table/tlb invalidation ordering
was detailed by Sunil Saxena at
http://www.cs.helsinki.fi/linux/linux-kernel/2002-20/0756.html
for the x86 architecture, and these mods seem to have made it
into 2.4.18 . The ppc arch was not addressed. Also have
noticed this problem being addressed starting in 2.5.16 .
Its not really practical for me to use 2.5.xx at this point.
I am hoping that someone familiar with this code and the
ppc architecture can verify that this is indeed a problem
for 2.4.19.
And then, what can I do about it? I'm willing to try things
as my time permits. I have looked at 2.5.60 memory.c/mmap.c
and related functions, and trying to port the new methods
back to 2.4.19 seems to be a rather daunting task.
Comments, suggestions?
My background involves writing device drivers for VMS,
Solaris, and now Linux.
Any assistance or guidance would be appreciated
Thanks
klink
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: "Illegal instruction" traps on smp clients - 2.4.19
2003-02-27 16:37 ` David Bryan
@ 2003-02-27 18:26 ` Michael R. Zucca
2003-02-27 20:32 ` David Bryan
0 siblings, 1 reply; 6+ messages in thread
From: Michael R. Zucca @ 2003-02-27 18:26 UTC (permalink / raw)
To: David Bryan; +Cc: Rudy Klinksiek, linuxppc-dev
David Bryan wrote:
> These modes are controlled by two bits in
> the Memory subsystem control register (MSSCR0). At reset, the MSSCR0
> defaults to MEI mode with the SHD signal disabled. By placing the 7400 in
> MESI mode at boot, we solved the problem.
Would you care to share what MSSCR0 bits these were and what you set
them to? :-)
--
----------------------------------------------
Michael Zucca - mrz5149@acm.org
----------------------------------------------
"I'm too old to use Emacs." -- Rod MacDonald
----------------------------------------------
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: "Illegal instruction" traps on smp clients - 2.4.19
2003-02-27 18:26 ` Michael R. Zucca
@ 2003-02-27 20:32 ` David Bryan
0 siblings, 0 replies; 6+ messages in thread
From: David Bryan @ 2003-02-27 20:32 UTC (permalink / raw)
To: Michael R. Zucca; +Cc: Rudy Klinksiek, linuxppc-dev
Michael,
Certainly, be glad to... however, I would make sure you fully understand
the operation of these bits and their applicability to your system.
On the 7400/7410 the MSSCR0 is accessed as SPR 1014.
At reset, the MSSCR0 is initialized to all 0s.
We OR'd 0x8000 with MSSCR0.
Bit 0 (MSB) controls whether the 7400 runs MEI or MESI coherency protocol.
Setting Bit 0 (SHDEN) causes the 7400 to implement "a 4-state MESI protocol
similar to the MPC604e family of processors". Bit 1 (SHDPEN3) is valid only
in MEI mode, so its value is ignored when SHDEN is set. In this
configuration, the 7400 will drive/sample the SHD/0/1 pins depending on the
bus mode (MPX or 60x).
Dave
David Bryan www.ThePTRGroup.com
----------------------------------------------
-----Original Message-----
From: Michael R. Zucca [mailto:mrz5149@acm.org]
Sent: Thursday, February 27, 2003 1:26 PM
To: David Bryan
Cc: Rudy Klinksiek; linuxppc-dev@lists.linuxppc.org
Subject: Re: "Illegal instruction" traps on smp clients - 2.4.19
David Bryan wrote:
> These modes are controlled by two bits in
> the Memory subsystem control register (MSSCR0). At reset, the MSSCR0
> defaults to MEI mode with the SHD signal disabled. By placing the 7400 in
> MESI mode at boot, we solved the problem.
Would you care to share what MSSCR0 bits these were and what you set
them to? :-)
--
----------------------------------------------
Michael Zucca - mrz5149@acm.org
----------------------------------------------
"I'm too old to use Emacs." -- Rod MacDonald
----------------------------------------------
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2003-02-27 20:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-02-27 14:44 "Illegal instruction" traps on smp clients - 2.4.19 Rudy Klinksiek
2003-02-27 15:57 ` Ethan Weinstein
2003-02-27 16:23 ` Benjamin Herrenschmidt
2003-02-27 16:37 ` David Bryan
2003-02-27 18:26 ` Michael R. Zucca
2003-02-27 20:32 ` David Bryan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).