linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* "Illegal instruction" traps on smp clients - 2.4.19
@ 2003-02-27 14:44 Rudy Klinksiek
  2003-02-27 15:57 ` Ethan Weinstein
  2003-02-27 16:37 ` David Bryan
  0 siblings, 2 replies; 6+ messages in thread
From: Rudy Klinksiek @ 2003-02-27 14:44 UTC (permalink / raw)
  To: linuxppc-dev


Hello:
	This is a message that was posted last week on linux-smp.
	No responses, so I'm rewriting/reposting here.

	Our configuration uses Linux 2.4.19, from Synergy ( derived
	from YellowDog version 2.1). We have several boards
	configd in a server/client relationship. These boards
	contain either 2 or 4 G4 Altived ppc processors.  The
	server has an attached disk, clients are diskless, mounting
	their  root file system over nfs.

	I am seeing frequent "Illegal instruction" traps on clients
	that run an smp kernel.  Other symptoms include failure of
	various daemons	during startup ( syslogd, crond, sshd, etc ).
	Symptoms also occur during rsh/rlogin usage.

	Running a UP kernel on clients works just fine.

	Smp and UP kernels work fine on the "server".

	Has anyone else seen this type of problem or something similar?

        This appears to me to be an smp problem.

	A fix relating to page table/tlb invalidation ordering
	was detailed by Sunil Saxena at
	http://www.cs.helsinki.fi/linux/linux-kernel/2002-20/0756.html
	for the x86 architecture, and these mods seem to have made it
	into 2.4.18 .  The ppc arch was not addressed.  Also have
	noticed this problem being addressed starting in 2.5.16 .

	Its not really practical for me to use 2.5.xx at this point.

	I am hoping that someone familiar with this code and the
	ppc architecture can verify that this is indeed a problem
	for 2.4.19.

	And then, what can I do about it?  I'm willing to try things
	as my time permits.  I have looked at 2.5.60 memory.c/mmap.c
	and related functions, and trying to port the new methods
	back to 2.4.19 seems to be a rather daunting task.

        Comments, suggestions?

        My background involves writing device drivers for VMS,
	Solaris,  and now Linux.

        Any assistance or guidance would be appreciated

	Thanks
	klink


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: "Illegal instruction" traps on smp clients - 2.4.19
  2003-02-27 14:44 "Illegal instruction" traps on smp clients - 2.4.19 Rudy Klinksiek
@ 2003-02-27 15:57 ` Ethan Weinstein
  2003-02-27 16:23   ` Benjamin Herrenschmidt
  2003-02-27 16:37 ` David Bryan
  1 sibling, 1 reply; 6+ messages in thread
From: Ethan Weinstein @ 2003-02-27 15:57 UTC (permalink / raw)
  To: Rudy Klinksiek, linuxppc-dev



Rudy Klinksiek wrote:

| 	Smp and UP kernels work fine on the "server".
|
| 	Has anyone else seen this type of problem or something similar?
|

I have. I run a dual G4(7400), also with CONFIG_HIMEM and got plenty of
random SIGILL's, SIGABRT's and deamons crashing upon startup until I
started using Ben's bk tree.  I've been using 2.4.20-ben(x) for quite
some time now without this problem.

|         This appears to me to be an smp problem.

| 	And then, what can I do about it?  I'm willing to try things
| 	as my time permits.  I have looked at 2.5.60 memory.c/mmap.c
| 	and related functions, and trying to port the new methods
| 	back to 2.4.19 seems to be a rather daunting task.
|
|         Comments, suggestions?
|
Try rsyncing ben's bk tree and using it, it cleared up this issue for me.

rsync -avz --delete rsync.penguinppc.org::linux-2.4-benh


- -Ethan Weinstein


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: "Illegal instruction" traps on smp clients - 2.4.19
  2003-02-27 15:57 ` Ethan Weinstein
@ 2003-02-27 16:23   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2003-02-27 16:23 UTC (permalink / raw)
  To: Ethan Weinstein; +Cc: Rudy Klinksiek, linuxppc-dev


On Thu, 2003-02-27 at 16:57, Ethan Weinstein wrote:
> Rudy Klinksiek wrote:
>
> | 	Smp and UP kernels work fine on the "server".
> |
> | 	Has anyone else seen this type of problem or something similar?
> |
>
> I have. I run a dual G4(7400), also with CONFIG_HIMEM and got plenty of
> random SIGILL's, SIGABRT's and deamons crashing upon startup until I
> started using Ben's bk tree.  I've been using 2.4.20-ben(x) for quite
> some time now without this problem.
>
> |         This appears to me to be an smp problem.
>
> | 	And then, what can I do about it?  I'm willing to try things
> | 	as my time permits.  I have looked at 2.5.60 memory.c/mmap.c
> | 	and related functions, and trying to port the new methods
> | 	back to 2.4.19 seems to be a rather daunting task.
> |
> |         Comments, suggestions?
> |
> Try rsyncing ben's bk tree and using it, it cleared up this issue for me.
>
> rsync -avz --delete rsync.penguinppc.org::linux-2.4-benh

This is actually my "stable" tree (2.4.20-ben7 at this time). Please
let me know if it's also fixed by the linuxppc_2_4 PPC bk tree (see
penguinppc.org for details on how to get these).

Ben.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: "Illegal instruction" traps on smp clients - 2.4.19
  2003-02-27 14:44 "Illegal instruction" traps on smp clients - 2.4.19 Rudy Klinksiek
  2003-02-27 15:57 ` Ethan Weinstein
@ 2003-02-27 16:37 ` David Bryan
  2003-02-27 18:26   ` Michael R. Zucca
  1 sibling, 1 reply; 6+ messages in thread
From: David Bryan @ 2003-02-27 16:37 UTC (permalink / raw)
  To: Rudy Klinksiek; +Cc: linuxppc-dev


Rudy,

We saw problems like this about a year ago when running SMP on dual PPC7400
boards.

There is a I/O signal called SHD between the processors in a multiprocessing
system.  This signal is used to indicate when a reservation is held on an
address.  The lwarx/stcwx instruction pair uses reservations to guarantee
atomicity in SMP systems.  (The lwarx/stcwx instructions are used
extensively in the kernel, particularly in the spinlock routines). To enable
use of the SHD signal, the 7400 has to either be in MESI mode or in MEI mode
with the SHD explicitly enabled.  These modes are controlled by two bits in
the Memory subsystem control register (MSSCR0).  At reset, the MSSCR0
defaults to MEI mode with the SHD signal disabled.  By placing the 7400 in
MESI mode at boot, we solved the problem.

Hope this helps,

Dave
----------------------------------------------
     T h e   P T R   G r o u p,   I n c.
----------------------------------------------
       ->->->->  ->->->->->   ->->->->
             ->     ->               ->
     ->->->->      ->       ->->->->
    ->            ->       ->      ->
   ->            ->       ->       ->
----------------------------------------------
 Embedded, Real-Time Solutions, and Training

David Bryan                www.ThePTRGroup.com
----------------------------------------------



-----Original Message-----
From: owner-linuxppc-dev@lists.linuxppc.org
[mailto:owner-linuxppc-dev@lists.linuxppc.org]On Behalf Of Rudy
Klinksiek
Sent: Thursday, February 27, 2003 9:45 AM
To: linuxppc-dev@lists.linuxppc.org
Subject: "Illegal instruction" traps on smp clients - 2.4.19



Hello:
	This is a message that was posted last week on linux-smp.
	No responses, so I'm rewriting/reposting here.

	Our configuration uses Linux 2.4.19, from Synergy ( derived
	from YellowDog version 2.1). We have several boards
	configd in a server/client relationship. These boards
	contain either 2 or 4 G4 Altived ppc processors.  The
	server has an attached disk, clients are diskless, mounting
	their  root file system over nfs.

	I am seeing frequent "Illegal instruction" traps on clients
	that run an smp kernel.  Other symptoms include failure of
	various daemons	during startup ( syslogd, crond, sshd, etc ).
	Symptoms also occur during rsh/rlogin usage.

	Running a UP kernel on clients works just fine.

	Smp and UP kernels work fine on the "server".

	Has anyone else seen this type of problem or something similar?

        This appears to me to be an smp problem.

	A fix relating to page table/tlb invalidation ordering
	was detailed by Sunil Saxena at
	http://www.cs.helsinki.fi/linux/linux-kernel/2002-20/0756.html
	for the x86 architecture, and these mods seem to have made it
	into 2.4.18 .  The ppc arch was not addressed.  Also have
	noticed this problem being addressed starting in 2.5.16 .

	Its not really practical for me to use 2.5.xx at this point.

	I am hoping that someone familiar with this code and the
	ppc architecture can verify that this is indeed a problem
	for 2.4.19.

	And then, what can I do about it?  I'm willing to try things
	as my time permits.  I have looked at 2.5.60 memory.c/mmap.c
	and related functions, and trying to port the new methods
	back to 2.4.19 seems to be a rather daunting task.

        Comments, suggestions?

        My background involves writing device drivers for VMS,
	Solaris,  and now Linux.

        Any assistance or guidance would be appreciated

	Thanks
	klink


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: "Illegal instruction" traps on smp clients - 2.4.19
  2003-02-27 16:37 ` David Bryan
@ 2003-02-27 18:26   ` Michael R. Zucca
  2003-02-27 20:32     ` David Bryan
  0 siblings, 1 reply; 6+ messages in thread
From: Michael R. Zucca @ 2003-02-27 18:26 UTC (permalink / raw)
  To: David Bryan; +Cc: Rudy Klinksiek, linuxppc-dev


David Bryan wrote:
 > These modes are controlled by two bits in
> the Memory subsystem control register (MSSCR0).  At reset, the MSSCR0
> defaults to MEI mode with the SHD signal disabled.  By placing the 7400 in
> MESI mode at boot, we solved the problem.

Would you care to share what MSSCR0 bits these were and what you set
them to? :-)

--
----------------------------------------------
  Michael Zucca - mrz5149@acm.org
----------------------------------------------
  "I'm too old to use Emacs." -- Rod MacDonald
----------------------------------------------


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: "Illegal instruction" traps on smp clients - 2.4.19
  2003-02-27 18:26   ` Michael R. Zucca
@ 2003-02-27 20:32     ` David Bryan
  0 siblings, 0 replies; 6+ messages in thread
From: David Bryan @ 2003-02-27 20:32 UTC (permalink / raw)
  To: Michael R. Zucca; +Cc: Rudy Klinksiek, linuxppc-dev


Michael,

Certainly, be glad to...  however, I would make sure you fully understand
the operation of these bits and their applicability to your system.

On the 7400/7410 the MSSCR0 is accessed as SPR 1014.

At reset, the MSSCR0 is initialized to all 0s.

We OR'd 0x8000 with MSSCR0.

Bit 0 (MSB) controls whether the 7400 runs MEI or MESI coherency protocol.
Setting Bit 0 (SHDEN) causes the 7400 to implement "a 4-state MESI protocol
similar to the MPC604e family of processors".  Bit 1 (SHDPEN3) is valid only
in MEI mode, so its value is ignored when SHDEN is set.  In this
configuration, the 7400 will drive/sample the SHD/0/1 pins depending on the
bus mode (MPX or 60x).


Dave

David Bryan                www.ThePTRGroup.com
----------------------------------------------

-----Original Message-----
From: Michael R. Zucca [mailto:mrz5149@acm.org]
Sent: Thursday, February 27, 2003 1:26 PM
To: David Bryan
Cc: Rudy Klinksiek; linuxppc-dev@lists.linuxppc.org
Subject: Re: "Illegal instruction" traps on smp clients - 2.4.19


David Bryan wrote:
 > These modes are controlled by two bits in
> the Memory subsystem control register (MSSCR0).  At reset, the MSSCR0
> defaults to MEI mode with the SHD signal disabled.  By placing the 7400 in
> MESI mode at boot, we solved the problem.

Would you care to share what MSSCR0 bits these were and what you set
them to? :-)

--
----------------------------------------------
  Michael Zucca - mrz5149@acm.org
----------------------------------------------
  "I'm too old to use Emacs." -- Rod MacDonald
----------------------------------------------


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-02-27 20:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-02-27 14:44 "Illegal instruction" traps on smp clients - 2.4.19 Rudy Klinksiek
2003-02-27 15:57 ` Ethan Weinstein
2003-02-27 16:23   ` Benjamin Herrenschmidt
2003-02-27 16:37 ` David Bryan
2003-02-27 18:26   ` Michael R. Zucca
2003-02-27 20:32     ` David Bryan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).