From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <48F88F27.4040107@nanometrics.ca> Date: Fri, 17 Oct 2008 09:12:07 -0400 From: Ben Gardiner MIME-Version: 1.0 To: Ganesh Kumar N M Subject: Re: Loadable module crashes at kernel stack overflow or machine check References: <004c01c92f92$bca991d0$0d01a8c0@signet> In-Reply-To: <004c01c92f92$bca991d0$0d01a8c0@signet> Content-Type: multipart/alternative; boundary="------------090406010005070500090005" Cc: linuxppc-dev@ozlabs.org, linuxppc-embedded@ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , This is a multi-part message in MIME format. --------------090406010005070500090005 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Ganesh Kumar N M wrote: > *Hi All,* > ** > * I'm working on MPC860 with Montavista linux 2.4.18* > *We have a Linux kernel loadable module which on loading* > *panicks after some random time say 8 hours, 4 hours or so* > *the oops outputs say either machine check exception or * > *kernel stack overflow (randomly both show up) a**re as below:* I don't know for sure what could be causing your problem. I can only suggest some patches that have helped us in the past. I'm not familiar with Montavista's kernel versions; but I know our 2.4.24 kernel did not have the 'separate I-TLB error and miss handling' patch ( http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html ) which caused our applications to segfault for not apparent reason. I also suggest applying the CPU15 fix ( http://git.denx.de/?p=linuxppc_2_4_devel.git;a=commit;h=baf9a6caca75b1f338ae370669e5882809000164 and http://git.denx.de/?p=linuxppc_2_4_devel.git;a=commit;h=3ad403717f1d9c6a09ec41a5b016ac5245591122 ) and enabling it temporarily to see if the problem could be the unlucky placement of a branch instruction at the end of a page; but evaluate the performance of your application carefully if you are considering running production code with the patch enabled as it introduces significant overhead. Regards, Ben Gardiner Nanometrics Seismological Instruments 250 Herzberg Rd., Kanata, ON, CA, K2K 2A1 --------------090406010005070500090005 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Ganesh Kumar N M wrote:
Hi All,
 
    I'm working on MPC860 with Montavista linux 2.4.18
We have a Linux kernel loadable module which on loading
panicks after some random time say 8 hours, 4 hours or so
the oops outputs say either machine check exception or
kernel stack overflow (randomly both show up) are as below: 
I don't know for sure what could be causing your problem. I can only suggest some patches that have helped us in the past.

I'm not familiar with Montavista's kernel versions; but I know our 2.4.24 kernel did not have the 'separate I-TLB error and miss handling' patch ( http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html ) which caused our applications to segfault for not apparent reason.

I also suggest applying the CPU15 fix ( http://git.denx.de/?p=linuxppc_2_4_devel.git;a=commit;h=baf9a6caca75b1f338ae370669e5882809000164 and http://git.denx.de/?p=linuxppc_2_4_devel.git;a=commit;h=3ad403717f1d9c6a09ec41a5b016ac5245591122 ) and enabling it temporarily to see if the problem could be the unlucky placement of a branch instruction at the end of a page; but evaluate the performance of your application carefully if you are considering running production code with the patch enabled as it introduces significant overhead.

Regards,

Ben Gardiner
Nanometrics Seismological Instruments
250 Herzberg Rd., Kanata, ON, CA, K2K 2A1
--------------090406010005070500090005--