From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nz-out-0102.google.com (nz-out-0102.google.com [64.233.162.193]) by ozlabs.org (Postfix) with ESMTP id 6517E67BB7 for ; Thu, 21 Sep 2006 17:11:04 +1000 (EST) Received: by nz-out-0102.google.com with SMTP id i1so249100nzh for ; Thu, 21 Sep 2006 00:11:03 -0700 (PDT) Message-ID: Date: Thu, 21 Sep 2006 00:11:02 -0700 From: "Manoj Sharma" To: "Liu Dave-r63238" Subject: Re: Hang with isync In-Reply-To: <995B09A8299C2C44B59866F6391D263516C018@zch01exm21.fsl.freescale.net> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_1086_8588946.1158822662753" References: <995B09A8299C2C44B59866F6391D263516C018@zch01exm21.fsl.freescale.net> Cc: linuxppc-dev@ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , ------=_Part_1086_8588946.1158822662753 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable Content-Disposition: inline The hang has trigerred watchdog timer exception. It points that there is problem somewhere but don't know what is it. Even with 2.4, it does not happen regularly. It occurs once in a while and not reproducable. On 9/20/06, Liu Dave-r63238 wrote: > > First, you must make sure if it really happen at watchdog timer > exception. > if it is, you need select one suitable way to fix it. > Second, I don't believe the sync-isync instructions make it happen. > you can try the 2.6, I don't know if 2.6 kernel can resolve your problem. > > -Dave > > ------------------------------ > Dave, watchdog timeout is around one second and no cpu activity for tha= t > long is something wrong. Is it ok to disable it to hide the problem lying > somewhere else? Do you think it can be because of sync-isync > instructions and moving to 2.6 might resolve it? > > > On 9/20/06, Liu Dave-r63238 wrote: > > > > > > No MSR is 00029030 and user mode bit is not set here. > > > > I had missed it in the prev mail: > > > > =05NIP: C0005DA4 XER: 20000000 LR: C0004FE4 SP: C01F3000=05 REGS: c01ef= f30 > > TRAP: 1020 Not tainted > > MSR: 00029030 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11 > > TASK =3D c01f1080[0] 'swapper' Last syscall: 120 > > last math 00000000 last altivec 00000000 > > =05PLB0: bear=3D 0x08000000 acr=3D 0xbb000000 besr=3D 0x00000000 > > > > Dave>I notice that MSR and TRAP, MSR is 00029030- the critical interrup= t > > enable. > > Dave>TRAP is 1020. --WatchDog timer exception is happening > > Dave>You can disable the MSR[CE] bit to no critical exception or disabl= e > > the WD timer > > > > > > > > On 9/20/06, Linas Vepstas wrote: > > > > > > On Thu, Sep 21, 2006 at 08:38:13AM +1000, Benjamin Herrenschmidt > > > wrote: > > > > On Wed, 2006-09-20 at 15:31 -0700, Manoj Sharma wrote: > > > > > This is the stack trace. > > > > > > > > > > Registers: > > > > > GPR00: 00069030 > > > > > > This is the MSR and it has the user-mode bit set, which is surely > > > wrong. > > > This is not how one gets to user space. > > > > > > 00048000 > > > > > > The MSR had this or'ed into it, which is setting the user-mode bit. > > > Surely that's wrong. > > > > > > --linas > > > > > > > > ------=_Part_1086_8588946.1158822662753 Content-Type: text/html; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit Content-Disposition: inline
The hang has trigerred watchdog timer exception. It points that there is problem somewhere but don't know what is it. 
Even with 2.4, it does not happen regularly. It occurs once in a while and not reproducable.
 
On 9/20/06, Liu Dave-r63238 <DaveLiu@freescale.com> wrote:
First, you must make sure if it really happen at watchdog timer exception.
if it is, you need select one suitable way to fix it.
Second, I don't believe the sync-isync instructions make it happen.
you can try the 2.6, I don't know if 2.6 kernel can resolve your problem.
 
-Dave


Dave, watchdog timeout is around one second and no cpu activity for that long is something wrong. Is it ok to disable it to hide the problem lying somewhere else? Do you think it can be because of sync-isync instructions and moving to 2.6 might resolve it?

 
On 9/20/06, Liu Dave-r63238 <DaveLiu@freescale.com > wrote:

No MSR is 00029030 and user mode bit is not set here.

I had missed it in the prev mail:

NIP: C0005DA4 XER: 20000000 LR: C0004FE4 SP: C01F3000 REGS: c01eff30 TRAP: 1020    Not tainted
MSR: 00029030 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
TASK = c01f1080[0] 'swapper' Last syscall: 120
last math 00000000 last altivec 00000000
PLB0: bear= 0x08000000 acr=   0xbb000000 besr=  0x00000000
 
Dave>I notice that MSR and TRAP, MSR is 00029030- the critical interrupt enable.
Dave>TRAP is 1020. --WatchDog timer exception is happening
Dave>You can disable the MSR[CE] bit to no critical exception or disable the WD timer

 
On 9/20/06, Linas Vepstas <linas@austin.ibm.com > wrote:
On Thu, Sep 21, 2006 at 08:38:13AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2006-09-20 at 15:31 -0700, Manoj Sharma wrote:
> > This is the stack trace.
> >
> > Registers:
> > GPR00: 00069030

This is the MSR and it has the user-mode bit set, which is surely wrong.
This is not how one gets to user space.

00048000

The MSR had this or'ed into it, which is setting the user-mode bit.
Surely that's wrong.

--linas



------=_Part_1086_8588946.1158822662753--