All of lore.kernel.org
 help / color / mirror / Atom feed
From: jfaslist <jfaslist@yahoo.fr>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc64-dev@ozlabs.org
Subject: Re: Maple freezing on PCI Target-Abort
Date: Wed, 03 May 2006 17:13:31 +0200	[thread overview]
Message-ID: <4458C89B.9070505@yahoo.fr> (raw)
In-Reply-To: <1139011975.8543.4.camel@localhost.localdomain>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii; format=flowed, Size: 4211 bytes --]

Hi,
Back on this old posting, we have made progress thanks to IBM help, the 
Maple platform no longer freezes on a (PIO) PCI target-abort. On such an 
occurence we now run the machine check excpetion handler, just like you 
said.
Here is what we got from IBM:

"...
Engineering has verified following behavior relating to Machine Check 
and Check Stop. CPC925 documentation will be updated.

   1. With APIMASK register DerrEXCP set to 1 the target abort on the
      PCI bus causes P_CSTP signal to be driven low.
   2. With APIMASK register DerrEXCP set to 0 and APIEMASK register
      DerrEXCP set to 0 the target abort on the PCI bus causes machine
      check interrupt. In this case the CHP_FAULT signal continues to be
      driven high. It appears that the EI interface has a way of
      signaling machine check since both pins P_CSTP and CHP_FAULT are
      disabled through APIMASK and APIEMASK.
   3. With APIMASK register DerrEXCP set to 0 and APIEMASK register
      DerrEXCP set to 1 the target abort on the PCI bus causes machine
      check interrupt. In this case the CHP_FAULT signal is driven low
      until the APIEXCP is read. After APIEXCP register is read the
      CHP_FAULT signal is again driven high. Since the CHP_FAULT pin is
      not connected to the PPC970FX MCP_B input pin EI bus has a way of
      signaling machine check through EI interface...."


Setting the CPC925 according to item 3, fixes the problem. I give this 
for the record, since the fix should be in PIBS, I think.
I still don't like the fact that a user process causing the condition 
causes the system to enter the "mon" debugger rather than being killed 
w/ SIGBUS/SIGSEGV. I guess the correct way for a fix would be to write a 
Maple specific machine_check exception?
Thanks,
-jf simon

Benjamin Herrenschmidt wrote:

>On Fri, 2006-02-03 at 16:58 +0100, jfaslist wrote:
>  
>
>>Hi,
>>Yes, we are going to dig into all this CPC925 and Processor Interface 
>>initialization.
>>Note that I checked that both MSR_ME and MSR_RI were set prior to 
>>triggering the PCI Target-Abort.
>>
>>-MSR_ME: If not set the CPU will "checkstop" on a machine chaeck.
>>-MSR_RI: So that the exception is recoverable.
>>
>>Regarding MSR_RI, this should always be set, I think?
>>    
>>
>
>Yes, MSR:RI is always set by the kernel except in the rare code path
>where taking an exception is actually unsafe (like in some of the
>exception handling code itself)
>
>Ben.
>
>
>  
>



-------- Original Message --------
Subject: 	Re: Maple freezing on PCI Target-Abort
Date: 	Fri, 03 Feb 2006 12:42:37 +1100
From: 	Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: 	jfaslist <jfaslist@yahoo.fr>
CC: 	linuxppc64-dev@ozlabs.org
References: 	<43E23B4A.4020402@yahoo.fr>



> -What exception vector is taking care of a DERR excp? From what I can 
> see it seems to be the "machine check" vector. But that seems a bit 
> drastic to me. After all this is just a PCI target abort.

I would expect a machine check yes.

> -I expect that the normal behavior would be for the kernel to send a 
> signal termination to the user process which caused the PIO READ PCI 
> cycle (from a previously mmap()'ed VMA address). Is it  doable on this 
> platform?  Since a READ operation is coupled by nature, I think this is 
> the only acceptable way.

It should SIGBUS except if the problem occurred in the kernel. I don't
know why it's not doing so, maybe you are hitting an issue/errata or
misconfiguration of the 925 ?

> I have tried to set the MSR[RI] bit before doing the PCI cycle, but it 
> didn't change change anything. Also on our design we disconnect the 
> CPC925 checkstop pin from the 970 machine check pin.(see page 39 of 
> cpc925 user's manual). So a DERR shouldn't cause a machine check I would 
> think.
> 
> I realize that these questions are very H/W related but couldn't find 
> the answer in IBM doc.






	

	
		
___________________________________________________________________________ 
Faites de Yahoo! votre page d'accueil sur le web pour retrouver directement vos services préférés : vérifiez vos nouveaux mails, lancez vos recherches et suivez l'actualité en temps réel. 
Rendez-vous sur http://fr.yahoo.com/set

       reply	other threads:[~2006-05-03 15:13 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <43E23B4A.4020402@yahoo.fr>
     [not found] ` <1138930958.4934.102.camel@localhost.localdomain>
     [not found]   ` <43E37DAC.4030606@yahoo.fr>
     [not found]     ` <1139011975.8543.4.camel@localhost.localdomain>
2006-05-03 15:13       ` jfaslist [this message]
2006-05-03 15:40         ` Maple freezing on PCI Target-Abort Segher Boessenkool
2006-05-03 23:05         ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4458C89B.9070505@yahoo.fr \
    --to=jfaslist@yahoo.fr \
    --cc=benh@kernel.crashing.org \
    --cc=linuxppc64-dev@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.