public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
From: Jesse Barnes <jbarnes@engr.sgi.com>
To: linux-ia64@vger.kernel.org
Subject: Re: [RFC] I/O error handling for userspace
Date: Mon, 06 Dec 2004 16:59:45 +0000	[thread overview]
Message-ID: <200412060859.45051.jbarnes@engr.sgi.com> (raw)
In-Reply-To: <200412030831.25662.jbarnes@engr.sgi.com>

On Monday, December 6, 2004 4:42 am, Hidetoshi Seto wrote:
> Jesse Barnes wrote:
> > On Friday, December 3, 2004 8:31 am, Jesse Barnes wrote:
> >>This patch adds support for sending a SIGBUS to a userspace application
> >>using /proc/bus/pci to drive a device if an I/O error occurs.  We're
> >> using this in house for the X server's BIOS emulator and it seems to be
> >> working well.
> >>
> >>The idea is to track mmaped /proc/bus/pci regions so that the machine
> >> check handler is able to properly determine which process is responsible
> >> for any faults that occur (ia64 is interesting in that the error may not
> >> occur in the process context that actually generated the bad reference).
> >>  If a match is found, a SIGBUS is sent to the process, along with the
> >> address that caused the fault.  The machine check record is then cleared
> >> and recovery takes place (the assumption is that the signal to userspace
> >> is a sufficient record of the error).
>
> Cool!
> BTW I have some short comments.

Does this look a little better?  I've removed the clearing of the error 
records too, in light of Keith's patch to clear them out quickly if they're 
corrected (though I'll need more additions to set the recovered flag 
correctly).

> force_sig_info() takes spinlock in it... I think calling this isn't safe on
> MCA.

This is the only bit I'm unsure about.  I can't just add a spin_trylock 
version, since the call path for send_sig_info calls the slab allocator, 
which takes other locks.

Assuming that only the CPU that caused the MCA is in the MCA handler (i.e. 
rendezvous doesn't occur), then the only time that one of the spinlocks could 
hang is if the current CPU also owned it, right?  Hmm, maybe the 
ia64_spinlock_contention routine could check for a machine check condition 
and promote the failure to an uncorrectable one in that case?  That's pretty 
ugly though...

Jesse

  parent reply	other threads:[~2004-12-06 16:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-12-03 16:31 [RFC] I/O error handling for userspace Jesse Barnes
2004-12-03 16:43 ` Jesse Barnes
2004-12-06 12:42 ` Hidetoshi Seto
2004-12-06 16:13 ` Jesse Barnes
2004-12-06 16:59 ` Jesse Barnes [this message]
2004-12-06 17:05 ` Jesse Barnes
2004-12-06 22:56 ` Jesse Barnes
2004-12-06 23:51 ` Keith Owens
2004-12-07  0:38 ` Keith Owens
2004-12-07  0:40 ` Jesse Barnes
2004-12-07  1:29 ` Keith Owens
2004-12-07  1:36 ` Jesse Barnes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200412060859.45051.jbarnes@engr.sgi.com \
    --to=jbarnes@engr.sgi.com \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox