From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: Re: IOMMU faults Date: Thu, 16 Jun 2011 10:47:30 -0400 Message-ID: <20110616144730.GA6108@dumpdata.com> References: <20110616092509.GH17634@whitby.uk.xensource.com> <201106161630.15290.wei.wang2@amd.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <201106161630.15290.wei.wang2@amd.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Wei Wang2 Cc: "xen-devel@lists.xensource.com" , Allen Kay , Tim Deegan , Jean Guyader List-Id: xen-devel@lists.xenproject.org > > I was considering just writing 0 to the faulting card's PCI command > > register, but I'm told that's not always enough to properly deactivate > > a card, and it might be a little over-zealous to do it on the first > > offence. > > Ideas? > It seems difficult to find a generic approach to stop a device without knowing > more device specific details... Perhaps make something similar to the MCE fault interrupts? As in when the error happens, the Dom0 is notified of the offending BDF and persuses whatever action it thinks are neccessary. The action would be to tell the device driver to turn itself off. But how it would interact with the driver.. Well how does Linux deal with this today? Is there an extension to the device driver API (similar to the power) to notify the driver that it has done bad things and to shut itself off? Perhaps similar to the PCIe AER handling?