All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Wang2 <wei.wang2@amd.com>
To: Tim Deegan <Tim.Deegan@citrix.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Allen Kay <allen.m.kay@intel.com>,
	Jean Guyader <Jean.Guyader@citrix.com>
Subject: Re: IOMMU faults
Date: Thu, 16 Jun 2011 16:30:14 +0200	[thread overview]
Message-ID: <201106161630.15290.wei.wang2@amd.com> (raw)
In-Reply-To: <20110616092509.GH17634@whitby.uk.xensource.com>

Alberto BozzoOn Thursday 16 June 2011 11:25:09 Tim Deegan wrote:
> Hi, IOMMU maintainers,
>
> What should Xen do when an IOMMU fault happens?  As far as I can
> see both the AMD and Intel code clears the error in the IOMMU and
> carries on, but I suspect some more vigorous action is appropriate.
> I've seen traces from an Intel machine that seemed to be livelocked on
> IOMMU faults from a passed-through VGA card, until it was killed by the
> watchdog.  I think I can see two things that contribute to that:
>
>  - The Intel IOMMU fault handler prints quite a lot of info in interrupt
>    context, making it easier to livelock.  Still I think the general
>    problem applies on AMD too.

This info could still be useful for debugging, but we should only enable this 
for debug build. 

>  - Domain destruction re-assigns passed though cards to dom0, but the
>    cards don't seem to get reset.  So there's nothing to stop a card
>    battering away at DMA in the meantime.  That seems like a problem
>    independent of livelock, actually.

There should  be some FLR codes in tools (both xm and xl). But this might not 
work well with some devices...

> In any case, it seems like it would be a good idea to stop a
> broken/malicious/deassigned card from flooding Xen with IOMMU faults.

Yes, agree that. Actually I saw a lot could be improved in the fault handler. 
When iommu faults come from dma error, we should either stop the device from 
doing dma or inject errors to guest if the guest driver is able to handle io 
page fault.

> I was considering just writing 0 to the faulting card's PCI command
> register, but I'm told that's not always enough to properly deactivate
> a card, and it might be a little over-zealous to do it on the first
> offence.
> Ideas?
It seems difficult to find a generic approach to stop a device without knowing 
more device specific details... 

Thanks,
Wei
> Tim.

  parent reply	other threads:[~2011-06-16 14:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-16  9:25 IOMMU faults Tim Deegan
2011-06-16  9:47 ` Jean Guyader
2011-06-16 10:07   ` Tim Deegan
2011-06-16 10:28     ` Jean Guyader
2011-06-24 13:32       ` Tim Deegan
2011-06-30 10:08         ` Tim Deegan
2011-06-30 10:31           ` Jean Guyader
2011-06-16 14:30 ` Wei Wang2 [this message]
2011-06-16 14:47   ` Konrad Rzeszutek Wilk
2011-06-17  8:08     ` Tim Deegan
2011-06-16 19:21 ` Kay, Allen M
2011-06-17  8:06   ` Tim Deegan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201106161630.15290.wei.wang2@amd.com \
    --to=wei.wang2@amd.com \
    --cc=Jean.Guyader@citrix.com \
    --cc=Tim.Deegan@citrix.com \
    --cc=allen.m.kay@intel.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.