From: Keir Fraser <keir@xen.org>
To: Tim Deegan <Tim.Deegan@citrix.com>,
"Kay, Allen M" <allen.m.kay@intel.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
Date: Thu, 21 Jul 2011 10:01:53 +0100 [thread overview]
Message-ID: <CA4DA991.2F9E3%keir@xen.org> (raw)
In-Reply-To: <20110721085047.GB2688@whitby.uk.xensource.com>
On 21/07/2011 09:50, "Tim Deegan" <Tim.Deegan@citrix.com> wrote:
> At 18:08 -0700 on 20 Jul (1311185311), Kay, Allen M wrote:
>> Hi Tim,
>>
>> Can you provide the code flow that can cause this failure?
>>
>> In pci_release_devices(), pci_clean_dpci_irqs() is called before
>> "d->need_iommu = 0" in deassign_device(). If this path is taken, then
>> it should not return from !need_iommu(d) in pci_clean_dpci_irqs().
>
> The problem is that the xl toolstack has already deassigned the domain's
> devices, using a hypercall to invoke deassign_device(), so by the time
> the domain is destroyed, pci_release_devices() can't tell that it once
> had a PCI device passed through.
>
> It seems like the Right Thing[tm] would be for deassign_device() to find
> and undo the relevant IRQ plumbing but I couldn't see how. Is there a
> mapping from bdf to irq in the iommu code or are they handled entirely
> separately?
Could we make need_iommu(d) sticky? Being able to clear it doesn't seem an
important case (such a domain is probably being torn down anyway) and
clearly it can lead to fragility. The fact that presumably we'd end up doing
unnecessary IOMMU PT work for the remaining lifetime of the domain doesn't
seem a major downside to me.
-- Keir
> Tim.
>
>> Allen
>>
>> -----Original Message-----
>> From: Tim Deegan [mailto:Tim.Deegan@citrix.com]
>> Sent: Monday, July 18, 2011 9:39 AM
>> To: xen-devel@lists.xensource.com
>> Cc: keir@xen.org; Kay, Allen M
>> Subject: [PATCH] [RFC] VT-d: always clean up dpci timers.
>>
>> If a VM has all its PCI devices deassigned, need_iommu(d) becomes false
>> but it might still have DPCI EOI timers that were init_timer()d but not
>> yet kill_timer()d. That causes xen to crash later because the linked
>> list of inactive timers gets corrupted, e.g.:
>>
>> (XEN) Xen call trace:
>> (XEN) [<ffff82c480126256>] set_timer+0x1c2/0x24f
>> (XEN) [<ffff82c48011fbf8>] schedule+0x129/0x5dd
>> (XEN) [<ffff82c480122c1e>] __do_softirq+0x7e/0x89
>> (XEN) [<ffff82c480122c9d>] do_softirq+0x26/0x28
>> (XEN) [<ffff82c480153c85>] idle_loop+0x5a/0x5c
>> (XEN)
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 0:
>> (XEN) Assertion 'entry->next->prev == entry' failed at
>> /local/scratch/tdeegan/xen-unstable.hg/xen/include:172
>> (XEN) ****************************************
>>
>> The following patch makes sure that the domain destruction path always
>> clears up the DPCI state even if !needs_iommu(d).
>>
>> Although it fixes the crash for me, I'm sufficiently confused by this
>> code that I don't know whether it's enough. If the dpci timer state
>> gets freed earlier than pci_clean_dpci_irqs() then there's still a race,
>> and some other function (reassign_device_ownership() ?) needs to sort
>> out the timers when the PCI card is deassigned.
>>
>> Allen, can you comment?
>>
>> Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
>>
>> diff -r ab6551e30841 xen/drivers/passthrough/pci.c
>> --- a/xen/drivers/passthrough/pci.c Mon Jul 18 10:59:44 2011 +0100
>> +++ b/xen/drivers/passthrough/pci.c Mon Jul 18 17:22:48 2011 +0100
>> @@ -269,7 +269,7 @@ static void pci_clean_dpci_irqs(struct d
>> if ( !iommu_enabled )
>> return;
>>
>> - if ( !is_hvm_domain(d) || !need_iommu(d) )
>> + if ( !is_hvm_domain(d) )
>> return;
>>
>> spin_lock(&d->event_lock);
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
next prev parent reply other threads:[~2011-07-21 9:01 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-18 16:38 [PATCH] [RFC] VT-d: always clean up dpci timers Tim Deegan
2011-07-21 1:08 ` Kay, Allen M
2011-07-21 8:50 ` Tim Deegan
2011-07-21 9:01 ` Keir Fraser [this message]
2011-07-25 14:21 ` Tim Deegan
2011-07-25 14:47 ` Keir Fraser
2011-07-25 15:04 ` Tim Deegan
2011-07-22 7:33 ` Kay, Allen M
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CA4DA991.2F9E3%keir@xen.org \
--to=keir@xen.org \
--cc=Tim.Deegan@citrix.com \
--cc=allen.m.kay@intel.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.