From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Sander Eikelenboom <linux@eikelenboom.it>,
Govinda Tatti <govinda.tatti@oracle.com>
Cc: "Juergen Gross" <jgross@suse.com>,
"Christopher Clark" <christopher.w.clark@gmail.co>,
"Kyle Temkin" <temkink@ainfosec.com>,
"Jérôme Oufella" <jerome.oufella@savoirfairelinux.com>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: pci-passthrough loses msi-x interrupts ability after domain destroy
Date: Fri, 22 Sep 2017 15:25:00 -0400 [thread overview]
Message-ID: <20170922192500.GD26248@char.us.oracle.com> (raw)
In-Reply-To: <373043a4-68e2-0b39-f4d3-82815e0e7767@eikelenboom.it>
On Fri, Sep 22, 2017 at 09:35:40AM +0200, Sander Eikelenboom wrote:
> On 22/09/17 04:09, Christopher Clark wrote:
> > On Thu, Sep 21, 2017 at 1:27 PM, Sander Eikelenboom
> > <linux@eikelenboom.it> wrote:
> >>
> >> On Thu, September 21, 2017, 10:39:52 AM, Roger Pau Monné wrote:
> >>
> >>> On Wed, Sep 20, 2017 at 03:50:35PM -0400, Jérôme Oufella wrote:
> >>>>
> >>>> I'm using PCI pass-through to map a PCIe (intel i210) controller into
> >>>> a HVM domain. The system uses xen-pciback to hide the appropriate PCI
> >>>> device from Dom0.
> >>>>
> >>>> When creating the HVM domain after an hypervisor cold boot, the HVM
> >>>> domain can access and use the PCIe controller without problem.
> >>>>
> >>>> However, if the HVM domain is destroyed then restarted, it won't be
> >>>> able to use the pass-through PCI device anymore. The PCI device is
> >>>> seen and can be mapped, however, the interrupts will not be passed to
> >>>> the HVM domain anymore (this is visible under a Linux guest as
> >>>> /proc/interrupts counters remain 0). The behavior on a Windows10 guest
> >>>> is the same.
> >>>>
> >>>> A few interesting hints I noticed:
> >>>>
> >>>> - On Dom0, 'lspci -vv' on that PCIe device between the "working" and
> >>>> the "muted interrupts" states, I noted a difference between the
> >>>> MSI-X caps:
> >>>>
> >>>> - Capabilities: [70] MSI-X: Enable- Count=5 Masked- <-- IRQs will work if domain started
> >>>> + Capabilities: [70] MSI-X: Enable- Count=5 Masked+ <-- IRQs won't work if domain started
> >>>> ^^^^^^^
> >>
> >>> IMHO it seems that either your device is not able to perform a reset
> >>> successfully, or Linux is not correctly performing such reset. I don't
> >>> think there's a lot that can be done from the Xen side.
> >>
> >> Unfortunately for a lot of pci-devices a simple reset as performed by default isn't enough,
> >> but also almost none support a real pci FLR.
> >>
> >> In the distant past Konrad has made a patchset that implemented a bus reset and
> >> reseting config space. (It piggy backed on already existing libxl mechanism of
> >> trying to call on a syfs "do_flr" attribute which triggers pciback to perform
> >> the busreset and rewrite of config space for the device.
> >>
> >> I use that patchset ever since for my pci-passtrough needs and it works pretty
> >> well. I can shutdown an restart VM's with pci devices passed trhough (also AMD
> >> Radeon graphic cards).
> >
> > Just to confirm the utility of that piece of work: OpenXT also uses an
> > extended version of that same patch to perform device reset for
> > passthrough.
> >
> > I've attached a copy of that OpenXT patch to this message and it can
> > also be obtained from our git repository:
> > https://github.com/OpenXT/xenclient-oe/blob/f8d3b282a87231d9ae717b13d506e8e7e28c78c4/recipes-kernel/linux/4.9/patches/thorough-reset-interface-to-pciback-s-sysfs.patch
> > This version creates a sysfs node named "reset_device" and the OpenXT
> > libxl toolstack is patched to use that node instead of "do_flr".
>
> Nice to hear there are more users of this patch. On #xen on IRC there were from time to time
> also users who tried pci-passtrough and ran into this issue (and probably abandonning the idea
> since having to restart your host before being able to use your pass throughed device again
> defies much of the use case).
>
> > Konrad's original work encountered pushback on upstream acceptance at
> > the time it was developed. I'm not sure I've found where that
> > discussion ended. Is there any prospect of a more comprehensive reset
> > mechanism being accepted into xen-pciback, or elsewhere in the kernel?
>
> Yeah it was nacked by David Vrabel and the discussion somewhat bleeded to death.
> >From what i remember the main issue was with the naming, since it doesn't do a FLR,
> the sysfs hook shouldn't be called "do_flr".
>
> Some other perhaps minor issues i can think of are:
> - No way to excempt pci-devices from this new way of resetting them.
> Perhaps there could be pci devices/topologies were this way of
> resetting causes more problems than it solves and could cause a
> regression. Unfortunately auto detecting what works doesn't seem to
> be possible. On the other hand (though only with my n=10) i haven't encountered
> such a device yet.
>
> - The communication path between libxl and the kernel via sysfs.
> I think the preference was for a:
> a) having it use a more common used Xen communication channel or
> b) having it all self-contained in pci-back. (from my memory and the openxt patch description
> there could be some locking issue when trying to implement it this way,
> but the vfio guys had that solved for there reset implementation if i
> from one of the comments in there source code (patches by Alex Williamson
> if i remember correctly).
>
> - Not an issue back then when the patch was made, but as the question earlier to Roger,
> the hypervisor seems to grow more interference with pci devices with the PVH dom0 work.
> If and hoow does that relate to pci-back and pci-passthrough and (the location of) resetting mechanisms ?
>
>
> So i think David's NACK was mostly for the patchset having some hackish cosmetics.
He didn't like 'do_flr' which made sense as the patchset did not do FLR. It made a bus-reset
for more than one device (if those devices were assigned to pciback).
>
> On the upside one can conclude that this patchset is now pretty well tested over the years ;)
>
> Since David has left, perhaps Jurgen/Boris/Konrad could express their views (again) ?
> (CC'ed them as well)
I've asked Govinda (CC-ed) to refresh the patchset against the lastest kernel and
repost it and see where it goes.
>
> > As noted in the original LKML threads, vfio has similar relevant pci
> > device reset retry logic. (Thanks to Rich Persaud for this pointer:)
> > http://elixir.free-electrons.com/linux/v4.14-rc1/source/drivers/vfio/pci/vfio_pci.c#L1353
> >
> > libvirt also performs similar reset logic, using a direct low level
> > interface to config space (Thanks to Marek for this pointer, libvirt
> > is used by Qubes:)
> > https://github.com/libvirt/libvirt/blob/master/src/util/virpci.c#L929
> > I thinks this indicates that it would be possible to extend libxl to
> > do something similar, but that seems less satisfactory compared to
> > performing the work in a kernel-provided implementation.
> >
> > Is there a way forward to providing this functionality within Xen
> > software or Linux> Christopher
> > --
> >
> > openxt.org
> >
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-09-22 19:26 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-20 19:50 pci-passthrough loses msi-x interrupts ability after domain destroy Jérôme Oufella
2017-09-21 8:39 ` Roger Pau Monné
2017-09-21 20:27 ` Sander Eikelenboom
2017-09-22 2:09 ` Christopher Clark
2017-09-22 6:58 ` Pasi Kärkkäinen
2017-09-22 7:35 ` Sander Eikelenboom
2017-09-22 19:25 ` Konrad Rzeszutek Wilk [this message]
2017-09-25 14:41 ` Ross Philipson
2017-11-02 16:59 ` Pasi Kärkkäinen
2017-09-25 9:54 ` Roger Pau Monné
2017-09-22 8:57 ` Roger Pau Monné
2017-09-21 13:12 ` Jan Beulich
2017-09-25 16:10 ` Jérôme Oufella
2017-09-26 7:18 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170922192500.GD26248@char.us.oracle.com \
--to=konrad.wilk@oracle.com \
--cc=boris.ostrovsky@oracle.com \
--cc=christopher.w.clark@gmail.co \
--cc=govinda.tatti@oracle.com \
--cc=jerome.oufella@savoirfairelinux.com \
--cc=jgross@suse.com \
--cc=linux@eikelenboom.it \
--cc=roger.pau@citrix.com \
--cc=temkink@ainfosec.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.