From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: PCI passthrough problems after legacy update of xen 4.1 Date: Fri, 3 May 2013 14:57:40 +0100 Message-ID: <5183C254.3010400@citrix.com> References: <7fe927daf168b95128153bf104715689@mail.shatteredsilicon.net> <5182708302000078000D2A11@nat28.tlf.novell.com> <518287CC02000078000D2AA4@nat28.tlf.novell.com> <5183839502000078000D2F4F@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------080500000504030104050003" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andreas Falck Cc: Gordan Bobic , Jan Beulich , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org --------------080500000504030104050003 Content-Type: multipart/alternative; boundary="------------070108040605000008030601" --------------070108040605000008030601 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit On 03/05/2013 14:31, Andreas Falck wrote: > Sorry, I'll stick to xen-devel then, even though I am not on the list > (I can change the latter of course). > > Since I run xen 4.1 (specifically the precompiled 4.1.3-3ubuntu1.5) I > guess I cannot apply Andrew's patch directly (unless the file hasn't > changed between the versions)? But I could probably figure out where > in the 4.1.3-3ubuntu1.5 sources to insert the code from the patch, if > I compile it from source. Attached is the same debugging patch against 4.1. If you grab the src rpm, you should be able to add it as another patch to the specfile and use rpmbuild. Or you can just build straight from source, whichever is easier. ~Andrew > > I can do some more testing throughout the weekend. Could you give me > some directions on what I should test, which logs and info to provide, > and how to maximize debug output from tools and from the hypervisor. > That would maximize my chances to get the relevant information. I > guess that at a minimum we want a failing and a succeding case with > and without the change in pciif.py? > > 'xm dmesg' didn't show me anything unusual when testing, but that was > without any added debug-keys. > > Should I pastebin log outputs or should I include them in emails to > keep them searchable? > > /Andreas > > > 2013/5/3 Jan Beulich > > > >>> On 02.05.13 at 23:07, Andreas Falck > > wrote: > > Ok, I have some progress. It tried also with the device I had always > > managed to get through, the radeon 7790 gpu. This worked equally > well with > > both versions of pciif.py. However, it turned out that with the > changed > > version, if I pass the gpu first in the pci = [ ... ] list, the > other > > devices also gets through. This was not the case with the > original version > > of pciif.py. > > > > If (and only if) i order the passthrough list in the config file > so that it > > says > > > > pci = [ '41:00.0', '41:00.1', '04:00.0' ] > > > > (This corresponds to GPU, HDMI audio, USB at irqs 16, 17, 19) - then > > passthrough of all devices works with the new version of > pciif.py ("if > > dev.irq:"), but not with the old version ("if not > self.vm.info.is_hvm() and > > dev.irq:"). So the second failure seemingly has to do with some > property > > set or checked only for the first passed through device. Logs > follow: > > Sending xend logs here is only marginally useful, as the errors > quite certainly originate in the hypervisor. Especially considering > that the ordering of devices matters (which is quite irritating to > me), but also with the logs here now showing the -EEXIST error > that your earlier mail mentioned, we have to rely on you to help > with tracking down the root cause of this (by instrumenting the > affected hypervisor paths, i.e. extending on the debugging > patch that Andrew sent). And without you explicitly saying so > we can't even be sure there aren't (when run at maximum log > level) already messages in the hypervisor log that might provide > some further insight. > > Also, please don't cross post - pick either of xen-devel or > xen-users, but not both. > > Jan > > --------------070108040605000008030601 Content-Type: text/html; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit
On 03/05/2013 14:31, Andreas Falck wrote:
Sorry, I'll stick to xen-devel then, even though I am not on the list (I can change the latter of course).

Since I run xen 4.1 (specifically the precompiled 4.1.3-3ubuntu1.5) I guess I cannot apply Andrew's patch directly (unless the file hasn't changed between the versions)? But I could probably figure out where in the 4.1.3-3ubuntu1.5 sources to insert the code from the patch, if I compile it from source.

Attached is the same debugging patch against 4.1.  If you grab the src rpm, you should be able to add it as another patch to the specfile and use rpmbuild.  Or you can just build straight from source, whichever is easier.

~Andrew


I can do some more testing throughout the weekend. Could you give me some directions on what I should test, which logs and info to provide, and how to maximize debug output from tools and from the hypervisor. That would maximize my chances to get the relevant information. I guess that at a minimum we want a failing and a succeding case with and without the change in pciif.py?

'xm dmesg' didn't show me anything unusual when testing, but that was without any added debug-keys.

Should I pastebin log outputs or should I include them in emails to keep them searchable?

/Andreas


2013/5/3 Jan Beulich <JBeulich@suse.com>
>>> On 02.05.13 at 23:07, Andreas Falck <falck.andreas.lists@gmail.com> wrote:
> Ok, I have some progress. It tried also with the device I had always
> managed to get through, the radeon 7790 gpu. This worked equally well with
> both versions of pciif.py. However, it turned out that with the changed
> version, if I pass the gpu first in the pci = [ ... ] list, the other
> devices also gets through. This was not the case with the original version
> of pciif.py.
>
> If (and only if) i order the passthrough list in the config file so that it
> says
>
> pci = [ '41:00.0', '41:00.1', '04:00.0' ]
>
> (This corresponds to GPU, HDMI audio, USB at irqs 16, 17, 19) - then
> passthrough of all devices works with the new version of pciif.py ("if
> dev.irq:"), but not with the old version ("if not self.vm.info.is_hvm() and
> dev.irq:"). So the second failure seemingly has to do with some property
> set or checked only for the first passed through device. Logs follow:

Sending xend logs here is only marginally useful, as the errors
quite certainly originate in the hypervisor. Especially considering
that the ordering of devices matters (which is quite irritating to
me), but also with the logs here now showing the -EEXIST error
that your earlier mail mentioned, we have to rely on you to help
with tracking down the root cause of this (by instrumenting the
affected hypervisor paths, i.e. extending on the debugging
patch that Andrew sent). And without you explicitly saying so
we can't even be sure there aren't (when run at maximum log
level) already messages in the hypervisor log that might provide
some further insight.

Also, please don't cross post - pick either of xen-devel or
xen-users, but not both.

Jan



--------------070108040605000008030601-- --------------080500000504030104050003 Content-Type: text/plain; charset="windows-1252"; name="XSA-46-xen-4.1-debug.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="XSA-46-xen-4.1-debug.patch" diff --git a/xen/common/domctl.c b/xen/common/domctl.c index c3240db..c7c1058 100644 --- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -851,6 +851,7 @@ long do_domctl(XEN_GUEST_HANDLE(xen_domctl_t) u_domctl) { struct domain *d; unsigned int pirq = op->u.irq_permission.pirq; + int allow = op->u.irq_permission.allow_access; ret = -ESRCH; d = rcu_lock_domain_by_id(op->domain); @@ -864,6 +865,8 @@ long do_domctl(XEN_GUEST_HANDLE(xen_domctl_t) u_domctl) else ret = pirq_deny_access(d, pirq); + printk("**DBG perms { %u, %d } = %ld\n", pirq, allow, ret); + rcu_unlock_domain(d); } break; --------------080500000504030104050003 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --------------080500000504030104050003--