All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Sander Eikelenboom <linux@eikelenboom.it>,
	Paul Durrant <paul.durrant@citrix.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	Jan Beulich <jbeulich@suse.com>
Subject: Re: Xen-unstable-staging: Xen BUG at iommu_map.c:455
Date: Fri, 10 Apr 2015 19:55:27 +0100	[thread overview]
Message-ID: <55281C9F.7080409@citrix.com> (raw)
In-Reply-To: <10810464855.20150410122433@eikelenboom.it>

On 10/04/15 11:24, Sander Eikelenboom wrote:
> Hi Andrew,
>
> Finally got some time to figure this out .. and i have narrowed it down to:
> git://xenbits.xen.org/staging/qemu-upstream-unstable.git
> commit 7665d6ba98e20fb05c420de947c1750fd47e5c07 "Xen: Use the ioreq-server API when available"
> A straight revert of this commit prevents the issue from happening.
>
> The reason i had a hard time figuring this out was:
> - I wasn't aware of this earlier, since git pulling the main xen tree, doesn't 
>   auto update the qemu-* trees.

This has caught me out so many times.  It is very non-obvious behaviour.

> - So i happen to get this when i cloned a fresh tree to try to figure out the 
>   other issue i was seeing.
> - After that checking out previous versions of the main xen tree didn't resolve 
>   this new issue, because the qemu tree doesn't get auto updated and is set 
>   "master".
> - Cloning a xen-stable-4.5.0 made it go away .. because that has a specific 
>   git://xenbits.xen.org/staging/qemu-upstream-unstable.git tag which is not 
>   master.
>
> *sigh* 
>
> This is tested with xen main tree at last commit 3a28f760508fb35c430edac17a9efde5aff6d1d5
> (normal xen-unstable, not the staging branch)
>
> Ok so i have added some extra debug info (see attached diff) and this is the 
> output when it crashes due to something the commit above triggered, the 
> level is out of bounds and the pfn looks fishy too.
> Complete serial log from both bad and good (specific commit reverted) are 
> attached.

Just to confirm, you are positively identifying a qemu changeset as
causing this crash?

If so, the qemu change has discovered a pre-existing issue in the
toolstack pci-passthrough interface.  Whatever qemu is or isn't doing,
it should not be able to cause a crash like this.

With this in mind, I need to brush up on my AMD-Vi details.

In the meantime, can you run with the following patch to identify what
is going on, domctl wise?  I assume it is the assign_device which is
failing, but it will be nice to observe the differences between the
working and failing case, which might offer a hint.

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 9f3413c..57eb311 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1532,6 +1532,11 @@ int iommu_do_pci_domctl(
         max_sdevs = domctl->u.get_device_group.max_sdevs;
         sdevs = domctl->u.get_device_group.sdev_array;
 
+        printk("*** %pv->d%d: get_device_group({%04x:%02x:%02x.%u, %u})\n",
+               current, d->domain_id,
+               seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+               max_sdevs);
+
         ret = iommu_get_device_group(d, seg, bus, devfn, sdevs, max_sdevs);
         if ( ret < 0 )
         {
@@ -1558,6 +1563,10 @@ int iommu_do_pci_domctl(
         bus = (domctl->u.assign_device.machine_sbdf >> 8) & 0xff;
         devfn = domctl->u.assign_device.machine_sbdf & 0xff;
 
+        printk("*** %pv->d%d: test_assign_device({%04x:%02x:%02x.%u})\n",
+               current, d->domain_id,
+               seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
+
         if ( device_assigned(seg, bus, devfn) )
         {
             printk(XENLOG_G_INFO
@@ -1582,6 +1591,10 @@ int iommu_do_pci_domctl(
         bus = (domctl->u.assign_device.machine_sbdf >> 8) & 0xff;
         devfn = domctl->u.assign_device.machine_sbdf & 0xff;
 
+        printk("*** %pv->d%d: assign_device({%04x:%02x:%02x.%u})\n",
+               current, d->domain_id,
+               seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
+
         ret = device_assigned(seg, bus, devfn) ?:
               assign_device(d, seg, bus, devfn);
         if ( ret == -ERESTART )
@@ -1604,6 +1617,10 @@ int iommu_do_pci_domctl(
         bus = (domctl->u.assign_device.machine_sbdf >> 8) & 0xff;
         devfn = domctl->u.assign_device.machine_sbdf & 0xff;
 
+        printk("*** %pv->d%d: deassign_device({%04x:%02x:%02x.%u})\n",
+               current, d->domain_id,
+               seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
+
         spin_lock(&pcidevs_lock);
         ret = deassign_device(d, seg, bus, devfn);
         spin_unlock(&pcidevs_lock);

  reply	other threads:[~2015-04-10 18:55 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-31 21:11 Xen-unstable-staging: Xen BUG at iommu_map.c:455 Sander Eikelenboom
2015-03-31 23:38 ` Andrew Cooper
2015-04-01  0:17   ` Sander Eikelenboom
2015-04-01 19:22   ` Sander Eikelenboom
2015-04-10 10:24   ` Sander Eikelenboom
2015-04-10 18:55     ` Andrew Cooper [this message]
2015-04-10 19:27       ` Sander Eikelenboom
2015-04-11 14:11       ` Sander Eikelenboom
2015-04-11 14:21         ` Andrew Cooper
2015-04-11 16:21           ` Sander Eikelenboom
2015-04-11 16:32             ` Andrew Cooper
2015-04-11 16:38               ` Andrew Cooper
2015-04-11 17:25                 ` Sander Eikelenboom
2015-04-11 17:35                   ` Andrew Cooper
2015-04-11 17:42                     ` Sander Eikelenboom
2015-04-11 18:25                       ` Andrew Cooper
2015-04-11 19:33                         ` Sander Eikelenboom
2015-04-11 20:22                           ` Andrew Cooper
2015-04-11 21:05                             ` Sander Eikelenboom
2015-04-11 21:35                               ` Andrew Cooper
2015-04-12 15:15                                 ` Sander Eikelenboom
2015-04-12 16:10                                   ` Sander Eikelenboom
2015-04-15 19:06                                 ` Konrad Rzeszutek Wilk
2015-04-16  9:28                                 ` Tim Deegan
2015-04-20 16:11                                   ` Jan Beulich
2015-04-20 16:14                                     ` Sander Eikelenboom
2015-04-20 18:50                                     ` Sander Eikelenboom
2015-04-21  8:11                                       ` Jan Beulich
2015-04-21  8:24                                         ` Sander Eikelenboom
2015-04-21  8:42                                           ` Jan Beulich
2015-04-21  8:55                                             ` Sander Eikelenboom
2015-05-05 15:10                                   ` Jan Beulich
2015-05-05 15:17                                     ` Tim Deegan
2015-05-05 15:31                                       ` Jan Beulich
2015-05-07 10:32                                         ` Tim Deegan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55281C9F.7080409@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=linux@eikelenboom.it \
    --cc=paul.durrant@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.