xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Tommi Airikka <tommi@airikka.net>
Cc: 810379@bugs.debian.org, xen-devel@lists.xen.org
Subject: Re: [BUG] pci-passthrough generates "xen:events: Failed to obtain physical IRQ" for some devices
Date: Wed, 3 Feb 2016 15:28:01 -0500	[thread overview]
Message-ID: <20160203202801.GA30532@char.us.oracle.com> (raw)
In-Reply-To: <CANQMFx5macG2AbNWtrKjs6o445_Jo7+twMaDg6ozE=0DSD_n7A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 329 bytes --]

> > And I wonder if the XEN_DOMCTL_irq_permission aka, xc_domain_irq_permission
> > had been called. I remember that at some point we missed it for Xend..
> >

False alarm.

It was all in Linux. Attached are four patches (the first two
are for your issue you discovered).
I was wondering if there is a way for you to test them?


[-- Attachment #2: 0001-xen-pciback-Check-PF-instead-of-VF-for-PCI_COMMAND_M.patch --]
[-- Type: text/plain, Size: 1737 bytes --]

>From 9b8c92f9535ca9af672facd05360570730a33e05 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Wed, 3 Feb 2016 10:18:18 -0500
Subject: [PATCH 1/4] xen/pciback: Check PF instead of VF for
 PCI_COMMAND_MEMORY

c/s 408fb0e5aa7fda0059db282ff58c3b2a4278baa0
"xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set."
would check the device for PCI_COMMAND_MEMORY which is great.
Except that VF devices are unique - for example they have no
legacy interrupts, and also any writes to PCI_COMMAND_MEMORY
are silently ignored (by the hardware).

CC: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pciback_ops.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/pciback_ops.c b/drivers/xen/xen-pciback/pciback_ops.c
index 73dafdc..8c86a53 100644
--- a/drivers/xen/xen-pciback/pciback_ops.c
+++ b/drivers/xen/xen-pciback/pciback_ops.c
@@ -213,6 +213,7 @@ int xen_pcibk_enable_msix(struct xen_pcibk_device *pdev,
 	int i, result;
 	struct msix_entry *entries;
 	u16 cmd;
+	struct pci_dev *phys_dev;
 
 	if (unlikely(verbose_request))
 		printk(KERN_DEBUG DRV_NAME ": %s: enable MSI-X\n",
@@ -227,8 +228,10 @@ int xen_pcibk_enable_msix(struct xen_pcibk_device *pdev,
 	/*
 	 * PCI_COMMAND_MEMORY must be enabled, otherwise we may not be able
 	 * to access the BARs where the MSI-X entries reside.
+	 * But VF devices are unique in which the PF needs to be checked.
 	 */
-	pci_read_config_word(dev, PCI_COMMAND, &cmd);
+	phys_dev = pci_physfn(dev);
+	pci_read_config_word(phys_dev, PCI_COMMAND, &cmd);
 	if (dev->msi_enabled || !(cmd & PCI_COMMAND_MEMORY))
 		return -ENXIO;
 
-- 
2.1.0


[-- Attachment #3: 0002-xen-pciback-Save-the-number-of-MSI-X-entries-to-be-c.patch --]
[-- Type: text/plain, Size: 2010 bytes --]

>From 0f8901117800fc2f1a87cc5468f1ab7e4288cc5f Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Wed, 3 Feb 2016 10:22:02 -0500
Subject: [PATCH 2/4] xen/pciback: Save the number of MSI-X entries to be
 copied later.

c/s  8135cf8b092723dbfcc611fe6fdcb3a36c9951c5
"xen/pciback: Save xen_pci_op commands before processing it"
would copyback the processed values - which was great.

However it missed the case that xen_pcibk_enable_msix - when
completing would overwrite op->value (which had the number
of MSI-X vectors requested) with the return value (which for
success was zero). Hence the copy-back routine (which would use
op->value) would copy exactly zero MSI-X vectors back.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pciback_ops.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/xen-pciback/pciback_ops.c b/drivers/xen/xen-pciback/pciback_ops.c
index 8c86a53..2fc5880 100644
--- a/drivers/xen/xen-pciback/pciback_ops.c
+++ b/drivers/xen/xen-pciback/pciback_ops.c
@@ -335,7 +335,9 @@ void xen_pcibk_do_op(struct work_struct *data)
 	struct xen_pcibk_dev_data *dev_data = NULL;
 	struct xen_pci_op *op = &pdev->op;
 	int test_intx = 0;
-
+#ifdef CONFIG_PCI_MSI
+	unsigned int nr = 0;
+#endif
 	*op = pdev->sh_info->op;
 	barrier();
 	dev = xen_pcibk_get_pci_dev(pdev, op->domain, op->bus, op->devfn);
@@ -363,6 +365,7 @@ void xen_pcibk_do_op(struct work_struct *data)
 			op->err = xen_pcibk_disable_msi(pdev, dev, op);
 			break;
 		case XEN_PCI_OP_enable_msix:
+			nr = op->value;
 			op->err = xen_pcibk_enable_msix(pdev, dev, op);
 			break;
 		case XEN_PCI_OP_disable_msix:
@@ -385,7 +388,7 @@ void xen_pcibk_do_op(struct work_struct *data)
 	if (op->cmd == XEN_PCI_OP_enable_msix && op->err == 0) {
 		unsigned int i;
 
-		for (i = 0; i < op->value; i++)
+		for (i = 0; i < nr; i++)
 			pdev->sh_info->op.msix_entries[i].vector =
 				op->msix_entries[i].vector;
 	}
-- 
2.1.0


[-- Attachment #4: 0003-xen-pcifront-Report-the-errors-better.patch --]
[-- Type: text/plain, Size: 1856 bytes --]

>From 3a0d57b60a61eb461504f8ed1845afd5084b7889 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Wed, 3 Feb 2016 16:39:21 -0500
Subject: [PATCH 3/4] xen/pcifront: Report the errors better.

The messages should be different depending on the type of error.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/include/asm/xen/pci.h | 4 ++--
 arch/x86/pci/xen.c             | 5 ++++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/xen/pci.h b/arch/x86/include/asm/xen/pci.h
index 968d57d..f320ee3 100644
--- a/arch/x86/include/asm/xen/pci.h
+++ b/arch/x86/include/asm/xen/pci.h
@@ -57,7 +57,7 @@ static inline int xen_pci_frontend_enable_msi(struct pci_dev *dev,
 {
 	if (xen_pci_frontend && xen_pci_frontend->enable_msi)
 		return xen_pci_frontend->enable_msi(dev, vectors);
-	return -ENODEV;
+	return -ENOSYS;
 }
 static inline void xen_pci_frontend_disable_msi(struct pci_dev *dev)
 {
@@ -69,7 +69,7 @@ static inline int xen_pci_frontend_enable_msix(struct pci_dev *dev,
 {
 	if (xen_pci_frontend && xen_pci_frontend->enable_msix)
 		return xen_pci_frontend->enable_msix(dev, vectors, nvec);
-	return -ENODEV;
+	return -ENOSYS;
 }
 static inline void xen_pci_frontend_disable_msix(struct pci_dev *dev)
 {
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index ff31ab4..beac4df 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -196,7 +196,10 @@ static int xen_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 	return 0;
 
 error:
-	dev_err(&dev->dev, "Xen PCI frontend has not registered MSI/MSI-X support!\n");
+	if (ret == -ENOSYS)
+		dev_err(&dev->dev, "Xen PCI frontend has not registered MSI/MSI-X support!\n");
+	else if (ret)
+		dev_err(&dev->dev, "Xen PCI frontend error: %d!\n", ret);
 free:
 	kfree(v);
 	return ret;
-- 
2.1.0


[-- Attachment #5: 0004-xen-pcifront-Fix-mysterious-crashes-when-NUMA-locali.patch --]
[-- Type: text/plain, Size: 4075 bytes --]

>From 25af39d59d2ff4a5e5bc872b8d4c451bbeffa312 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Wed, 3 Feb 2016 16:43:56 -0500
Subject: [PATCH 4/4] xen/pcifront: Fix mysterious crashes when NUMA locality
 information was extracted.

Occasionaly PV guests would crash with:

pciback 0000:00:00.1: Xen PCI mapped GSI0 to IRQ16
BUG: unable to handle kernel paging request at 0000000d1a8c0be0
.. snip..
  <ffffffff8139ce1b>] find_next_bit+0xb/0x10
  [<ffffffff81387f22>] cpumask_next_and+0x22/0x40
  [<ffffffff813c1ef8>] pci_device_probe+0xb8/0x120
  [<ffffffff81529097>] ? driver_sysfs_add+0x77/0xa0
  [<ffffffff815293e4>] driver_probe_device+0x1a4/0x2d0
  [<ffffffff813c1ddd>] ? pci_match_device+0xdd/0x110
  [<ffffffff81529657>] __device_attach_driver+0xa7/0xb0
  [<ffffffff815295b0>] ? __driver_attach+0xa0/0xa0
  [<ffffffff81527622>] bus_for_each_drv+0x62/0x90
  [<ffffffff8152978d>] __device_attach+0xbd/0x110
  [<ffffffff815297fb>] device_attach+0xb/0x10
  [<ffffffff813b75ac>] pci_bus_add_device+0x3c/0x70
  [<ffffffff813b7618>] pci_bus_add_devices+0x38/0x80
  [<ffffffff813dc34e>] pcifront_scan_root+0x13e/0x1a0
  [<ffffffff817a0692>] pcifront_backend_changed+0x262/0x60b
  [<ffffffff814644c6>] ? xenbus_gather+0xd6/0x160
  [<ffffffff8120900f>] ? put_object+0x2f/0x50
  [<ffffffff81465c1d>] xenbus_otherend_changed+0x9d/0xa0
  [<ffffffff814678ee>] backend_changed+0xe/0x10
  [<ffffffff81463a28>] xenwatch_thread+0xc8/0x190
  [<ffffffff810f22f0>] ? woken_wake_function+0x10/0x10

which was the result of two things:

When we call pci_scan_root_bus we would pass in 'sd' (sysdata)
pointer which was an 'pcifront_sd' structure. However in the
pci_device_add it expects that the 'sd' is 'struct sysdata' and
sets the dev->node to what is in sd->node (offset 4):

set_dev_node(&dev->dev, pcibus_to_node(bus));

 __pcibus_to_node(const struct pci_bus *bus)
{
        const struct pci_sysdata *sd = bus->sysdata;

        return sd->node;
}

However our structure was pcifront_sd which had nothing at that
offset:

struct pcifront_sd {
        int                        domain;    /*     0     4 */
        /* XXX 4 bytes hole, try to pack */
        struct pcifront_device *   pdev;      /*     8     8 */
}

That is an hole - filled with garbage as we used kmalloc instead of
kzalloc (the second problem).

This patch fixes the issue by:
 1) Use kzalloc to initialize to a well known state.
 2) Put 'struct pci_sysdata' at the start of 'pcifront_sd'. That
    way access to the 'node' will access the right offset.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/pci/xen-pcifront.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
index c777b97..66d197d 100644
--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -53,7 +53,7 @@ struct pcifront_device {
 };
 
 struct pcifront_sd {
-	int domain;
+	struct pci_sysdata sd;
 	struct pcifront_device *pdev;
 };
 
@@ -67,7 +67,9 @@ static inline void pcifront_init_sd(struct pcifront_sd *sd,
 				    unsigned int domain, unsigned int bus,
 				    struct pcifront_device *pdev)
 {
-	sd->domain = domain;
+	/* Because we do not expose that information via XenBus. */
+	sd->sd.node = first_online_node;
+	sd->sd.domain = domain;
 	sd->pdev = pdev;
 }
 
@@ -468,8 +470,8 @@ static int pcifront_scan_root(struct pcifront_device *pdev,
 	dev_info(&pdev->xdev->dev, "Creating PCI Frontend Bus %04x:%02x\n",
 		 domain, bus);
 
-	bus_entry = kmalloc(sizeof(*bus_entry), GFP_KERNEL);
-	sd = kmalloc(sizeof(*sd), GFP_KERNEL);
+	bus_entry = kzalloc(sizeof(*bus_entry), GFP_KERNEL);
+	sd = kzalloc(sizeof(*sd), GFP_KERNEL);
 	if (!bus_entry || !sd) {
 		err = -ENOMEM;
 		goto err_out;
@@ -576,6 +578,7 @@ static void pcifront_free_roots(struct pcifront_device *pdev)
 		free_root_bus_devs(bus_entry->bus);
 
 		kfree(bus_entry->bus->sysdata);
+		bus_entry->bus->sysdata = NULL;
 
 		device_unregister(bus_entry->bus->bridge);
 		pci_remove_bus(bus_entry->bus);
-- 
2.1.0


[-- Attachment #6: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2016-02-03 20:28 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CANQMFx4YULqKctKZqeESesTQjLQun7rQ0ZjGzq96TXTtUw6VWA@mail.gmail.com>
2016-01-27 18:30 ` [BUG] pci-passthrough generates "xen:events: Failed to obtain physical IRQ" for some devices Konrad Rzeszutek Wilk
2016-01-30  2:13   ` Marek Marczykowski-Górecki
2016-01-30 13:18   ` Tommi Airikka
2016-02-01 10:43     ` Ian Campbell
2016-02-01 14:50       ` Konrad Rzeszutek Wilk
2016-02-03 14:22         ` Marek Marczykowski-Górecki
2016-02-03 15:26           ` Konrad Rzeszutek Wilk
2016-02-08 17:39             ` Marek Marczykowski-Górecki
2016-02-09  4:59               ` Konrad Rzeszutek Wilk
2016-02-03 20:28     ` Konrad Rzeszutek Wilk [this message]
2016-02-04  9:37       ` Ian Campbell
2016-02-04 23:35         ` Tommi Airikka
2016-02-12 21:36           ` Konrad Rzeszutek Wilk
2016-02-12 21:53             ` Tommi Airikka
2016-01-23 16:12 Tommi Airikka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160203202801.GA30532@char.us.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=810379@bugs.debian.org \
    --cc=tommi@airikka.net \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).