* [PATCH] pci/msi-x: Sanity check MSI-X table offset
@ 2014-08-11 4:36 Benjamin Herrenschmidt
2014-08-11 14:18 ` Wei Yang
2014-09-05 21:13 ` Bjorn Helgaas
0 siblings, 2 replies; 4+ messages in thread
From: Benjamin Herrenschmidt @ 2014-08-11 4:36 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Linux PCI
If the hardware device mis-behaves (such as for example crashes or
gets unplugged at the wrong time) and provides us with a bogus
MSI-X table offset, all sorts of "interesting" and potentially
very hard to debug things can happen.
For example, on POWER8, such a device caused us to ioremap an area
outside of the region assigned to the PCI bus, causing subsequent
accesses to cause a PowerBus timeout and checkstop the machine.
Since this isn't a hot path, let's add a good dose of sanity
checking to msix_map_region() to flag these issues early and limit
the damage.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
drivers/pci/msi.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 5a40516..a584f590 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -666,13 +666,30 @@ static int msi_capability_init(struct pci_dev *dev, int nvec)
static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)
{
resource_size_t phys_addr;
- u32 table_offset;
+ u32 table_offset, table_end;
u8 bir;
pci_read_config_dword(dev, dev->msix_cap + PCI_MSIX_TABLE,
&table_offset);
bir = (u8)(table_offset & PCI_MSIX_TABLE_BIR);
+ if (bir >= DEVICE_COUNT_RESOURCE) {
+ dev_err(&dev->dev, "MSI-X points to non-exiting BAR %d !\n",
+ bir);
+ return NULL;
+ }
+ if ((pci_resource_flags(dev, bir) & IORESOURCE_MEM) == 0) {
+ dev_err(&dev->dev, "MSI-X points to non-memory BAR %d !\n",
+ bir);
+ return NULL;
+ }
table_offset &= PCI_MSIX_TABLE_OFFSET;
+ table_end = table_offset + nr_entries * PCI_MSIX_ENTRY_SIZE;
+ if (table_end <= table_offset ||
+ table_end > pci_resource_len(dev, bir)) {
+ dev_err(&dev->dev, "MSI-X table outside of BAR boundary !"
+ " (0x%08x..%08x)\n", table_offset, table_end);
+ return NULL;
+ }
phys_addr = pci_resource_start(dev, bir) + table_offset;
return ioremap_nocache(phys_addr, nr_entries * PCI_MSIX_ENTRY_SIZE);
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] pci/msi-x: Sanity check MSI-X table offset
2014-08-11 4:36 [PATCH] pci/msi-x: Sanity check MSI-X table offset Benjamin Herrenschmidt
@ 2014-08-11 14:18 ` Wei Yang
2014-09-05 21:13 ` Bjorn Helgaas
1 sibling, 0 replies; 4+ messages in thread
From: Wei Yang @ 2014-08-11 14:18 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Bjorn Helgaas, Linux PCI
On Mon, Aug 11, 2014 at 02:36:55PM +1000, Benjamin Herrenschmidt wrote:
>If the hardware device mis-behaves (such as for example crashes or
>gets unplugged at the wrong time) and provides us with a bogus
>MSI-X table offset, all sorts of "interesting" and potentially
>very hard to debug things can happen.
>
>For example, on POWER8, such a device caused us to ioremap an area
>outside of the region assigned to the PCI bus, causing subsequent
>accesses to cause a PowerBus timeout and checkstop the machine.
>
>Since this isn't a hot path, let's add a good dose of sanity
>checking to msix_map_region() to flag these issues early and limit
>the damage.
>
>Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>---
> drivers/pci/msi.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
>index 5a40516..a584f590 100644
>--- a/drivers/pci/msi.c
>+++ b/drivers/pci/msi.c
>@@ -666,13 +666,30 @@ static int msi_capability_init(struct pci_dev *dev, int nvec)
> static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)
> {
> resource_size_t phys_addr;
>- u32 table_offset;
>+ u32 table_offset, table_end;
> u8 bir;
>
> pci_read_config_dword(dev, dev->msix_cap + PCI_MSIX_TABLE,
> &table_offset);
> bir = (u8)(table_offset & PCI_MSIX_TABLE_BIR);
>+ if (bir >= DEVICE_COUNT_RESOURCE) {
could we use this?
if (bir > PCI_STD_RESOURCE_END)
my understanding is the IOV BAR and bridge resources are not proper for MSI-X
neigher.
>+ dev_err(&dev->dev, "MSI-X points to non-exiting BAR %d !\n",
>+ bir);
>+ return NULL;
>+ }
>+ if ((pci_resource_flags(dev, bir) & IORESOURCE_MEM) == 0) {
>+ dev_err(&dev->dev, "MSI-X points to non-memory BAR %d !\n",
>+ bir);
>+ return NULL;
>+ }
> table_offset &= PCI_MSIX_TABLE_OFFSET;
>+ table_end = table_offset + nr_entries * PCI_MSIX_ENTRY_SIZE;
>+ if (table_end <= table_offset ||
>+ table_end > pci_resource_len(dev, bir)) {
>+ dev_err(&dev->dev, "MSI-X table outside of BAR boundary !"
>+ " (0x%08x..%08x)\n", table_offset, table_end);
>+ return NULL;
>+ }
> phys_addr = pci_resource_start(dev, bir) + table_offset;
>
> return ioremap_nocache(phys_addr, nr_entries * PCI_MSIX_ENTRY_SIZE);
>
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Richard Yang
Help you, Help me
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] pci/msi-x: Sanity check MSI-X table offset
2014-08-11 4:36 [PATCH] pci/msi-x: Sanity check MSI-X table offset Benjamin Herrenschmidt
2014-08-11 14:18 ` Wei Yang
@ 2014-09-05 21:13 ` Bjorn Helgaas
2014-09-05 21:33 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 4+ messages in thread
From: Bjorn Helgaas @ 2014-09-05 21:13 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Linux PCI, weiyang
[+cc Wei]
On Mon, Aug 11, 2014 at 02:36:55PM +1000, Benjamin Herrenschmidt wrote:
> If the hardware device mis-behaves (such as for example crashes or
> gets unplugged at the wrong time) and provides us with a bogus
> MSI-X table offset, all sorts of "interesting" and potentially
> very hard to debug things can happen.
>
> For example, on POWER8, such a device caused us to ioremap an area
> outside of the region assigned to the PCI bus, causing subsequent
> accesses to cause a PowerBus timeout and checkstop the machine.
>
> Since this isn't a hot path, let's add a good dose of sanity
> checking to msix_map_region() to flag these issues early and limit
> the damage.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
> drivers/pci/msi.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 5a40516..a584f590 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -666,13 +666,30 @@ static int msi_capability_init(struct pci_dev *dev, int nvec)
> static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)
> {
> resource_size_t phys_addr;
> - u32 table_offset;
> + u32 table_offset, table_end;
> u8 bir;
>
> pci_read_config_dword(dev, dev->msix_cap + PCI_MSIX_TABLE,
> &table_offset);
> bir = (u8)(table_offset & PCI_MSIX_TABLE_BIR);
> + if (bir >= DEVICE_COUNT_RESOURCE) {
> + dev_err(&dev->dev, "MSI-X points to non-exiting BAR %d !\n",
> + bir);
> + return NULL;
> + }
I assume we would just get 0xffffffff if something went wrong, wouldn't we?
If it's possible to get arbitrary bad data, there's no end to the error
checking we could add when reading config space.
If we can add a minimal check for the value we get if the device has been
removed or isolated, I'd rather do that than try to validate every field in
the register. An error message along the lines of "config read returned
0xffffffff" or "can't access device" is probably easier to debug than
"MSI-X points to non-existing BAR 255" anway.
> + if ((pci_resource_flags(dev, bir) & IORESOURCE_MEM) == 0) {
> + dev_err(&dev->dev, "MSI-X points to non-memory BAR %d !\n",
> + bir);
> + return NULL;
> + }
> table_offset &= PCI_MSIX_TABLE_OFFSET;
> + table_end = table_offset + nr_entries * PCI_MSIX_ENTRY_SIZE;
> + if (table_end <= table_offset ||
> + table_end > pci_resource_len(dev, bir)) {
> + dev_err(&dev->dev, "MSI-X table outside of BAR boundary !"
> + " (0x%08x..%08x)\n", table_offset, table_end);
> + return NULL;
> + }
> phys_addr = pci_resource_start(dev, bir) + table_offset;
>
> return ioremap_nocache(phys_addr, nr_entries * PCI_MSIX_ENTRY_SIZE);
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] pci/msi-x: Sanity check MSI-X table offset
2014-09-05 21:13 ` Bjorn Helgaas
@ 2014-09-05 21:33 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 4+ messages in thread
From: Benjamin Herrenschmidt @ 2014-09-05 21:33 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Linux PCI, weiyang
On Fri, 2014-09-05 at 15:13 -0600, Bjorn Helgaas wrote:
> I assume we would just get 0xffffffff if something went wrong, wouldn't we?
It's HW, better safe than sorry. 0xffffffff is the normal case of "the
device isn't talking to you anymore" but all the other cases handled in
my patch are all illegal and might catch buggy/broken devices (which can
be useful if you are just developing such a device in an FPGA for
example).
> If it's possible to get arbitrary bad data, there's no end to the error
> checking we could add when reading config space.
Yeah, though not all of them result in mapping bad addresses ... in this
case the checks are pretty easy and it's not a performance sensitive
path, so while I was initially only checking for ff's I though "screw
it, may as well sanitiy check it all) :-)
> If we can add a minimal check for the value we get if the device has been
> removed or isolated, I'd rather do that than try to validate every field in
> the register. An error message along the lines of "config read returned
> 0xffffffff" or "can't access device" is probably easier to debug than
> "MSI-X points to non-existing BAR 255" anway.
Ok.
Ben.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-09-05 22:48 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-11 4:36 [PATCH] pci/msi-x: Sanity check MSI-X table offset Benjamin Herrenschmidt
2014-08-11 14:18 ` Wei Yang
2014-09-05 21:13 ` Bjorn Helgaas
2014-09-05 21:33 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).