From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-am1on0061.outbound.protection.outlook.com ([157.56.112.61]:6064 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754584AbcDYMqW (ORCPT ); Mon, 25 Apr 2016 08:46:22 -0400 Subject: Re: [PATCH] PCI: Refine broken INTx masking for Mellanox devices To: Alex Williamson References: <1461083616-30099-1-git-send-email-majd@mellanox.com> <20160419130435.7d681f6d@t450s.home> <7b67ed14-516d-aae9-2b72-6dae420caa65@mellanox.com> CC: "bhelgaas@google.com" , "linux-pci@vger.kernel.org" , Or Gerlitz , "Noa Osherovich" From: Majd Dibbiny Message-ID: Date: Mon, 25 Apr 2016 15:45:37 +0300 MIME-Version: 1.0 In-Reply-To: <7b67ed14-516d-aae9-2b72-6dae420caa65@mellanox.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Sender: linux-pci-owner@vger.kernel.org List-ID: Hi Bjorn, Can you please merge this? Thanks On 4/21/2016 10:12 AM, Majd Dibbiny wrote: > > > On 4/19/2016 10:04 PM, Alex Williamson wrote: >> On Tue, 19 Apr 2016 19:33:36 +0300 >> Majd Dibbiny wrote: >> >>> From: Noa Osherovich >>> >>> Mellanox devices were marked as having INTx masking ability broken. >>> As a result, the VFIO driver fails to load when the device was >>> passed-through to a VM. >>> >>> This patch excludes ConnectX-4, ConnectX4-Lx and Connect-IB from the >>> list of Mellanox devices marked as having broken INTx masking: >>> >>> - ConnectX-4 and ConnectX4-Lx either declare INTx are not supported >>> (FW versions 12.14.2036 / 14.12.2036) or support them properly >>> (starting May 2016 firmware release). Users having earlier firmware >>> versions will have to update to either one. >>> - Connect-IB does not support INTx currently so will not cause any >>> problem. >> What's the user and support experience here? A user gets a new kernel >> that no longer marks INTx masking as bad (note that if the device does >> not support INTx by returing 0 in the pin register, it doesn't matter >> whether it's marked bad or not), they try to assign the device and get >> an interrupt storm on the guest and go complain to support. Support >> tries to reproduce, maybe can, maybe can't, depending on the firmware >> version they happen to have running, everyone gets very unhappy until >> the thing that almost never works, "are you running the latest >> firmware", actually works. Can we instead add a new function for >> mellanox that can probe the firmware version and continue marking INTx >> masking bad with a warning noting that a firmware update will resolve >> it? Thanks, >> >> Alex > Hi Alex, > I understand your concern and therefore we have documented this in our > Release notes to ease the troubleshooting experience for everyone.. We > have also added the Firmware versions in the commit messages to make > it clearer... > > What do you think? > > Thanks >>> Fixes: 11e42532ada31 ('PCI: Assume all Mellanox devices have ...') >>> Signed-off-by: Noa Osherovich >>> Signed-off-by: Majd Dibbiny >>> --- >>> drivers/pci/quirks.c | 61 >>> ++++++++++++++++++++++++++++++++++++++++++++++++- >>> include/linux/pci_ids.h | 14 ++++++++++++ >>> 2 files changed, 74 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c >>> index 8e67802..6a1856e 100644 >>> --- a/drivers/pci/quirks.c >>> +++ b/drivers/pci/quirks.c >>> @@ -3147,7 +3147,66 @@ DECLARE_PCI_FIXUP_HEADER(0x1814, 0x0601, /* >>> Ralink RT2800 802.11n PCI */ >>> */ >>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_REALTEK, 0x8169, >>> quirk_broken_intx_masking); >>> -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, PCI_ANY_ID, >>> + >>> +/* >>> + * Mellanox devices that fail under PCI device assignment using >>> DisINTx masking >>> + */ >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> PCI_DEVICE_ID_MELLANOX_TAVOR, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_TAVOR_BRIDGE, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> PCI_DEVICE_ID_MELLANOX_ARBEL, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_ARBEL_COMPAT, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> PCI_DEVICE_ID_MELLANOX_SINAI, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_SINAI_OLD, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_HERMON_SDR, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_HERMON_DDR, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_HERMON_QDR, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_HERMON_DDR_GEN2, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_HERMON_QDR_GEN2, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_HERMON_EN, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_HERMON_EN_GEN2, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_CONNECTX_EN, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_CONNECTX_EN_T_GEN2, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_CONNECTX_EN_GEN2, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_CONNECTX_EN_5_GEN2, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_CONNECTX2, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_CONNECTX3, >>> + quirk_broken_intx_masking); >>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, >>> + PCI_DEVICE_ID_MELLANOX_CONNECTX3_PRO, >>> quirk_broken_intx_masking); >>> static void quirk_no_bus_reset(struct pci_dev *dev) >>> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h >>> index 247da8c..bc8918e 100644 >>> --- a/include/linux/pci_ids.h >>> +++ b/include/linux/pci_ids.h >>> @@ -2262,6 +2262,20 @@ >>> #define PCI_DEVICE_ID_MELLANOX_ARBEL 0x6282 >>> #define PCI_DEVICE_ID_MELLANOX_SINAI_OLD 0x5e8c >>> #define PCI_DEVICE_ID_MELLANOX_SINAI 0x6274 >>> +#define PCI_DEVICE_ID_MELLANOX_HERMON_SDR 0x6340 >>> +#define PCI_DEVICE_ID_MELLANOX_HERMON_DDR 0x634a >>> +#define PCI_DEVICE_ID_MELLANOX_HERMON_QDR 0x6354 >>> +#define PCI_DEVICE_ID_MELLANOX_HERMON_DDR_GEN2 0x6732 >>> +#define PCI_DEVICE_ID_MELLANOX_HERMON_QDR_GEN2 0x673c >>> +#define PCI_DEVICE_ID_MELLANOX_HERMON_EN 0x6368 >>> +#define PCI_DEVICE_ID_MELLANOX_HERMON_EN_GEN2 0x6750 >>> +#define PCI_DEVICE_ID_MELLANOX_CONNECTX_EN 0x6372 >>> +#define PCI_DEVICE_ID_MELLANOX_CONNECTX_EN_T_GEN2 0x675a >>> +#define PCI_DEVICE_ID_MELLANOX_CONNECTX_EN_GEN2 0x6764 >>> +#define PCI_DEVICE_ID_MELLANOX_CONNECTX_EN_5_GEN2 0x6746 >>> +#define PCI_DEVICE_ID_MELLANOX_CONNECTX2 0x676e >>> +#define PCI_DEVICE_ID_MELLANOX_CONNECTX3 0x1003 >>> +#define PCI_DEVICE_ID_MELLANOX_CONNECTX3_PRO 0x1007 >>> #define PCI_VENDOR_ID_DFI 0x15bd >