From: Bjorn Helgaas <helgaas@kernel.org>
To: Jian-Hong Pan <jian-hong@endlessm.com>
Cc: linux@endlesssm.com, Matthew Wilcox <willy@linux.intel.com>,
linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
linux-pci@vger.kernel.org
Subject: Re: [PATCH] PCI/MSI: Fix restoring of MSI-X vector control's mask bit
Date: Wed, 23 Oct 2019 16:43:25 -0500 [thread overview]
Message-ID: <20191023214325.GA30290@google.com> (raw)
In-Reply-To: <20191023211235.GA236419@google.com>
On Wed, Oct 23, 2019 at 04:12:35PM -0500, Bjorn Helgaas wrote:
> On Tue, Oct 08, 2019 at 11:42:39AM +0800, Jian-Hong Pan wrote:
> > MSI-X vector control's bit 0 is the mask bit, which masks the
> > corresponding interrupt request, or not. Other reserved bits might be
> > used for other purpose by device vendors. For example, the values of
> > Kingston NVMe SSD's MSI-X vector control are neither 0, nor 1, but other
> > values [1].
> >
> > The original restoring logic in system resuming uses the whole MSI-X
> > vector control value as the flag to set/clear the mask bit. However,
> > this logic conflicts the idea mentioned above. It may mislead system to
> > disable the MSI-X vector entries. That makes system get no interrupt
> > from Kingston NVMe SSD after resume and usually get NVMe I/O timeout
> > error.
> >
> > [ 174.715534] nvme nvme0: I/O 978 QID 3 timeout, completion polled
> >
> > This patch takes only the mask bit of original MSI-X vector control
> > value as the flag to fix this issue.
> >
> > [1] https://bugzilla.kernel.org/show_bug.cgi?id=204887#c8
> >
> > Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204887
> > Fixed: f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code")
> > Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
> > ---
> > drivers/pci/msi.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> > index 0884bedcfc7a..deae3d5acaf6 100644
> > --- a/drivers/pci/msi.c
> > +++ b/drivers/pci/msi.c
> > @@ -433,6 +433,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
> > static void __pci_restore_msix_state(struct pci_dev *dev)
> > {
> > struct msi_desc *entry;
> > + u32 flag;
> >
> > if (!dev->msix_enabled)
> > return;
> > @@ -444,8 +445,10 @@ static void __pci_restore_msix_state(struct pci_dev *dev)
> > PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL);
> >
> > arch_restore_msi_irqs(dev);
> > - for_each_pci_msi_entry(entry, dev)
> > - msix_mask_irq(entry, entry->masked);
> > + for_each_pci_msi_entry(entry, dev) {
> > + flag = entry->masked & PCI_MSIX_ENTRY_CTRL_MASKBIT;
> > + msix_mask_irq(entry, flag);
>
> This makes good sense: before your patch, when we restore, we set the
> mask bit if *any* bits in the Vector Control register are set.
>
> There are other paths leading to __pci_msix_desc_mask_irq(), so I
> think it would be better to do the masking there, e.g.,
>
> if (flag & PCI_MSIX_ENTRY_CTRL_MASKBIT)
> mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
>
> I think the other paths all pass either 0 or 1, so they're all safe
> today. But doing the masking in __pci_msix_desc_mask_irq() removes
> that assumption from the callers.
>
> I applied the patch below to pci/msi, let me know if it doesn't work
> for you.
>
> > + }
> >
> > pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
> > }
>
> commit 1a828a554650
> Author: Jian-Hong Pan <jian-hong@endlessm.com>
> Date: Tue Oct 8 11:42:39 2019 +0800
>
> PCI/MSI: Fix incorrect MSI-X masking on resume
>
> When a driver enables MSI-X, msix_program_entries() reads the MSI-X Vector
> Control register for each vector and saves it in desc->masked. Each
> register is 32 bits and bit 0 is the actual Mask bit.
>
> When we restored these registers during resume, we previously set the Mask
> bit if *any* bit in desc->masked was set instead of when the Mask bit
> itself was set:
>
> pci_restore_state
> pci_restore_msi_state
> __pci_restore_msix_state
> for_each_pci_msi_entry
> msix_mask_irq(entry, entry->masked) <-- entire u32 word
> __pci_msix_desc_mask_irq(desc, flag)
> mask_bits = desc->masked & ~PCI_MSIX_ENTRY_CTRL_MASKBIT
> if (flag) <-- testing entire u32, not just bit 0
> mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT
> writel(mask_bits, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL)
>
> This means that after resume, MSI-X vectors were masked when they shouldn't
> be, which leads to timeouts like this:
>
> nvme nvme0: I/O 978 QID 3 timeout, completion polled
>
> On resume, set the Mask bit only when the saved Mask bit from suspend was
> set.
>
> This should remove the need for 19ea025e1d28 ("nvme: Add quirk for Kingston
> NVME SSD running FW E8FK11.T").
>
> [bhelgaas: commit log, move fix to __pci_msix_desc_mask_irq()]
> Fixes: f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=204887
> Link: https://lore.kernel.org/r/20191008034238.2503-1-jian-hong@endlessm.com
> Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
I forgot; this probably should be marked for stable, too.
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 0884bedcfc7a..771041784e64 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -213,12 +213,13 @@ u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
>
> if (pci_msi_ignore_mask)
> return 0;
> +
> desc_addr = pci_msix_desc_addr(desc);
> if (!desc_addr)
> return 0;
>
> mask_bits &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
> - if (flag)
> + if (flag & PCI_MSIX_ENTRY_CTRL_MASKBIT)
> mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
>
> writel(mask_bits, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
WARNING: multiple messages have this Message-ID (diff)
From: Bjorn Helgaas <helgaas@kernel.org>
To: Jian-Hong Pan <jian-hong@endlessm.com>
Cc: linux@endlesssm.com, Matthew Wilcox <willy@linux.intel.com>,
linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
linux-pci@vger.kernel.org
Subject: Re: [PATCH] PCI/MSI: Fix restoring of MSI-X vector control's mask bit
Date: Wed, 23 Oct 2019 16:43:25 -0500 [thread overview]
Message-ID: <20191023214325.GA30290@google.com> (raw)
In-Reply-To: <20191023211235.GA236419@google.com>
On Wed, Oct 23, 2019 at 04:12:35PM -0500, Bjorn Helgaas wrote:
> On Tue, Oct 08, 2019 at 11:42:39AM +0800, Jian-Hong Pan wrote:
> > MSI-X vector control's bit 0 is the mask bit, which masks the
> > corresponding interrupt request, or not. Other reserved bits might be
> > used for other purpose by device vendors. For example, the values of
> > Kingston NVMe SSD's MSI-X vector control are neither 0, nor 1, but other
> > values [1].
> >
> > The original restoring logic in system resuming uses the whole MSI-X
> > vector control value as the flag to set/clear the mask bit. However,
> > this logic conflicts the idea mentioned above. It may mislead system to
> > disable the MSI-X vector entries. That makes system get no interrupt
> > from Kingston NVMe SSD after resume and usually get NVMe I/O timeout
> > error.
> >
> > [ 174.715534] nvme nvme0: I/O 978 QID 3 timeout, completion polled
> >
> > This patch takes only the mask bit of original MSI-X vector control
> > value as the flag to fix this issue.
> >
> > [1] https://bugzilla.kernel.org/show_bug.cgi?id=204887#c8
> >
> > Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204887
> > Fixed: f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code")
> > Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
> > ---
> > drivers/pci/msi.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> > index 0884bedcfc7a..deae3d5acaf6 100644
> > --- a/drivers/pci/msi.c
> > +++ b/drivers/pci/msi.c
> > @@ -433,6 +433,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
> > static void __pci_restore_msix_state(struct pci_dev *dev)
> > {
> > struct msi_desc *entry;
> > + u32 flag;
> >
> > if (!dev->msix_enabled)
> > return;
> > @@ -444,8 +445,10 @@ static void __pci_restore_msix_state(struct pci_dev *dev)
> > PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL);
> >
> > arch_restore_msi_irqs(dev);
> > - for_each_pci_msi_entry(entry, dev)
> > - msix_mask_irq(entry, entry->masked);
> > + for_each_pci_msi_entry(entry, dev) {
> > + flag = entry->masked & PCI_MSIX_ENTRY_CTRL_MASKBIT;
> > + msix_mask_irq(entry, flag);
>
> This makes good sense: before your patch, when we restore, we set the
> mask bit if *any* bits in the Vector Control register are set.
>
> There are other paths leading to __pci_msix_desc_mask_irq(), so I
> think it would be better to do the masking there, e.g.,
>
> if (flag & PCI_MSIX_ENTRY_CTRL_MASKBIT)
> mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
>
> I think the other paths all pass either 0 or 1, so they're all safe
> today. But doing the masking in __pci_msix_desc_mask_irq() removes
> that assumption from the callers.
>
> I applied the patch below to pci/msi, let me know if it doesn't work
> for you.
>
> > + }
> >
> > pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
> > }
>
> commit 1a828a554650
> Author: Jian-Hong Pan <jian-hong@endlessm.com>
> Date: Tue Oct 8 11:42:39 2019 +0800
>
> PCI/MSI: Fix incorrect MSI-X masking on resume
>
> When a driver enables MSI-X, msix_program_entries() reads the MSI-X Vector
> Control register for each vector and saves it in desc->masked. Each
> register is 32 bits and bit 0 is the actual Mask bit.
>
> When we restored these registers during resume, we previously set the Mask
> bit if *any* bit in desc->masked was set instead of when the Mask bit
> itself was set:
>
> pci_restore_state
> pci_restore_msi_state
> __pci_restore_msix_state
> for_each_pci_msi_entry
> msix_mask_irq(entry, entry->masked) <-- entire u32 word
> __pci_msix_desc_mask_irq(desc, flag)
> mask_bits = desc->masked & ~PCI_MSIX_ENTRY_CTRL_MASKBIT
> if (flag) <-- testing entire u32, not just bit 0
> mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT
> writel(mask_bits, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL)
>
> This means that after resume, MSI-X vectors were masked when they shouldn't
> be, which leads to timeouts like this:
>
> nvme nvme0: I/O 978 QID 3 timeout, completion polled
>
> On resume, set the Mask bit only when the saved Mask bit from suspend was
> set.
>
> This should remove the need for 19ea025e1d28 ("nvme: Add quirk for Kingston
> NVME SSD running FW E8FK11.T").
>
> [bhelgaas: commit log, move fix to __pci_msix_desc_mask_irq()]
> Fixes: f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=204887
> Link: https://lore.kernel.org/r/20191008034238.2503-1-jian-hong@endlessm.com
> Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
I forgot; this probably should be marked for stable, too.
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 0884bedcfc7a..771041784e64 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -213,12 +213,13 @@ u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
>
> if (pci_msi_ignore_mask)
> return 0;
> +
> desc_addr = pci_msix_desc_addr(desc);
> if (!desc_addr)
> return 0;
>
> mask_bits &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
> - if (flag)
> + if (flag & PCI_MSIX_ENTRY_CTRL_MASKBIT)
> mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
>
> writel(mask_bits, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2019-10-23 21:43 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-08 3:42 [PATCH] PCI/MSI: Fix restoring of MSI-X vector control's mask bit Jian-Hong Pan
2019-10-08 3:42 ` Jian-Hong Pan
2019-10-23 21:12 ` Bjorn Helgaas
2019-10-23 21:12 ` Bjorn Helgaas
2019-10-23 21:43 ` Bjorn Helgaas [this message]
2019-10-23 21:43 ` Bjorn Helgaas
2019-10-24 3:35 ` Jian-Hong Pan
2019-10-24 3:35 ` Jian-Hong Pan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191023214325.GA30290@google.com \
--to=helgaas@kernel.org \
--cc=jian-hong@endlessm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux@endlesssm.com \
--cc=willy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.