All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nirmal Patel <nirmal.patel@linux.intel.com>
To: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>,
	Keith Busch <kbusch@kernel.org>,
	jonathan.derrick@linux.dev, acelan.kao@canonical.com,
	lpieralisi@kernel.org, kw@linux.com, robh@kernel.org,
	bhelgaas@google.com, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] PCI: vmd: Delay interrupt handling on MTL VMD controller
Date: Thu, 12 Sep 2024 10:45:47 -0700	[thread overview]
Message-ID: <20240912104547.00005865@linux.intel.com> (raw)
In-Reply-To: <CAAd53p5ox-CiNd6nHb4ogL-K2wr+dNYBtRxiw8E6jf7HgLsH-A@mail.gmail.com>

On Fri, 6 Sep 2024 09:56:59 +0800
Kai-Heng Feng <kai.heng.feng@canonical.com> wrote:

> On Wed, Sep 4, 2024 at 2:22 PM Manivannan Sadhasivam
> <manivannan.sadhasivam@linaro.org> wrote:
> >
> > On Wed, Sep 04, 2024 at 09:57:08AM +0800, Kai-Heng Feng wrote:  
> > > On Tue, Sep 3, 2024 at 10:51 PM Keith Busch <kbusch@kernel.org>
> > > wrote:  
> > > >
> > > > On Tue, Sep 03, 2024 at 03:07:45PM +0800, Kai-Heng Feng wrote:  
> > > > > On Tue, Sep 3, 2024 at 12:29 PM Manivannan Sadhasivam
> > > > > <manivannan.sadhasivam@linaro.org> wrote:  
> > > > > >
> > > > > > On Tue, Sep 03, 2024 at 10:55:44AM +0800, Kai-Heng Feng
> > > > > > wrote:  
> > > > > > > Meteor Lake VMD has a bug that the IRQ raises before the
> > > > > > > DMA region is ready, so the requested IO is considered
> > > > > > > never completed: [   97.343423] nvme nvme0: I/O 259 QID 2
> > > > > > > timeout, completion polled [   97.343446] nvme nvme0: I/O
> > > > > > > 384 QID 3 timeout, completion polled [   97.343459] nvme
> > > > > > > nvme0: I/O 320 QID 4 timeout, completion polled [
> > > > > > > 97.343470] nvme nvme0: I/O 707 QID 5 timeout, completion
> > > > > > > polled
> > > > > > >
> > > > > > > The is documented as erratum MTL016 [0]. The suggested
> > > > > > > workaround is to "The VMD MSI interrupt-handler should
> > > > > > > initially perform a dummy register read to the MSI
> > > > > > > initiator device prior to any writes to ensure proper
> > > > > > > PCIe ordering." which essentially is adding a delay
> > > > > > > before the interrupt handling. 
> > > > > >
> > > > > > Why can't you add a dummy register read instead? Adding a
> > > > > > delay for PCIe ordering is not going to work always.  
> > > > >
> > > > > This can be done too. But it can take longer than 4us delay,
> > > > > so I'd like to keep it a bit faster here.  
> > > >
> > > > An added delay is just a side effect of the read. The read
> > > > flushes pending device-to-host writes, which is most likely
> > > > what the errata really requires. I think Mani is right, you
> > > > need to pay that register read penalty to truly fix this.  
> > >
> > > OK, will change the quirk to perform dummy register read.
> > >
> > > But I am not sure which is the "MSI initiator device", is it VMD
> > > controller or NVMe devices?
> > >  
> >
> > 'MSI initiator' should be the NVMe device. My understanding is that
> > the workaround suggests reading the NVMe register from the MSI
> > handler before doing any write to the device to ensures that the
> > previous writes from the device are flushed.  
> 
> Hmm, it would be really great to contain the quirk in VMD controller.
> Is there anyway to do that right before generic_handle_irq()?
> 
The bug is in hardware, I agree with Kai-Heng to contain it to VMD
controller.
 
> >
> > And this sounds like the workaround should be done in the NVMe
> > driver as it has the knowledge of the NVMe registers. But isn't the
> > NVMe driver already reading CQE status first up in the ISR?  
> 
> The VMD interrupt is fired before the CQE status update, hence the
> bug.
> 
> Kai-Heng
> 
> >
> > - Mani
> >  
> > > Kai-Heng
> > >  
> > > >  
> > > > > > > +     /* Erratum MTL016 */
> > > > > > > +     VMD_FEAT_INTERRUPT_QUIRK        = (1 << 6),
> > > > > > >  };
> > > > > > >
> > > > > > >  #define VMD_BIOS_PM_QUIRK_LTR        0x1003  /* 3145728
> > > > > > > ns */ @@ -90,6 +94,8 @@ static
> > > > > > > DEFINE_IDA(vmd_instance_ida); */
> > > > > > >  static DEFINE_RAW_SPINLOCK(list_lock);
> > > > > > >
> > > > > > > +static bool interrupt_delay;
> > > > > > > +
> > > > > > >  /**
> > > > > > >   * struct vmd_irq - private data to map driver IRQ to
> > > > > > > the VMD shared vector
> > > > > > >   * @node:    list item for parent traversal.
> > > > > > > @@ -105,6 +111,7 @@ struct vmd_irq {
> > > > > > >       struct vmd_irq_list     *irq;
> > > > > > >       bool                    enabled;
> > > > > > >       unsigned int            virq;
> > > > > > > +     bool                    delay_irq;  
> > > > > >
> > > > > > This is unused. Perhaps you wanted to use this instead of
> > > > > > interrupt_delay?  
> > > > >
> > > > > This is leftover, will scratch this.  
> > > >
> > > > Maybe you should actually use it instead of making a global?
> > > > The quirk says it is device specific, so no need to punish
> > > > every device if it doesn't need it (unlikely as it is to see
> > > > such a thing).  
> >
> > --
> > மணிவண்ணன் சதாசிவம்  


  reply	other threads:[~2024-09-12 17:45 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-03  2:55 [PATCH] PCI: vmd: Delay interrupt handling on MTL VMD controller Kai-Heng Feng
2024-09-03  4:28 ` Manivannan Sadhasivam
2024-09-03  7:07   ` Kai-Heng Feng
2024-09-03  8:00     ` Manivannan Sadhasivam
2024-09-03 14:51     ` Keith Busch
2024-09-04  1:57       ` Kai-Heng Feng
2024-09-04  6:22         ` Manivannan Sadhasivam
2024-09-06  1:56           ` Kai-Heng Feng
2024-09-12 17:45             ` Nirmal Patel [this message]
2024-09-13  5:55               ` Kai-Heng Feng
2024-09-13 11:11                 ` Manivannan Sadhasivam
2024-09-13 15:24                   ` Keith Busch
2024-09-13 16:14                     ` Manivannan Sadhasivam
2024-09-13 16:34                       ` Keith Busch
2024-09-03 15:29     ` Nirmal Patel
2024-09-03 16:17 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240912104547.00005865@linux.intel.com \
    --to=nirmal.patel@linux.intel.com \
    --cc=acelan.kao@canonical.com \
    --cc=bhelgaas@google.com \
    --cc=jonathan.derrick@linux.dev \
    --cc=kai.heng.feng@canonical.com \
    --cc=kbusch@kernel.org \
    --cc=kw@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=manivannan.sadhasivam@linaro.org \
    --cc=robh@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.