From: Nirmal Patel <nirmal.patel@linux.intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: linux-pci@vger.kernel.org
Subject: Re: [PATCH v4] PCI: vmd: Disable MSI remap only for low MSI count
Date: Mon, 6 May 2024 17:39:01 -0700 [thread overview]
Message-ID: <20240506173901.00003ec4@linux.intel.com> (raw)
In-Reply-To: <20240426223957.GA609812@bhelgaas>
On Fri, 26 Apr 2024 17:39:57 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:
> I think this refers specifically to MSI-X, so use "MSI-X" in the
> subject.
ACK.
>
> On Thu, Apr 18, 2024 at 11:31:21AM -0400, Nirmal Patel wrote:
> > VMD MSI remapping is disabled by default for all the CPUs with 28c0
> > VMD deviceID. We used to disable remapping because drives supported
> > more vectors than the VMD so the performance was better without
> > remapping. Now with CPUs that support more than 64 (128 VMD MSIx
> > vectors for gen5) we no longer need to disable this feature.
>
> "because drives supported more vectors" ... I guess you are referring
> to typical devices that might be behind a VMD? But I assume there's
> no actual requirement that those devices be "drives", right?
Yes, I am referring to PCIe device with equal or more MSI-X vectors
than VMD. If a device has lesser than minimum VMD MSI-X count (64) than
VMD will not cause any performance issue while using MSI-X remapping.
>
> "CPUs that support more than 64 ... 128 VMD vectors" Are we talking
> about *CPUs* that support more vectors, or *VMDs* that support more
> vectors?
>
> I guess you probably think of CPUs here because VMD is integrated into
> the same package, right? That would explain the "CPUs with 28c0 VMD"
> comment. But the vmd driver doesn't care about that; it just claims a
> PCI device.
Yes, I will adjust my wordings; but I meant newer VMD which still has
same deviceID (28c0) as previous generations but it has more MSI-X
vectors.(i.e. 128)
>
> s/MSI remapping/MSI-X remapping/ (I think?)
> s/MSIx/MSI-X/ to match spec usage.
I will make adjustments.
>
> A reference to ee81ee84f873 ("PCI: vmd: Disable MSI-X remapping when
> possible"), which added VMD_FEAT_CAN_BYPASS_MSI_REMAP, might be
> useful because it has nice context.
>
> IIUC this will keep MSI-X remapping enabled in more cases, e.g., on
> new devices that support more vectors. What is the benefit of keeping
> it enabled?
Sorry I took longer to respond.
VMD MSI-X remapping was a performance bottleneck in certain
situations. Starting from 28c0, VMD has a capability to disable MSI-X
remapping and improve the I/O performance. The first iteration of 28c0
VMD HW had only 64 MSI-X vectors support while the newer iterations can
support up to 128 and VMD is no longer a bottleneck. So I thought it
would be a good idea to change it to MSI-X remapping default ON.
Also upon further testings, I noticed huge boost in performance because
of this CID patch:
https://lore.kernel.org/kvm/20240423174114.526704-5-jacob.jun.pan@linux.intel.com/T/
The performance boost we get from the CID patch as follow:
Kernel 6.8.8 : 1Drive: 2000, 4Drives: 2300
6.9.0-rc6 + CID + MSI-X remap Disable: 1Drive: 2700, 4Drives: 6010
6.9.0-rc6 + CID + MSI-X remap Enabled: 1Drive: 2700, 4Drives: 6100
Since there is no significant performance difference between MSI-X
enable and disable after addition of CID patch, I think we can drop this
patch for now until we see significant change in I/O performance due to
VMD's MSI-X remapping policy.
Thanks for your time.
-nirmal
>
> The ee81ee84f873 commit log suggests two issues:
>
> - Number of vectors available to child domain is limited by size of
> VMD MSI-X table.
>
> - Remapping means child interrupts have to go through the VMD domain
> interrupt handler instead of going straight to the device handler.
>
> But this commit log suggests that with more vectors, you can enable
> remapping even without a performance penalty? Maybe the VMD domain
> interrupt handler was only needed because of vector sharing?
>
> I'm just a little confused because this commit log doesn't say what
> the actual benefit is, other than "keeping remapping enabled", and I
> don't know enough to know why that's good.
>
> > Note, pci_msix_vec_count() failure is translated to ENODEV per
> > typical expectations that drivers may return ENODEV when some
> > driver-known fundamental detail of the device is missing.
> >
> > Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
> > ---
> > v1->v2: Updating commit message.
> > v2->v3: Use VMD MSI count instead of cpu count.
> > v3->v4: Updating commit message.
> > ---
> > drivers/pci/controller/vmd.c | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/drivers/pci/controller/vmd.c
> > b/drivers/pci/controller/vmd.c index 769eedeb8802..ba63af70bb63
> > 100644 --- a/drivers/pci/controller/vmd.c
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -34,6 +34,8 @@
> > #define MB2_SHADOW_OFFSET 0x2000
> > #define MB2_SHADOW_SIZE 16
> >
> > +#define VMD_MIN_MSI_VECTOR_COUNT 64
> > +
> > enum vmd_features {
> > /*
> > * Device may contain registers which hint the physical
> > location of the @@ -807,13 +809,20 @@ static int
> > vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> > sd->node = pcibus_to_node(vmd->dev->bus);
> >
> > + vmd->msix_count = pci_msix_vec_count(vmd->dev);
> > + if (vmd->msix_count < 0)
> > + return -ENODEV;
> > +
> > /*
> > * Currently MSI remapping must be enabled in guest
> > passthrough mode
> > * due to some missing interrupt remapping plumbing. This
> > is probably
> > * acceptable because the guest is usually CPU-limited and
> > MSI
> > * remapping doesn't become a performance bottleneck.
>
> This part of the comment might need some updating. I don't see the
> connection with guest passthrough mode in the code.
>
> > + * Disable MSI remapping only if supported by VMD hardware
> > and when
> > + * VMD MSI count is less than or equal to minimum MSI
> > count.
>
> Add blank line between paragraphs or rewrap into a single paragraph.
>
> > */
> > if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) ||
> > + vmd->msix_count > VMD_MIN_MSI_VECTOR_COUNT ||
> > offset[0] || offset[1]) {
>
> I think this conditional might be easier to read if it were inverted:
>
> if (features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) && ...) {
> vmd_set_msi_remapping(vmd, false);
> } else {
> ret = vmd_alloc_irqs(vmd);
> ...
> }
>
> Maybe a separate patch though.
>
> > ret = vmd_alloc_irqs(vmd);
> > if (ret)
> > --
> > 2.31.1
> >
next prev parent reply other threads:[~2024-05-07 0:39 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-18 15:31 [PATCH v4] PCI: vmd: Disable MSI remap only for low MSI count Nirmal Patel
2024-04-25 21:10 ` Nirmal Patel
2024-04-26 22:39 ` Bjorn Helgaas
2024-05-07 0:39 ` Nirmal Patel [this message]
2024-05-08 1:46 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240506173901.00003ec4@linux.intel.com \
--to=nirmal.patel@linux.intel.com \
--cc=helgaas@kernel.org \
--cc=linux-pci@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox