From: "Patel, Nirmal" <nirmal.patel@linux.intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>, Xinghui Li <korantwork@gmail.com>
Cc: kbusch@kernel.org, jonathan.derrick@linux.dev,
lpieralisi@kernel.org, linux-pci@vger.kernel.org,
linux-kernel@vger.kernel.org, Xinghui Li <korantli@tencent.com>
Subject: Re: [PATCH v4] PCI: vmd: Add the module param to adjust MSI mode
Date: Wed, 29 Mar 2023 23:49:03 -0700 [thread overview]
Message-ID: <0603c75d-82d3-01d5-ffe7-b648c1f02f0e@linux.intel.com> (raw)
In-Reply-To: <20230329163107.GA3061927@bhelgaas>
On 3/29/2023 9:31 AM, Bjorn Helgaas wrote:
> On Wed, Mar 29, 2023 at 04:57:08PM +0800, Xinghui Li wrote:
>> On Wed, Mar 29, 2023 at 5:34 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>> It would also be nice to include a hint about why a user would choose
>>> "on" or "off". What is the performance effect? What sort of I/O
>>> scenario would lead you to choose "on" vs "off"?
>>>
>> Before this patch, I sent the patch named :
>> PCI: vmd: Do not disable MSI-X remapping in VMD 28C0 controller
>> (patchwork link:
>> https://patchwork.kernel.org/project/linux-pci/patch/20221222072603.1175248-1-korantwork@gmail.com/)
>> We found the 4k rand read's iops could drop 50% if 4 NVMEs were
>> mounted in one PCIE port with VMD MSI bypass.
>> I suppose this is because the VMD Controller can aggregate interrupts.
>> But those test result is so long that I didn't add them to this patch
>> commit log.
>> If you believe it is necessary, I will try to add some simple instructions
> I don't think we need detailed performance numbers, but we need
> something like:
>
> - "msi_remap=off" improves interrupt handling performance by
> avoiding the VMD MSI-X domain interrupt handler
>
> - But "msi_remap=on" is needed when ...?
>
>>> ee81ee84f873 ("PCI: vmd: Disable MSI-X remapping when possible")
>>> suggests that MSI-X remapping (I assume the "msi_remap=on" case):
>>>
>>> - Limits the number MSI-X vectors available to child devices to the
>>> number of VMD MSI-X vectors.
>>>
>>> - Reduces interrupt handling performance because child device
>>> interrupts have to go through the VMD MSI-X domain interrupt
>>> handler.
>>>
>>> So I assume "msi_remap=off" would remove that MSI-X vector limit and
>>> improve interrupt handling performance?
>>>
>>> But obviously there's more to consider because those are both good
>>> things and if we could do that all the time, we would. So there must
>>> be cases where we *have* to remap. ee81ee84f873 suggests that not all
>>> VMD devices support disabling remap. There's also a hint that some
>>> virt configs require it.
>>>
>> I used to just want to disable 28C0's VMD MSI bypass by default.
>> But Nirmal suggested the current method by adjusting the param.
>> Because he and other reviewers worry there are some other scenarios we
>> didn't consider.
>> Adding a method to adjust VMD'S MSI-X mode is better.
> This commit log doesn't outline any of those other scenarios, and it
> doesn't say anything about when "msi_remap=on" or "msi_remap=off"
> would be necessary or desired, so I have no idea how users are
> supposed to figure out whether or not to use this parameter.
>
>>> This patch doesn't enforce either of those things. What happens if
>>> the user gets it wrong?
>> If I am wrong, please feel free to correct me at any time.
>> I place the "vmd_config_msi_remap_param" that is VMD MSI-X's mode
>> param configuring helper front
>> "vmd_enable_domain". So, It will not change the logic disabling
>> remapping from ee81ee84f873, such as
>> "Currently MSI remapping must be enabled in guest passthrough mode".
>> So, if the user config the wrong type, it will not work, and they can
>> find it by dmesg.
> That's kind of a problem. I'm not in favor of something failing and
> the user having to debug it via dmesg. That causes user frustration
> and problem reports.
>
> I don't know what "guest passthrough mode" is. Can you detect that
> automatically?
>
> Bjorn
How about adding a boolean flag by comparing user input for module
parameter msi_remap? and add the flag at
- if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) || msi_flag
|| offset[0] || offset[1])
Correct if I am wrong, but in this way we can cover all the cases.
If user adds msi_remap=on, msi_flag=true and enables remapping.
If user adds msi_remap=off, msi_flag=false and disables remapping.
If user doesn't add anything, msi_flag=false and decision will be
made same as current implementation. This will cover guest OS case
as well.
next prev parent reply other threads:[~2023-03-30 6:49 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-20 13:23 [PATCH v4] PCI: vmd: Add the module param to adjust MSI mode korantwork
2023-03-27 6:39 ` Xinghui Li
2023-03-28 21:34 ` Bjorn Helgaas
2023-03-29 8:57 ` Xinghui Li
2023-03-29 16:31 ` Bjorn Helgaas
2023-03-30 6:49 ` Patel, Nirmal [this message]
2023-04-02 14:34 ` Xinghui Li
2023-04-03 21:00 ` Patel, Nirmal
2023-04-02 15:02 ` Xinghui Li
2023-04-03 22:45 ` Bjorn Helgaas
2023-04-04 11:02 ` Xinghui Li
2023-04-17 9:15 ` Xinghui Li
2023-04-17 21:44 ` Bjorn Helgaas
2023-04-18 3:40 ` Xinghui Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0603c75d-82d3-01d5-ffe7-b648c1f02f0e@linux.intel.com \
--to=nirmal.patel@linux.intel.com \
--cc=helgaas@kernel.org \
--cc=jonathan.derrick@linux.dev \
--cc=kbusch@kernel.org \
--cc=korantli@tencent.com \
--cc=korantwork@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lpieralisi@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).