linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Derrick <jonathan.derrick@linux.dev>
To: Bjorn Helgaas <helgaas@kernel.org>,
	Nirmal Patel <nirmal.patel@linux.intel.com>
Cc: linux-pci@vger.kernel.org,
	"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	"Krzysztof Wilczyński" <kwilczynski@kernel.org>,
	"Rob Herring" <robh@kernel.org>
Subject: Re: [PATCH] PCI: vmd: VMD to control Hotplug on its rootports
Date: Thu, 13 Jul 2023 13:26:36 -0600	[thread overview]
Message-ID: <8768a272-18f8-9569-86f6-25564f8f862b@linux.dev> (raw)
In-Reply-To: <20230713165856.GA322319@bhelgaas>



On 7/13/2023 10:58 AM, Bjorn Helgaas wrote:
> [+cc Jonathan, Lorenzo, Krzysztof, Rob (from MAINTAINERS)]
> 
> Can you make the subject line say what the patch does?  Repeating
> "VMD" is probably unnecessary.
> 
> On Wed, Jul 05, 2023 at 01:20:38PM -0400, Nirmal Patel wrote:
>> The hotplug functionality is broken in various combinations of guest
>> OSes i.e. RHEL, SlES and hypervisors i.e. KVM and ESXI.
> 
> "SLES"
> 
>> VMD enabled on Intel ADL cpus caused interrupt storm for smasung
>> drives due to AER being enabled on VMD controlled root ports.
> 
> "Samsung"
> 
> Enabling AER should not cause an interrupt storm.  There must be
> something else going on in addition.  Are you saying the Samsung
> drives have some AER-related defect, like generating Correctable
> Errors when they shouldn't?  It doesn't sound like "Intel ADL" would
> be relevant here.
> 
> It's not clear to me if this is directly related to *hotplug* or if
> it's an AER issue that may happen because of hotplug and possible for
> other reasons.
> 
>> The patch 04b12ef163d10e348db664900ae7f611b83c7a0e
> 
> 12 char SHA1 is sufficient.
> 
>> ("PCI: vmd: Honor APCI _OSC on PCIe features.") was added to the VMD
>> driver to correct the issue based on the following assumption:
>> 	“Since VMD is an aperture to regular PCIe root ports, honor ACPI
>> 	_OSC to disable PCIe features accordingly to resolve the issue.”
>> 	Link: https://lore.kernel.org/r/20211203031541.1428904-1-kai.heng.feng@canonical.com
>>
>> VMD as a PCIe device is an end point(type 0), not a PCIe aperture
>> (pcie bridge). In fact VMD is a type 0 raid controller(class code).
>> When BIOS boots, all root ports under VMD is inaccessible by BIOS, and
>> as such, they maintain their power on default states. The VMD UEFI DXE
>> driver loads and configure all devices under VMD. This is how AER,
>> power management and hotplug gets enabled in UEFI, since the BIOS pci
>> driver cannot access the root ports.
> 
> s/pcie/PCIe/
> s/pci/PCI/
> s/raid/RAID/
> 
>> The patch worked around the interrupt storm by assigning the native
> 
> I assume "the patch" means 04b12ef163d1.  Sometimes people write "the
> patch does X" in the commit log for the current patch, so it's nice to
> be specific to avoid confusion.
> 
>> ACPI states to the  root ports under VMD. It assigns AER, hotplug,
>> PME, etc. These have been restored back to the power on default state
>> in guest OS, which says the root port hot plug enable is default OFF.
>> At most, the work around should have only assigned AER state.
>> An additional patch should be added to exclude hot plug from the
>> original patch.
> 
> Add blank line between paragraphs.
> 
>> This will cause hot plug to start working again in the guest, as the
>> settings implemented by the UEFI VMD DXE driver will remain in effect
>> in Linux.
So hotplug will work in the guest per UEFI VMD DXE driver, but AER will 
be determined by the host native_aer state?

> 
> This is a lot of description that doesn't seem to say what the actual
> underlying issue is.  It's basically "if we treat hotplug differently
> from other negotiated features, things work better."  And it seems
> like it depends on what the UEFI driver did, and we should try to
> avoid a dependency there.
> 
> If the issue is too many Correctable Errors being reported via AER, we
> have that problem regardless of VMD, and we should handle it by
> rate-limiting and/or suppressing them completely for particularly
> offensive devices.  We have some issues like this that are still
> outstanding:
I'm curious if Non-VMD root ports on the system see the same CE storm.
Either way, rate-limiting per device seems like a good idea.

> 
> [1] https://lore.kernel.org/linux-pci/DM6PR04MB6473197DBD89FF4643CC391F8BC19@DM6PR04MB6473.namprd04.prod.outlook.com/
> [2] https://lore.kernel.org/linux-pci/20230606035256.2886098-2-grundler@chromium.org/
> 
>> Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
>> ---
>>   drivers/pci/controller/vmd.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
>> index 769eedeb8802..52c2461b4761 100644
>> --- a/drivers/pci/controller/vmd.c
>> +++ b/drivers/pci/controller/vmd.c
>> @@ -701,8 +701,6 @@ static int vmd_alloc_irqs(struct vmd_dev *vmd)
>>   static void vmd_copy_host_bridge_flags(struct pci_host_bridge *root_bridge,
>>   				       struct pci_host_bridge *vmd_bridge)
>>   {
>> -	vmd_bridge->native_pcie_hotplug = root_bridge->native_pcie_hotplug;
>> -	vmd_bridge->native_shpc_hotplug = root_bridge->native_shpc_hotplug;
>>   	vmd_bridge->native_aer = root_bridge->native_aer;
>>   	vmd_bridge->native_pme = root_bridge->native_pme;
>>   	vmd_bridge->native_ltr = root_bridge->native_ltr;
>> -- 
>> 2.31.1
>>

  reply	other threads:[~2023-07-13 19:33 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-05 17:20 [PATCH] PCI: vmd: VMD to control Hotplug on its rootports Nirmal Patel
2023-07-13 16:58 ` Bjorn Helgaas
2023-07-13 19:26   ` Jonathan Derrick [this message]
2023-07-24 21:59     ` Patel, Nirmal
2023-07-19  2:00   ` Patel, Nirmal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8768a272-18f8-9569-86f6-25564f8f862b@linux.dev \
    --to=jonathan.derrick@linux.dev \
    --cc=helgaas@kernel.org \
    --cc=kwilczynski@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=nirmal.patel@linux.intel.com \
    --cc=robh@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).