Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: <linux-pci@vger.kernel.org>, Keith Busch <kbusch@kernel.org>,
	"Kai-Heng Feng" <kai.heng.feng@canonical.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Subject: Re: [PATCH] Documentation: PCI: add vmd documentation
Date: Thu, 25 Apr 2024 14:43:07 -0700	[thread overview]
Message-ID: <935baf43-791e-4c8e-9566-e05335e74cea@intel.com> (raw)
In-Reply-To: <20240425172447.GA511062@bhelgaas>

On 4/25/2024 10:24 AM, Bjorn Helgaas wrote:
> On Wed, Apr 24, 2024 at 02:29:16PM -0700, Paul M Stillwell Jr wrote:
>> On 4/23/2024 5:47 PM, Bjorn Helgaas wrote:
>>> On Tue, Apr 23, 2024 at 04:10:37PM -0700, Paul M Stillwell Jr wrote:
>>>> On 4/23/2024 2:26 PM, Bjorn Helgaas wrote:
>>>>> On Mon, Apr 22, 2024 at 04:39:19PM -0700, Paul M Stillwell Jr wrote:
>>>>>> On 4/22/2024 3:52 PM, Bjorn Helgaas wrote:
>>>>>>> On Mon, Apr 22, 2024 at 02:39:16PM -0700, Paul M Stillwell Jr wrote:
>>>>>>>> On 4/22/2024 1:27 PM, Bjorn Helgaas wrote:
>>>>>> ...
>>>>>
>>>>>>>>> _OSC negotiates ownership of features between platform firmware and
>>>>>>>>> OSPM.  The "native_pcie_hotplug" and similar bits mean that "IF a
>>>>>>>>> device advertises the feature, the OS can use it."  We clear those
>>>>>>>>> native_* bits if the platform retains ownership via _OSC.
>>>>>>>>>
>>>>>>>>> If BIOS doesn't enable the VMD host bridge and doesn't supply _OSC for
>>>>>>>>> the domain below it, why would we assume that BIOS retains ownership
>>>>>>>>> of the features negotiated by _OSC?  I think we have to assume the OS
>>>>>>>>> owns them, which is what happened before 04b12ef163d1.
>>>>>>>>
>>>>>>>> Sorry, this confuses me :) If BIOS doesn't enable VMD (i.e. VMD is disabled)
>>>>>>>> then all the root ports and devices underneath VMD are visible to the BIOS
>>>>>>>> and OS so ACPI would run on all of them and the _OSC bits should be set
>>>>>>>> correctly.
>>>>>>>
>>>>>>> Sorry, that was confusing.  I think there are two pieces to enabling
>>>>>>> VMD:
>>>>>>>
>>>>>>>       1) There's the BIOS "VMD enable" switch.  If set, the VMD device
>>>>>>>       appears as an RCiEP and the devices behind it are invisible to the
>>>>>>>       BIOS.  If cleared, VMD doesn't exist; the VMD RCiEP is hidden and
>>>>>>>       the devices behind it appear as normal Root Ports with devices below
>>>>>>>       them.
>>>>>>>
>>>>>>>       2) When the BIOS "VMD enable" is set, the OS vmd driver configures
>>>>>>>       the VMD RCiEP and enumerates things below the VMD host bridge.
>>>>>>>
>>>>>>>       In this case, BIOS enables the VMD RCiEP, but it doesn't have a
>>>>>>>       driver for it and it doesn't know how to enumerate the VMD Root
>>>>>>>       Ports, so I don't think it makes sense for BIOS to own features for
>>>>>>>       devices it doesn't know about.
>>>>>>
>>>>>> That makes sense to me. It sounds like VMD should own all the features, I
>>>>>> just don't know how the vmd driver would set the bits other than hotplug
>>>>>> correctly... We know leaving them on is problematic, but I'm not sure what
>>>>>> method to use to decide which of the other bits should be set or not.
>>>>>
>>>>> My starting assumption would be that we'd handle the VMD domain the
>>>>> same as other PCI domains: if a device advertises a feature, the
>>>>> kernel includes support for it, and the kernel owns it, we enable it.
>>>>
>>>> I've been poking around and it seems like some things (I was looking for
>>>> AER) are global to the platform. In my investigation (which is a small
>>>> sample size of machines) it looks like there is a single entry in the BIOS
>>>> to enable/disable AER so whatever is in one domain should be the same in all
>>>> the domains. I couldn't find settings for LTR or the other bits, but I'm not
>>>> sure what to look for in the BIOS for those.
>>>>
>>>> So it seems that there are 2 categories: platform global and device
>>>> specific. AER and probably some of the others are global and can be copied
>>>> from one domain to another, but things like hotplug are device specific and
>>>> should be handled that way.
>>>
>>> _OSC is the only mechanism for negotiating ownership of these
>>> features, and PCI Firmware r3.3, sec 4.5.1, is pretty clear that _OSC
>>> only applies to the hierarchy originated by the PNP0A03/PNP0A08 host
>>> bridge that contains the _OSC method.  AFAICT, there's no
>>> global/device-specific thing here.
>>>
>>> The BIOS may have a single user-visible setting, and it may apply that
>>> setting to all host bridge _OSC methods, but that's just part of the
>>> BIOS UI, not part of the firmware/OS interface.
>>
>> Fair, but we are still left with the question of how to set the _OSC bits
>> for the VMD bridge. This would normally happen using ACPI AFAICT and we
>> don't have that for the devices behind VMD.
> 
> In the absence of a mechanism for negotiating ownership, e.g., an ACPI
> host bridge device for the hierarchy, the OS owns all the PCIe
> features.
> 

I'm new to this space so I don't know what it means for the OS to own 
the features. In other words, how would the vmd driver figure out what 
features are supported?

>>>>> If a device advertises a feature but there's a hardware problem with
>>>>> it, the usual approach is to add a quirk to work around the problem.
>>>>> The Correctable Error issue addressed by 04b12ef163d1 ("PCI: vmd:
>>>>> Honor ACPI _OSC on PCIe features"), looks like it might be in this
>>>>> category.
>>>>
>>>> I don't think we had a hardware problem with these Samsung (IIRC) devices;
>>>> the issue was that the vmd driver were incorrectly enabling AER because
>>>> those native_* bits get set automatically.
>>>
>>> Where do all the Correctable Errors come from?  IMO they're either
>>> caused by some hardware issue or by a software error in programming
>>> AER.  It's possible we forget to clear the errors and we just see the
>>> same error reported over and over.  But I don't think the answer is
>>> to copy the AER ownership from a different domain.
>>
>> I looked back at the original bugzilla and I feel like the AER errors are a
>> red herring. AER was *supposed* to be disabled, but was incorrectly enabled
>> by VMD so we are seeing errors. Yes, they may be real errors, but my point
>> is that the user had disabled AER so they didn't care if there were errors
>> or not (i.e. if AER had been correctly disabled by VMD then the user would
>> not have AER errors in the dmesg output).
> 
> 04b12ef163d1 basically asserted "the platform knows about a hardware
> issue between VMD and this NVMe and avoided it by disabling AER in
> domain 0000; therefore we should also disable AER in the VMD domain."
> 
> Your patch at
> https://lore.kernel.org/linux-pci/20240408183927.135-1-paul.m.stillwell.jr@intel.com/
> says "vmd users *always* want hotplug enabled."  What happens when a
> platform knows about a hotplug hardware issue and avoids it by
> disabling hotplug in domain 0000?
> 

I was thinking about this also and I could look at all the root ports 
underneath vmd and see if hotplug is set for any of them. If it is then 
we could set the native_*hotplug bits based on that.

> I think 04b12ef163d1 would avoid it in the VMD domain, but your patch
> would expose the hotplug issue.
> 
>> Kai-Heng even says this in one of his responses here https://lore.kernel.org/linux-pci/CAAd53p6hATV8TOcJ9Qi2rMwVi=y_9+tQu6KhDkAm6Y8=cQ_xoA@mail.gmail.com/.
>> A quote from his reply "To be more precise, AER is disabled by the platform
>> vendor in BIOS to paper over the issue."
> 
> I suspect there's a real hardware issue between the VMD and the
> Samsung NVMe that causes these Correctable Errors.  I think disabling
> AER via _OSC is a bad way to work around it because:
> 
>    - it disables AER for *everything* in domain 0000, when other
>      devices probably work perfectly fine,
> 
>    - it assumes the OS vmd driver applies domain 0000 _OSC to the VMD
>      domain, which isn't required by any spec, and
> 
>    - it disables *all* of AER, including Uncorrectable Errors, and I'd
>      like to know about those, even if we have to mask the Correctable
>      Errors.
> 
> In https://bugzilla.kernel.org/show_bug.cgi?id=215027#c5, Kai-Heng did
> not see the Correctable Error flood when VMD was turned off and
> concluded that the issue is VMD specific.
> 
> But I think it's likely that the errors still occur even when VMD is
> turned off, and we just don't see the flood because AER is disabled.
> I suggested an experiment with "pcie_ports=native", which should
> enable AER even if _OSC doesn't grant ownership:
> https://bugzilla.kernel.org/show_bug.cgi?id=215027#c9
> 

I don't have a way to test his configuration so that would be something 
he would need to do.

> Bjorn
> 


  reply	other threads:[~2024-04-25 21:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-17 20:15 [PATCH] Documentation: PCI: add vmd documentation Paul M Stillwell Jr
2024-04-17 23:51 ` Keith Busch
2024-04-18 15:07   ` Paul M Stillwell Jr
2024-04-18 23:34     ` Keith Busch
2024-04-18 18:26 ` Bjorn Helgaas
2024-04-18 21:51   ` Paul M Stillwell Jr
2024-04-19 21:14     ` Bjorn Helgaas
2024-04-19 22:18       ` Paul M Stillwell Jr
2024-04-22 20:27         ` Bjorn Helgaas
2024-04-22 21:39           ` Paul M Stillwell Jr
2024-04-22 22:52             ` Bjorn Helgaas
2024-04-22 23:39               ` Paul M Stillwell Jr
2024-04-23 21:26                 ` Bjorn Helgaas
2024-04-23 23:10                   ` Paul M Stillwell Jr
2024-04-24  0:47                     ` Bjorn Helgaas
2024-04-24 21:29                       ` Paul M Stillwell Jr
2024-04-25 17:24                         ` Bjorn Helgaas
2024-04-25 21:43                           ` Paul M Stillwell Jr [this message]
2024-04-25 22:32                             ` Bjorn Helgaas
2024-04-25 23:32                               ` Paul M Stillwell Jr
2024-04-26 21:36                                 ` Bjorn Helgaas
2024-04-26 21:46                                   ` Paul M Stillwell Jr
2024-06-12 21:52                                     ` Paul M Stillwell Jr
2024-06-12 22:25                                       ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=935baf43-791e-4c8e-9566-e05335e74cea@intel.com \
    --to=paul.m.stillwell.jr@intel.com \
    --cc=helgaas@kernel.org \
    --cc=kai.heng.feng@canonical.com \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=rafael.j.wysocki@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox