From: Bjorn Helgaas <helgaas@kernel.org>
To: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Cc: linux-pci@vger.kernel.org, Keith Busch <kbusch@kernel.org>,
Kai-Heng Feng <kai.heng.feng@canonical.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Subject: Re: [PATCH] Documentation: PCI: add vmd documentation
Date: Tue, 23 Apr 2024 19:47:33 -0500 [thread overview]
Message-ID: <20240424004733.GA476130@bhelgaas> (raw)
In-Reply-To: <413d99f0-0af5-477e-a3cb-84f0955407d0@intel.com>
On Tue, Apr 23, 2024 at 04:10:37PM -0700, Paul M Stillwell Jr wrote:
> On 4/23/2024 2:26 PM, Bjorn Helgaas wrote:
> > On Mon, Apr 22, 2024 at 04:39:19PM -0700, Paul M Stillwell Jr wrote:
> > > On 4/22/2024 3:52 PM, Bjorn Helgaas wrote:
> > > > On Mon, Apr 22, 2024 at 02:39:16PM -0700, Paul M Stillwell Jr wrote:
> > > > > On 4/22/2024 1:27 PM, Bjorn Helgaas wrote:
> > > ...
> >
> > > > > > _OSC negotiates ownership of features between platform firmware and
> > > > > > OSPM. The "native_pcie_hotplug" and similar bits mean that "IF a
> > > > > > device advertises the feature, the OS can use it." We clear those
> > > > > > native_* bits if the platform retains ownership via _OSC.
> > > > > >
> > > > > > If BIOS doesn't enable the VMD host bridge and doesn't supply _OSC for
> > > > > > the domain below it, why would we assume that BIOS retains ownership
> > > > > > of the features negotiated by _OSC? I think we have to assume the OS
> > > > > > owns them, which is what happened before 04b12ef163d1.
> > > > >
> > > > > Sorry, this confuses me :) If BIOS doesn't enable VMD (i.e. VMD is disabled)
> > > > > then all the root ports and devices underneath VMD are visible to the BIOS
> > > > > and OS so ACPI would run on all of them and the _OSC bits should be set
> > > > > correctly.
> > > >
> > > > Sorry, that was confusing. I think there are two pieces to enabling
> > > > VMD:
> > > >
> > > > 1) There's the BIOS "VMD enable" switch. If set, the VMD device
> > > > appears as an RCiEP and the devices behind it are invisible to the
> > > > BIOS. If cleared, VMD doesn't exist; the VMD RCiEP is hidden and
> > > > the devices behind it appear as normal Root Ports with devices below
> > > > them.
> > > >
> > > > 2) When the BIOS "VMD enable" is set, the OS vmd driver configures
> > > > the VMD RCiEP and enumerates things below the VMD host bridge.
> > > >
> > > > In this case, BIOS enables the VMD RCiEP, but it doesn't have a
> > > > driver for it and it doesn't know how to enumerate the VMD Root
> > > > Ports, so I don't think it makes sense for BIOS to own features for
> > > > devices it doesn't know about.
> > >
> > > That makes sense to me. It sounds like VMD should own all the features, I
> > > just don't know how the vmd driver would set the bits other than hotplug
> > > correctly... We know leaving them on is problematic, but I'm not sure what
> > > method to use to decide which of the other bits should be set or not.
> >
> > My starting assumption would be that we'd handle the VMD domain the
> > same as other PCI domains: if a device advertises a feature, the
> > kernel includes support for it, and the kernel owns it, we enable it.
>
> I've been poking around and it seems like some things (I was looking for
> AER) are global to the platform. In my investigation (which is a small
> sample size of machines) it looks like there is a single entry in the BIOS
> to enable/disable AER so whatever is in one domain should be the same in all
> the domains. I couldn't find settings for LTR or the other bits, but I'm not
> sure what to look for in the BIOS for those.
>
> So it seems that there are 2 categories: platform global and device
> specific. AER and probably some of the others are global and can be copied
> from one domain to another, but things like hotplug are device specific and
> should be handled that way.
_OSC is the only mechanism for negotiating ownership of these
features, and PCI Firmware r3.3, sec 4.5.1, is pretty clear that _OSC
only applies to the hierarchy originated by the PNP0A03/PNP0A08 host
bridge that contains the _OSC method. AFAICT, there's no
global/device-specific thing here.
The BIOS may have a single user-visible setting, and it may apply that
setting to all host bridge _OSC methods, but that's just part of the
BIOS UI, not part of the firmware/OS interface.
> > If a device advertises a feature but there's a hardware problem with
> > it, the usual approach is to add a quirk to work around the problem.
> > The Correctable Error issue addressed by 04b12ef163d1 ("PCI: vmd:
> > Honor ACPI _OSC on PCIe features"), looks like it might be in this
> > category.
>
> I don't think we had a hardware problem with these Samsung (IIRC) devices;
> the issue was that the vmd driver were incorrectly enabling AER because
> those native_* bits get set automatically.
Where do all the Correctable Errors come from? IMO they're either
caused by some hardware issue or by a software error in programming
AER. It's possible we forget to clear the errors and we just see the
same error reported over and over. But I don't think the answer is
to copy the AER ownership from a different domain.
Bjorn
next prev parent reply other threads:[~2024-04-24 0:47 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-17 20:15 [PATCH] Documentation: PCI: add vmd documentation Paul M Stillwell Jr
2024-04-17 23:51 ` Keith Busch
2024-04-18 15:07 ` Paul M Stillwell Jr
2024-04-18 23:34 ` Keith Busch
2024-04-18 18:26 ` Bjorn Helgaas
2024-04-18 21:51 ` Paul M Stillwell Jr
2024-04-19 21:14 ` Bjorn Helgaas
2024-04-19 22:18 ` Paul M Stillwell Jr
2024-04-22 20:27 ` Bjorn Helgaas
2024-04-22 21:39 ` Paul M Stillwell Jr
2024-04-22 22:52 ` Bjorn Helgaas
2024-04-22 23:39 ` Paul M Stillwell Jr
2024-04-23 21:26 ` Bjorn Helgaas
2024-04-23 23:10 ` Paul M Stillwell Jr
2024-04-24 0:47 ` Bjorn Helgaas [this message]
2024-04-24 21:29 ` Paul M Stillwell Jr
2024-04-25 17:24 ` Bjorn Helgaas
2024-04-25 21:43 ` Paul M Stillwell Jr
2024-04-25 22:32 ` Bjorn Helgaas
2024-04-25 23:32 ` Paul M Stillwell Jr
2024-04-26 21:36 ` Bjorn Helgaas
2024-04-26 21:46 ` Paul M Stillwell Jr
2024-06-12 21:52 ` Paul M Stillwell Jr
2024-06-12 22:25 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240424004733.GA476130@bhelgaas \
--to=helgaas@kernel.org \
--cc=kai.heng.feng@canonical.com \
--cc=kbusch@kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lorenzo.pieralisi@arm.com \
--cc=paul.m.stillwell.jr@intel.com \
--cc=rafael.j.wysocki@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox