From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Chen, Jiqian" <Jiqian.Chen@amd.com>,
"Rafael J . Wysocki" <rafael@kernel.org>,
Len Brown <lenb@kernel.org>, Juergen Gross <jgross@suse.com>,
Stefano Stabellini <sstabellini@kernel.org>,
Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Bjorn Helgaas <bhelgaas@google.com>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
"Hildebrand, Stewart" <Stewart.Hildebrand@amd.com>,
"Huang, Ray" <Ray.Huang@amd.com>,
"Ragiadakou, Xenia" <Xenia.Ragiadakou@amd.com>
Subject: Re: [RFC KERNEL PATCH v4 3/3] PCI/sysfs: Add gsi sysfs for pci_dev
Date: Mon, 12 Feb 2024 10:13:28 +0100 [thread overview]
Message-ID: <ZcnhOEjnTgbYFPVl@macbook> (raw)
In-Reply-To: <20240209210549.GA884438@bhelgaas>
On Fri, Feb 09, 2024 at 03:05:49PM -0600, Bjorn Helgaas wrote:
> On Thu, Feb 01, 2024 at 09:39:49AM +0100, Roger Pau Monné wrote:
> > On Wed, Jan 31, 2024 at 01:00:14PM -0600, Bjorn Helgaas wrote:
> > > On Wed, Jan 31, 2024 at 09:58:19AM +0100, Roger Pau Monné wrote:
> > > > On Tue, Jan 30, 2024 at 02:44:03PM -0600, Bjorn Helgaas wrote:
> > > > > On Tue, Jan 30, 2024 at 10:07:36AM +0100, Roger Pau Monné wrote:
> > > > > > On Mon, Jan 29, 2024 at 04:01:13PM -0600, Bjorn Helgaas wrote:
> > > > > > > On Thu, Jan 25, 2024 at 07:17:24AM +0000, Chen, Jiqian wrote:
> > > > > > > > On 2024/1/24 00:02, Bjorn Helgaas wrote:
> > > > > > > > > On Tue, Jan 23, 2024 at 10:13:52AM +0000, Chen, Jiqian wrote:
> > > > > > > > >> On 2024/1/23 07:37, Bjorn Helgaas wrote:
> > > > > > > > >>> On Fri, Jan 05, 2024 at 02:22:17PM +0800, Jiqian Chen wrote:
> > > > > > > > >>>> There is a need for some scenarios to use gsi sysfs.
> > > > > > > > >>>> For example, when xen passthrough a device to dumU, it will
> > > > > > > > >>>> use gsi to map pirq, but currently userspace can't get gsi
> > > > > > > > >>>> number.
> > > > > > > > >>>> So, add gsi sysfs for that and for other potential scenarios.
> > > > > > > > >> ...
> > > > > > > > >
> > > > > > > > >>> I don't know enough about Xen to know why it needs the GSI in
> > > > > > > > >>> userspace. Is this passthrough brand new functionality that can't be
> > > > > > > > >>> done today because we don't expose the GSI yet?
> > > > > > >
> > > > > > > I assume this must be new functionality, i.e., this kind of
> > > > > > > passthrough does not work today, right?
> > > > > > >
> > > > > > > > >> has ACPI support and is responsible for detecting and controlling
> > > > > > > > >> the hardware, also it performs privileged operations such as the
> > > > > > > > >> creation of normal (unprivileged) domains DomUs. When we give to a
> > > > > > > > >> DomU direct access to a device, we need also to route the physical
> > > > > > > > >> interrupts to the DomU. In order to do so Xen needs to setup and map
> > > > > > > > >> the interrupts appropriately.
> > > > > > > > >
> > > > > > > > > What kernel interfaces are used for this setup and mapping?
> > > > > > > >
> > > > > > > > For passthrough devices, the setup and mapping of routing physical
> > > > > > > > interrupts to DomU are done on Xen hypervisor side, hypervisor only
> > > > > > > > need userspace to provide the GSI info, see Xen code:
> > > > > > > > xc_physdev_map_pirq require GSI and then will call hypercall to pass
> > > > > > > > GSI into hypervisor and then hypervisor will do the mapping and
> > > > > > > > routing, kernel doesn't do the setup and mapping.
> > > > > > >
> > > > > > > So we have to expose the GSI to userspace not because userspace itself
> > > > > > > uses it, but so userspace can turn around and pass it back into the
> > > > > > > kernel?
> > > > > >
> > > > > > No, the point is to pass it back to Xen, which doesn't know the
> > > > > > mapping between GSIs and PCI devices because it can't execute the ACPI
> > > > > > AML resource methods that provide such information.
> > > > > >
> > > > > > The (Linux) kernel is just a proxy that forwards the hypercalls from
> > > > > > user-space tools into Xen.
> > > > >
> > > > > But I guess Xen knows how to interpret a GSI even though it doesn't
> > > > > have access to AML?
> > > >
> > > > On x86 Xen does know how to map a GSI into an IO-APIC pin, in order
> > > > configure the RTE as requested.
> > >
> > > IIUC, mapping a GSI to an IO-APIC pin requires information from the
> > > MADT. So I guess Xen does use the static ACPI tables, but not the AML
> > > _PRT methods that would connect a GSI with a PCI device?
> >
> > Yes, Xen can parse the static tables, and knows the base GSI of
> > IO-APICs from the MADT.
> >
> > > I guess this means Xen would not be able to deal with _MAT methods,
> > > which also contains MADT entries? I don't know the implications of
> > > this -- maybe it means Xen might not be able to use with hot-added
> > > devices?
> >
> > It's my understanding _MAT will only be present on some very specific
> > devices (IO-APIC or CPU objects). Xen doesn't support hotplug of
> > IO-APICs, but hotplug of CPUs should in principle be supported with
> > cooperation from the control domain OS (albeit it's not something that
> > we tests on our CI). I don't expect however that a CPU object _MAT
> > method will return IO APIC entries.
> >
> > > The tables (including DSDT and SSDTS that contain the AML) are exposed
> > > to userspace via /sys/firmware/acpi/tables/, but of course that
> > > doesn't mean Xen knows how to interpret the AML, and even if it did,
> > > Xen probably wouldn't be able to *evaluate* it since that could
> > > conflict with the host kernel's use of AML.
> >
> > Indeed, there can only be a single OSPM, and that's the dom0 OS (Linux
> > in our context).
> >
> > Getting back to our context though, what would be a suitable place for
> > exposing the GSI assigned to each device?
>
> IIUC, the Xen hypervisor:
>
> - Interprets /sys/firmware/acpi/tables/APIC (or gets this via
> something running on the Dom0 kernel) to find the physical base
> address and GSI base, e.g., from I/O APIC, I/O SAPIC.
No, Xen parses the MADT directly from memory, before stating dom0.
That's a static table so it's fine for Xen to parse it and obtain the
I/O APIC GSI base.
> - Needs the GSI to locate the APIC and pin within the APIC. The
> Dom0 kernel is the OSPM, so only it can evaluate the AML _PRT to
> learn the PCI device -> GSI mapping.
Yes, Xen doesn't know the PCI device -> GSI mapping. Dom0 needs to
parse the ACPI methods and signal Xen to configure a GSI with a
given trigger and polarity.
> - Has direct access to the APIC physical base address to program the
> Redirection Table.
Yes, the hardware (native) I/O APIC is owned by Xen, and not directly
accessible by dom0.
> The patch seems a little messy to me because the PCI core has to keep
> track of the GSI even though it doesn't need it itself. And the
> current patch exposes it on all arches, even non-ACPI ones or when
> ACPI is disabled (easily fixable).
>
> We only call acpi_pci_irq_enable() in the pci_enable_device() path, so
> we don't know the GSI unless a Dom0 driver has claimed the device and
> called pci_enable_device() for it, which seems like it might not be
> desirable.
I think that's always the case, as on dom0 devices to be passed
through are handled by pciback which does enable them.
I agree it might be best to not tie exposing the node to
pci_enable_device() having been called. Is _PRT only evaluated as
part of acpi_pci_irq_enable()? (or pci_enable_device()).
> I was hoping we could put it in /sys/firmware/acpi/interrupts, but
> that looks like it's only for SCI statistics. I guess we could moot a
> new /sys/firmware/acpi/gsi/ directory, but then each file there would
> have to identify a device, which might not be as convenient as the
> /sys/devices/ directory that already exists. I guess there may be
> GSIs for things other than PCI devices; will you ever care about any
> of those?
We only support passthrough of PCI devices so far, but I guess if any
of such non-PCI devices ever appear and those use a GSI, and Xen
supports passthrough for them, then yes, we would need to fetch such
GSI somehow.
Thanks, Roger.
next prev parent reply other threads:[~2024-02-12 9:13 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-05 6:22 [RFC KERNEL PATCH v4 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
2024-01-05 6:22 ` [RFC KERNEL PATCH v4 1/3] xen/pci: Add xen_reset_device_state function Jiqian Chen
2024-02-23 0:18 ` Stefano Stabellini
2024-01-05 6:22 ` [RFC KERNEL PATCH v4 2/3] xen/pvh: Setup gsi for passthrough device Jiqian Chen
2024-02-23 0:23 ` Stefano Stabellini
2024-02-23 6:08 ` Chen, Jiqian
2024-01-05 6:22 ` [RFC KERNEL PATCH v4 3/3] PCI/sysfs: Add gsi sysfs for pci_dev Jiqian Chen
2024-01-22 6:36 ` Chen, Jiqian
2024-01-22 23:37 ` Bjorn Helgaas
2024-01-23 10:13 ` Chen, Jiqian
2024-01-23 16:02 ` Bjorn Helgaas
2024-01-25 7:17 ` Chen, Jiqian
2024-01-29 22:01 ` Bjorn Helgaas
2024-01-30 9:07 ` Roger Pau Monné
2024-01-30 20:44 ` Bjorn Helgaas
2024-01-31 8:58 ` Roger Pau Monné
2024-01-31 19:00 ` Bjorn Helgaas
2024-02-01 8:39 ` Roger Pau Monné
2024-02-09 21:05 ` Bjorn Helgaas
2024-02-12 9:13 ` Roger Pau Monné [this message]
2024-02-12 19:18 ` Bjorn Helgaas
2024-02-15 8:37 ` Roger Pau Monné
2024-03-01 7:57 ` Chen, Jiqian
2024-04-08 6:42 ` Chen, Jiqian
2024-04-09 20:03 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZcnhOEjnTgbYFPVl@macbook \
--to=roger.pau@citrix.com \
--cc=Jiqian.Chen@amd.com \
--cc=Ray.Huang@amd.com \
--cc=Stewart.Hildebrand@amd.com \
--cc=Xenia.Ragiadakou@amd.com \
--cc=bhelgaas@google.com \
--cc=boris.ostrovsky@oracle.com \
--cc=helgaas@kernel.org \
--cc=jgross@suse.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=oleksandr_tyshchenko@epam.com \
--cc=rafael@kernel.org \
--cc=sstabellini@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox