From: Thomas Gleixner <tglx@linutronix.de>
To: "Tian, Kevin" <kevin.tian@intel.com>, Jason Gunthorpe <jgg@nvidia.com>
Cc: "Jiang, Dave" <dave.jiang@intel.com>,
Logan Gunthorpe <logang@deltatee.com>,
LKML <linux-kernel@vger.kernel.org>,
Bjorn Helgaas <helgaas@kernel.org>, Marc Zygnier <maz@kernel.org>,
Alex Williamson <alex.williamson@redhat.com>,
"Dey, Megha" <megha.dey@intel.com>,
"Raj, Ashok" <ashok.raj@intel.com>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Jon Mason <jdmason@kudzu.us>, Allen Hubbe <allenbh@gmail.com>,
"linux-ntb@googlegroups.com" <linux-ntb@googlegroups.com>,
"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
Heiko Carstens <hca@linux.ibm.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
"x86@kernel.org" <x86@kernel.org>, Joerg Roedel <jroedel@suse.de>,
"iommu@lists.linux-foundation.org"
<iommu@lists.linux-foundation.org>
Subject: RE: [patch 21/32] NTB/msi: Convert to msi_on_each_desc()
Date: Fri, 10 Dec 2021 13:13:18 +0100 [thread overview]
Message-ID: <87lf0sy7xd.ffs@tglx> (raw)
In-Reply-To: <BN9PR11MB527619B099061B3814EB40408C719@BN9PR11MB5276.namprd11.prod.outlook.com>
Kevin,
On Fri, Dec 10 2021 at 07:29, Kevin Tian wrote:
>> From: Thomas Gleixner <tglx@linutronix.de>
>> 4) For the guest side we agreed that we need an hypercall because the
>> host can't trap the write to the MSI[-X] entry anymore.
>
> To be accurate I'd like to not call it "can't trap". The host still traps the
> MSI/MSI-X entry if the hypercall is not used. This is for current guest
> OS which doesn't have this hypercall mechanism. For future new guest
> OS which will support this machinery then a handshake process from
> such guest will disable the trap for MSI-X and map it for direct guest
> access in the fly.
Right. What I'm suggesting is a clear cut between the current approach,
which obviously needs to be preserved, and a consistent new approach
which handles MSI, MSI-X and IMS in the exactly same way.
> MSI has to be always trapped although the guest has acquired the right
> data/addr pair via the hypercall, since it's a PCI config capability.
>
>>
>> Aside of the fact that this creates a special case for IMS which is
>> undesirable in my opinion, it's not really obvious where the
>> hypercall should be placed to work for all scenarios so that it can
>> also solve the existing issue of silent failures.
>>
>> 5) It's not possible for the kernel to reliably detect whether it is
>> running on bare metal or not. Yes we talked about heuristics, but
>> that's something I really want to avoid.
>
> How would the hypercall mechanism avoid such heuristics?
The availability of IR remapping where the irqdomain which is provided
by the remapping unit signals that it supports this new scheme:
|--IO/APIC
|--MSI
vector -- IR --|--MSI-X
|--IMS
while the current scheme is:
|--IO/APIC
vector -- IR --|--PCI/MSI[-X]
or
|--IO/APIC
vector --------|--PCI/MSI[-X]
So in the new scheme the IR domain will advertise new features which are
not available on older kernels. The availability of these new features
is the indicator for the interrupt subsystem and subsequently for PCI
whether IMS is supported or not.
Bootup either finds an IR unit or not. In the bare metal case that's the
usual hardware/firmware detection. In the guest case it's the
availability of vIR including the required hypercall protocol.
So for the guest case the initialization would look like this:
qemu starts guest
Method of interrupt management defaults to current scheme
restricted to MSI/MSI-X
guest initializes
older guest do not check for the hypercall so everything
continues as of today
new guest initializes vIR, detects hypercall and requests
from the hypervisor to switch over to the new scheme.
The hypervisor grants or rejects the request, i.e. either both
switch to the new scheme or stay with the old one.
The new scheme means, that all variants, MSI, MSI-X, IMS are handled in
the same way. Staying on the old scheme means that IMS is not available
to the guest.
Having that clear separation is in my opinion way better than trying to
make all of that a big maze of conditionals.
I'm going to make that clear cut in the PCI/MSI management layer as
well. Trying to do that hybrid we do today for irqdomain and non
irqdomain based backends is just not feasible. The decision which way to
go will be very close to the driver exposed API, i.e.:
pci_alloc_ims_vector()
if (new_scheme())
return new_scheme_alloc_ims();
else
return -ENOTSUPP;
and new_scheme_alloc_ims() will have:
new_scheme_alloc_ims()
if (!system_supports_ims())
return -ENOTSUPP;
....
system_supports_ims() makes its decision based on the feature flags
exposed by the underlying base MSI irqdomain, i.e. either vector or IR
on x86.
Vector domain will not have that feature flag set, but IR will have on
bare metal and eventually in the guest when the vIR setup and hypercall
detection and negotiation succeeds.
> Then Qemu needs to find out the GSI number for the vIRTE handle.
> Again Qemu doesn't have such information since it doesn't know
> which MSI[-X] entry points to this handle due to no trap.
>
> This implies that we may also need carry device ID, #msi entry, etc.
> in the hypercall, so Qemu can associate the virtual routing info
> to the right [irqfd, gsi].
>
> In your model the hypercall is raised by IR domain. Do you see
> any problem of finding those information within IR domain?
IR has the following information available:
Interrupt type
- MSI: Device, index and number of vectors
- MSI-X: Device, index
- IMS: Device, index
Target APIC/vector pair
IMS: The index depends on the storage type:
For storage in device memory, e.g. IDXD, it's the array index.
For storage in system memory, the index is a software artifact.
Does that answer your question?
Thanks,
tglx
next prev parent reply other threads:[~2021-12-10 12:13 UTC|newest]
Thread overview: 184+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-27 1:22 [patch 00/32] genirq/msi, PCI/MSI: Spring cleaning - Part 2 Thomas Gleixner
2021-11-27 1:22 ` [patch 01/32] genirq/msi: Move descriptor list to struct msi_device_data Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 12:19 ` Greg Kroah-Hartman
2021-11-27 1:22 ` [patch 02/32] genirq/msi: Add mutex for MSI list protection Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 03/32] genirq/msi: Provide msi_domain_alloc/free_irqs_descs_locked() Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 04/32] genirq/msi: Provide a set of advanced MSI accessors and iterators Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-28 1:00 ` Jason Gunthorpe
2021-11-28 19:22 ` Thomas Gleixner
2021-11-29 9:26 ` Thomas Gleixner
2021-11-29 14:01 ` Jason Gunthorpe
2021-11-29 14:46 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 05/32] genirq/msi: Provide msi_alloc_msi_desc() and a simple allocator Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 06/32] genirq/msi: Provide domain flags to allocate/free MSI descriptors automatically Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 07/32] genirq/msi: Count the allocated MSI descriptors Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 12:19 ` Greg Kroah-Hartman
2021-11-27 19:22 ` Thomas Gleixner
2021-11-27 19:45 ` Thomas Gleixner
2021-11-28 11:07 ` Greg Kroah-Hartman
2021-11-28 19:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 08/32] PCI/MSI: Protect MSI operations Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 09/32] PCI/MSI: Use msi_add_msi_desc() Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 10/32] PCI/MSI: Let core code free MSI descriptors Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 11/32] PCI/MSI: Use msi_on_each_desc() Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 12/32] x86/pci/xen: Use msi_for_each_desc() Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 13/32] xen/pcifront: Rework MSI handling Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 14/32] s390/pci: Rework MSI descriptor walk Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-29 10:31 ` Niklas Schnelle
2021-11-29 13:04 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 15/32] powerpc/4xx/hsta: Rework MSI handling Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 16/32] powerpc/cell/axon_msi: Convert to msi_on_each_desc() Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 17/32] powerpc/pasemi/msi: Convert to msi_on_each_dec() Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 18/32] powerpc/fsl_msi: Use msi_for_each_desc() Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 19/32] powerpc/mpic_u3msi: Use msi_for_each-desc() Thomas Gleixner
2021-11-27 1:23 ` Thomas Gleixner
2021-11-27 1:22 ` [patch 20/32] PCI: hv: Rework MSI handling Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-27 1:23 ` [patch 21/32] NTB/msi: Convert to msi_on_each_desc() Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-29 18:21 ` Logan Gunthorpe
2021-11-29 20:51 ` Thomas Gleixner
2021-11-29 22:27 ` Logan Gunthorpe
2021-11-29 22:50 ` Dave Jiang
2021-11-29 23:31 ` Jason Gunthorpe
2021-11-29 23:52 ` Logan Gunthorpe
2021-11-30 0:01 ` Jason Gunthorpe
2021-11-30 0:29 ` Thomas Gleixner
2021-11-30 19:21 ` Logan Gunthorpe
2021-11-30 19:48 ` Thomas Gleixner
2021-11-30 20:14 ` Logan Gunthorpe
2021-11-30 20:28 ` Jason Gunthorpe
2021-11-30 21:23 ` Thomas Gleixner
2021-12-01 0:17 ` Jason Gunthorpe
2021-12-01 10:16 ` Thomas Gleixner
2021-12-01 13:00 ` Jason Gunthorpe
2021-12-01 17:35 ` Thomas Gleixner
2021-12-01 18:14 ` Jason Gunthorpe
2021-12-01 18:46 ` Logan Gunthorpe
2021-12-01 20:21 ` Thomas Gleixner
2021-12-02 0:01 ` Thomas Gleixner
2021-12-02 13:55 ` Jason Gunthorpe
2021-12-02 14:23 ` Greg Kroah-Hartman
2021-12-02 14:45 ` Jason Gunthorpe
2021-12-02 19:25 ` Thomas Gleixner
2021-12-02 20:00 ` Jason Gunthorpe
2021-12-02 22:31 ` Thomas Gleixner
2021-12-03 0:37 ` Jason Gunthorpe
2021-12-03 15:07 ` Thomas Gleixner
2021-12-03 16:41 ` Jason Gunthorpe
2021-12-04 14:20 ` Thomas Gleixner
2021-12-05 14:16 ` Thomas Gleixner
2021-12-06 14:43 ` Jason Gunthorpe
2021-12-06 15:47 ` Thomas Gleixner
2021-12-06 17:00 ` Jason Gunthorpe
2021-12-06 20:28 ` Thomas Gleixner
2021-12-06 21:06 ` Jason Gunthorpe
2021-12-06 22:21 ` Thomas Gleixner
2021-12-06 14:19 ` Jason Gunthorpe
2021-12-06 15:06 ` Thomas Gleixner
2021-12-09 6:26 ` Tian, Kevin
2021-12-09 9:03 ` Thomas Gleixner
2021-12-09 12:17 ` Tian, Kevin
2021-12-09 15:57 ` Thomas Gleixner
2021-12-10 7:37 ` Tian, Kevin
2021-12-09 5:41 ` Tian, Kevin
2021-12-09 5:47 ` Jason Wang
2021-12-01 16:28 ` Dave Jiang
2021-12-01 18:41 ` Thomas Gleixner
2021-12-01 18:47 ` Dave Jiang
2021-12-01 20:25 ` Thomas Gleixner
2021-12-01 21:21 ` Dave Jiang
2021-12-01 21:44 ` Thomas Gleixner
2021-12-01 21:49 ` Dave Jiang
2021-12-01 22:03 ` Thomas Gleixner
2021-12-01 22:53 ` Dave Jiang
2021-12-01 23:57 ` Thomas Gleixner
2021-12-09 5:23 ` Tian, Kevin
2021-12-09 8:37 ` Thomas Gleixner
2021-12-09 12:31 ` Tian, Kevin
2021-12-09 16:21 ` Jason Gunthorpe
2021-12-09 20:32 ` Thomas Gleixner
2021-12-09 20:58 ` Jason Gunthorpe
2021-12-09 22:09 ` Thomas Gleixner
2021-12-10 0:26 ` Thomas Gleixner
2021-12-10 7:29 ` Tian, Kevin
2021-12-10 12:13 ` Thomas Gleixner [this message]
2021-12-11 8:06 ` Tian, Kevin
2021-12-10 12:39 ` Jason Gunthorpe
2021-12-10 19:00 ` Thomas Gleixner
2021-12-11 7:44 ` Tian, Kevin
2021-12-11 13:04 ` Thomas Gleixner
2021-12-12 1:56 ` Tian, Kevin
2021-12-12 20:55 ` Thomas Gleixner
2021-12-12 23:37 ` Jason Gunthorpe
2021-12-13 7:50 ` Tian, Kevin
2022-09-15 9:24 ` Tian, Kevin
2022-09-20 14:09 ` Jason Gunthorpe
2022-09-21 7:57 ` Tian, Kevin
2022-09-21 12:48 ` Jason Gunthorpe
2022-09-22 5:11 ` Tian, Kevin
2022-09-22 12:13 ` Jason Gunthorpe
2022-09-22 22:42 ` Tian, Kevin
2022-09-23 13:26 ` Jason Gunthorpe
2021-12-11 7:52 ` Tian, Kevin
2021-12-12 0:12 ` Thomas Gleixner
2021-12-12 2:14 ` Tian, Kevin
2021-12-12 20:50 ` Thomas Gleixner
2021-12-12 23:42 ` Jason Gunthorpe
2021-12-10 7:36 ` Tian, Kevin
2021-12-10 12:30 ` Jason Gunthorpe
2021-12-12 6:44 ` Mika Penttilä
2021-12-12 23:27 ` Jason Gunthorpe
2021-12-01 14:52 ` Thomas Gleixner
2021-12-01 15:11 ` Jason Gunthorpe
2021-12-01 18:37 ` Thomas Gleixner
2021-12-01 18:47 ` Jason Gunthorpe
2021-12-01 20:26 ` Thomas Gleixner
2022-12-05 18:25 ` [tip: irq/core] PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X tip-bot2 for Thomas Gleixner
2022-12-05 21:41 ` tip-bot2 for Thomas Gleixner
2021-11-27 1:23 ` [patch 22/32] soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-27 1:23 ` [patch 23/32] soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs() Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-27 1:23 ` [patch 24/32] bus: fsl-mc-msi: Simplify MSI descriptor handling Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-27 1:23 ` [patch 25/32] platform-msi: Let core code handle MSI descriptors Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-27 1:23 ` [patch 26/32] platform-msi: Simplify platform device MSI code Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-27 1:23 ` [patch 27/32] genirq/msi: Make interrupt allocation less convoluted Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-27 1:23 ` [patch 28/32] genirq/msi: Convert to new functions Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-27 1:23 ` [patch 29/32] genirq/msi: Mop up old interfaces Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-27 1:23 ` [patch 30/32] genirq/msi: Add abuse prevention comment to msi header Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-27 1:23 ` [patch 31/32] genirq/msi: Simplify sysfs handling Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-27 12:32 ` Greg Kroah-Hartman
2021-11-27 19:31 ` Thomas Gleixner
2021-11-28 11:07 ` Greg Kroah-Hartman
2021-11-28 19:33 ` Thomas Gleixner
2021-11-27 1:23 ` [patch 32/32] genirq/msi: Convert storage to xarray Thomas Gleixner
2021-11-27 1:24 ` Thomas Gleixner
2021-11-27 12:33 ` Greg Kroah-Hartman
2021-11-27 1:23 ` [patch 00/32] genirq/msi, PCI/MSI: Spring cleaning - Part 2 Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87lf0sy7xd.ffs@tglx \
--to=tglx@linutronix.de \
--cc=alex.williamson@redhat.com \
--cc=allenbh@gmail.com \
--cc=ashok.raj@intel.com \
--cc=borntraeger@de.ibm.com \
--cc=dave.jiang@intel.com \
--cc=gregkh@linuxfoundation.org \
--cc=hca@linux.ibm.com \
--cc=helgaas@kernel.org \
--cc=iommu@lists.linux-foundation.org \
--cc=jdmason@kudzu.us \
--cc=jgg@nvidia.com \
--cc=jroedel@suse.de \
--cc=kevin.tian@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-ntb@googlegroups.com \
--cc=linux-pci@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=logang@deltatee.com \
--cc=maz@kernel.org \
--cc=megha.dey@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).