linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Virtio interrupt remapping
@ 2025-06-13 17:08 Demi Marie Obenour
  2025-06-13 18:13 ` Jean-Philippe Brucker
  0 siblings, 1 reply; 5+ messages in thread
From: Demi Marie Obenour @ 2025-06-13 17:08 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Joerg Roedel, Will Deacon, Robin Murphy,
	virtualization, iommu, linux-kernel, devel, Alyssa Ross


[-- Attachment #1.1.1: Type: text/plain, Size: 506 bytes --]

I’m working on virtio-IOMMU interrupt remapping for Spectrum OS [1],
and am running into a problem.  All of the current interrupt remapping
drivers use __init code during initialization, and I’m not sure how to
plumb the struct virtio_device * into the IOMMU initialization code.

What is the proper way to do this, where “proper” means that it doesn’t
do something disgusting like “stuff the virtio device in a global
variable”?
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)


[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Virtio interrupt remapping
  2025-06-13 17:08 Virtio interrupt remapping Demi Marie Obenour
@ 2025-06-13 18:13 ` Jean-Philippe Brucker
  2025-06-13 18:50   ` Demi Marie Obenour
  0 siblings, 1 reply; 5+ messages in thread
From: Jean-Philippe Brucker @ 2025-06-13 18:13 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, virtualization, iommu,
	linux-kernel, devel, Alyssa Ross

Hi,

On Fri, Jun 13, 2025 at 01:08:07PM -0400, Demi Marie Obenour wrote:
> I’m working on virtio-IOMMU interrupt remapping for Spectrum OS [1],
> and am running into a problem.  All of the current interrupt remapping
> drivers use __init code during initialization, and I’m not sure how to
> plumb the struct virtio_device * into the IOMMU initialization code.
> 
> What is the proper way to do this, where “proper” means that it doesn’t
> do something disgusting like “stuff the virtio device in a global
> variable”?

I'm not familiar at all with interrupt remapping, but I suspect a major
hurdle will be device probing order: the PCI subsystem probes the
virtio-pci transport device relatively late during boot, and the virtio
driver probes the virtio-iommu device afterwards, at which point we can
call viommu_probe() and inspect the device features and config.  This can
be quite late in userspace if virtio and virtio-iommu get loaded as
modules (which distros tend to do).

The way we know to hold off initializing dependent devices before the
IOMMU is ready is by reading the firmware tables. In devicetree the
"msi-parent" and "msi-map" properties point to the interrupt remapping
device, so by reading those Linux knows to wait for the probe of the
remapping device before setting up those endpoints. The ACPI VIOT
describes this topology as well, although at the moment it does not have
separate graphs for MMU and interrupts, like devicetree does (could
probably be added to the spec if needed, but I'm guessing the topologies
may be the same for a VM).  If the interrupt infrastructure supports
probe deferral, then that's probably the way to go.

Thanks,
Jean


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Virtio interrupt remapping
  2025-06-13 18:13 ` Jean-Philippe Brucker
@ 2025-06-13 18:50   ` Demi Marie Obenour
  2025-06-14  8:11     ` Alyssa Ross
  0 siblings, 1 reply; 5+ messages in thread
From: Demi Marie Obenour @ 2025-06-13 18:50 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, virtualization, iommu,
	linux-kernel, devel, Alyssa Ross, Thomas Gleixner, Bjorn Helgaas,
	linux-pci


[-- Attachment #1.1.1: Type: text/plain, Size: 2070 bytes --]

On 6/13/25 14:13, Jean-Philippe Brucker wrote:
> Hi,
> 
> On Fri, Jun 13, 2025 at 01:08:07PM -0400, Demi Marie Obenour wrote:
>> I’m working on virtio-IOMMU interrupt remapping for Spectrum OS [1],
>> and am running into a problem.  All of the current interrupt remapping
>> drivers use __init code during initialization, and I’m not sure how to
>> plumb the struct virtio_device * into the IOMMU initialization code.
>>
>> What is the proper way to do this, where “proper” means that it doesn’t
>> do something disgusting like “stuff the virtio device in a global
>> variable”?
> 
> I'm not familiar at all with interrupt remapping, but I suspect a major
> hurdle will be device probing order: the PCI subsystem probes the
> virtio-pci transport device relatively late during boot, and the virtio
> driver probes the virtio-iommu device afterwards, at which point we can
> call viommu_probe() and inspect the device features and config.  This can
> be quite late in userspace if virtio and virtio-iommu get loaded as
> modules (which distros tend to do).> 
> The way we know to hold off initializing dependent devices before the
> IOMMU is ready is by reading the firmware tables. In devicetree the
> "msi-parent" and "msi-map" properties point to the interrupt remapping
> device, so by reading those Linux knows to wait for the probe of the
> remapping device before setting up those endpoints. The ACPI VIOT
> describes this topology as well, although at the moment it does not have
> separate graphs for MMU and interrupts, like devicetree does (could
> probably be added to the spec if needed, but I'm guessing the topologies
> may be the same for a VM).  If the interrupt infrastructure supports
> probe deferral, then that's probably the way to go.

I don't see any examples of probe deferral in the codebase.  Would it
instead be possible to require virtio-iommu (and thus virtio) to be
built-in rather than modules?

CCing the IRQ and PCI maintainers as well.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Virtio interrupt remapping
  2025-06-13 18:50   ` Demi Marie Obenour
@ 2025-06-14  8:11     ` Alyssa Ross
  2025-06-16 16:07       ` Jean-Philippe Brucker
  0 siblings, 1 reply; 5+ messages in thread
From: Alyssa Ross @ 2025-06-14  8:11 UTC (permalink / raw)
  To: Demi Marie Obenour, Jean-Philippe Brucker
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, virtualization, iommu,
	linux-kernel, devel, Thomas Gleixner, Bjorn Helgaas, linux-pci

[-- Attachment #1: Type: text/plain, Size: 2216 bytes --]

Demi Marie Obenour <demiobenour@gmail.com> writes:

> On 6/13/25 14:13, Jean-Philippe Brucker wrote:
>> Hi,
>> 
>> On Fri, Jun 13, 2025 at 01:08:07PM -0400, Demi Marie Obenour wrote:
>>> I’m working on virtio-IOMMU interrupt remapping for Spectrum OS [1],
>>> and am running into a problem.  All of the current interrupt remapping
>>> drivers use __init code during initialization, and I’m not sure how to
>>> plumb the struct virtio_device * into the IOMMU initialization code.
>>>
>>> What is the proper way to do this, where “proper” means that it doesn’t
>>> do something disgusting like “stuff the virtio device in a global
>>> variable”?
>> 
>> I'm not familiar at all with interrupt remapping, but I suspect a major
>> hurdle will be device probing order: the PCI subsystem probes the
>> virtio-pci transport device relatively late during boot, and the virtio
>> driver probes the virtio-iommu device afterwards, at which point we can
>> call viommu_probe() and inspect the device features and config.  This can
>> be quite late in userspace if virtio and virtio-iommu get loaded as
>> modules (which distros tend to do).> 
>> The way we know to hold off initializing dependent devices before the
>> IOMMU is ready is by reading the firmware tables. In devicetree the
>> "msi-parent" and "msi-map" properties point to the interrupt remapping
>> device, so by reading those Linux knows to wait for the probe of the
>> remapping device before setting up those endpoints. The ACPI VIOT
>> describes this topology as well, although at the moment it does not have
>> separate graphs for MMU and interrupts, like devicetree does (could
>> probably be added to the spec if needed, but I'm guessing the topologies
>> may be the same for a VM).  If the interrupt infrastructure supports
>> probe deferral, then that's probably the way to go.
>
> I don't see any examples of probe deferral in the codebase.  Would it
> instead be possible to require virtio-iommu (and thus virtio) to be
> built-in rather than modules?

It's certainly possible to have an optional feature in the kernel that
depends on a module being built in where it otherwise wouldn't have to be.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Virtio interrupt remapping
  2025-06-14  8:11     ` Alyssa Ross
@ 2025-06-16 16:07       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 5+ messages in thread
From: Jean-Philippe Brucker @ 2025-06-16 16:07 UTC (permalink / raw)
  To: Alyssa Ross
  Cc: Demi Marie Obenour, Joerg Roedel, Will Deacon, Robin Murphy,
	virtualization, iommu, linux-kernel, devel, Thomas Gleixner,
	Bjorn Helgaas, linux-pci, eric.auger

[+Eric]

On Sat, Jun 14, 2025 at 10:11:52AM +0200, Alyssa Ross wrote:
> Demi Marie Obenour <demiobenour@gmail.com> writes:
> 
> > On 6/13/25 14:13, Jean-Philippe Brucker wrote:
> >> Hi,
> >> 
> >> On Fri, Jun 13, 2025 at 01:08:07PM -0400, Demi Marie Obenour wrote:
> >>> I’m working on virtio-IOMMU interrupt remapping for Spectrum OS [1],
> >>> and am running into a problem.  All of the current interrupt remapping
> >>> drivers use __init code during initialization, and I’m not sure how to
> >>> plumb the struct virtio_device * into the IOMMU initialization code.
> >>>
> >>> What is the proper way to do this, where “proper” means that it doesn’t
> >>> do something disgusting like “stuff the virtio device in a global
> >>> variable”?
> >> 
> >> I'm not familiar at all with interrupt remapping, but I suspect a major
> >> hurdle will be device probing order: the PCI subsystem probes the
> >> virtio-pci transport device relatively late during boot, and the virtio
> >> driver probes the virtio-iommu device afterwards, at which point we can
> >> call viommu_probe() and inspect the device features and config.  This can
> >> be quite late in userspace if virtio and virtio-iommu get loaded as
> >> modules (which distros tend to do).> 
> >> The way we know to hold off initializing dependent devices before the
> >> IOMMU is ready is by reading the firmware tables. In devicetree the
> >> "msi-parent" and "msi-map" properties point to the interrupt remapping
> >> device, so by reading those Linux knows to wait for the probe of the
> >> remapping device before setting up those endpoints. The ACPI VIOT
> >> describes this topology as well, although at the moment it does not have
> >> separate graphs for MMU and interrupts, like devicetree does (could
> >> probably be added to the spec if needed, but I'm guessing the topologies
> >> may be the same for a VM).  If the interrupt infrastructure supports
> >> probe deferral, then that's probably the way to go.
> >
> > I don't see any examples of probe deferral in the codebase.

I think the flow with VIOT is roughly:

 // Scan an endpoint
 pci_bus_add_device()
  device_attach()
   driver_probe_device()
    really_probe()
     dev->bus->dma_configure()
      pci_dma_configure()
       acpi_dma_configure()
        acpi_iommu_configure_id()
         viot_iommu_configure()
	  viot_dev_iommu_init()
	   acpi_iommu_fwspec_init()
	    iommu_fwspec_init()
	     driver_deferred_probe_check_state()
	     iommu ready ? 0 : -EPROBE_DEFER

So if the IOMMU isn't available at this point, base/dd.c adds the device
to the deferred probe list, to be retried later. Later:

 // Scan the IOMMU
 ...
  virtio_dev_probe()
   viommu_probe()
    iommu_device_register()
     add to iommu_device_list
     iommu->ready = true

I believe after this probe completes, base/dd.c checks if dependent
devices can now be probed:

 driver_bound()
  driver_deferred_probe_trigger()

That should all be working and you don't need to add anything. The
question is whether the interrupt core supports starting the setup of
interrupt remapping in viommu_probe(), or if it needs to know of the
IOMMU's config and features earlier during boot. Even if the viommu driver
is builtin, those info may not be available early enough since they
require PCI and virtio probe.

> > Would it instead be possible to require virtio-iommu (and thus virtio)
> > to be built-in rather than modules?
> 
> It's certainly possible to have an optional feature in the kernel that
> depends on a module being built in where it otherwise wouldn't have to be.

Agree, no problem requiring this as a first step, but the IOMMU probe
might still not be early enough. 

Thanks,
Jean

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-06-16 16:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-13 17:08 Virtio interrupt remapping Demi Marie Obenour
2025-06-13 18:13 ` Jean-Philippe Brucker
2025-06-13 18:50   ` Demi Marie Obenour
2025-06-14  8:11     ` Alyssa Ross
2025-06-16 16:07       ` Jean-Philippe Brucker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).