public inbox for linux-hyperv@vger.kernel.org
 help / color / mirror / Atom feed
From: Easwar Hariharan <easwar.hariharan@linux.microsoft.com>
To: Michael Kelley <mhklinux@outlook.com>
Cc: easwar.hariharan@linux.microsoft.com,
	Yu Zhang <zhangyu1@linux.microsoft.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"kys@microsoft.com" <kys@microsoft.com>,
	"haiyangz@microsoft.com" <haiyangz@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	"decui@microsoft.com" <decui@microsoft.com>,
	"lpieralisi@kernel.org" <lpieralisi@kernel.org>,
	"kwilczynski@kernel.org" <kwilczynski@kernel.org>,
	"mani@kernel.org" <mani@kernel.org>,
	"robh@kernel.org" <robh@kernel.org>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"arnd@arndb.de" <arnd@arndb.de>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"will@kernel.org" <will@kernel.org>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	"jacob.pan@linux.microsoft.com" <jacob.pan@linux.microsoft.com>,
	"nunodasneves@linux.microsoft.com"
	<nunodasneves@linux.microsoft.com>,
	"mrathor@linux.microsoft.com" <mrathor@linux.microsoft.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Subject: Re: [RFC v1 1/5] PCI: hv: Create and export hv_build_logical_dev_id()
Date: Tue, 14 Apr 2026 12:43:48 -0700	[thread overview]
Message-ID: <c1c86f2d-3535-45bb-86dd-90c3b05c16fb@linux.microsoft.com> (raw)
In-Reply-To: <SN6PR02MB415741EA3C27630199BCEAA3D4252@SN6PR02MB4157.namprd02.prod.outlook.com>

On 4/14/2026 11:06 AM, Michael Kelley wrote:
> From: Easwar Hariharan <easwar.hariharan@linux.microsoft.com> Sent: Tuesday, April 14, 2026 10:42 AM
>>
> 
> [snip]
>  
>>>> Thanks for that explanation, that makes sense. I didn't see any serialization
>>>> that would ensure that the VMBus path to communicate the child devices on the bus
>>>> would complete before pci_scan_device() finds and finalizes the pci_dev. I think it's
>>>
>>> FWIW, hv_pci_query_relations() should be ensuring that the communication
>>> has completed before it returns. It does a wait_for_reponse(), which ensures
>>> that the Hyper-V host has sent the PCI_BUS_RELATIONS[2] response. However,
>>> that message spins off work to the hbus->wq workqueue, so
>>> hv_pci_query_relations() has a flush_workqueue() so ensure everything that
>>> was queued has completed.
>>
>> Hm, I read the comment for the flush_workqueue() as addressing the "PCI_BUS_RELATIONS[2]
>> message arrived before we sent the QUERY_BUS_RELATIONS message" race case, not as an
>> "all child devices have definitely been received and processed in response to our
>> QUERY_BUS_RELATIONS message". Also, knowing very little about the VMBus contract, I
>> discounted the 100 ms timeout in wait_for_response() as a serialization guarantee.
> 
> Yeah, that timeout is so that the code can wake up every 100 ms to check
> if the device has been rescinded (i.e., removed). If the device isn't
> rescinded, wait_for_response() waits forever until a response comes in.

I don't know how I missed that. :(

>>
>> Chalk it up to previous experience dealing with hardware that's *supposed* to be
>> spec-compliant and complete initialization within specified timings. :)
>>
>> I see now that the flush is sufficient though.
>>
>>>
>>> Thinking more about the "hv_pcibus_installed" case, if that path is ever
>>> triggered, I don't think anything needs to be done with the logical device ID.
>>> The vPCI device has already been fully initialized on the Linux side, and it's
>>> logical device ID would not change.
>>>
>>> So I think you could construct the full logical device ID once
>>> hv_pci_query_relations() returns to hv_pci_probe().
>>
>> Let me think about this more and decide between the logical ID and full bus GUID
>> options.
>>
>>>
>>>> safest to take the approach to communicate the GUID, and find the function number from
>>>> the pci_dev. This does mean that there will be an essentially identical copy of
>>>> hv_build_logical_dev_id() in the IOMMU code, but a comment can explain that.
>>>
>>> With this alternative approach, is there a need to communicate the full
>>> GUID to the pvIOMMU drvier? Couldn't you just communicate bytes 4 thru
>>> 7, which would be logical device ID minus the function number?
>>
>> Yes, we could just communicate bytes 4 through 7 but the pvIOMMU version of the build logical
>> ID function would diverge from the pci-hyperv version. I figured if we say (in a comment)
>> that this is the same ID as generated in pci-hyperv, it's better for future readers to see it
>> to be clearly identical at first glance.
>>
>> It's also possible to change the pci-hyperv function to only take bytes 4 through 7 instead of the
>> full GUID, but I rather think we don't need that impedance mismatch of bytes 4 through 7 of the
>> GUID becoming bytes 0 through 3 of a u32.
>>
>>>

<snip>

>>
>>>>>>>
>>>>>>> I don't think the pci-hyperv driver even needs to tell the IOMMU driver to
>>>>>>> remove the information if a PCI pass-thru device is unbound or removed, as
>>>>>>> the logical device ID will be the same if the device ever comes back. At worst,
>>>>>>> the IOMMU driver can simply replace an existing logical device ID if a new one
>>>>>>> is provided for the same PCI domain ID.
>>>>>>
>>>>>> As above, replacing a unique GUID when a result is found for a non-unique
>>>>>> key value may be prone to failure if it happens that the device that came "back"
>>>>>> is not in fact the same device (or class of device) that went away and just happens
>>>>>> to, either due to bytes 4 and 5 being identical, or due to collision in the
>>>>>> pci_domain_nr_dynamic_ida, have the same domain number.
>>>>
>>>> Given the vPCI team's statements (above), I think we will need to handle unbind or
>>>> removal and ensure the pvIOMMU drivers data structure is invalidated when either
>>>> happens.
>>>
>>> The generic PCI code should handle detaching from the pvIOMMU. So I'm assuming
>>> your statement is specifically about the mapping from domain ID to logical device ID.
>>
>> Yes, apologies for the vagueness (again).
>>
>>> I still think removing it may be unnecessary since adding a mapping for a new vPCI
>>> device with the same domain ID but different logical device ID could just overwrite
>>> any existing mapping. And leaving a dead mapping in the pvIOMMU data structures
>>> doesn’t actually hurt anything. On the other hand, removing/invalidating it is
>>> certainly more tidy and might prevent some confusion down the road.
>>>
>>
>> Yes, if the data structure maps domain -> logical ID, we can do the overwrite as you say.
>> With my approach of informing the pvIOMMU driver of the entire (bus) GUID, we would want
>> to be careful that we don't assume the 1:1 bus<->device case and overwrite an existing
>> device entry with a new device that's on the same bus.
> 
> Yes, that's a valid point.  I was assuming that the pvIOMMU would use the
> domain ID at the lookup key, since the domain ID is directly available from the
> struct pci_dev that is an input parameter to the IOMMU functions. But in the
> not 1:1 case, that domain ID might refer to a bus with multiple functions. The
> logical device IDs for those devices will be the same except for the low order
> 3 bits that encode with the function number. So maybe the domain ID maps
> to a partial logical device ID, and the pvIOMMU driver must always add in the
> function number so the not 1:1 case works.

Agreed.

> 
> Would the pvIOMMU driver do anything with the full GUID, except extract
> bytes 4 through 7? There's no way I see to use the full GUID as the lookup
> key.
> 

No, the hypercalls only use the logical ID, so the rest of the GUID is unused.

Thanks,
Easwar (he/him)

  reply	other threads:[~2026-04-14 19:44 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-09  5:11 [RFC v1 0/5] Hyper-V: Add para-virtualized IOMMU support for Linux guests Yu Zhang
2025-12-09  5:11 ` [RFC v1 1/5] PCI: hv: Create and export hv_build_logical_dev_id() Yu Zhang
2025-12-09  5:21   ` Randy Dunlap
2025-12-10 17:03     ` Easwar Hariharan
2025-12-10 21:39   ` Bjorn Helgaas
2025-12-11  8:31     ` Yu Zhang
2026-01-08 18:46   ` Michael Kelley
2026-01-09 18:40     ` Easwar Hariharan
2026-01-11 17:36       ` Michael Kelley
2026-04-08 20:20         ` Easwar Hariharan
2026-04-09 19:01           ` Michael Kelley
2026-04-13 21:29             ` Easwar Hariharan
2026-04-14 16:24               ` Michael Kelley
2026-04-14 17:42                 ` Easwar Hariharan
2026-04-14 18:06                   ` Michael Kelley
2026-04-14 19:43                     ` Easwar Hariharan [this message]
2025-12-09  5:11 ` [RFC v1 2/5] iommu: Move Hyper-V IOMMU driver to its own subdirectory Yu Zhang
2025-12-09  5:11 ` [RFC v1 3/5] hyperv: Introduce new hypercall interfaces used by Hyper-V guest IOMMU Yu Zhang
2026-01-08 18:47   ` Michael Kelley
2026-01-09 18:47     ` Easwar Hariharan
2026-01-09 19:24       ` Michael Kelley
2025-12-09  5:11 ` [RFC v1 4/5] hyperv: allow hypercall output pages to be allocated for child partitions Yu Zhang
2026-01-08 18:47   ` Michael Kelley
2026-01-10  5:07     ` Yu Zhang
2026-01-11 22:27       ` Michael Kelley
2025-12-09  5:11 ` [RFC v1 5/5] iommu/hyperv: Add para-virtualized IOMMU support for Hyper-V guest Yu Zhang
2025-12-10 17:15   ` Easwar Hariharan
2025-12-11  8:41     ` Yu Zhang
2026-01-08 18:48   ` Michael Kelley
2026-01-12 16:56     ` Yu Zhang
2026-01-12 17:48       ` Michael Kelley
2026-01-13 17:29         ` Jacob Pan
2026-01-14 15:43           ` Michael Kelley
2026-01-08 18:45 ` [RFC v1 0/5] Hyper-V: Add para-virtualized IOMMU support for Linux guests Michael Kelley
2026-01-10  5:39   ` Yu Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c1c86f2d-3535-45bb-86dd-90c3b05c16fb@linux.microsoft.com \
    --to=easwar.hariharan@linux.microsoft.com \
    --cc=arnd@arndb.de \
    --cc=bhelgaas@google.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=iommu@lists.linux.dev \
    --cc=jacob.pan@linux.microsoft.com \
    --cc=joro@8bytes.org \
    --cc=kwilczynski@kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=mani@kernel.org \
    --cc=mhklinux@outlook.com \
    --cc=mrathor@linux.microsoft.com \
    --cc=nunodasneves@linux.microsoft.com \
    --cc=peterz@infradead.org \
    --cc=robh@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=wei.liu@kernel.org \
    --cc=will@kernel.org \
    --cc=zhangyu1@linux.microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox