Re: [PATCH 1/2] hw/i386/amd_iommu: Fix device setup failure when PT is on.

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
To: Sairaj Kodilkar <sarunkod@amd.com>, qemu-devel@nongnu.org
Cc: suravee.suthikulpanit@amd.com, joao.m.martins@oracle.com,
	philmd@linaro.org, vasant.hegde@amd.com
Subject: Re: [PATCH 1/2] hw/i386/amd_iommu: Fix device setup failure when PT is on.
Date: Tue, 15 Apr 2025 14:28:24 -0400	[thread overview]
Message-ID: <c0a214d8-f9d4-4fa3-8262-88cdd4372878@oracle.com> (raw)
In-Reply-To: <63acddc3-bc7b-47e4-9e7c-66bdc40f23d2@amd.com>



On 4/15/25 2:38 AM, Sairaj Kodilkar wrote:
> 
> 
> Hi Alejandro,
> 
> On 4/15/2025 1:56 AM, Alejandro Jimenez wrote:
> 
>> Hi Sairaj,
>>
>> I'm conflicted by the implementation of the change, so I'd like to 
>> make sure I fully understand...
>>
>> On 4/10/25 2:44 AM, Sairaj Kodilkar wrote:

>>> Fix the issue by removing pt_supported check and disabling nodma memory
>>> region. Adding pt_supported requires additional changes and we will look
>>> into it later.
>>
>> I see that you are trying to essentially block a guest from enabling 
>> an IOMMU feature that is not currently supported by the vIOMMU. 
>> Hopefully that limitation will be solved soon (shameless plug):
>> https://lore.kernel.org/qemu-devel/20250414020253.443831-1- 
>> alejandro.j.jimenez@oracle.com/
>>
>> But in the meantime, I think enabling amdvi_dev_as->iommu when DMA 
>> remapping capability is not available is likely to cause more 
>> confusion for anyone trying to understand the already convoluted 
>> details of the memory region setup. 
> 
>> To a reader of the code and the commit message, it is confusing that 
>> to support the "NO DMA" case, the nodma memory region must be 
>> disabled, which is the opposite of what it is meant to do.
>>
> 
> I dont think that I understand above statement. What do you mean by "NO
> DMA" case here ? is it iommu.passthrough=0 ?

I meant it from the point of view of the vIOMMU configuration (since we 
don't control what the guest can request). So in terms of vIOMMU modes 
and corresponding Memory Regions that must be enabled to support such 
modes, I see it as:

Passthrough(NO DMA) --> MR: iommu_nodma: enabled && iommu: disabled

DMA remap --> MR: iommu: enabled && iommu_nodma: disabled

But I recognize that view/model is probably too rigid for now, although 
it should be the "correct state" once we support DMA remapping.

> 
> Essentially, I am trying to support the "DMA" case that is
> iommu.passthrough=0 for the emulated devices, by reverting the changes> that introduced the regression.

I understand the goal is to make emulated devs to work in more scenarios.

Because of that view that I mention above, is why I don't think of 
c1f46999ef506 ("amd_iommu: Add support for pass though mode") as 
introducing a regression, but more of a prerequisite to support both PT 
and DMA modes.

> 
> If I understand correct -->
> The original intent of the flag (in case of Intel) is
> 
> 1. To turn on the optimization which will use nodma region (dynamically
>     enabling it) if guest configures the device with passthrough (pt=1)
>     for given context entry.

This is why I said I am conflicted with the implementation. Your change 
always disables the iommu_nodma region, where the default for Linux 
guests is to use passthrough mode, which "normally" would result in 
iommu_nodma being enabled. I almost suggested on my first reply that you 
instead forced x86_iommu->pt_supported = 0 in the AMDVi code, but that 
creates a similar type of contradiction.

In short, I understand what you are trying to do, but I think "the 
trick" as I called it below should probably be documented.

> 
> 2. The flag should not enable no_dma region if guest does not configure
>     device with pt.
> 
> Intel driver does this dynamically (for every context entry update while
> guest is running). But for AMD this is static and does not change with
> the DTE updates, which is causing this regression.

hopefully solved soon:
https://lore.kernel.org/qemu-devel/20250414020253.443831-15-alejandro.j.jimenez@oracle.com/

Alejandro

> 
>> To explain the "trick": this change is always enabling amdvi_dev_as- 
>>  >iommu, which is explicitly created as an IOMMU memory region (i.e. a 
>> memory region with mr->is_iommu == true), and it is meant to support 
>> DMA remapping. It is relying on the "side effect" that VFIO will try 
>> to register notifiers for memory regions that are an "IOMMU" (i.e. 
>> pass the check in memory_region_is_iommu()), and later fail when 
>> trying to register the notifier.
>>
>> If this change is merged, I think you should at least include the 
>> explanation above in the commit message, since it is not obvious to me 
>> at first reading. That being said, in my opinion, this approach adds 
>> potential confusion that is not worth the trouble, since most guests 
>> will not be using AMD vIOMMU at this point. And if they are, they 
>> would also have to be specifically requesting to enable DMA 
>> translation to hit the problem. Unfortunately, guests will always have 
>> the ability of specifying an invalid configuration if they try really 
>> hard (or not hard at all in this case).
>>
> 
> Yep, I should have explained it in details. Sorry about the confusion
> will keep in mind while sending future patches.
> 
> Regards
> Sairaj
> 
>> Alejandro

next prev parent reply	other threads:[~2025-04-15 18:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-10  6:44 [PATCH 0/2] amd_iommu: Fixes Sairaj Kodilkar
2025-04-10  6:44 ` [PATCH 1/2] hw/i386/amd_iommu: Fix device setup failure when PT is on Sairaj Kodilkar
2025-04-10  8:01   ` Vasant Hegde
2025-04-14 20:26   ` Alejandro Jimenez
2025-04-15  6:38     ` Sairaj Kodilkar
2025-04-15 18:28       ` Alejandro Jimenez [this message]
2025-04-17 10:21         ` Sairaj Kodilkar
2025-04-10  6:44 ` [PATCH 2/2] hw/i386/amd_iommu: Fix xtsup when vcpus < 255 Sairaj Kodilkar
2025-04-10  9:16   ` Joao Martins
2025-04-11  0:56   ` Alejandro Jimenez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c0a214d8-f9d4-4fa3-8262-88cdd4372878@oracle.com \
    --to=alejandro.j.jimenez@oracle.com \
    --cc=joao.m.martins@oracle.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=sarunkod@amd.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=vasant.hegde@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).