From: Sairaj Kodilkar <sarunkod@amd.com>
To: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>,
<qemu-devel@nongnu.org>
Cc: <suravee.suthikulpanit@amd.com>, <joao.m.martins@oracle.com>,
<philmd@linaro.org>, <vasant.hegde@amd.com>
Subject: Re: [PATCH 1/2] hw/i386/amd_iommu: Fix device setup failure when PT is on.
Date: Thu, 17 Apr 2025 15:51:57 +0530 [thread overview]
Message-ID: <610d0e59-4ca2-45a4-b22f-dbe0e9098200@amd.com> (raw)
In-Reply-To: <c0a214d8-f9d4-4fa3-8262-88cdd4372878@oracle.com>
On 4/15/2025 11:58 PM, Alejandro Jimenez wrote:
>
>
> On 4/15/25 2:38 AM, Sairaj Kodilkar wrote:
>>
>>
>> Hi Alejandro,
>>
>> On 4/15/2025 1:56 AM, Alejandro Jimenez wrote:
>>
>>> Hi Sairaj,
>>>
>>> I'm conflicted by the implementation of the change, so I'd like to
>>> make sure I fully understand...
>>>
>>> On 4/10/25 2:44 AM, Sairaj Kodilkar wrote:
>
>>>> Fix the issue by removing pt_supported check and disabling nodma memory
>>>> region. Adding pt_supported requires additional changes and we will
>>>> look
>>>> into it later.
>>>
>>> I see that you are trying to essentially block a guest from enabling
>>> an IOMMU feature that is not currently supported by the vIOMMU.
>>> Hopefully that limitation will be solved soon (shameless plug):
>>> https://lore.kernel.org/qemu-devel/20250414020253.443831-1-
>>> alejandro.j.jimenez@oracle.com/
>>>
>>> But in the meantime, I think enabling amdvi_dev_as->iommu when DMA
>>> remapping capability is not available is likely to cause more
>>> confusion for anyone trying to understand the already convoluted
>>> details of the memory region setup.
>>
>>> To a reader of the code and the commit message, it is confusing that
>>> to support the "NO DMA" case, the nodma memory region must be
>>> disabled, which is the opposite of what it is meant to do.
>>>
>>
>> I dont think that I understand above statement. What do you mean by "NO
>> DMA" case here ? is it iommu.passthrough=0 ?
>
> I meant it from the point of view of the vIOMMU configuration (since we
> don't control what the guest can request). So in terms of vIOMMU modes
> and corresponding Memory Regions that must be enabled to support such
> modes, I see it as:
>
> Passthrough(NO DMA) --> MR: iommu_nodma: enabled && iommu: disabled
>
> DMA remap --> MR: iommu: enabled && iommu_nodma: disabled
>
> But I recognize that view/model is probably too rigid for now, although
> it should be the "correct state" once we support DMA remapping.
>
>>
>> Essentially, I am trying to support the "DMA" case that is
>> iommu.passthrough=0 for the emulated devices, by reverting the
>> changes> that introduced the regression.
>
> I understand the goal is to make emulated devs to work in more scenarios.
>
> Because of that view that I mention above, is why I don't think of
> c1f46999ef506 ("amd_iommu: Add support for pass though mode") as
> introducing a regression, but more of a prerequisite to support both PT
> and DMA modes.
>
>>
>> If I understand correct -->
>> The original intent of the flag (in case of Intel) is
>>
>> 1. To turn on the optimization which will use nodma region (dynamically
>> enabling it) if guest configures the device with passthrough (pt=1)
>> for given context entry.
>
> This is why I said I am conflicted with the implementation. Your change
> always disables the iommu_nodma region, where the default for Linux
> guests is to use passthrough mode, which "normally" would result in
> iommu_nodma being enabled. I almost suggested on my first reply that you
> instead forced x86_iommu->pt_supported = 0 in the AMDVi code, but that
> creates a similar type of contradiction.
>
> In short, I understand what you are trying to do, but I think "the
> trick" as I called it below should probably be documented.
>
Yes, I will respin the series with better commit message. I understand
that this patch can cause the confusion.
Regards
Sairaj Kodilkar
>>
>> 2. The flag should not enable no_dma region if guest does not configure
>> device with pt.
>>
>> Intel driver does this dynamically (for every context entry update while
>> guest is running). But for AMD this is static and does not change with
>> the DTE updates, which is causing this regression.
>
> hopefully solved soon:
> https://lore.kernel.org/qemu-devel/20250414020253.443831-15-
> alejandro.j.jimenez@oracle.com/
>
> Alejandro
>
>>
>>> To explain the "trick": this change is always enabling amdvi_dev_as-
>>> >iommu, which is explicitly created as an IOMMU memory region (i.e.
>>> a memory region with mr->is_iommu == true), and it is meant to
>>> support DMA remapping. It is relying on the "side effect" that VFIO
>>> will try to register notifiers for memory regions that are an
>>> "IOMMU" (i.e. pass the check in memory_region_is_iommu()), and later
>>> fail when trying to register the notifier.
>>>
>>> If this change is merged, I think you should at least include the
>>> explanation above in the commit message, since it is not obvious to
>>> me at first reading. That being said, in my opinion, this approach
>>> adds potential confusion that is not worth the trouble, since most
>>> guests will not be using AMD vIOMMU at this point. And if they are,
>>> they would also have to be specifically requesting to enable DMA
>>> translation to hit the problem. Unfortunately, guests will always
>>> have the ability of specifying an invalid configuration if they try
>>> really hard (or not hard at all in this case).
>>>
>>
>> Yep, I should have explained it in details. Sorry about the confusion
>> will keep in mind while sending future patches.
>>
>> Regards
>> Sairaj
>>
>>> Alejandro
>
>
next prev parent reply other threads:[~2025-04-17 10:22 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-10 6:44 [PATCH 0/2] amd_iommu: Fixes Sairaj Kodilkar
2025-04-10 6:44 ` [PATCH 1/2] hw/i386/amd_iommu: Fix device setup failure when PT is on Sairaj Kodilkar
2025-04-10 8:01 ` Vasant Hegde
2025-04-14 20:26 ` Alejandro Jimenez
2025-04-15 6:38 ` Sairaj Kodilkar
2025-04-15 18:28 ` Alejandro Jimenez
2025-04-17 10:21 ` Sairaj Kodilkar [this message]
2025-04-10 6:44 ` [PATCH 2/2] hw/i386/amd_iommu: Fix xtsup when vcpus < 255 Sairaj Kodilkar
2025-04-10 9:16 ` Joao Martins
2025-04-11 0:56 ` Alejandro Jimenez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=610d0e59-4ca2-45a4-b22f-dbe0e9098200@amd.com \
--to=sarunkod@amd.com \
--cc=alejandro.j.jimenez@oracle.com \
--cc=joao.m.martins@oracle.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=suravee.suthikulpanit@amd.com \
--cc=vasant.hegde@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).