Re: [PATCH 1/2] amd_iommu: Follow root pointer before page walk and use 1-based levels

public inbox for qemu-devel@nongnu.org
 help / color / mirror / Atom feed

From: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
To: Sairaj Kodilkar <sarunkod@amd.com>, qemu-devel@nongnu.org
Cc: qemu@demindiro.com, mst@redhat.com,
	clement.mathieu--drif@eviden.com, pbonzini@redhat.com,
	richard.henderson@linaro.org, eduardo@habkost.net,
	boris.ostrovsky@oracle.com
Subject: Re: [PATCH 1/2] amd_iommu: Follow root pointer before page walk and use 1-based levels
Date: Tue, 24 Mar 2026 10:25:17 -0400	[thread overview]
Message-ID: <ba2d7018-7dcc-42d7-ba52-395d4e1d0298@oracle.com> (raw)
In-Reply-To: <2756c509-4286-444a-815c-d6266c3e3a91@amd.com>

On 3/23/26 7:01 AM, Sairaj Kodilkar wrote:
> 
> 
> On 3/12/2026 2:09 AM, Alejandro Jimenez wrote:
>> DTE[Mode] and PTE NextLevel encode page table levels as 1-based values, but
>> fetch_pte() currently uses a 0-based level counter, making the logic
>> harder to follow and requiring conversions between DTE mode and level.
>>
>> Switch the page table walk logic to use 1-based level accounting in
>> fetch_pte() and the relevant macro helpers. To further simplify the page
>> walking loop, split the root page table access from the walk i.e. rework
>> fetch_pte() to follow the DTE Page Table Root Pointer and retrieve the top
>> level pagetable entry before entering the loop, then iterate only over the
>> PDE/PTE entries.
>>
>> The reworked algorithm fixes a page walk bug where the page size was
>> calculated for the next level before checking if the current PTE was already
>> a leaf/hugepage. That caused hugepage mappings to be reported as 4K pages,
>> leading to performance degradation and failures in some setups.
>>
>> Fixes: a74bb3110a5b ("amd_iommu: Add helpers to walk AMD v1 Page Table
>> format")
>> Cc: qemu-stable@nongnu.org
>> Reported-by: David Hoppenbrouwers <qemu@demindiro.com>
>> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
>> ---
>>   hw/i386/amd_iommu.c | 132 ++++++++++++++++++++++++++++++--------------
>>   hw/i386/amd_iommu.h |  11 ++--
>>   2 files changed, 97 insertions(+), 46 deletions(-)
>>

[...]

> 
> Hi Alejandro,
> 
> amdvi_sync_shadow_page_table_range() does not check if DTE
> **valid and translation valid** bit are set. I added this check in
> my reply to david's patch. Do you think we should include that
> check as well to ensure that only DTEs with **valid and translation valid**
> bit are passed to the fetch_pte ?
> 

Right now the check for DTE[V] and DTE[TV] are implemented in
amdvi_as_to_dte(), which all callers of
amdvi_sync_shadow_page_table_range() use to extract the DTE that will be
passed as its argument. So I think if we were to add a check, an assert
would be more appropriate.

On the other hand, I am worried about sprinkling assertions around the
code, and I am actually considering removing the ones the original series
added to amdvi_sync_shadow_page_table_range(), fetch_pte(), and
amdvi_notify_iommu(), since any failures in these cases are likely to cause
immediate symptoms and I don't believe it creates any window for data
corruption.

With that in mind, to answer your question I would not add any additional
check on amdvi_sync_shadow_page_table_range(), and would rely on the
current convention to check when retrieving a DTE via the interface we
provided for it and all the code already uses i.e. amdvi_as_to_dte().

For a follow up cleanup, maybe amdvi_as_to_dte() should be renamed with a
more suggestive name that clearly indicates that it validates for those
conditions. But in context, all of its callers rely on the fact that it
does this verification and its return values also encode it.

While looking into this I noticed a chance for minor cleanup in this flow:
amdvi_do_translate()
    amdvi_as_to_dte()
    amdvi_page_walk()

Where amdvi_page_walk() does an unnecessary check for DTE[TV], that
amd_vi_as_to_dte() already ensures is set.

Thank you,
Alejandro

> Thanks
> Sairaj Kodilkar
>

next prev parent reply	other threads:[~2026-03-24 14:25 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11 20:39 [PATCH 0/2] amd_iommu: Fix page size reporting on hugepage mappings Alejandro Jimenez
2026-03-11 20:39 ` [PATCH 1/2] amd_iommu: Follow root pointer before page walk and use 1-based levels Alejandro Jimenez
2026-03-12 17:01   ` David Hoppenbrouwers
2026-03-23 11:01   ` Sairaj Kodilkar
2026-03-24 14:25     ` Alejandro Jimenez [this message]
2026-03-26  6:52       ` Sairaj Kodilkar
2026-03-11 20:39 ` [PATCH 2/2] amd_iommu: Reject non-decreasing NextLevel in fetch_pte() Alejandro Jimenez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ba2d7018-7dcc-42d7-ba52-395d4e1d0298@oracle.com \
    --to=alejandro.j.jimenez@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=clement.mathieu--drif@eviden.com \
    --cc=eduardo@habkost.net \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu@demindiro.com \
    --cc=richard.henderson@linaro.org \
    --cc=sarunkod@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox