Re: [PATCH v8 07/15] iommupt: Add map_pages op

public inbox for linux-doc@vger.kernel.org
 help / color / mirror / Atom feed

From: Jason Gunthorpe <jgg@nvidia.com>
To: Alexey Kardashevskiy <aik@amd.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>, Anup Patel <anup@brainfault.org>,
	Albert Ou <aou@eecs.berkeley.edu>,
	Jonathan Corbet <corbet@lwn.net>,
	iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	Justin Stitt <justinstitt@google.com>,
	linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
	linux-riscv@lists.infradead.org, llvm@lists.linux.dev,
	Bill Wendling <morbo@google.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <nick.desaulniers+lkml@gmail.com>,
	Miguel Ojeda <ojeda@kernel.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Walmsley <pjw@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Shuah Khan <shuah@kernel.org>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Will Deacon <will@kernel.org>,
	Alejandro Jimenez <alejandro.j.jimenez@oracle.com>,
	James Gowans <jgowans@amazon.com>,
	Kevin Tian <kevin.tian@intel.com>,
	Michael Roth <michael.roth@amd.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	patches@lists.linux.dev, Samiullah Khawaja <skhawaja@google.com>,
	Vasant Hegde <vasant.hegde@amd.com>
Subject: Re: [PATCH v8 07/15] iommupt: Add map_pages op
Date: Sat, 17 Jan 2026 11:43:47 -0400	[thread overview]
Message-ID: <20260117154347.GF1134360@nvidia.com> (raw)
In-Reply-To: <fc4f0354-4e6d-452d-abfb-fe24e53253a2@amd.com>

On Sat, Jan 17, 2026 at 03:54:52PM +1100, Alexey Kardashevskiy wrote:

> I am trying this with TEE-IO on AMD SEV and hitting problems. 

My understanding is that if you want to use SEV today you also have to
use the kernel command line parameter to force 4k IOMMU pages?

So, I think your questions are about trying to enhance this to get
larger pages in the IOMMU when possible?

> Now, from time to time the guest will share 4K pages which makes the
> host OS smash NPT's 2MB PDEs to 4K PTEs, and 2M RMP entries to 4K
> RMP entries, and since the IOMMU performs RMP checks - IOMMU PDEs
> have to use the same granularity as NPT and RMP.

IMHO this is a bad hardware choice, it is going to make some very
troublesome software, so sigh.

> So I end up in a situation when QEMU asks to map, for example, 2GB
> of guest RAM and I want most of it to be 2MB mappings, and only
> handful of 2MB pages to be split into 4K pages. But it appears so
> that the above enforces the same page size for entire range.

> In the old IOMMU code, I handled it like this:
> 
> https://github.com/AMDESE/linux-kvm/commit/0a40130987b7b65c367390d23821cc4ecaeb94bd#diff-f22bea128ddb136c3adc56bc09de9822a53ba1ca60c8be662a48c3143c511963L341
> 
> tl;dr: I constantly re-calculate the page size while mapping.

Doing it at mapping time doesn't seem right to me, AFAICT the RMP can
change dynamically whenever the guest decides to change the
private/shared status of memory?

My expectation for AMD was that the VMM would be monitoring the RMP
granularity and use cut or "increase/decrease page size" through
iommupt to adjust the S2 mapping so it works with these RMP
limitations.

Those don't fully exist yet, but they are in the plans.

It assumes that the VMM is continually aware of what all the RMP PTEs
look like and when they are changing so it can make the required
adjustments.

The flow would be some thing like..
 1) Create an IOAS
 2) Create a HWPT. If there is some known upper bound on RMP/etc page
    size then limit the HWPT page size to the upper bound
 3) Map stuff into the ioas
 4) Build the RMP/etc and map ranges of page granularity
 5) Call iommufd to adjust the page size within ranges
 6) Guest changes encrypted state so RMP changes
 7) VMM adjusts the ranges of page granularity and calls iommufd with
    the updates
 8) iommput code increases/decreases page size as required.

Does this seem reasonable?

> I know, ideally we would only share memory in 2MB chunks but we are
> not there yet as I do not know the early boot stage on x86 enough to

Even 2M is too small, I'd expect realy scenarios to want to get up to
1GB ??

Jason

next prev parent reply	other threads:[~2026-01-17 15:43 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-04 18:29 [PATCH v8 00/15] Consolidate iommu page table implementations (AMD) Jason Gunthorpe
2025-11-04 18:29 ` [PATCH v8 01/15] genpt: Generic Page Table base API Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 02/15] genpt: Add Documentation/ files Jason Gunthorpe
2025-11-04 23:49   ` Randy Dunlap
2025-11-05 18:51     ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 03/15] iommupt: Add the basic structure of the iommu implementation Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 04/15] iommupt: Add the AMD IOMMU v1 page table format Jason Gunthorpe
2025-11-04 18:51   ` Randy Dunlap
2025-11-04 18:30 ` [PATCH v8 05/15] iommupt: Add iova_to_phys op Jason Gunthorpe
2025-11-04 19:02   ` Randy Dunlap
2025-11-04 19:19     ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 06/15] iommupt: Add unmap_pages op Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 07/15] iommupt: Add map_pages op Jason Gunthorpe
2026-01-17  4:54   ` Alexey Kardashevskiy
2026-01-17 15:43     ` Jason Gunthorpe [this message]
2026-01-19  1:00       ` Alexey Kardashevskiy
2026-01-19 17:37         ` Jason Gunthorpe
2026-01-21  1:08           ` Alexey Kardashevskiy
2026-01-21 17:09             ` Jason Gunthorpe
2026-01-22 10:58               ` Alexey Kardashevskiy
2026-01-22 14:12                 ` Jason Gunthorpe
2026-01-23  1:07                   ` Alexey Kardashevskiy
2026-01-23 14:14                     ` Jason Gunthorpe
2026-01-27  8:08                       ` Alexey Kardashevskiy
2026-01-27 14:25                         ` Jason Gunthorpe
2026-01-28  1:42                           ` Alexey Kardashevskiy
2026-01-28 13:32                             ` Jason Gunthorpe
2026-01-29  0:33                               ` Alexey Kardashevskiy
2026-01-29  1:17                                 ` Jason Gunthorpe
2026-02-25 23:11       ` Alexey Kardashevskiy
2026-02-26 15:04         ` Jason Gunthorpe
2026-02-27  1:39           ` Alexey Kardashevskiy
2026-02-27 13:48             ` Jason Gunthorpe
2026-03-02  0:02               ` Alexey Kardashevskiy
2026-03-02  0:41                 ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 08/15] iommupt: Add read_and_clear_dirty op Jason Gunthorpe
2025-11-04 19:13   ` Randy Dunlap
2025-11-04 19:17     ` Jason Gunthorpe
2025-11-04 19:19       ` Randy Dunlap
2025-11-04 18:30 ` [PATCH v8 09/15] iommupt: Add a kunit test for Generic Page Table Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 10/15] iommupt: Add a mock pagetable format for iommufd selftest to use Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 11/15] iommufd: Change the selftest to use iommupt instead of xarray Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 12/15] iommupt: Add the x86 64 bit page table format Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 13/15] iommu/amd: Use the generic iommu page table Jason Gunthorpe
2025-11-05 16:01   ` Ankit Soni
2025-11-05 16:57     ` Jason Gunthorpe
2025-12-05  2:40   ` Lai, Yi
2025-12-05 19:46     ` Jason Gunthorpe
2025-12-05 20:07       ` Alejandro Jimenez
2025-11-04 18:30 ` [PATCH v8 14/15] iommu/amd: Remove AMD io_pgtable support Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 15/15] iommupt: Add a kunit test for the IOMMU implementation Jason Gunthorpe
2025-11-05  8:45 ` [PATCH v8 00/15] Consolidate iommu page table implementations (AMD) Joerg Roedel
2025-11-05 12:43   ` Jason Gunthorpe
2025-12-19  8:10 ` patchwork-bot+linux-riscv

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260117154347.GF1134360@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=aik@amd.com \
    --cc=alejandro.j.jimenez@oracle.com \
    --cc=alex@ghiti.fr \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=corbet@lwn.net \
    --cc=iommu@lists.linux.dev \
    --cc=jgowans@amazon.com \
    --cc=joro@8bytes.org \
    --cc=justinstitt@google.com \
    --cc=kevin.tian@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=llvm@lists.linux.dev \
    --cc=michael.roth@amd.com \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=nick.desaulniers+lkml@gmail.com \
    --cc=ojeda@kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=patches@lists.linux.dev \
    --cc=pjw@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=shuah@kernel.org \
    --cc=skhawaja@google.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=vasant.hegde@amd.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox