All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Alexey Kardashevskiy <aik@amd.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>, Anup Patel <anup@brainfault.org>,
	Albert Ou <aou@eecs.berkeley.edu>,
	Jonathan Corbet <corbet@lwn.net>,
	iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	Justin Stitt <justinstitt@google.com>,
	linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
	linux-riscv@lists.infradead.org, llvm@lists.linux.dev,
	Bill Wendling <morbo@google.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <nick.desaulniers+lkml@gmail.com>,
	Miguel Ojeda <ojeda@kernel.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Walmsley <pjw@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Shuah Khan <shuah@kernel.org>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Will Deacon <will@kernel.org>,
	Alejandro Jimenez <alejandro.j.jimenez@oracle.com>,
	James Gowans <jgowans@amazon.com>,
	Kevin Tian <kevin.tian@intel.com>,
	Michael Roth <michael.roth@amd.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	patches@lists.linux.dev, Samiullah Khawaja <skhawaja@google.com>,
	Vasant Hegde <vasant.hegde@amd.com>
Subject: Re: [PATCH v8 07/15] iommupt: Add map_pages op
Date: Tue, 27 Jan 2026 10:25:12 -0400	[thread overview]
Message-ID: <20260127142512.GD1134360@nvidia.com> (raw)
In-Reply-To: <ddeb2bc8-5088-4d16-bea3-91d58a4403e8@amd.com>

On Tue, Jan 27, 2026 at 07:08:39PM +1100, Alexey Kardashevskiy wrote:
> > Oh so it doesn't actually check the RMP, it is just rounding down to
> > two fixed sizes?
> 
> No, it does check RMP.
> 
> If the IOMMU page walk ends at a >=2MB page - it will round down to
> 2MB (to the nearest supported RMP size) and check for 2MB RMP and if
> that check fails because of the page size - it won't try 4K (even
> though it could theoretically).
> 
> The expectation is that the host OS makes sure the IOMMU uses page
> sizes equal or bigger than closest smaller RMP page size so there is
> no need in two RMP checks.

Seems dynfunctional to me.

> > > > ARM is pushing a thing where encrypt/decrypt has to work on certain aligned
> > > > granual sizes > PAGE_SIZE, you could use that mechanism to select a 2M
> > > > size for AMD too and avoid this.
> > > 
> > > 2M minimum on every DMA map?
> > On every swiotlb allocation pool chunk, yeah.
> 
> Nah, it is quite easy to force 2MB on swiotlb (just do it once and
> forget) but currently any guest page can be converted to shared and
> DMA-mapped and this skips swiotlb.

Upstream Linux doesn't support that, only SWIOTLB or special DMA
coherent memory can be DMA mapped in CC systems. You can't take a
random page, make it shared and then DMA map it.

> > > > What happens if the guest puts 4K pages into it's AMDv2 table and RMP
> > > > is 2M?
> > > 
> > > Is this AMDv2 - an NPT (then it is going to fail)? or nested IOMMU (never tried, in the works, I suspect failure)?
> > 
> > Yes, some future nested vIOMMU
> > 
> > If guest can't have a 4K page in it's vIOMMU while the host is using
> > 2M RMP then the whole architecture is broken, sorry.
> 
> I re-read what I wrote and I think I was wrong, the S2 table (guest
> physical -> host physical) has to match RMP, not the S1.

Really? So the HW can fix the 4k/2M mismatch for the S1 but doesn't
bother for the S2? Seems like a crazy design to me.

What happens if you don't have a VIOMMU, have a single translation
stage and only use the S1 (AMDv2) page table in the hypervisor? Then
does the HW fix it? Or does it only fix it with two stages enabled?

> > iommufd won't deal with memory maps for IO, the secure world will
> > handle that through KVM.
> 
> Is QEMU going to skip on IOMMU mapping entirely? So when the device
> is transitioned from untrusted (when everything mapped via VFIO or
> IOMMU) to trusted - QEMU will unmap everything and then the guest
> will map everything but this time via KVM and bypassing QEMU
> entirely? Thanks,

On ARM there are different S2s for the IOMMU, one for T=1 and one for
T=0 traffic. The T=1 is fully controlled by the secure world is equal
to the CPU S2. The T=0 one is fully controlled by qemu and acts like a
normal system. The T=0 can only access guest shared memory.

Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@nvidia.com>
To: Alexey Kardashevskiy <aik@amd.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>, Anup Patel <anup@brainfault.org>,
	Albert Ou <aou@eecs.berkeley.edu>,
	Jonathan Corbet <corbet@lwn.net>,
	iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	Justin Stitt <justinstitt@google.com>,
	linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
	linux-riscv@lists.infradead.org, llvm@lists.linux.dev,
	Bill Wendling <morbo@google.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <nick.desaulniers+lkml@gmail.com>,
	Miguel Ojeda <ojeda@kernel.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Walmsley <pjw@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Shuah Khan <shuah@kernel.org>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Will Deacon <will@kernel.org>,
	Alejandro Jimenez <alejandro.j.jimenez@oracle.com>,
	James Gowans <jgowans@amazon.com>,
	Kevin Tian <kevin.tian@intel.com>,
	Michael Roth <michael.roth@amd.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	patches@lists.linux.dev, Samiullah Khawaja <skhawaja@google.com>,
	Vasant Hegde <vasant.hegde@amd.com>
Subject: Re: [PATCH v8 07/15] iommupt: Add map_pages op
Date: Tue, 27 Jan 2026 10:25:12 -0400	[thread overview]
Message-ID: <20260127142512.GD1134360@nvidia.com> (raw)
In-Reply-To: <ddeb2bc8-5088-4d16-bea3-91d58a4403e8@amd.com>

On Tue, Jan 27, 2026 at 07:08:39PM +1100, Alexey Kardashevskiy wrote:
> > Oh so it doesn't actually check the RMP, it is just rounding down to
> > two fixed sizes?
> 
> No, it does check RMP.
> 
> If the IOMMU page walk ends at a >=2MB page - it will round down to
> 2MB (to the nearest supported RMP size) and check for 2MB RMP and if
> that check fails because of the page size - it won't try 4K (even
> though it could theoretically).
> 
> The expectation is that the host OS makes sure the IOMMU uses page
> sizes equal or bigger than closest smaller RMP page size so there is
> no need in two RMP checks.

Seems dynfunctional to me.

> > > > ARM is pushing a thing where encrypt/decrypt has to work on certain aligned
> > > > granual sizes > PAGE_SIZE, you could use that mechanism to select a 2M
> > > > size for AMD too and avoid this.
> > > 
> > > 2M minimum on every DMA map?
> > On every swiotlb allocation pool chunk, yeah.
> 
> Nah, it is quite easy to force 2MB on swiotlb (just do it once and
> forget) but currently any guest page can be converted to shared and
> DMA-mapped and this skips swiotlb.

Upstream Linux doesn't support that, only SWIOTLB or special DMA
coherent memory can be DMA mapped in CC systems. You can't take a
random page, make it shared and then DMA map it.

> > > > What happens if the guest puts 4K pages into it's AMDv2 table and RMP
> > > > is 2M?
> > > 
> > > Is this AMDv2 - an NPT (then it is going to fail)? or nested IOMMU (never tried, in the works, I suspect failure)?
> > 
> > Yes, some future nested vIOMMU
> > 
> > If guest can't have a 4K page in it's vIOMMU while the host is using
> > 2M RMP then the whole architecture is broken, sorry.
> 
> I re-read what I wrote and I think I was wrong, the S2 table (guest
> physical -> host physical) has to match RMP, not the S1.

Really? So the HW can fix the 4k/2M mismatch for the S1 but doesn't
bother for the S2? Seems like a crazy design to me.

What happens if you don't have a VIOMMU, have a single translation
stage and only use the S1 (AMDv2) page table in the hypervisor? Then
does the HW fix it? Or does it only fix it with two stages enabled?

> > iommufd won't deal with memory maps for IO, the secure world will
> > handle that through KVM.
> 
> Is QEMU going to skip on IOMMU mapping entirely? So when the device
> is transitioned from untrusted (when everything mapped via VFIO or
> IOMMU) to trusted - QEMU will unmap everything and then the guest
> will map everything but this time via KVM and bypassing QEMU
> entirely? Thanks,

On ARM there are different S2s for the IOMMU, one for T=1 and one for
T=0 traffic. The T=1 is fully controlled by the secure world is equal
to the CPU S2. The T=0 one is fully controlled by qemu and acts like a
normal system. The T=0 can only access guest shared memory.

Jason

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2026-01-27 14:25 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-04 18:29 [PATCH v8 00/15] Consolidate iommu page table implementations (AMD) Jason Gunthorpe
2025-11-04 18:29 ` Jason Gunthorpe
2025-11-04 18:29 ` [PATCH v8 01/15] genpt: Generic Page Table base API Jason Gunthorpe
2025-11-04 18:29   ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 02/15] genpt: Add Documentation/ files Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-04 23:49   ` Randy Dunlap
2025-11-04 23:49     ` Randy Dunlap
2025-11-05 18:51     ` Jason Gunthorpe
2025-11-05 18:51       ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 03/15] iommupt: Add the basic structure of the iommu implementation Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 04/15] iommupt: Add the AMD IOMMU v1 page table format Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-04 18:51   ` Randy Dunlap
2025-11-04 18:51     ` Randy Dunlap
2025-11-04 18:30 ` [PATCH v8 05/15] iommupt: Add iova_to_phys op Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-04 19:02   ` Randy Dunlap
2025-11-04 19:02     ` Randy Dunlap
2025-11-04 19:19     ` Jason Gunthorpe
2025-11-04 19:19       ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 06/15] iommupt: Add unmap_pages op Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 07/15] iommupt: Add map_pages op Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2026-01-17  4:54   ` Alexey Kardashevskiy
2026-01-17  4:54     ` Alexey Kardashevskiy
2026-01-17 15:43     ` Jason Gunthorpe
2026-01-17 15:43       ` Jason Gunthorpe
2026-01-19  1:00       ` Alexey Kardashevskiy
2026-01-19  1:00         ` Alexey Kardashevskiy
2026-01-19 17:37         ` Jason Gunthorpe
2026-01-19 17:37           ` Jason Gunthorpe
2026-01-21  1:08           ` Alexey Kardashevskiy
2026-01-21  1:08             ` Alexey Kardashevskiy
2026-01-21 17:09             ` Jason Gunthorpe
2026-01-21 17:09               ` Jason Gunthorpe
2026-01-22 10:58               ` Alexey Kardashevskiy
2026-01-22 10:58                 ` Alexey Kardashevskiy
2026-01-22 14:12                 ` Jason Gunthorpe
2026-01-22 14:12                   ` Jason Gunthorpe
2026-01-23  1:07                   ` Alexey Kardashevskiy
2026-01-23  1:07                     ` Alexey Kardashevskiy
2026-01-23 14:14                     ` Jason Gunthorpe
2026-01-23 14:14                       ` Jason Gunthorpe
2026-01-27  8:08                       ` Alexey Kardashevskiy
2026-01-27  8:08                         ` Alexey Kardashevskiy
2026-01-27 14:25                         ` Jason Gunthorpe [this message]
2026-01-27 14:25                           ` Jason Gunthorpe
2026-01-28  1:42                           ` Alexey Kardashevskiy
2026-01-28  1:42                             ` Alexey Kardashevskiy
2026-01-28 13:32                             ` Jason Gunthorpe
2026-01-28 13:32                               ` Jason Gunthorpe
2026-01-29  0:33                               ` Alexey Kardashevskiy
2026-01-29  0:33                                 ` Alexey Kardashevskiy
2026-01-29  1:17                                 ` Jason Gunthorpe
2026-01-29  1:17                                   ` Jason Gunthorpe
2026-02-25 23:11       ` Alexey Kardashevskiy
2026-02-25 23:11         ` Alexey Kardashevskiy
2026-02-26 15:04         ` Jason Gunthorpe
2026-02-26 15:04           ` Jason Gunthorpe
2026-02-27  1:39           ` Alexey Kardashevskiy
2026-02-27  1:39             ` Alexey Kardashevskiy
2026-02-27 13:48             ` Jason Gunthorpe
2026-02-27 13:48               ` Jason Gunthorpe
2026-03-02  0:02               ` Alexey Kardashevskiy
2026-03-02  0:02                 ` Alexey Kardashevskiy
2026-03-02  0:41                 ` Jason Gunthorpe
2026-03-02  0:41                   ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 08/15] iommupt: Add read_and_clear_dirty op Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-04 19:13   ` Randy Dunlap
2025-11-04 19:13     ` Randy Dunlap
2025-11-04 19:17     ` Jason Gunthorpe
2025-11-04 19:17       ` Jason Gunthorpe
2025-11-04 19:19       ` Randy Dunlap
2025-11-04 19:19         ` Randy Dunlap
2025-11-04 18:30 ` [PATCH v8 09/15] iommupt: Add a kunit test for Generic Page Table Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-06  8:06   ` fatal error: ../../iommu-pages.h: No such file or directory (was: Re: [PATCH v8 09/15] iommupt: Add a kunit test for Generic Page Table) Thorsten Leemhuis
2025-11-06 18:58     ` Jason Gunthorpe
2025-11-07  7:38       ` fatal error: ../../iommu-pages.h: No such file or directory Thorsten Leemhuis
2025-11-04 18:30 ` [PATCH v8 10/15] iommupt: Add a mock pagetable format for iommufd selftest to use Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 11/15] iommufd: Change the selftest to use iommupt instead of xarray Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 12/15] iommupt: Add the x86 64 bit page table format Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 13/15] iommu/amd: Use the generic iommu page table Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-05 16:01   ` Ankit Soni
2025-11-05 16:01     ` Ankit Soni
2025-11-05 16:57     ` Jason Gunthorpe
2025-11-05 16:57       ` Jason Gunthorpe
2025-12-05  2:40   ` Lai, Yi
2025-12-05  2:40     ` Lai, Yi
2025-12-05 19:46     ` Jason Gunthorpe
2025-12-05 19:46       ` Jason Gunthorpe
2025-12-05 20:07       ` Alejandro Jimenez
2025-12-05 20:07         ` Alejandro Jimenez
2025-11-04 18:30 ` [PATCH v8 14/15] iommu/amd: Remove AMD io_pgtable support Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-04 18:30 ` [PATCH v8 15/15] iommupt: Add a kunit test for the IOMMU implementation Jason Gunthorpe
2025-11-04 18:30   ` Jason Gunthorpe
2025-11-05  8:45 ` [PATCH v8 00/15] Consolidate iommu page table implementations (AMD) Joerg Roedel
2025-11-05  8:45   ` Joerg Roedel
2025-11-05 12:43   ` Jason Gunthorpe
2025-11-05 12:43     ` Jason Gunthorpe
2025-12-19  8:10 ` patchwork-bot+linux-riscv
2025-12-19  8:10   ` patchwork-bot+linux-riscv

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260127142512.GD1134360@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=aik@amd.com \
    --cc=alejandro.j.jimenez@oracle.com \
    --cc=alex@ghiti.fr \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=corbet@lwn.net \
    --cc=iommu@lists.linux.dev \
    --cc=jgowans@amazon.com \
    --cc=joro@8bytes.org \
    --cc=justinstitt@google.com \
    --cc=kevin.tian@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=llvm@lists.linux.dev \
    --cc=michael.roth@amd.com \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=nick.desaulniers+lkml@gmail.com \
    --cc=ojeda@kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=patches@lists.linux.dev \
    --cc=pjw@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=shuah@kernel.org \
    --cc=skhawaja@google.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=vasant.hegde@amd.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.