public inbox for linux-riscv@lists.infradead.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Vincent Chen <vincent.chen@sifive.com>,
	Alexandre Ghiti <alex@ghiti.fr>,
	Albert Ou <aou@eecs.berkeley.edu>,
	iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	linux-riscv@lists.infradead.org,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Walmsley <pjw@kernel.org>,
	Tomasz Jeznach <tjeznach@rivosinc.com>,
	Will Deacon <will@kernel.org>,
	lihangjing@bytedance.com, Xu Lu <luxu.kernel@bytedance.com>,
	patches@lists.linux.dev, xieyongji@bytedance.com
Subject: Re: [PATCH v2 0/5] Convert riscv to use the generic iommu page table
Date: Mon, 2 Feb 2026 10:37:20 -0400	[thread overview]
Message-ID: <20260202143720.GN2223369@nvidia.com> (raw)
In-Reply-To: <8c46864e-4625-49f4-90b3-a7467cec8b7b@arm.com>

On Mon, Feb 02, 2026 at 02:00:07PM +0000, Robin Murphy wrote:

> > DMA-FQ requires two functionalites from the page table:
> > 1) use gather->freelist to avoid a HW UAF (iommupt always does this)
> 
> Nope, correct DMA API usage would almost never unmap an entire table, so
> synchronous non-leaf maintenance in that path still doesn't hurt DMA-FQ
> either (e.g. io-pgtable-arm).

Well, it certainly would hurt workloads like IB MR's which can have
quite alot of IOVA in a single dma_map_sg() and we do want to see the
table levels removed to avoid the waste that Pasha has talked
about. Doing single invalidations of potentially a lot of levels in a
DMA-FQ environment is unnecessary overhead.

But I get your point that simple, say storage, use of the DMA API
wouldn't be bothered by this and you could still get alot of benefit
without using the free list.

> If a pagetable implementation wanted to refcount and eagerly free empty
> tables upon leaf unmaps, then yes it would need deferred freeing, but
> frankly it would be better off just not doing that at all for DMA-FQ anyway
> (as IOVA caching would make it likely to need to repopulate the same level
> of table soon.)

Today it isn't done with refcounts, just if the iova range unmapped
fully contains a table level then the table level can go away too. It
does trim interior page tables for large IOVA allocations but small
ones are unlikely to free anything.

> > The one call to iommu_iotlb_sync() is only for the para-virtualization
> > optimization of narrowing invalidations. It would be nonsensical for a
> > driver to enable this optimization and offer IOMMU_CAP_DEFERRED_FLUSH.
> 
> Not necessarily - in the PV case it can be desirable to minimise
> over-invalidation *if* you're trapping for targeted invalidations in strict
> mode. However, depending on the usage pattern it may also be beneficial to
> have non-strict let the FQ mechanism batch up work to minimise the number of
> traps taken - e.g. s390 is in this situation, and is precisely why we added
> IOMMU_DMA_OPTS_SINGLE_QUEUE to help optimise for that.

Okay, so if I understand you right, it should check for
iommu_iotlb_gather_queued() and disable PT_FEAT_FLUSH_RANGE_NO_GAPS
mode entirely? ie there is no point in doing small invalidations if
the caller is going to do a flush all?

This way the user gets to pick using DMA-FQ or DMA-strict ?

Also Intel would probably benefit from .shadow_on_flush too?

Jason

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2026-02-02 14:37 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-06 15:06 [PATCH v2 0/5] Convert riscv to use the generic iommu page table Jason Gunthorpe
2026-01-06 15:06 ` [PATCH v2 1/5] iommupt: Add the RISC-V page table format Jason Gunthorpe
2026-01-30 19:21   ` Andrew Jones
2026-01-30 23:47     ` Jason Gunthorpe
2026-01-06 15:06 ` [PATCH v2 2/5] iommu/riscv: Disable SADE Jason Gunthorpe
2026-01-06 15:06 ` [PATCH v2 3/5] iommu/riscv: Use the generic iommu page table Jason Gunthorpe
2026-01-06 15:06 ` [PATCH v2 4/5] iommu/riscv: Enable SVNAPOT support for contiguous ptes Jason Gunthorpe
2026-01-06 15:06 ` [PATCH v2 5/5] iommu/riscv: Allow RISC_VIOMMU to COMPILE_TEST Jason Gunthorpe
2026-01-30 19:58   ` Andrew Jones
2026-01-30 23:44     ` Jason Gunthorpe
2026-02-04 16:09       ` Andrew Jones
2026-01-22  1:46 ` [PATCH v2 0/5] Convert riscv to use the generic iommu page table Vincent Chen
2026-01-22 15:31   ` Jason Gunthorpe
2026-01-23  3:05     ` Vincent Chen
2026-01-23 12:29       ` Vincent Chen
2026-01-23 13:52         ` Jason Gunthorpe
2026-01-29 11:21           ` Robin Murphy
2026-01-31  0:27             ` Jason Gunthorpe
2026-02-02 14:00               ` Robin Murphy
2026-02-02 14:37                 ` Jason Gunthorpe [this message]
2026-02-02 16:43                   ` Robin Murphy
2026-01-22  7:56 ` Joerg Roedel
2026-01-29  0:46   ` Jason Gunthorpe
2026-01-30 23:14     ` Paul Walmsley
2026-01-31  1:28       ` Tomasz Jeznach

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260202143720.GN2223369@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex@ghiti.fr \
    --cc=aou@eecs.berkeley.edu \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=lihangjing@bytedance.com \
    --cc=linux-riscv@lists.infradead.org \
    --cc=luxu.kernel@bytedance.com \
    --cc=palmer@dabbelt.com \
    --cc=patches@lists.linux.dev \
    --cc=pjw@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=tjeznach@rivosinc.com \
    --cc=vincent.chen@sifive.com \
    --cc=will@kernel.org \
    --cc=xieyongji@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox