All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Joerg Roedel <joro@8bytes.org>,
	iommu@lists.linux.dev, Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Rob Clark <robdclark@gmail.com>,
	Gaurav Kohli <quic_gkohli@quicinc.com>,
	Steven Price <steven.price@arm.com>
Subject: Re: [PATCH v2 0/2] iommu: Allow passing custom allocators to pgtable drivers
Date: Fri, 10 Nov 2023 12:12:29 -0400	[thread overview]
Message-ID: <20231110161229.GA462657@nvidia.com> (raw)
In-Reply-To: <20231110164809.270f82bc@collabora.com>

On Fri, Nov 10, 2023 at 04:48:09PM +0100, Boris Brezillon wrote:

> > Shouldn't improving the allocator in the io page table be done
> > generically?
> 
> While most of it could be made generic, the pre-reservation is a bit
> special for VM_BIND: we need to pre-reserve page tables without knowing
> the state of the page table tree (over-reservation), because page table
> updates are executed asynchronously (the state of the VM when we
> prepare the request might differ from its state when we execute it). We
> also need to make sure no other pre-reservation requests steal pages
> from the pool of pages we reserved for requests that were not executed
> yet.
> 
> I'm not saying this is impossible to implement, but it sounds too
> specific for a generic io-pgtable cache.

It is quite easy, and indeed much better to do it internally.

struct page allocations like the io page table uses get a few pointers
of data to be used by the caller in the struct page *.

You can put a refcounter in that data per-page to count how many
callers have reserved the page. Add a new "allocate VA" API to
allocate and install page table levels that cover a VA range in the
radix tree and increment all the refcounts on all the impacted struct
pages.

Now you can be guarenteed that future map in that VA range will be
fully non-allocating, and future unmap will be fully non-freeing.

Some "unallocate VA" will decrement the refcounts and free the page
table levels within that VA range.

Precompute the number of required pages at the start of allocate and
you can trivally do batch allocations. Ditto for unallocate, it can
trivially do batch freeing.

Way better and more generically useful than allocator ops!

I'd be interested in something like this for iommufd too, we greatly
suffer from poor iommu driver performace during map, and in general we
lack a robust way to actually fully unmap all page table levels.

A new domain API to prepare all the ioptes more efficiently would be a
great general improvement!

Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@nvidia.com>
To: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Joerg Roedel <joro@8bytes.org>,
	iommu@lists.linux.dev, Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Rob Clark <robdclark@gmail.com>,
	Gaurav Kohli <quic_gkohli@quicinc.com>,
	Steven Price <steven.price@arm.com>
Subject: Re: [PATCH v2 0/2] iommu: Allow passing custom allocators to pgtable drivers
Date: Fri, 10 Nov 2023 12:12:29 -0400	[thread overview]
Message-ID: <20231110161229.GA462657@nvidia.com> (raw)
In-Reply-To: <20231110164809.270f82bc@collabora.com>

On Fri, Nov 10, 2023 at 04:48:09PM +0100, Boris Brezillon wrote:

> > Shouldn't improving the allocator in the io page table be done
> > generically?
> 
> While most of it could be made generic, the pre-reservation is a bit
> special for VM_BIND: we need to pre-reserve page tables without knowing
> the state of the page table tree (over-reservation), because page table
> updates are executed asynchronously (the state of the VM when we
> prepare the request might differ from its state when we execute it). We
> also need to make sure no other pre-reservation requests steal pages
> from the pool of pages we reserved for requests that were not executed
> yet.
> 
> I'm not saying this is impossible to implement, but it sounds too
> specific for a generic io-pgtable cache.

It is quite easy, and indeed much better to do it internally.

struct page allocations like the io page table uses get a few pointers
of data to be used by the caller in the struct page *.

You can put a refcounter in that data per-page to count how many
callers have reserved the page. Add a new "allocate VA" API to
allocate and install page table levels that cover a VA range in the
radix tree and increment all the refcounts on all the impacted struct
pages.

Now you can be guarenteed that future map in that VA range will be
fully non-allocating, and future unmap will be fully non-freeing.

Some "unallocate VA" will decrement the refcounts and free the page
table levels within that VA range.

Precompute the number of required pages at the start of allocate and
you can trivally do batch allocations. Ditto for unallocate, it can
trivially do batch freeing.

Way better and more generically useful than allocator ops!

I'd be interested in something like this for iommufd too, we greatly
suffer from poor iommu driver performace during map, and in general we
lack a robust way to actually fully unmap all page table levels.

A new domain API to prepare all the ioptes more efficiently would be a
great general improvement!

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-11-10 16:12 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-10  9:43 [PATCH v2 0/2] iommu: Allow passing custom allocators to pgtable drivers Boris Brezillon
2023-11-10  9:43 ` Boris Brezillon
2023-11-10  9:43 ` [PATCH v2 1/2] " Boris Brezillon
2023-11-10  9:43   ` Boris Brezillon
2023-11-22 13:08   ` Robin Murphy
2023-11-22 13:08     ` Robin Murphy
2023-11-10  9:43 ` [PATCH v2 2/2] iommu: Extend LPAE page table format to support custom allocators Boris Brezillon
2023-11-10  9:43   ` Boris Brezillon
2023-11-22 14:24   ` Robin Murphy
2023-11-22 14:24     ` Robin Murphy
2023-11-10 10:47 ` [PATCH v2 0/2] iommu: Allow passing custom allocators to pgtable drivers Gaurav Kohli
2023-11-10 10:47   ` Gaurav Kohli
2023-11-10 15:14 ` Jason Gunthorpe
2023-11-10 15:14   ` Jason Gunthorpe
2023-11-10 15:48   ` Boris Brezillon
2023-11-10 15:48     ` Boris Brezillon
2023-11-10 16:12     ` Jason Gunthorpe [this message]
2023-11-10 16:12       ` Jason Gunthorpe
2023-11-10 19:16       ` Boris Brezillon
2023-11-10 19:16         ` Boris Brezillon
2023-11-10 19:42         ` Jason Gunthorpe
2023-11-10 19:42           ` Jason Gunthorpe
2023-11-13  9:11           ` Boris Brezillon
2023-11-13  9:11             ` Boris Brezillon
2023-11-14 16:27             ` Jason Gunthorpe
2023-11-14 16:27               ` Jason Gunthorpe
2023-11-20 14:04               ` Jason Gunthorpe
2023-11-20 14:04                 ` Jason Gunthorpe
2023-11-20 14:38                 ` Boris Brezillon
2023-11-20 14:38                   ` Boris Brezillon
2023-11-20 14:46                   ` Jason Gunthorpe
2023-11-20 14:46                     ` Jason Gunthorpe
2023-11-20 15:14                     ` Boris Brezillon
2023-11-20 15:14                       ` Boris Brezillon
2023-11-20 15:45                       ` Jason Gunthorpe
2023-11-20 15:45                         ` Jason Gunthorpe
2023-11-22 17:23                         ` Robin Murphy
2023-11-22 17:23                           ` Robin Murphy
2023-11-22 17:50                           ` Jason Gunthorpe
2023-11-22 17:50                             ` Jason Gunthorpe
2023-11-23  8:51                             ` Boris Brezillon
2023-11-23  8:51                               ` Boris Brezillon
2023-11-23 13:48                               ` Jason Gunthorpe
2023-11-23 13:48                                 ` Jason Gunthorpe
2023-11-23 16:49                                 ` Robin Murphy
2023-11-23 16:49                                   ` Robin Murphy
2023-11-23 16:59                                   ` Jason Gunthorpe
2023-11-23 16:59                                     ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231110161229.GA462657@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=boris.brezillon@collabora.com \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=quic_gkohli@quicinc.com \
    --cc=robdclark@gmail.com \
    --cc=robin.murphy@arm.com \
    --cc=steven.price@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.