All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: io-uring@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [PATCH] io_uring: take page references for NOMMU pbuf_ring mmaps
Date: Tue, 21 Apr 2026 19:56:13 -0600	[thread overview]
Message-ID: <f1b43e56-4724-4635-b18b-bae2add37936@kernel.dk> (raw)
In-Reply-To: <dec29d85-9e79-42df-ae3d-9af65134283c@kernel.dk>

On 4/21/26 7:17 PM, Jens Axboe wrote:
> On 4/21/26 11:39 AM, Jens Axboe wrote:
>>
>> On Tue, 21 Apr 2026 15:46:16 +0200, Greg Kroah-Hartman wrote:
>>> Under !CONFIG_MMU, io_uring_get_unmapped_area() returns the kernel
>>> virtual address of the io_mapped_region's backing pages directly;
>>> the user's VMA aliases the kernel allocation. io_uring_mmap() then
>>> just returns 0 -- it takes no page references.
>>>
>>> The CONFIG_MMU path uses vm_insert_pages(), which takes a reference on
>>> each inserted page.  Those references are released when the VMA is torn
>>> down (zap_pte_range -> put_page). io_free_region() -> release_pages()
>>> drops the io_uring-side references, but the pages survive until munmap
>>> drops the VMA-side references.
>>>
>>> [...]
>>
>> Applied, thanks!
>>
>> [1/1] io_uring: take page references for NOMMU pbuf_ring mmaps
>>       commit: d9b7b3d9c5286a786c7fe8220c55a6e012088c2e
> 
> Actually, I take that back - what prevents the io_mmap_get_region()
> in the newly added io_uring_nommu_vm_close() from getting the same
> region that we initially referenced the pages from in the nommu
> variant of io_uring_mmap()?

I think we can get rid of that and simplify the code at the same
time. Rather than need to re-lookup the buffer list, we can just iterate
the pages mapped in the vma. Since this is a file backed mapping and
io_uring doesn't allow remaps, that should always be the same.

Greg, can you test this? I will fold this in.


diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index 6818e9abf3b3..e80f9eed6efc 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -367,45 +367,18 @@ unsigned long io_uring_get_unmapped_area(struct file *filp, unsigned long addr,
 #else /* !CONFIG_MMU */
 
 /*
- * Under NOMMU, get_unmapped_area returns the kernel virtual address of
- * the io_mapped_region's backing pages directly -- the user's VMA
- * aliases the kernel allocation rather than holding its own copy or
- * page-table entries. The CONFIG_MMU path's vm_insert_pages() takes
- * page references that survive until munmap; this path takes none, so
- * io_unregister_pbuf_ring() -> io_free_region() -> release_pages()
- * frees the pages while the user's VMA still maps them. The user can
- * then write into whatever the buddy allocator hands out next.
- *
- * Mirror the MMU lifetime by taking page references in io_uring_mmap()
- * and releasing them in vm_ops->close. We re-derive the region from
- * vm_pgoff (same lookup get_unmapped_area used) so we know which pages
- * to grab.
+ * Drop the pages that were initially referenced and added in
+ * io_uring_mmap(). We cannot have had a mremap() as that isnt supported,
+ * hence the vma should be identical to the one we initially referenced and
+ * mapped, and partial unmaps and splitting isn't possible on a file backed
+ * mapping.
  */
-
 static void io_uring_nommu_vm_close(struct vm_area_struct *vma)
 {
-	struct io_ring_ctx *ctx = vma->vm_file->private_data;
-	struct io_mapped_region *region;
-	unsigned long i;
+	unsigned long index;
 
-	guard(mutex)(&ctx->mmap_lock);
-	region = io_mmap_get_region(ctx, vma->vm_pgoff);
-	/*
-	 * The region may have been unregistered (memset to zero in
-	 * io_free_region()) between mmap and munmap. The page refs we
-	 * took in io_uring_mmap() are what kept the pages alive; release
-	 * them via the VMA range since the region->pages array is gone.
-	 */
-	if (region && region->pages) {
-		for (i = 0; i < region->nr_pages; i++)
-			put_page(region->pages[i]);
-	} else {
-		/* Region cleared; walk the VMA range. */
-		unsigned long a;
-
-		for (a = vma->vm_start; a < vma->vm_end; a += PAGE_SIZE)
-			put_page(virt_to_page((void *)a));
-	}
+	for (index = vma->vm_start; index < vma->vm_end; index += PAGE_SIZE)
+		put_page(virt_to_page((void *) index);
 }
 
 static const struct vm_operations_struct io_uring_nommu_vm_ops = {

-- 
Jens Axboe

  reply	other threads:[~2026-04-22  1:56 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-21 13:46 [PATCH] io_uring: take page references for NOMMU pbuf_ring mmaps Greg Kroah-Hartman
2026-04-21 13:50 ` Jens Axboe
2026-04-21 13:55   ` Greg Kroah-Hartman
2026-04-21 14:02     ` Jens Axboe
2026-04-21 16:01     ` Greg Kroah-Hartman
2026-04-21 16:05       ` Jens Axboe
2026-04-21 16:21         ` Jens Axboe
2026-04-21 16:24           ` Greg Kroah-Hartman
2026-04-21 16:41             ` Jens Axboe
2026-04-21 17:04               ` Jens Axboe
2026-04-21 17:38                 ` Jens Axboe
2026-04-21 17:39 ` Jens Axboe
2026-04-22  1:17   ` Jens Axboe
2026-04-22  1:56     ` Jens Axboe [this message]
2026-04-22  2:26       ` Jens Axboe
2026-04-22  5:36         ` Greg Kroah-Hartman
2026-04-22  8:11         ` Greg Kroah-Hartman
2026-04-22 12:40           ` Jens Axboe
2026-04-22 13:03             ` Greg Kroah-Hartman
2026-04-22 13:06               ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f1b43e56-4724-4635-b18b-bae2add37936@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=gregkh@linuxfoundation.org \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.