From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3C2ECCD37A7 for ; Fri, 8 May 2026 15:56:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A5EC66B01A2; Fri, 8 May 2026 11:56:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A358E6B01A3; Fri, 8 May 2026 11:56:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9250C6B01A4; Fri, 8 May 2026 11:56:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 75C326B01A2 for ; Fri, 8 May 2026 11:56:37 -0400 (EDT) Received: from smtpin27.hostedemail.com (lb01a-stub [10.200.18.249]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 41C551C00CA for ; Fri, 8 May 2026 15:56:37 +0000 (UTC) X-FDA: 84744705234.27.90DA072 Received: from sea.source.kernel.org (sea.source.kernel.org [172.234.252.31]) by imf11.hostedemail.com (Postfix) with ESMTP id D7C1F4000A for ; Fri, 8 May 2026 15:56:34 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=FBJ6haGU; spf=pass (imf11.hostedemail.com: domain of kas@kernel.org designates 172.234.252.31 as permitted sender) smtp.mailfrom=kas@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1778255795; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OzQGeXvxVJZ3hjTLI0e/h2AFGEBU7rKEWdDRi1Q8mqE=; b=WPBMOoBbcgudRxgBH4MHjgNf5418f5vtVGZsRI0ixrkrQRDF20j7SOceSZz9wbpn9PNe4S 8IOfr+BRZYQdMkHHaw8+f3le2gIyjpRi2nQNo+ChVGTuIPOYr57BQ0vG/KorOuxA7ZH7H5 pt5fn0nFfRBX+oun/fy9NtbHo3HS4fE= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=FBJ6haGU; spf=pass (imf11.hostedemail.com: domain of kas@kernel.org designates 172.234.252.31 as permitted sender) smtp.mailfrom=kas@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1778255795; a=rsa-sha256; cv=none; b=GVcOUhUE5JxdRdnEJt3cyVRvyT31UKN8i/gMxRx43sYFf12y6PBu8HplPw6suC99VCiXqb JiMwpqhQ07NZJe+fYDoZOoaHzqg9lavTWESqhW5gFU14+RJCq31yHnPE8Xs17G7we5V2bP CMl9UI+JwW6ldh6rJEVWp3ZrLDzLQJk= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id F3554432D7; Fri, 8 May 2026 15:56:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E5258C4AF0B; Fri, 8 May 2026 15:56:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1778255793; bh=TerqHRubO4Vz6oZ4tH183bCKzAjkZvocVtgyx1u1M3U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FBJ6haGU9EbZh5yW0ExzmcEmtW3QJC8cK31sqbyWJ2QY6KjYrfEawsOEfNcPk5R98 b66AUmiSLymNJn/cQyEoXw23Zt0INmi4wJ+kq9KfgQwhTvK6UzC8Pqp2eBkor4Bcz3 1i1P4eXIcI3O5qN0ERYoFHyiElVNEOSf1h4wYYCJEWNuK0ZWxRAGmXKml9Aryql45m /GoYMLINaTb4YmpHutJuGtiTDIqcoqhrbcipENL5Ie4vl++S+uUDamHKQ06TZhxzBW rKVR435LWg+m4NJOoFH5h3mohgB+35caMtjdO/no++HZ4yL60FAcZCFMfQtHIx5Map lBx43wqHegM+g== Received: from phl-compute-11.internal (phl-compute-11.internal [10.202.2.51]) by mailfauth.phl.internal (Postfix) with ESMTP id 442FAF4006B; Fri, 8 May 2026 11:56:32 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-11.internal (MEProxy); Fri, 08 May 2026 11:56:32 -0400 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefhedrtddtgdduuddtjeeiucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucenucfjughrpefhvfevufffkffojghfgggtgfesthekre dtredtjeenucfhrhhomhepfdfmihhrhihlucfuhhhuthhsvghmrghuucdlofgvthgrmddf uceokhgrsheskhgvrhhnvghlrdhorhhgqeenucggtffrrghtthgvrhhnpefhvdefvdevje evhefhhfevudefudejfeduvdekheeludfhiefhhedujeffffeigfenucevlhhushhtvghr ufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehkihhrihhllhdomhgvshhmth hprghuthhhphgvrhhsohhnrghlihhthidqudeiudduiedvieehhedqvdekgeeggeejvdek qdhkrghspeepkhgvrhhnvghlrdhorhhgsehshhhuthgvmhhovhdrnhgrmhgvpdhnsggprh gtphhtthhopedvgedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtoheprghkphhmsehl ihhnuhigqdhfohhunhgurghtihhonhdrohhrghdprhgtphhtthhopehrphhptheskhgvrh hnvghlrdhorhhgpdhrtghpthhtohepphgvthgvrhigsehrvgguhhgrthdrtghomhdprhgt phhtthhopegurghvihgusehkvghrnhgvlhdrohhrghdprhgtphhtthhopehljhhssehkvg hrnhgvlhdrohhrghdprhgtphhtthhopehsuhhrvghnsgesghhoohhglhgvrdgtohhmpdhr tghpthhtohepvhgsrggskhgrsehkvghrnhgvlhdrohhrghdprhgtphhtthhopehlihgrmh drhhhofihlvghtthesohhrrggtlhgvrdgtohhmpdhrtghpthhtohepiihihiesnhhvihgu ihgrrdgtohhm X-ME-Proxy: Feedback-ID: i10464835:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 8 May 2026 11:56:31 -0400 (EDT) From: "Kiryl Shutsemau (Meta)" To: akpm@linux-foundation.org, rppt@kernel.org, peterx@redhat.com, david@kernel.org Cc: ljs@kernel.org, surenb@google.com, vbabka@kernel.org, Liam.Howlett@oracle.com, ziy@nvidia.com, corbet@lwn.net, skhan@linuxfoundation.org, seanjc@google.com, pbonzini@redhat.com, jthoughton@google.com, aarcange@redhat.com, sj@kernel.org, usama.arif@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, kvm@vger.kernel.org, kernel-team@meta.com, "Kiryl Shutsemau (Meta)" Subject: [PATCH v2 13/14] selftests/mm: add userfaultfd RWP tests Date: Fri, 8 May 2026 16:55:25 +0100 Message-ID: X-Mailer: git-send-email 2.51.2 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Stat-Signature: esqefe3h7rzfudwpngxeerag6sq9md19 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: D7C1F4000A X-Rspam-User: X-HE-Tag: 1778255794-7914 X-HE-Meta: U2FsdGVkX1+KAcLuJrTqsTbVT7qjMFlTOkjar3AB7g8A++/Ojx12BbPD/Jh2pv5tnXeA9mR7s7WYSktt614v0cqzPEDFrKlc8nDCkLVLdZgVki+H0ElHz6uh7QS08GvL6UsQsPDNWaubb60VnwDZDbQqUaCyIG5jicWdFjcLzJE+rUqxOLAe00GsAoJTwC9jSq9pN30r1OfPxhDvNdxfAuIrKUgq2j0U3r+iMGa1pW6YDlSc5RZCMvCUfdRAZWlhOxLyTa0Vj9R7dCCP0ZhuUuF83ieW9wXsLvGOeKAom9REvmovj8OObJg8vwhTBCh7LANLFlOT1057FSard6cbibAx7UUvZ0+v/QlHxhZMX77D9Y+STLLBLuuPmD/Fbiw8N0050R6ZtE8ZrHFwa+2VdTdF5limyrlCH2mSf/QrSOeont9SgIvDEKyhIGV4+fH6c0naNmACE4OjzsrVw4QzHcktkC4KUDij+Wl+6Xw/dad1KsqgPa+BKObIJo/V/5bAbYMW2OR6/M7+TZsNe4KhD7AjTFDWsaCJi2gUdxEEBBeiHqqHGF6Ho7jYSBeusJk8gnCYVvPauoyq95RrzHvI9cNWSstyrGtU8OQ8lor9NtOOyOKRUtoSZHWWjAdpWEm62ZvnHv1hO+JV+IQwXg2EdwdWMVuSuiFmXExiQ9e00T14uVsQc63K3A+0l9qaGA5fZ6/ngmD7v74k9lEmoGXIWfvlRcF4neLfbUFYuWdynb7xQbpxU0OIWCaT6QD+yI16aajjNYkMYlCxbdQ3grl5XA7nH1bM3c6GBGYfKWbV5dzADR/NXS9rOkIzhT+LSoi5og4Cl4pQkeFlYICxru52KAOzhMjQjxJwafZQG8AcN55r5dX3PdrqSuf3G+L7+WcRvOra4yuTjbWJ4Li8lPqZK4FDhuMx44aekh5LVLfdYXYsj7RuRHS9naUNZpS5tE4i/PDtkZ2sC77i8tX2bUV xkXQA9d5 u5uxhgq2CYK7GzHj1UIkb26bsq73NuzFHgeSnevcSY2o5QJm9GohPPVXpDaiaUo1fdz7lDNyHZWWAIAdHoR32j6Obku6uCePXYuyL2HKdMXfws4tkvaiGtOZzUe1tbOIWODHiR3BA1Z+xwb32P6fFaANxjTi1sRBcJpxKNEsePITt2DDPc0wx8wNllWgW/gZah8FVRV8KGUdqmyshB1W8/tAn3SoTzTMBkIL/i00z5IPAG9SH4VyMw5/9AbXWrOrEDOFO8OAU+VLo/OtAzSHEqVmxTK53OnzpIpcmPuN3ehc01e0= Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Coverage for UFFDIO_REGISTER_MODE_RWP and UFFDIO_RWPROTECT: rwp-async async mode — touch pages, verify permissions are auto-restored without a message rwp-sync sync mode — access blocks, handler resolves via UFFDIO_RWPROTECT rwp-pagemap PAGEMAP_SCAN reports still-cold pages via inverted PAGE_IS_ACCESSED rwp-mprotect RWP survives mprotect(PROT_NONE) -> mprotect(PROT_READ|PROT_WRITE) round-trip rwp-gup GUP walks through a protnone RWP PTE (pipe write/read drives the GUP path) rwp-async-toggle UFFDIO_SET_MODE flips between sync and async without re-registering rwp-close closing the uffd restores page permissions rwp-fork RWP survives fork() with EVENT_FORK; child's PTEs keep the uffd bit rwp-fork-pin RWP survives fork() on an RO-longterm-pinned anon page (forces copy_present_page()); child read auto-resolves and clears the bit, proving PAGE_NONE was in place rwp-wp-exclusive register with MODE_WP|MODE_RWP returns -EINVAL All tests run against anon, shmem, shmem-private, hugetlb, and hugetlb-private memory, except rwp-fork-pin which is anon-only — copy_present_page() is the private-anon pinned-exclusive fork path. Signed-off-by: Kiryl Shutsemau Assisted-by: Claude:claude-opus-4-6 --- tools/testing/selftests/mm/uffd-unit-tests.c | 774 +++++++++++++++++++ 1 file changed, 774 insertions(+) diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c index 6f5e404a446c..a35fb677e4cc 100644 --- a/tools/testing/selftests/mm/uffd-unit-tests.c +++ b/tools/testing/selftests/mm/uffd-unit-tests.c @@ -7,6 +7,7 @@ #include "uffd-common.h" +#include #include "../../../../mm/gup_test.h" #ifdef __NR_userfaultfd @@ -167,6 +168,23 @@ static int test_uffd_api(bool use_dev) goto out; } + /* Verify returned fd-level ioctls bitmask */ + { + uint64_t expected_ioctls = + BIT_ULL(_UFFDIO_REGISTER) | + BIT_ULL(_UFFDIO_UNREGISTER) | + BIT_ULL(_UFFDIO_API) | + BIT_ULL(_UFFDIO_SET_MODE); + + if ((uffdio_api.ioctls & expected_ioctls) != expected_ioctls) { + uffd_test_fail("UFFDIO_API missing expected ioctls: " + "got=0x%"PRIx64", expected=0x%"PRIx64, + (uint64_t)uffdio_api.ioctls, + expected_ioctls); + goto out; + } + } + /* Test double requests of UFFDIO_API with a random feature set */ uffdio_api.features = BIT_ULL(0); if (ioctl(uffd, UFFDIO_API, &uffdio_api) == 0) { @@ -623,6 +641,691 @@ void uffd_minor_collapse_test(uffd_global_test_opts_t *gopts, uffd_test_args_t * uffd_minor_test_common(gopts, true, false); } +static int uffd_register_rwp(int uffd, void *addr, uint64_t len) +{ + struct uffdio_register reg = { + .range = { .start = (unsigned long)addr, .len = len }, + .mode = UFFDIO_REGISTER_MODE_RWP, + }; + + if (ioctl(uffd, UFFDIO_REGISTER, ®) == -1) + return -errno; + return 0; +} + +static void rwprotect_range(int uffd, __u64 start, __u64 len, bool protect) +{ + struct uffdio_rwprotect rwp = { + .range = { .start = start, .len = len }, + .mode = protect ? UFFDIO_RWPROTECT_MODE_RWP : 0, + }; + + if (ioctl(uffd, UFFDIO_RWPROTECT, &rwp)) + err("UFFDIO_RWPROTECT failed"); +} + +static void set_async_mode(int uffd, bool enable) +{ + struct uffdio_set_mode mode = { }; + + if (enable) + mode.enable = UFFD_FEATURE_RWP_ASYNC; + else + mode.disable = UFFD_FEATURE_RWP_ASYNC; + + if (ioctl(uffd, UFFDIO_SET_MODE, &mode)) + err("UFFDIO_SET_MODE failed"); +} + +/* + * Test async RWP faults on anonymous memory. + * Populate pages, register MODE_RWP with RWP_ASYNC, + * RW-protect, re-access, verify content preserved and no faults delivered. + */ +static void uffd_rwp_async_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + unsigned long p; + + /* Populate all pages with known content */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, p % 255 + 1, page_size); + + /* Register MODE_RWP */ + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + /* RW-protect all pages (sets protnone) */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Access all pages — should auto-resolve, no faults */ + for (p = 0; p < nr_pages; p++) { + unsigned char *page = (unsigned char *)gopts->area_dst + + p * page_size; + unsigned char expected = p % 255 + 1; + + if (page[0] != expected) { + uffd_test_fail("page %lu content mismatch: %u != %u", + p, page[0], expected); + return; + } + } + + uffd_test_pass(); +} + +/* + * Fault handler for RWP — unprotect the page via UFFDIO_RWPROTECT. + */ +static void uffd_handle_rwp_fault(uffd_global_test_opts_t *gopts, + struct uffd_msg *msg, + struct uffd_args *uargs) +{ + if (!(msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_RWP)) + err("expected RWP fault, got 0x%llx", + msg->arg.pagefault.flags); + + rwprotect_range(gopts->uffd, msg->arg.pagefault.address, + gopts->page_size, false); + uargs->minor_faults++; +} + +/* + * Test sync RWP faults on anonymous memory. + * Populate pages, register MODE_RWP (sync), RW-protect, + * access from worker thread, verify fault delivered, UFFDIO_RWPROTECT resolves. + */ +static void uffd_rwp_sync_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + pthread_t uffd_mon; + struct uffd_args uargs = { }; + bool failed = false; + char c = '\0'; + unsigned long p; + + uargs.gopts = gopts; + uargs.handle_fault = uffd_handle_rwp_fault; + + /* Populate all pages */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, p % 255 + 1, page_size); + + /* Register MODE_RWP */ + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + /* RW-protect all pages */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Start fault handler thread */ + if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &uargs)) + err("uffd_poll_thread create"); + + /* Access all pages — triggers sync RWP faults, handler unprotects */ + for (p = 0; p < nr_pages; p++) { + unsigned char *page = (unsigned char *)gopts->area_dst + + p * page_size; + + if (page[0] != (p % 255 + 1)) { + uffd_test_fail("page %lu content mismatch", p); + failed = true; + goto out; + } + } + +out: + /* + * Stop the handler before reading minor_faults: the last fault + * resolution rwprotect_range()s before incrementing the counter, + * so the main thread can race ahead of the increment. + */ + if (write(gopts->pipefd[1], &c, sizeof(c)) != sizeof(c)) + err("pipe write"); + if (pthread_join(uffd_mon, NULL)) + err("join() failed"); + + if (failed) + return; + if (uargs.minor_faults == 0) + uffd_test_fail("expected RWP faults, got 0"); + else + uffd_test_pass(); +} + +/* + * Test PAGEMAP_SCAN detection of RW-protected (cold) pages. + */ +static void uffd_rwp_pagemap_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + unsigned long p; + struct page_region regions[16]; + struct pm_scan_arg pm_arg; + int pagemap_fd; + long ret; + + /* Need at least 4 pages */ + if (nr_pages < 4) { + uffd_test_skip("need at least 4 pages"); + return; + } + + /* Populate all pages */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, 0xab, page_size); + + /* Register and RW-protect */ + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Touch first half of pages to re-activate them (async auto-resolve) */ + for (p = 0; p < nr_pages / 2; p++) { + volatile char *page = gopts->area_dst + p * page_size; + (void)*page; + } + + /* Scan for cold (still RW-protected) pages */ + pagemap_fd = open("/proc/self/pagemap", O_RDONLY); + if (pagemap_fd < 0) + err("open pagemap"); + + /* + * PAGE_IS_ACCESSED is set once the uffd-wp bit has been cleared + * (access happened, or the user resolved). Invert it to select + * still-protected (cold) pages. + */ + memset(&pm_arg, 0, sizeof(pm_arg)); + pm_arg.size = sizeof(pm_arg); + pm_arg.start = (uint64_t)gopts->area_dst; + pm_arg.end = (uint64_t)gopts->area_dst + nr_pages * page_size; + pm_arg.vec = (uint64_t)regions; + pm_arg.vec_len = 16; + pm_arg.category_mask = PAGE_IS_ACCESSED; + pm_arg.category_inverted = PAGE_IS_ACCESSED; + pm_arg.return_mask = PAGE_IS_ACCESSED; + + ret = ioctl(pagemap_fd, PAGEMAP_SCAN, &pm_arg); + close(pagemap_fd); + + if (ret < 0) { + uffd_test_fail("PAGEMAP_SCAN failed: %s", strerror(errno)); + return; + } + + /* + * The second half of pages should be reported as RW-protected. + * They may be coalesced into one region. + */ + if (ret < 1) { + uffd_test_fail("expected cold pages, got %ld regions", ret); + return; + } + + /* Verify the cold region covers the second half */ + uint64_t cold_start = regions[0].start; + uint64_t expected_start = (uint64_t)gopts->area_dst + + (nr_pages / 2) * page_size; + + if (cold_start != expected_start) { + uffd_test_fail("cold region starts at 0x%lx, expected 0x%lx", + (unsigned long)cold_start, + (unsigned long)expected_start); + return; + } + + uffd_test_pass(); +} + +/* + * Test that RWP protection survives a mprotect(PROT_NONE) -> + * mprotect(PROT_READ|PROT_WRITE) round-trip. The uffd-wp bit on a + * VM_UFFD_RWP VMA must continue to carry PROT_NONE semantics after + * mprotect() changes the base protection; otherwise accesses would + * silently succeed and the pagemap bit would stick without a fault + * ever clearing it. + */ +static void uffd_rwp_mprotect_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + unsigned long p; + struct page_region regions[16]; + struct pm_scan_arg pm_arg; + int pagemap_fd; + long ret; + + /* Populate all pages */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, 0xab, page_size); + + /* Register and RW-protect the whole range */ + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Round-trip mprotect(): PROT_NONE -> PROT_READ|PROT_WRITE */ + if (mprotect(gopts->area_dst, nr_pages * page_size, PROT_NONE)) + err("mprotect() PROT_NONE"); + if (mprotect(gopts->area_dst, nr_pages * page_size, + PROT_READ | PROT_WRITE)) + err("mprotect() PROT_READ|PROT_WRITE"); + + /* Touch every page. Async RWP must auto-resolve each fault. */ + for (p = 0; p < nr_pages; p++) { + volatile char *page = gopts->area_dst + p * page_size; + (void)*page; + } + + /* + * After touching, no page should remain RW-protected. A stuck + * uffd-wp bit would mean mprotect() silently dropped PROT_NONE and + * the access never faulted. + */ + pagemap_fd = open("/proc/self/pagemap", O_RDONLY); + if (pagemap_fd < 0) + err("open pagemap"); + + memset(&pm_arg, 0, sizeof(pm_arg)); + pm_arg.size = sizeof(pm_arg); + pm_arg.start = (uint64_t)gopts->area_dst; + pm_arg.end = (uint64_t)gopts->area_dst + nr_pages * page_size; + pm_arg.vec = (uint64_t)regions; + pm_arg.vec_len = 16; + pm_arg.category_mask = PAGE_IS_ACCESSED; + pm_arg.category_inverted = PAGE_IS_ACCESSED; + pm_arg.return_mask = PAGE_IS_ACCESSED; + + ret = ioctl(pagemap_fd, PAGEMAP_SCAN, &pm_arg); + close(pagemap_fd); + + if (ret < 0) { + uffd_test_fail("PAGEMAP_SCAN failed: %s", strerror(errno)); + return; + } + if (ret != 0) { + uffd_test_fail("expected no cold pages after mprotect()+touch, got %ld regions", + ret); + return; + } + + uffd_test_pass(); +} + +/* + * Test that GUP resolves through protnone PTEs (async mode). + * RW-protect pages, then use a pipe to exercise GUP on the RW-protected + * memory. write() from RW-protected pages triggers GUP which must fault + * through the protnone PTE. + */ +static void uffd_rwp_gup_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long page_size = gopts->page_size; + char *buf; + int pipefd[2]; + + buf = malloc(page_size); + if (!buf) + err("malloc"); + + /* Populate first page with known content */ + memset(gopts->area_dst, 0xCD, page_size); + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, page_size)) + err("register failure"); + + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, page_size, true); + + if (pipe(pipefd)) + err("pipe"); + + /* + * write() from the RW-protected page into the pipe. This triggers + * GUP on the protnone PTE; in async mode the kernel auto-restores + * permissions and GUP succeeds. One byte is enough to exercise + * the GUP path and avoids any concern about pipe buffer sizing on + * large-page archs. + */ + if (write(pipefd[1], gopts->area_dst, 1) != 1) { + uffd_test_fail("write from RW-protected page failed: %s", + strerror(errno)); + goto out; + } + + if (read(pipefd[0], buf, 1) != 1) { + uffd_test_fail("read from pipe failed"); + goto out; + } + + if (buf[0] != (char)0xCD) { + uffd_test_fail("content mismatch: got 0x%02x, expected 0xCD", + (unsigned char)buf[0]); + goto out; + } + + uffd_test_pass(); +out: + close(pipefd[0]); + close(pipefd[1]); + free(buf); +} + +/* + * Test runtime toggle between async and sync modes. + * Start in async mode (detection), flip to sync (eviction), verify faults + * block, resolve them, flip back to async. + */ +static void uffd_rwp_async_toggle_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + struct uffd_args uargs = { }; + pthread_t uffd_mon; + bool started = false; + char c = '\0'; + unsigned long p; + + uargs.gopts = gopts; + uargs.handle_fault = uffd_handle_rwp_fault; + + /* Populate */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, p % 255 + 1, page_size); + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + /* Phase 1: async detection — RW-protect, access first half */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + for (p = 0; p < nr_pages / 2; p++) { + volatile char *page = gopts->area_dst + p * page_size; + (void)*page; /* auto-resolves in async mode */ + } + + /* Phase 2: flip to sync for eviction */ + set_async_mode(gopts->uffd, false); + + /* Start handler — will receive faults for cold pages */ + if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &uargs)) + err("uffd_poll_thread create"); + started = true; + + /* Access second half (cold pages) — should trigger sync faults */ + for (p = nr_pages / 2; p < nr_pages; p++) { + unsigned char *page = (unsigned char *)gopts->area_dst + + p * page_size; + if (page[0] != (p % 255 + 1)) { + uffd_test_fail("page %lu content mismatch", p); + goto out; + } + } + + /* + * Stop the handler before reading minor_faults: the last fault + * resolution rwprotect_range()s before incrementing the counter, + * so the main thread can race ahead of the increment. Stopping + * here also makes Phase 3 a clean async-only test -- with the + * handler still running it would silently resolve any sync fault + * the kernel erroneously delivers, masking a regression. + */ + if (write(gopts->pipefd[1], &c, sizeof(c)) != sizeof(c)) + err("pipe write"); + if (pthread_join(uffd_mon, NULL)) + err("join() failed"); + started = false; + + if (uargs.minor_faults == 0) { + uffd_test_fail("expected sync faults, got 0"); + goto out; + } + + /* Phase 3: flip back to async */ + set_async_mode(gopts->uffd, true); + + /* RW-protect and access again — should auto-resolve */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + for (p = 0; p < nr_pages; p++) { + volatile char *page = gopts->area_dst + p * page_size; + (void)*page; + } + + uffd_test_pass(); +out: + if (started) { + if (write(gopts->pipefd[1], &c, sizeof(c)) != sizeof(c)) + err("pipe write"); + if (pthread_join(uffd_mon, NULL)) + err("join() failed"); + } +} + +/* + * Test that RW-protected pages become accessible after closing uffd. + */ +static void uffd_rwp_close_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + unsigned long p; + + /* Populate */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, p % 255 + 1, page_size); + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Close uffd — should restore protnone PTEs */ + close(gopts->uffd); + gopts->uffd = -1; + + /* All pages should be accessible with original content */ + for (p = 0; p < nr_pages; p++) { + unsigned char *page = (unsigned char *)gopts->area_dst + + p * page_size; + unsigned char expected = p % 255 + 1; + + if (page[0] != expected) { + uffd_test_fail("page %lu not accessible after close", p); + return; + } + } + + uffd_test_pass(); +} + +/* + * Test that RWP protection is preserved across fork() when + * UFFD_FEATURE_EVENT_FORK is enabled. Without preservation, the child's + * PTEs would lose the uffd-wp marker and RWP-protected accesses would + * silently fall through to do_numa_page(). + */ +static void uffd_rwp_fork_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + int pagemap_fd; + uint64_t value; + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failed"); + + /* Populate + RWP-protect */ + *gopts->area_dst = 1; + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + page_size, true); + + /* Parent: verify uffd-wp bit is set before fork */ + pagemap_fd = pagemap_open(); + value = pagemap_get_entry(pagemap_fd, gopts->area_dst); + pagemap_check_wp(value, true); + + /* + * Fork with EVENT_FORK: child inherits VM_UFFD_RWP. Child reads + * its own pagemap and must still see the uffd-wp bit set. + */ + if (pagemap_test_fork(gopts, true, false)) { + uffd_test_fail("RWP marker lost in child after fork"); + goto out; + } + + uffd_test_pass(); +out: + close(pagemap_fd); +} + +/* + * Test that RWP protection on a pinned anon page is preserved across fork(). + * Pinning forces copy_present_page() in the child path, which must restore + * PAGE_NONE on top of the uffd bit. Using async mode, a read in the child + * auto-resolves if — and only if — the PTE was actually protnone+uffd; the + * cleared uffd bit afterward proves the fault path ran. + */ +static void uffd_rwp_fork_pin_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long page_size = gopts->page_size; + fork_event_args fevent_args = { .gopts = gopts, .child_uffd = -1 }; + pin_args pin_args = {}; + int pagemap_fd, status; + pthread_t fevent_thread; + uint64_t value; + pid_t child; + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, page_size)) + err("register failed"); + + /* Populate. */ + *gopts->area_dst = 1; + + /* RO-longterm pin so fork() takes copy_present_page() for this PTE. */ + if (pin_pages(&pin_args, gopts->area_dst, page_size)) { + uffd_test_skip("Possibly CONFIG_GUP_TEST missing or unprivileged"); + uffd_unregister(gopts->uffd, gopts->area_dst, page_size); + return; + } + + /* RWP-protect: PTE is now PAGE_NONE + uffd bit. */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, page_size, true); + + pagemap_fd = pagemap_open(); + value = pagemap_get_entry(pagemap_fd, gopts->area_dst); + pagemap_check_wp(value, true); + + /* + * UFFD_FEATURE_EVENT_FORK is required so the child inherits + * VM_UFFD_RWP and the marker; without it dup_userfaultfd() resets + * the child VMA and the test would pass for the wrong reason. + * dup_userfaultfd() blocks until the EVENT_FORK message is consumed, + * so spawn a reader before the fork(). + */ + gopts->ready_for_fork = false; + if (pthread_create(&fevent_thread, NULL, fork_event_consumer, + &fevent_args)) + err("pthread_create() for fork event consumer"); + while (!gopts->ready_for_fork) + ; /* Wait for consumer to start polling. */ + + child = fork(); + if (child < 0) + err("fork"); + if (child == 0) { + volatile char c; + int cfd; + + /* + * Read the pinned page. Only reaches the fault path if the + * child PTE is protnone + uffd; async mode auto-resolves and + * clears the uffd bit. If copy_present_page() dropped + * PAGE_NONE, the read would silently succeed and the bit + * would still be set. + */ + c = *(volatile char *)gopts->area_dst; + (void)c; + + cfd = pagemap_open(); + value = pagemap_get_entry(cfd, gopts->area_dst); + close(cfd); + _exit((value & PM_UFFD_WP) ? 1 : 0); + } + if (waitpid(child, &status, 0) < 0) + err("waitpid"); + if (pthread_join(fevent_thread, NULL)) + err("pthread_join() for fork event consumer"); + if (fevent_args.child_uffd >= 0) + close(fevent_args.child_uffd); + + unpin_pages(&pin_args); + close(pagemap_fd); + if (uffd_unregister(gopts->uffd, gopts->area_dst, page_size)) + err("unregister failed"); + + if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) { + uffd_test_fail("RWP not enforced in child after pinned fork"); + return; + } + + uffd_test_pass(); +} + +/* + * WP and RWP share the uffd-wp PTE bit and cannot coexist in the same VMA. + * Registration requesting both modes must be rejected. + */ +static void uffd_rwp_wp_exclusive_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + struct uffdio_register reg = { }; + + reg.range.start = (unsigned long)gopts->area_dst; + reg.range.len = nr_pages * page_size; + reg.mode = UFFDIO_REGISTER_MODE_WP | UFFDIO_REGISTER_MODE_RWP; + + if (ioctl(gopts->uffd, UFFDIO_REGISTER, ®) == 0) { + uffd_test_fail("register with WP|RWP unexpectedly succeeded"); + return; + } + if (errno != EINVAL) { + uffd_test_fail("register with WP|RWP: expected EINVAL, got %d", + errno); + return; + } + uffd_test_pass(); +} + static sigjmp_buf jbuf, *sigbuf; static void sighndl(int sig, siginfo_t *siginfo, void *ptr) @@ -1625,6 +2328,77 @@ uffd_test_case_t uffd_tests[] = { /* We can't test MADV_COLLAPSE, so try our luck */ .uffd_feature_required = UFFD_FEATURE_MINOR_SHMEM, }, + { + .name = "rwp-async", + .uffd_fn = uffd_rwp_async_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-sync", + .uffd_fn = uffd_rwp_sync_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = UFFD_FEATURE_RWP, + }, + { + .name = "rwp-pagemap", + .uffd_fn = uffd_rwp_pagemap_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-mprotect", + .uffd_fn = uffd_rwp_mprotect_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-gup", + .uffd_fn = uffd_rwp_gup_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-async-toggle", + .uffd_fn = uffd_rwp_async_toggle_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-close", + .uffd_fn = uffd_rwp_close_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = UFFD_FEATURE_RWP, + }, + { + .name = "rwp-fork", + .uffd_fn = uffd_rwp_fork_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_EVENT_FORK, + }, + { + .name = "rwp-fork-pin", + .uffd_fn = uffd_rwp_fork_pin_test, + .mem_targets = MEM_ANON, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC | + UFFD_FEATURE_EVENT_FORK, + }, + { + .name = "rwp-wp-exclusive", + .uffd_fn = uffd_rwp_wp_exclusive_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | + UFFD_FEATURE_PAGEFAULT_FLAG_WP | + UFFD_FEATURE_WP_HUGETLBFS_SHMEM, + }, { .name = "sigbus", .uffd_fn = uffd_sigbus_test, -- 2.51.2