From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4E54740FDB8; Fri, 8 May 2026 15:56:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778255794; cv=none; b=fu6sXb7hplD8IBMrEEJEB1wUDvmL2unLqjknK8d5lv3bA0b2nm26Rm2ja/7U+VnB3V+b4H2AwQPYR/FcjJppbok6Vh8XDVxZlnR84hpX/Q8xHUNh5e2n4v0JgrTl3M7D6coNsSFSc67acBkqS3aPoWZP5lWr9ANeBHY6ob3iagA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778255794; c=relaxed/simple; bh=TerqHRubO4Vz6oZ4tH183bCKzAjkZvocVtgyx1u1M3U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=NWNzgqYR9NDJKD2PIwgYkcrLhtriOslIDeDOr00z4aAbF30EOuTVHqtpG/tyZjt14ae4acC9qIbUZj/DRNEAsB+LsfKgeHiHL04OdAsu5WIQtrDsQ7Uni0oAV1j3EbBw+7DkrsS1iXov0iS0YigdK/Cwlo+fZ/1pyHD+SrG02P0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=FBJ6haGU; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="FBJ6haGU" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E5258C4AF0B; Fri, 8 May 2026 15:56:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1778255793; bh=TerqHRubO4Vz6oZ4tH183bCKzAjkZvocVtgyx1u1M3U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FBJ6haGU9EbZh5yW0ExzmcEmtW3QJC8cK31sqbyWJ2QY6KjYrfEawsOEfNcPk5R98 b66AUmiSLymNJn/cQyEoXw23Zt0INmi4wJ+kq9KfgQwhTvK6UzC8Pqp2eBkor4Bcz3 1i1P4eXIcI3O5qN0ERYoFHyiElVNEOSf1h4wYYCJEWNuK0ZWxRAGmXKml9Aryql45m /GoYMLINaTb4YmpHutJuGtiTDIqcoqhrbcipENL5Ie4vl++S+uUDamHKQ06TZhxzBW rKVR435LWg+m4NJOoFH5h3mohgB+35caMtjdO/no++HZ4yL60FAcZCFMfQtHIx5Map lBx43wqHegM+g== Received: from phl-compute-11.internal (phl-compute-11.internal [10.202.2.51]) by mailfauth.phl.internal (Postfix) with ESMTP id 442FAF4006B; Fri, 8 May 2026 11:56:32 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-11.internal (MEProxy); Fri, 08 May 2026 11:56:32 -0400 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefhedrtddtgdduuddtjeeiucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucenucfjughrpefhvfevufffkffojghfgggtgfesthekre dtredtjeenucfhrhhomhepfdfmihhrhihlucfuhhhuthhsvghmrghuucdlofgvthgrmddf uceokhgrsheskhgvrhhnvghlrdhorhhgqeenucggtffrrghtthgvrhhnpefhvdefvdevje evhefhhfevudefudejfeduvdekheeludfhiefhhedujeffffeigfenucevlhhushhtvghr ufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehkihhrihhllhdomhgvshhmth hprghuthhhphgvrhhsohhnrghlihhthidqudeiudduiedvieehhedqvdekgeeggeejvdek qdhkrghspeepkhgvrhhnvghlrdhorhhgsehshhhuthgvmhhovhdrnhgrmhgvpdhnsggprh gtphhtthhopedvgedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtoheprghkphhmsehl ihhnuhigqdhfohhunhgurghtihhonhdrohhrghdprhgtphhtthhopehrphhptheskhgvrh hnvghlrdhorhhgpdhrtghpthhtohepphgvthgvrhigsehrvgguhhgrthdrtghomhdprhgt phhtthhopegurghvihgusehkvghrnhgvlhdrohhrghdprhgtphhtthhopehljhhssehkvg hrnhgvlhdrohhrghdprhgtphhtthhopehsuhhrvghnsgesghhoohhglhgvrdgtohhmpdhr tghpthhtohepvhgsrggskhgrsehkvghrnhgvlhdrohhrghdprhgtphhtthhopehlihgrmh drhhhofihlvghtthesohhrrggtlhgvrdgtohhmpdhrtghpthhtohepiihihiesnhhvihgu ihgrrdgtohhm X-ME-Proxy: Feedback-ID: i10464835:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 8 May 2026 11:56:31 -0400 (EDT) From: "Kiryl Shutsemau (Meta)" To: akpm@linux-foundation.org, rppt@kernel.org, peterx@redhat.com, david@kernel.org Cc: ljs@kernel.org, surenb@google.com, vbabka@kernel.org, Liam.Howlett@oracle.com, ziy@nvidia.com, corbet@lwn.net, skhan@linuxfoundation.org, seanjc@google.com, pbonzini@redhat.com, jthoughton@google.com, aarcange@redhat.com, sj@kernel.org, usama.arif@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, kvm@vger.kernel.org, kernel-team@meta.com, "Kiryl Shutsemau (Meta)" Subject: [PATCH v2 13/14] selftests/mm: add userfaultfd RWP tests Date: Fri, 8 May 2026 16:55:25 +0100 Message-ID: X-Mailer: git-send-email 2.51.2 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Coverage for UFFDIO_REGISTER_MODE_RWP and UFFDIO_RWPROTECT: rwp-async async mode — touch pages, verify permissions are auto-restored without a message rwp-sync sync mode — access blocks, handler resolves via UFFDIO_RWPROTECT rwp-pagemap PAGEMAP_SCAN reports still-cold pages via inverted PAGE_IS_ACCESSED rwp-mprotect RWP survives mprotect(PROT_NONE) -> mprotect(PROT_READ|PROT_WRITE) round-trip rwp-gup GUP walks through a protnone RWP PTE (pipe write/read drives the GUP path) rwp-async-toggle UFFDIO_SET_MODE flips between sync and async without re-registering rwp-close closing the uffd restores page permissions rwp-fork RWP survives fork() with EVENT_FORK; child's PTEs keep the uffd bit rwp-fork-pin RWP survives fork() on an RO-longterm-pinned anon page (forces copy_present_page()); child read auto-resolves and clears the bit, proving PAGE_NONE was in place rwp-wp-exclusive register with MODE_WP|MODE_RWP returns -EINVAL All tests run against anon, shmem, shmem-private, hugetlb, and hugetlb-private memory, except rwp-fork-pin which is anon-only — copy_present_page() is the private-anon pinned-exclusive fork path. Signed-off-by: Kiryl Shutsemau Assisted-by: Claude:claude-opus-4-6 --- tools/testing/selftests/mm/uffd-unit-tests.c | 774 +++++++++++++++++++ 1 file changed, 774 insertions(+) diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c index 6f5e404a446c..a35fb677e4cc 100644 --- a/tools/testing/selftests/mm/uffd-unit-tests.c +++ b/tools/testing/selftests/mm/uffd-unit-tests.c @@ -7,6 +7,7 @@ #include "uffd-common.h" +#include #include "../../../../mm/gup_test.h" #ifdef __NR_userfaultfd @@ -167,6 +168,23 @@ static int test_uffd_api(bool use_dev) goto out; } + /* Verify returned fd-level ioctls bitmask */ + { + uint64_t expected_ioctls = + BIT_ULL(_UFFDIO_REGISTER) | + BIT_ULL(_UFFDIO_UNREGISTER) | + BIT_ULL(_UFFDIO_API) | + BIT_ULL(_UFFDIO_SET_MODE); + + if ((uffdio_api.ioctls & expected_ioctls) != expected_ioctls) { + uffd_test_fail("UFFDIO_API missing expected ioctls: " + "got=0x%"PRIx64", expected=0x%"PRIx64, + (uint64_t)uffdio_api.ioctls, + expected_ioctls); + goto out; + } + } + /* Test double requests of UFFDIO_API with a random feature set */ uffdio_api.features = BIT_ULL(0); if (ioctl(uffd, UFFDIO_API, &uffdio_api) == 0) { @@ -623,6 +641,691 @@ void uffd_minor_collapse_test(uffd_global_test_opts_t *gopts, uffd_test_args_t * uffd_minor_test_common(gopts, true, false); } +static int uffd_register_rwp(int uffd, void *addr, uint64_t len) +{ + struct uffdio_register reg = { + .range = { .start = (unsigned long)addr, .len = len }, + .mode = UFFDIO_REGISTER_MODE_RWP, + }; + + if (ioctl(uffd, UFFDIO_REGISTER, ®) == -1) + return -errno; + return 0; +} + +static void rwprotect_range(int uffd, __u64 start, __u64 len, bool protect) +{ + struct uffdio_rwprotect rwp = { + .range = { .start = start, .len = len }, + .mode = protect ? UFFDIO_RWPROTECT_MODE_RWP : 0, + }; + + if (ioctl(uffd, UFFDIO_RWPROTECT, &rwp)) + err("UFFDIO_RWPROTECT failed"); +} + +static void set_async_mode(int uffd, bool enable) +{ + struct uffdio_set_mode mode = { }; + + if (enable) + mode.enable = UFFD_FEATURE_RWP_ASYNC; + else + mode.disable = UFFD_FEATURE_RWP_ASYNC; + + if (ioctl(uffd, UFFDIO_SET_MODE, &mode)) + err("UFFDIO_SET_MODE failed"); +} + +/* + * Test async RWP faults on anonymous memory. + * Populate pages, register MODE_RWP with RWP_ASYNC, + * RW-protect, re-access, verify content preserved and no faults delivered. + */ +static void uffd_rwp_async_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + unsigned long p; + + /* Populate all pages with known content */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, p % 255 + 1, page_size); + + /* Register MODE_RWP */ + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + /* RW-protect all pages (sets protnone) */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Access all pages — should auto-resolve, no faults */ + for (p = 0; p < nr_pages; p++) { + unsigned char *page = (unsigned char *)gopts->area_dst + + p * page_size; + unsigned char expected = p % 255 + 1; + + if (page[0] != expected) { + uffd_test_fail("page %lu content mismatch: %u != %u", + p, page[0], expected); + return; + } + } + + uffd_test_pass(); +} + +/* + * Fault handler for RWP — unprotect the page via UFFDIO_RWPROTECT. + */ +static void uffd_handle_rwp_fault(uffd_global_test_opts_t *gopts, + struct uffd_msg *msg, + struct uffd_args *uargs) +{ + if (!(msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_RWP)) + err("expected RWP fault, got 0x%llx", + msg->arg.pagefault.flags); + + rwprotect_range(gopts->uffd, msg->arg.pagefault.address, + gopts->page_size, false); + uargs->minor_faults++; +} + +/* + * Test sync RWP faults on anonymous memory. + * Populate pages, register MODE_RWP (sync), RW-protect, + * access from worker thread, verify fault delivered, UFFDIO_RWPROTECT resolves. + */ +static void uffd_rwp_sync_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + pthread_t uffd_mon; + struct uffd_args uargs = { }; + bool failed = false; + char c = '\0'; + unsigned long p; + + uargs.gopts = gopts; + uargs.handle_fault = uffd_handle_rwp_fault; + + /* Populate all pages */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, p % 255 + 1, page_size); + + /* Register MODE_RWP */ + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + /* RW-protect all pages */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Start fault handler thread */ + if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &uargs)) + err("uffd_poll_thread create"); + + /* Access all pages — triggers sync RWP faults, handler unprotects */ + for (p = 0; p < nr_pages; p++) { + unsigned char *page = (unsigned char *)gopts->area_dst + + p * page_size; + + if (page[0] != (p % 255 + 1)) { + uffd_test_fail("page %lu content mismatch", p); + failed = true; + goto out; + } + } + +out: + /* + * Stop the handler before reading minor_faults: the last fault + * resolution rwprotect_range()s before incrementing the counter, + * so the main thread can race ahead of the increment. + */ + if (write(gopts->pipefd[1], &c, sizeof(c)) != sizeof(c)) + err("pipe write"); + if (pthread_join(uffd_mon, NULL)) + err("join() failed"); + + if (failed) + return; + if (uargs.minor_faults == 0) + uffd_test_fail("expected RWP faults, got 0"); + else + uffd_test_pass(); +} + +/* + * Test PAGEMAP_SCAN detection of RW-protected (cold) pages. + */ +static void uffd_rwp_pagemap_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + unsigned long p; + struct page_region regions[16]; + struct pm_scan_arg pm_arg; + int pagemap_fd; + long ret; + + /* Need at least 4 pages */ + if (nr_pages < 4) { + uffd_test_skip("need at least 4 pages"); + return; + } + + /* Populate all pages */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, 0xab, page_size); + + /* Register and RW-protect */ + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Touch first half of pages to re-activate them (async auto-resolve) */ + for (p = 0; p < nr_pages / 2; p++) { + volatile char *page = gopts->area_dst + p * page_size; + (void)*page; + } + + /* Scan for cold (still RW-protected) pages */ + pagemap_fd = open("/proc/self/pagemap", O_RDONLY); + if (pagemap_fd < 0) + err("open pagemap"); + + /* + * PAGE_IS_ACCESSED is set once the uffd-wp bit has been cleared + * (access happened, or the user resolved). Invert it to select + * still-protected (cold) pages. + */ + memset(&pm_arg, 0, sizeof(pm_arg)); + pm_arg.size = sizeof(pm_arg); + pm_arg.start = (uint64_t)gopts->area_dst; + pm_arg.end = (uint64_t)gopts->area_dst + nr_pages * page_size; + pm_arg.vec = (uint64_t)regions; + pm_arg.vec_len = 16; + pm_arg.category_mask = PAGE_IS_ACCESSED; + pm_arg.category_inverted = PAGE_IS_ACCESSED; + pm_arg.return_mask = PAGE_IS_ACCESSED; + + ret = ioctl(pagemap_fd, PAGEMAP_SCAN, &pm_arg); + close(pagemap_fd); + + if (ret < 0) { + uffd_test_fail("PAGEMAP_SCAN failed: %s", strerror(errno)); + return; + } + + /* + * The second half of pages should be reported as RW-protected. + * They may be coalesced into one region. + */ + if (ret < 1) { + uffd_test_fail("expected cold pages, got %ld regions", ret); + return; + } + + /* Verify the cold region covers the second half */ + uint64_t cold_start = regions[0].start; + uint64_t expected_start = (uint64_t)gopts->area_dst + + (nr_pages / 2) * page_size; + + if (cold_start != expected_start) { + uffd_test_fail("cold region starts at 0x%lx, expected 0x%lx", + (unsigned long)cold_start, + (unsigned long)expected_start); + return; + } + + uffd_test_pass(); +} + +/* + * Test that RWP protection survives a mprotect(PROT_NONE) -> + * mprotect(PROT_READ|PROT_WRITE) round-trip. The uffd-wp bit on a + * VM_UFFD_RWP VMA must continue to carry PROT_NONE semantics after + * mprotect() changes the base protection; otherwise accesses would + * silently succeed and the pagemap bit would stick without a fault + * ever clearing it. + */ +static void uffd_rwp_mprotect_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + unsigned long p; + struct page_region regions[16]; + struct pm_scan_arg pm_arg; + int pagemap_fd; + long ret; + + /* Populate all pages */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, 0xab, page_size); + + /* Register and RW-protect the whole range */ + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Round-trip mprotect(): PROT_NONE -> PROT_READ|PROT_WRITE */ + if (mprotect(gopts->area_dst, nr_pages * page_size, PROT_NONE)) + err("mprotect() PROT_NONE"); + if (mprotect(gopts->area_dst, nr_pages * page_size, + PROT_READ | PROT_WRITE)) + err("mprotect() PROT_READ|PROT_WRITE"); + + /* Touch every page. Async RWP must auto-resolve each fault. */ + for (p = 0; p < nr_pages; p++) { + volatile char *page = gopts->area_dst + p * page_size; + (void)*page; + } + + /* + * After touching, no page should remain RW-protected. A stuck + * uffd-wp bit would mean mprotect() silently dropped PROT_NONE and + * the access never faulted. + */ + pagemap_fd = open("/proc/self/pagemap", O_RDONLY); + if (pagemap_fd < 0) + err("open pagemap"); + + memset(&pm_arg, 0, sizeof(pm_arg)); + pm_arg.size = sizeof(pm_arg); + pm_arg.start = (uint64_t)gopts->area_dst; + pm_arg.end = (uint64_t)gopts->area_dst + nr_pages * page_size; + pm_arg.vec = (uint64_t)regions; + pm_arg.vec_len = 16; + pm_arg.category_mask = PAGE_IS_ACCESSED; + pm_arg.category_inverted = PAGE_IS_ACCESSED; + pm_arg.return_mask = PAGE_IS_ACCESSED; + + ret = ioctl(pagemap_fd, PAGEMAP_SCAN, &pm_arg); + close(pagemap_fd); + + if (ret < 0) { + uffd_test_fail("PAGEMAP_SCAN failed: %s", strerror(errno)); + return; + } + if (ret != 0) { + uffd_test_fail("expected no cold pages after mprotect()+touch, got %ld regions", + ret); + return; + } + + uffd_test_pass(); +} + +/* + * Test that GUP resolves through protnone PTEs (async mode). + * RW-protect pages, then use a pipe to exercise GUP on the RW-protected + * memory. write() from RW-protected pages triggers GUP which must fault + * through the protnone PTE. + */ +static void uffd_rwp_gup_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long page_size = gopts->page_size; + char *buf; + int pipefd[2]; + + buf = malloc(page_size); + if (!buf) + err("malloc"); + + /* Populate first page with known content */ + memset(gopts->area_dst, 0xCD, page_size); + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, page_size)) + err("register failure"); + + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, page_size, true); + + if (pipe(pipefd)) + err("pipe"); + + /* + * write() from the RW-protected page into the pipe. This triggers + * GUP on the protnone PTE; in async mode the kernel auto-restores + * permissions and GUP succeeds. One byte is enough to exercise + * the GUP path and avoids any concern about pipe buffer sizing on + * large-page archs. + */ + if (write(pipefd[1], gopts->area_dst, 1) != 1) { + uffd_test_fail("write from RW-protected page failed: %s", + strerror(errno)); + goto out; + } + + if (read(pipefd[0], buf, 1) != 1) { + uffd_test_fail("read from pipe failed"); + goto out; + } + + if (buf[0] != (char)0xCD) { + uffd_test_fail("content mismatch: got 0x%02x, expected 0xCD", + (unsigned char)buf[0]); + goto out; + } + + uffd_test_pass(); +out: + close(pipefd[0]); + close(pipefd[1]); + free(buf); +} + +/* + * Test runtime toggle between async and sync modes. + * Start in async mode (detection), flip to sync (eviction), verify faults + * block, resolve them, flip back to async. + */ +static void uffd_rwp_async_toggle_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + struct uffd_args uargs = { }; + pthread_t uffd_mon; + bool started = false; + char c = '\0'; + unsigned long p; + + uargs.gopts = gopts; + uargs.handle_fault = uffd_handle_rwp_fault; + + /* Populate */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, p % 255 + 1, page_size); + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + /* Phase 1: async detection — RW-protect, access first half */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + for (p = 0; p < nr_pages / 2; p++) { + volatile char *page = gopts->area_dst + p * page_size; + (void)*page; /* auto-resolves in async mode */ + } + + /* Phase 2: flip to sync for eviction */ + set_async_mode(gopts->uffd, false); + + /* Start handler — will receive faults for cold pages */ + if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &uargs)) + err("uffd_poll_thread create"); + started = true; + + /* Access second half (cold pages) — should trigger sync faults */ + for (p = nr_pages / 2; p < nr_pages; p++) { + unsigned char *page = (unsigned char *)gopts->area_dst + + p * page_size; + if (page[0] != (p % 255 + 1)) { + uffd_test_fail("page %lu content mismatch", p); + goto out; + } + } + + /* + * Stop the handler before reading minor_faults: the last fault + * resolution rwprotect_range()s before incrementing the counter, + * so the main thread can race ahead of the increment. Stopping + * here also makes Phase 3 a clean async-only test -- with the + * handler still running it would silently resolve any sync fault + * the kernel erroneously delivers, masking a regression. + */ + if (write(gopts->pipefd[1], &c, sizeof(c)) != sizeof(c)) + err("pipe write"); + if (pthread_join(uffd_mon, NULL)) + err("join() failed"); + started = false; + + if (uargs.minor_faults == 0) { + uffd_test_fail("expected sync faults, got 0"); + goto out; + } + + /* Phase 3: flip back to async */ + set_async_mode(gopts->uffd, true); + + /* RW-protect and access again — should auto-resolve */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + for (p = 0; p < nr_pages; p++) { + volatile char *page = gopts->area_dst + p * page_size; + (void)*page; + } + + uffd_test_pass(); +out: + if (started) { + if (write(gopts->pipefd[1], &c, sizeof(c)) != sizeof(c)) + err("pipe write"); + if (pthread_join(uffd_mon, NULL)) + err("join() failed"); + } +} + +/* + * Test that RW-protected pages become accessible after closing uffd. + */ +static void uffd_rwp_close_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + unsigned long p; + + /* Populate */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, p % 255 + 1, page_size); + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Close uffd — should restore protnone PTEs */ + close(gopts->uffd); + gopts->uffd = -1; + + /* All pages should be accessible with original content */ + for (p = 0; p < nr_pages; p++) { + unsigned char *page = (unsigned char *)gopts->area_dst + + p * page_size; + unsigned char expected = p % 255 + 1; + + if (page[0] != expected) { + uffd_test_fail("page %lu not accessible after close", p); + return; + } + } + + uffd_test_pass(); +} + +/* + * Test that RWP protection is preserved across fork() when + * UFFD_FEATURE_EVENT_FORK is enabled. Without preservation, the child's + * PTEs would lose the uffd-wp marker and RWP-protected accesses would + * silently fall through to do_numa_page(). + */ +static void uffd_rwp_fork_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + int pagemap_fd; + uint64_t value; + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failed"); + + /* Populate + RWP-protect */ + *gopts->area_dst = 1; + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + page_size, true); + + /* Parent: verify uffd-wp bit is set before fork */ + pagemap_fd = pagemap_open(); + value = pagemap_get_entry(pagemap_fd, gopts->area_dst); + pagemap_check_wp(value, true); + + /* + * Fork with EVENT_FORK: child inherits VM_UFFD_RWP. Child reads + * its own pagemap and must still see the uffd-wp bit set. + */ + if (pagemap_test_fork(gopts, true, false)) { + uffd_test_fail("RWP marker lost in child after fork"); + goto out; + } + + uffd_test_pass(); +out: + close(pagemap_fd); +} + +/* + * Test that RWP protection on a pinned anon page is preserved across fork(). + * Pinning forces copy_present_page() in the child path, which must restore + * PAGE_NONE on top of the uffd bit. Using async mode, a read in the child + * auto-resolves if — and only if — the PTE was actually protnone+uffd; the + * cleared uffd bit afterward proves the fault path ran. + */ +static void uffd_rwp_fork_pin_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long page_size = gopts->page_size; + fork_event_args fevent_args = { .gopts = gopts, .child_uffd = -1 }; + pin_args pin_args = {}; + int pagemap_fd, status; + pthread_t fevent_thread; + uint64_t value; + pid_t child; + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, page_size)) + err("register failed"); + + /* Populate. */ + *gopts->area_dst = 1; + + /* RO-longterm pin so fork() takes copy_present_page() for this PTE. */ + if (pin_pages(&pin_args, gopts->area_dst, page_size)) { + uffd_test_skip("Possibly CONFIG_GUP_TEST missing or unprivileged"); + uffd_unregister(gopts->uffd, gopts->area_dst, page_size); + return; + } + + /* RWP-protect: PTE is now PAGE_NONE + uffd bit. */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, page_size, true); + + pagemap_fd = pagemap_open(); + value = pagemap_get_entry(pagemap_fd, gopts->area_dst); + pagemap_check_wp(value, true); + + /* + * UFFD_FEATURE_EVENT_FORK is required so the child inherits + * VM_UFFD_RWP and the marker; without it dup_userfaultfd() resets + * the child VMA and the test would pass for the wrong reason. + * dup_userfaultfd() blocks until the EVENT_FORK message is consumed, + * so spawn a reader before the fork(). + */ + gopts->ready_for_fork = false; + if (pthread_create(&fevent_thread, NULL, fork_event_consumer, + &fevent_args)) + err("pthread_create() for fork event consumer"); + while (!gopts->ready_for_fork) + ; /* Wait for consumer to start polling. */ + + child = fork(); + if (child < 0) + err("fork"); + if (child == 0) { + volatile char c; + int cfd; + + /* + * Read the pinned page. Only reaches the fault path if the + * child PTE is protnone + uffd; async mode auto-resolves and + * clears the uffd bit. If copy_present_page() dropped + * PAGE_NONE, the read would silently succeed and the bit + * would still be set. + */ + c = *(volatile char *)gopts->area_dst; + (void)c; + + cfd = pagemap_open(); + value = pagemap_get_entry(cfd, gopts->area_dst); + close(cfd); + _exit((value & PM_UFFD_WP) ? 1 : 0); + } + if (waitpid(child, &status, 0) < 0) + err("waitpid"); + if (pthread_join(fevent_thread, NULL)) + err("pthread_join() for fork event consumer"); + if (fevent_args.child_uffd >= 0) + close(fevent_args.child_uffd); + + unpin_pages(&pin_args); + close(pagemap_fd); + if (uffd_unregister(gopts->uffd, gopts->area_dst, page_size)) + err("unregister failed"); + + if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) { + uffd_test_fail("RWP not enforced in child after pinned fork"); + return; + } + + uffd_test_pass(); +} + +/* + * WP and RWP share the uffd-wp PTE bit and cannot coexist in the same VMA. + * Registration requesting both modes must be rejected. + */ +static void uffd_rwp_wp_exclusive_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + struct uffdio_register reg = { }; + + reg.range.start = (unsigned long)gopts->area_dst; + reg.range.len = nr_pages * page_size; + reg.mode = UFFDIO_REGISTER_MODE_WP | UFFDIO_REGISTER_MODE_RWP; + + if (ioctl(gopts->uffd, UFFDIO_REGISTER, ®) == 0) { + uffd_test_fail("register with WP|RWP unexpectedly succeeded"); + return; + } + if (errno != EINVAL) { + uffd_test_fail("register with WP|RWP: expected EINVAL, got %d", + errno); + return; + } + uffd_test_pass(); +} + static sigjmp_buf jbuf, *sigbuf; static void sighndl(int sig, siginfo_t *siginfo, void *ptr) @@ -1625,6 +2328,77 @@ uffd_test_case_t uffd_tests[] = { /* We can't test MADV_COLLAPSE, so try our luck */ .uffd_feature_required = UFFD_FEATURE_MINOR_SHMEM, }, + { + .name = "rwp-async", + .uffd_fn = uffd_rwp_async_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-sync", + .uffd_fn = uffd_rwp_sync_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = UFFD_FEATURE_RWP, + }, + { + .name = "rwp-pagemap", + .uffd_fn = uffd_rwp_pagemap_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-mprotect", + .uffd_fn = uffd_rwp_mprotect_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-gup", + .uffd_fn = uffd_rwp_gup_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-async-toggle", + .uffd_fn = uffd_rwp_async_toggle_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-close", + .uffd_fn = uffd_rwp_close_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = UFFD_FEATURE_RWP, + }, + { + .name = "rwp-fork", + .uffd_fn = uffd_rwp_fork_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_EVENT_FORK, + }, + { + .name = "rwp-fork-pin", + .uffd_fn = uffd_rwp_fork_pin_test, + .mem_targets = MEM_ANON, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC | + UFFD_FEATURE_EVENT_FORK, + }, + { + .name = "rwp-wp-exclusive", + .uffd_fn = uffd_rwp_wp_exclusive_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | + UFFD_FEATURE_PAGEFAULT_FLAG_WP | + UFFD_FEATURE_WP_HUGETLBFS_SHMEM, + }, { .name = "sigbus", .uffd_fn = uffd_sigbus_test, -- 2.51.2