From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46F583B777B for ; Mon, 27 Apr 2026 11:48:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777290481; cv=none; b=fYAyh+LzTLmT5nC8y6+cji92GgUdhWFcTfAp/8RVP3sObk+nz6JVh5weq6CIL7/9UQPq1XmosipxRsHQHGHjmy9wSYxGVT0OqmPJ63lVzhBSiPMd6HsjpX2XjUt1uESX33BiYy2nzN3WQrmS3OBt1cznmOa6A/eVcKw2u7xBoJE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777290481; c=relaxed/simple; bh=bFpYJ6c5r646lD2k4KgbkzTBVYwCH7HuXwcZzNixDWg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=jy6dhcSx3I9QfsZ8xGxW8cWZ5Q79pBUJerRCOmnL/1rVMm/szYMiRxeHmOKh7oPeAwJAIvJhxuOBlRPcWQ5ESwi+hxff/FmMS8l9zKiKDoc13Ug/ougcguUdEGgbYme0LYWxzQxOHRqNsDUnkq308qq1KgWgYqjHv63L8lGcwXQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Es3uUV8y; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Es3uUV8y" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A0C9AC2BCB8; Mon, 27 Apr 2026 11:48:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777290481; bh=bFpYJ6c5r646lD2k4KgbkzTBVYwCH7HuXwcZzNixDWg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Es3uUV8yr637+99sfaJPwvyqfs6MZRWDVg85fxuaXeeyfkpifrJdYZQjFzGQOrnzb WNkP1n148BUoEQLGtu2cP158V+nh7sPs/pgM4rWd90p/H3tnHFzBfdarJiHiE0LgMg c2Re8RdYWQe/tYztv2kJt3ApKkclrJGQ+O4LnVCFzE9tzu2d2c50XV9uqOGw25SLbF TY0xFVh0c4RY1+5n0vDsTgCWThXUva0qycAAO5gsayK+6JwrIA8lw68zUqMhYxa+go bjyj4E/JSG7X5OpPicv8h4Epxo7rw+nrmAFllBT9IR1sbatObvtAslJCJMIV1XL1wk vuU1VwEt4qsQQ== Received: from phl-compute-05.internal (phl-compute-05.internal [10.202.2.45]) by mailfauth.phl.internal (Postfix) with ESMTP id C4DACF4006A; Mon, 27 Apr 2026 07:47:59 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-05.internal (MEProxy); Mon, 27 Apr 2026 07:47:59 -0400 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefhedrtddtgdejkeeiudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecunecujfgurhephffvvefufffkofgjfhggtgfgsehtkeertd ertdejnecuhfhrohhmpedfmfhirhihlhcuufhhuhhtshgvmhgruhculdfovghtrgdmfdcu oehkrghssehkvghrnhgvlhdrohhrgheqnecuggftrfgrthhtvghrnhephfdvfedvveejve ehhffhvedufedujeefuddvkeehleduhfeihfehudejffffiefgnecuvehluhhsthgvrhfu ihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepkhhirhhilhhlodhmvghsmhhtph gruhhthhhpvghrshhonhgrlhhithihqdduieduudeivdeiheehqddvkeeggeegjedvkedq khgrsheppehkvghrnhgvlhdrohhrghesshhhuhhtvghmohhvrdhnrghmvgdpnhgspghrtg hpthhtohepvdegpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegrkhhpmheslhhi nhhugidqfhhouhhnuggrthhiohhnrdhorhhgpdhrtghpthhtoheprhhpphhtsehkvghrnh gvlhdrohhrghdprhgtphhtthhopehpvghtvghrgiesrhgvughhrghtrdgtohhmpdhrtghp thhtohepuggrvhhiugeskhgvrhhnvghlrdhorhhgpdhrtghpthhtoheplhhjsheskhgvrh hnvghlrdhorhhgpdhrtghpthhtohepshhurhgvnhgssehgohhoghhlvgdrtghomhdprhgt phhtthhopehvsggrsghkrgeskhgvrhhnvghlrdhorhhgpdhrtghpthhtoheplhhirghmrd hhohiflhgvthhtsehorhgrtghlvgdrtghomhdprhgtphhtthhopeiiihihsehnvhhiughi rgdrtghomh X-ME-Proxy: Feedback-ID: i10464835:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Apr 2026 07:47:57 -0400 (EDT) From: "Kiryl Shutsemau (Meta)" To: akpm@linux-foundation.org, rppt@kernel.org, peterx@redhat.com, david@kernel.org Cc: ljs@kernel.org, surenb@google.com, vbabka@kernel.org, Liam.Howlett@oracle.com, ziy@nvidia.com, corbet@lwn.net, skhan@linuxfoundation.org, seanjc@google.com, pbonzini@redhat.com, jthoughton@google.com, aarcange@redhat.com, sj@kernel.org, usama.arif@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, kvm@vger.kernel.org, kernel-team@meta.com, "Kiryl Shutsemau (Meta)" Subject: [PATCH 13/14] selftests/mm: add userfaultfd RWP tests Date: Mon, 27 Apr 2026 12:46:01 +0100 Message-ID: <20260427114607.4068647-14-kas@kernel.org> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20260427114607.4068647-1-kas@kernel.org> References: <20260427114607.4068647-1-kas@kernel.org> Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Coverage for UFFDIO_REGISTER_MODE_RWP and UFFDIO_RWPROTECT: rwp-async async mode — touch pages, verify permissions are auto-restored without a message rwp-sync sync mode — access blocks, handler resolves via UFFDIO_RWPROTECT rwp-pagemap PAGEMAP_SCAN reports still-cold pages via inverted PAGE_IS_ACCESSED rwp-mprotect RWP survives mprotect(PROT_NONE) -> mprotect(PROT_READ|PROT_WRITE) round-trip rwp-gup GUP walks through a protnone RWP PTE (pipe write/read drives the GUP path) rwp-async-toggle UFFDIO_SET_MODE flips between sync and async without re-registering rwp-close closing the uffd restores page permissions rwp-fork RWP survives fork() with EVENT_FORK; child's PTEs keep the uffd bit rwp-fork-pin RWP survives fork() on an RO-longterm-pinned anon page (forces copy_present_page()); child read auto-resolves and clears the bit, proving PAGE_NONE was in place rwp-wp-exclusive register with MODE_WP|MODE_RWP returns -EINVAL All tests run against anon, shmem, shmem-private, hugetlb, and hugetlb-private memory, except rwp-fork-pin which is anon-only — copy_present_page() is the private-anon pinned-exclusive fork path. Signed-off-by: Kiryl Shutsemau Assisted-by: Claude:claude-opus-4-6 --- tools/testing/selftests/mm/uffd-unit-tests.c | 734 +++++++++++++++++++ 1 file changed, 734 insertions(+) diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c index 6f5e404a446c..6fdfe8bc3a70 100644 --- a/tools/testing/selftests/mm/uffd-unit-tests.c +++ b/tools/testing/selftests/mm/uffd-unit-tests.c @@ -7,6 +7,7 @@ #include "uffd-common.h" +#include #include "../../../../mm/gup_test.h" #ifdef __NR_userfaultfd @@ -167,6 +168,23 @@ static int test_uffd_api(bool use_dev) goto out; } + /* Verify returned fd-level ioctls bitmask */ + { + uint64_t expected_ioctls = + BIT_ULL(_UFFDIO_REGISTER) | + BIT_ULL(_UFFDIO_UNREGISTER) | + BIT_ULL(_UFFDIO_API) | + BIT_ULL(_UFFDIO_SET_MODE); + + if ((uffdio_api.ioctls & expected_ioctls) != expected_ioctls) { + uffd_test_fail("UFFDIO_API missing expected ioctls: " + "got=0x%"PRIx64", expected=0x%"PRIx64, + (uint64_t)uffdio_api.ioctls, + expected_ioctls); + goto out; + } + } + /* Test double requests of UFFDIO_API with a random feature set */ uffdio_api.features = BIT_ULL(0); if (ioctl(uffd, UFFDIO_API, &uffdio_api) == 0) { @@ -623,6 +641,653 @@ void uffd_minor_collapse_test(uffd_global_test_opts_t *gopts, uffd_test_args_t * uffd_minor_test_common(gopts, true, false); } +static int uffd_register_rwp(int uffd, void *addr, uint64_t len) +{ + struct uffdio_register reg = { + .range = { .start = (unsigned long)addr, .len = len }, + .mode = UFFDIO_REGISTER_MODE_RWP, + }; + + if (ioctl(uffd, UFFDIO_REGISTER, ®) == -1) + return -errno; + return 0; +} + +static void rwprotect_range(int uffd, __u64 start, __u64 len, bool protect) +{ + struct uffdio_rwprotect rwp = { + .range = { .start = start, .len = len }, + .mode = protect ? UFFDIO_RWPROTECT_MODE_RWP : 0, + }; + + if (ioctl(uffd, UFFDIO_RWPROTECT, &rwp)) + err("UFFDIO_RWPROTECT failed"); +} + +static void set_async_mode(int uffd, bool enable) +{ + struct uffdio_set_mode mode = { }; + + if (enable) + mode.enable = UFFD_FEATURE_RWP_ASYNC; + else + mode.disable = UFFD_FEATURE_RWP_ASYNC; + + if (ioctl(uffd, UFFDIO_SET_MODE, &mode)) + err("UFFDIO_SET_MODE failed"); +} + +/* + * Test async RWP faults on anonymous memory. + * Populate pages, register MODE_RWP with RWP_ASYNC, + * RW-protect, re-access, verify content preserved and no faults delivered. + */ +static void uffd_rwp_async_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + unsigned long p; + + /* Populate all pages with known content */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, p % 255 + 1, page_size); + + /* Register MODE_RWP */ + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + /* RW-protect all pages (sets protnone) */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Access all pages — should auto-resolve, no faults */ + for (p = 0; p < nr_pages; p++) { + unsigned char *page = (unsigned char *)gopts->area_dst + + p * page_size; + unsigned char expected = p % 255 + 1; + + if (page[0] != expected) { + uffd_test_fail("page %lu content mismatch: %u != %u", + p, page[0], expected); + return; + } + } + + uffd_test_pass(); +} + +/* + * Fault handler for RWP — unprotect the page via UFFDIO_RWPROTECT. + */ +static void uffd_handle_rwp_fault(uffd_global_test_opts_t *gopts, + struct uffd_msg *msg, + struct uffd_args *uargs) +{ + if (!(msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_RWP)) + err("expected RWP fault, got 0x%llx", + msg->arg.pagefault.flags); + + rwprotect_range(gopts->uffd, msg->arg.pagefault.address, + gopts->page_size, false); + uargs->minor_faults++; +} + +/* + * Test sync RWP faults on anonymous memory. + * Populate pages, register MODE_RWP (sync), RW-protect, + * access from worker thread, verify fault delivered, UFFDIO_RWPROTECT resolves. + */ +static void uffd_rwp_sync_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + pthread_t uffd_mon; + struct uffd_args uargs = { }; + char c = '\0'; + unsigned long p; + + uargs.gopts = gopts; + uargs.handle_fault = uffd_handle_rwp_fault; + + /* Populate all pages */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, p % 255 + 1, page_size); + + /* Register MODE_RWP */ + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + /* RW-protect all pages */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Start fault handler thread */ + if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &uargs)) + err("uffd_poll_thread create"); + + /* Access all pages — triggers sync RWP faults, handler unprotects */ + for (p = 0; p < nr_pages; p++) { + unsigned char *page = (unsigned char *)gopts->area_dst + + p * page_size; + + if (page[0] != (p % 255 + 1)) { + uffd_test_fail("page %lu content mismatch", p); + goto out; + } + } + + if (uargs.minor_faults == 0) { + uffd_test_fail("expected RWP faults, got 0"); + goto out; + } + + uffd_test_pass(); +out: + if (write(gopts->pipefd[1], &c, sizeof(c)) != sizeof(c)) + err("pipe write"); + if (pthread_join(uffd_mon, NULL)) + err("join() failed"); +} + +/* + * Test PAGEMAP_SCAN detection of RW-protected (cold) pages. + */ +static void uffd_rwp_pagemap_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + unsigned long p; + struct page_region regions[16]; + struct pm_scan_arg pm_arg; + int pagemap_fd; + long ret; + + /* Need at least 4 pages */ + if (nr_pages < 4) { + uffd_test_skip("need at least 4 pages"); + return; + } + + /* Populate all pages */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, 0xab, page_size); + + /* Register and RW-protect */ + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Touch first half of pages to re-activate them (async auto-resolve) */ + for (p = 0; p < nr_pages / 2; p++) { + volatile char *page = gopts->area_dst + p * page_size; + (void)*page; + } + + /* Scan for cold (still RW-protected) pages */ + pagemap_fd = open("/proc/self/pagemap", O_RDONLY); + if (pagemap_fd < 0) + err("open pagemap"); + + /* + * PAGE_IS_ACCESSED is set once the uffd-wp bit has been cleared + * (access happened, or the user resolved). Invert it to select + * still-protected (cold) pages. + */ + memset(&pm_arg, 0, sizeof(pm_arg)); + pm_arg.size = sizeof(pm_arg); + pm_arg.start = (uint64_t)gopts->area_dst; + pm_arg.end = (uint64_t)gopts->area_dst + nr_pages * page_size; + pm_arg.vec = (uint64_t)regions; + pm_arg.vec_len = 16; + pm_arg.category_mask = PAGE_IS_ACCESSED; + pm_arg.category_inverted = PAGE_IS_ACCESSED; + pm_arg.return_mask = PAGE_IS_ACCESSED; + + ret = ioctl(pagemap_fd, PAGEMAP_SCAN, &pm_arg); + close(pagemap_fd); + + if (ret < 0) { + uffd_test_fail("PAGEMAP_SCAN failed: %s", strerror(errno)); + return; + } + + /* + * The second half of pages should be reported as RW-protected. + * They may be coalesced into one region. + */ + if (ret < 1) { + uffd_test_fail("expected cold pages, got %ld regions", ret); + return; + } + + /* Verify the cold region covers the second half */ + uint64_t cold_start = regions[0].start; + uint64_t expected_start = (uint64_t)gopts->area_dst + + (nr_pages / 2) * page_size; + + if (cold_start != expected_start) { + uffd_test_fail("cold region starts at 0x%lx, expected 0x%lx", + (unsigned long)cold_start, + (unsigned long)expected_start); + return; + } + + uffd_test_pass(); +} + +/* + * Test that RWP protection survives a mprotect(PROT_NONE) -> + * mprotect(PROT_READ|PROT_WRITE) round-trip. The uffd-wp bit on a + * VM_UFFD_RWP VMA must continue to carry PROT_NONE semantics after + * mprotect() changes the base protection; otherwise accesses would + * silently succeed and the pagemap bit would stick without a fault + * ever clearing it. + */ +static void uffd_rwp_mprotect_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + unsigned long p; + struct page_region regions[16]; + struct pm_scan_arg pm_arg; + int pagemap_fd; + long ret; + + /* Populate all pages */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, 0xab, page_size); + + /* Register and RW-protect the whole range */ + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Round-trip mprotect(): PROT_NONE -> PROT_READ|PROT_WRITE */ + if (mprotect(gopts->area_dst, nr_pages * page_size, PROT_NONE)) + err("mprotect() PROT_NONE"); + if (mprotect(gopts->area_dst, nr_pages * page_size, + PROT_READ | PROT_WRITE)) + err("mprotect() PROT_READ|PROT_WRITE"); + + /* Touch every page. Async RWP must auto-resolve each fault. */ + for (p = 0; p < nr_pages; p++) { + volatile char *page = gopts->area_dst + p * page_size; + (void)*page; + } + + /* + * After touching, no page should remain RW-protected. A stuck + * uffd-wp bit would mean mprotect() silently dropped PROT_NONE and + * the access never faulted. + */ + pagemap_fd = open("/proc/self/pagemap", O_RDONLY); + if (pagemap_fd < 0) + err("open pagemap"); + + memset(&pm_arg, 0, sizeof(pm_arg)); + pm_arg.size = sizeof(pm_arg); + pm_arg.start = (uint64_t)gopts->area_dst; + pm_arg.end = (uint64_t)gopts->area_dst + nr_pages * page_size; + pm_arg.vec = (uint64_t)regions; + pm_arg.vec_len = 16; + pm_arg.category_mask = PAGE_IS_ACCESSED; + pm_arg.category_inverted = PAGE_IS_ACCESSED; + pm_arg.return_mask = PAGE_IS_ACCESSED; + + ret = ioctl(pagemap_fd, PAGEMAP_SCAN, &pm_arg); + close(pagemap_fd); + + if (ret < 0) { + uffd_test_fail("PAGEMAP_SCAN failed: %s", strerror(errno)); + return; + } + if (ret != 0) { + uffd_test_fail("expected no cold pages after mprotect()+touch, got %ld regions", + ret); + return; + } + + uffd_test_pass(); +} + +/* + * Test that GUP resolves through protnone PTEs (async mode). + * RW-protect pages, then use a pipe to exercise GUP on the RW-protected + * memory. write() from RW-protected pages triggers GUP which must fault + * through the protnone PTE. + */ +static void uffd_rwp_gup_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long page_size = gopts->page_size; + char *buf; + int pipefd[2]; + + buf = malloc(page_size); + if (!buf) + err("malloc"); + + /* Populate first page with known content */ + memset(gopts->area_dst, 0xCD, page_size); + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, page_size)) + err("register failure"); + + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, page_size, true); + + if (pipe(pipefd)) + err("pipe"); + + /* + * write() from the RW-protected page into the pipe. + * This triggers GUP on the protnone PTE. In async mode the + * kernel auto-restores permissions and GUP succeeds. + */ + /* + * Use base page size for write — hugetlb pages are larger than + * the default pipe buffer, which would deadlock. + */ + if (write(pipefd[1], gopts->area_dst, getpagesize()) != getpagesize()) { + uffd_test_fail("write from RW-protected page failed: %s", + strerror(errno)); + goto out; + } + + if (read(pipefd[0], buf, getpagesize()) != getpagesize()) { + uffd_test_fail("read from pipe failed"); + goto out; + } + + if (memcmp(buf, "\xCD", 1) != 0) { + uffd_test_fail("content mismatch: got 0x%02x, expected 0xCD", + (unsigned char)buf[0]); + goto out; + } + + uffd_test_pass(); +out: + close(pipefd[0]); + close(pipefd[1]); + free(buf); +} + +/* + * Test runtime toggle between async and sync modes. + * Start in async mode (detection), flip to sync (eviction), verify faults + * block, resolve them, flip back to async. + */ +static void uffd_rwp_async_toggle_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + struct uffd_args uargs = { }; + pthread_t uffd_mon; + char c = '\0'; + unsigned long p; + + uargs.gopts = gopts; + uargs.handle_fault = uffd_handle_rwp_fault; + + /* Populate */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, p % 255 + 1, page_size); + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + /* Phase 1: async detection — RW-protect, access first half */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + for (p = 0; p < nr_pages / 2; p++) { + volatile char *page = gopts->area_dst + p * page_size; + (void)*page; /* auto-resolves in async mode */ + } + + /* Phase 2: flip to sync for eviction */ + set_async_mode(gopts->uffd, false); + + /* Start handler — will receive faults for cold pages */ + if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &uargs)) + err("uffd_poll_thread create"); + + /* Access second half (cold pages) — should trigger sync faults */ + for (p = nr_pages / 2; p < nr_pages; p++) { + unsigned char *page = (unsigned char *)gopts->area_dst + + p * page_size; + if (page[0] != (p % 255 + 1)) { + uffd_test_fail("page %lu content mismatch", p); + goto out; + } + } + + if (uargs.minor_faults == 0) { + uffd_test_fail("expected sync faults, got 0"); + goto out; + } + + /* Phase 3: flip back to async */ + set_async_mode(gopts->uffd, true); + + /* RW-protect and access again — should auto-resolve */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + for (p = 0; p < nr_pages; p++) { + volatile char *page = gopts->area_dst + p * page_size; + (void)*page; + } + + uffd_test_pass(); +out: + if (write(gopts->pipefd[1], &c, sizeof(c)) != sizeof(c)) + err("pipe write"); + if (pthread_join(uffd_mon, NULL)) + err("join() failed"); +} + +/* + * Test that RW-protected pages become accessible after closing uffd. + */ +static void uffd_rwp_close_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + unsigned long p; + + /* Populate */ + for (p = 0; p < nr_pages; p++) + memset(gopts->area_dst + p * page_size, p % 255 + 1, page_size); + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failure"); + + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + nr_pages * page_size, true); + + /* Close uffd — should restore protnone PTEs */ + close(gopts->uffd); + gopts->uffd = -1; + + /* All pages should be accessible with original content */ + for (p = 0; p < nr_pages; p++) { + unsigned char *page = (unsigned char *)gopts->area_dst + + p * page_size; + unsigned char expected = p % 255 + 1; + + if (page[0] != expected) { + uffd_test_fail("page %lu not accessible after close", p); + return; + } + } + + uffd_test_pass(); +} + +/* + * Test that RWP protection is preserved across fork() when + * UFFD_FEATURE_EVENT_FORK is enabled. Without preservation, the child's + * PTEs would lose the uffd-wp marker and RWP-protected accesses would + * silently fall through to do_numa_page(). + */ +static void uffd_rwp_fork_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + int pagemap_fd; + uint64_t value; + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, + nr_pages * page_size)) + err("register failed"); + + /* Populate + RWP-protect */ + *gopts->area_dst = 1; + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, + page_size, true); + + /* Parent: verify uffd-wp bit is set before fork */ + pagemap_fd = pagemap_open(); + value = pagemap_get_entry(pagemap_fd, gopts->area_dst); + pagemap_check_wp(value, true); + + /* + * Fork with EVENT_FORK: child inherits VM_UFFD_RWP. Child reads + * its own pagemap and must still see the uffd-wp bit set. + */ + if (pagemap_test_fork(gopts, true, false)) { + uffd_test_fail("RWP marker lost in child after fork"); + goto out; + } + + uffd_test_pass(); +out: + close(pagemap_fd); +} + +/* + * Test that RWP protection on a pinned anon page is preserved across fork(). + * Pinning forces copy_present_page() in the child path, which must restore + * PAGE_NONE on top of the uffd bit. Using async mode, a read in the child + * auto-resolves if — and only if — the PTE was actually protnone+uffd; the + * cleared uffd bit afterward proves the fault path ran. + */ +static void uffd_rwp_fork_pin_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long page_size = gopts->page_size; + pin_args pin_args = {}; + int pagemap_fd, status; + uint64_t value; + pid_t child; + + if (uffd_register_rwp(gopts->uffd, gopts->area_dst, page_size)) + err("register failed"); + + /* Populate. */ + *gopts->area_dst = 1; + + /* RO-longterm pin so fork() takes copy_present_page() for this PTE. */ + if (pin_pages(&pin_args, gopts->area_dst, page_size)) { + uffd_test_skip("Possibly CONFIG_GUP_TEST missing or unprivileged"); + uffd_unregister(gopts->uffd, gopts->area_dst, page_size); + return; + } + + /* RWP-protect: PTE is now PAGE_NONE + uffd bit. */ + rwprotect_range(gopts->uffd, (uint64_t)gopts->area_dst, page_size, true); + + pagemap_fd = pagemap_open(); + value = pagemap_get_entry(pagemap_fd, gopts->area_dst); + pagemap_check_wp(value, true); + + child = fork(); + if (child < 0) + err("fork"); + if (child == 0) { + volatile char c; + int cfd; + + /* + * Read the pinned page. Only reaches the fault path if the + * child PTE is protnone + uffd; async mode auto-resolves and + * clears the uffd bit. If copy_present_page() dropped + * PAGE_NONE, the read would silently succeed and the bit + * would still be set. + */ + c = *(volatile char *)gopts->area_dst; + (void)c; + + cfd = pagemap_open(); + value = pagemap_get_entry(cfd, gopts->area_dst); + close(cfd); + _exit((value & PM_UFFD_WP) ? 1 : 0); + } + if (waitpid(child, &status, 0) < 0) + err("waitpid"); + + unpin_pages(&pin_args); + close(pagemap_fd); + if (uffd_unregister(gopts->uffd, gopts->area_dst, page_size)) + err("unregister failed"); + + if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) { + uffd_test_fail("RWP not enforced in child after pinned fork"); + return; + } + + uffd_test_pass(); +} + +/* + * Test WP and RWP coexisting on the same VMA (async RWP mode). + * Register with MODE_WP | MODE_RWP, apply WP, then apply RWP, then read + * pages. Async RWP auto-resolves, and the WP bit must survive the RWP + * protect/unprotect cycle so subsequent writes still deliver WP faults. + */ +/* + * WP and RWP share the uffd-wp PTE bit and cannot coexist in the same VMA. + * Registration requesting both modes must be rejected. + */ +static void uffd_rwp_wp_exclusive_test(uffd_global_test_opts_t *gopts, + uffd_test_args_t *args) +{ + unsigned long nr_pages = gopts->nr_pages; + unsigned long page_size = gopts->page_size; + struct uffdio_register reg = { }; + + reg.range.start = (unsigned long)gopts->area_dst; + reg.range.len = nr_pages * page_size; + reg.mode = UFFDIO_REGISTER_MODE_WP | UFFDIO_REGISTER_MODE_RWP; + + if (ioctl(gopts->uffd, UFFDIO_REGISTER, ®) == 0) { + uffd_test_fail("register with WP|RWP unexpectedly succeeded"); + return; + } + if (errno != EINVAL) { + uffd_test_fail("register with WP|RWP: expected EINVAL, got %d", + errno); + return; + } + uffd_test_pass(); +} + static sigjmp_buf jbuf, *sigbuf; static void sighndl(int sig, siginfo_t *siginfo, void *ptr) @@ -1625,6 +2290,75 @@ uffd_test_case_t uffd_tests[] = { /* We can't test MADV_COLLAPSE, so try our luck */ .uffd_feature_required = UFFD_FEATURE_MINOR_SHMEM, }, + { + .name = "rwp-async", + .uffd_fn = uffd_rwp_async_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-sync", + .uffd_fn = uffd_rwp_sync_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = UFFD_FEATURE_RWP, + }, + { + .name = "rwp-pagemap", + .uffd_fn = uffd_rwp_pagemap_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-mprotect", + .uffd_fn = uffd_rwp_mprotect_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-gup", + .uffd_fn = uffd_rwp_gup_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-async-toggle", + .uffd_fn = uffd_rwp_async_toggle_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-close", + .uffd_fn = uffd_rwp_close_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = UFFD_FEATURE_RWP, + }, + { + .name = "rwp-fork", + .uffd_fn = uffd_rwp_fork_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_EVENT_FORK, + }, + { + .name = "rwp-fork-pin", + .uffd_fn = uffd_rwp_fork_pin_test, + .mem_targets = MEM_ANON, + .uffd_feature_required = + UFFD_FEATURE_RWP | UFFD_FEATURE_RWP_ASYNC, + }, + { + .name = "rwp-wp-exclusive", + .uffd_fn = uffd_rwp_wp_exclusive_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = + UFFD_FEATURE_RWP | + UFFD_FEATURE_PAGEFAULT_FLAG_WP, + }, { .name = "sigbus", .uffd_fn = uffd_sigbus_test, -- 2.51.2