From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C945C677F1 for ; Thu, 19 Jan 2023 01:20:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230078AbjASBUd (ORCPT ); Wed, 18 Jan 2023 20:20:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230079AbjASBS6 (ORCPT ); Wed, 18 Jan 2023 20:18:58 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F67C6C57D for ; Wed, 18 Jan 2023 17:16:14 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 61693B81D7E for ; Thu, 19 Jan 2023 01:16:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F388AC433D2; Thu, 19 Jan 2023 01:16:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1674090972; bh=+H0xaEG+uwWkStTj3220+MGiDJWo7Qdp5aYXO2NN5po=; h=Date:To:From:Subject:From; b=JmgG7MacMtFqinPDHRKSgAWB639tHbwCgBaVqBpMO/MH14rCDYD8FNb2N00vv7XBU hZXw8wBqf+XxUhk+vYwauFbVg8lxqVoGVRZMrHyOGSlw64E7zexsIHRWUa/69Ou1uM ltDsUno4kRt7PA5lxBs5C7HYmikhjktxGC4CHDj8= Date: Wed, 18 Jan 2023 17:16:11 -0800 To: mm-commits@vger.kernel.org, zokeefe@google.com, vbabka@suse.cz, shuah@kernel.org, peterx@redhat.com, nadav.amit@gmail.com, hughd@google.com, aarcange@redhat.com, david@redhat.com, akpm@linux-foundation.org From: Andrew Morton Subject: [merged mm-stable] selftests-vm-cow-add-cow-tests-for-collapsing-of-pte-mapped-anon-thp.patch removed from -mm tree Message-Id: <20230119011611.F388AC433D2@smtp.kernel.org> Precedence: bulk Reply-To: linux-kernel@vger.kernel.org List-ID: X-Mailing-List: mm-commits@vger.kernel.org The quilt patch titled Subject: selftests/vm: cow: add COW tests for collapsing of PTE-mapped anon THP has been removed from the -mm tree. Its filename was selftests-vm-cow-add-cow-tests-for-collapsing-of-pte-mapped-anon-thp.patch This patch was dropped because it was merged into the mm-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: David Hildenbrand Subject: selftests/vm: cow: add COW tests for collapsing of PTE-mapped anon THP Date: Wed, 4 Jan 2023 15:49:05 +0100 Currently, anonymous PTE-mapped THPs cannot be collapsed in-place: collapsing (e.g., via MADV_COLLAPSE) implies allocating a fresh THP and mapping that new THP via a PMD: as it's a fresh anon THP, it will get the exclusive flag set on the head page and everybody is happy. However, if the kernel would ever support in-place collapse of anonymous THPs (replacing a page table mapping each sub-page of a THP via PTEs with a single PMD mapping the complete THP), exclusivity information stored for each sub-page would have to be collapsed accordingly: (1) All PTEs map !exclusive anon sub-pages: the in-place collapsed THP must not not have the exclusive flag set on the head page mapped by the PMD. This is the easiest case to handle ("simply don't set any exclusive flags"). (2) All PTEs map exclusive anon sub-pages: when collapsing, we have to clear the exclusive flag from all tail pages and only leave the exclusive flag set for the head page. Otherwise, fork() after collapse would not clear the exclusive flags from the tail pages and we'd be in trouble once PTE-mapping the shared THP when writing to shared tail pages that still have the exclusive flag set. This would effectively revert what the PTE-mapping code does when propagating the exclusive flag to all sub-pages. (3) PTEs map a mixture of exclusive and !exclusive anon sub-pages (can happen e.g., due to MADV_DONTFORK before fork()). We must not collapse the THP in-place, otherwise bad things may happen: the exclusive flags of sub-pages would get ignored and the exclusive flag of the head page would get used instead. Now that we have MADV_COLLAPSE in place to trigger collapsing a THP, let's add some test cases that would bail out early, if we'd voluntarily/accidantially unlock in-place collapse for anon THPs and forget about taking proper care of exclusive flags. Running the test on a kernel with MADV_COLLAPSE support: # [INFO] Anonymous THP tests # [RUN] Basic COW after fork() when collapsing before fork() ok 169 No leak from parent into child # [RUN] Basic COW after fork() when collapsing after fork() (fully shared) ok 170 # SKIP MADV_COLLAPSE failed: Invalid argument # [RUN] Basic COW after fork() when collapsing after fork() (lower shared) ok 171 No leak from parent into child # [RUN] Basic COW after fork() when collapsing after fork() (upper shared) ok 172 No leak from parent into child For now, MADV_COLLAPSE always seems to fail if all PTEs map shared sub-pages. Link: https://lkml.kernel.org/r/20230104144905.460075-1-david@redhat.com Signed-off-by: David Hildenbrand Cc: Shuah Khan Cc: Hugh Dickins Cc: Peter Xu Cc: Vlastimil Babka Cc: Nadav Amit Cc: Zach O'Keefe Cc: Andrea Arcangeli Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/cow.c | 228 +++++++++++++++++++++++++++++ 1 file changed, 228 insertions(+) --- a/tools/testing/selftests/vm/cow.c~selftests-vm-cow-add-cow-tests-for-collapsing-of-pte-mapped-anon-thp +++ a/tools/testing/selftests/vm/cow.c @@ -30,6 +30,10 @@ #include "../kselftest.h" #include "vm_util.h" +#ifndef MADV_COLLAPSE +#define MADV_COLLAPSE 25 +#endif + static size_t pagesize; static int pagemap_fd; static size_t thpsize; @@ -1178,6 +1182,228 @@ static int tests_per_anon_test_case(void return tests; } +enum anon_thp_collapse_test { + ANON_THP_COLLAPSE_UNSHARED, + ANON_THP_COLLAPSE_FULLY_SHARED, + ANON_THP_COLLAPSE_LOWER_SHARED, + ANON_THP_COLLAPSE_UPPER_SHARED, +}; + +static void do_test_anon_thp_collapse(char *mem, size_t size, + enum anon_thp_collapse_test test) +{ + struct comm_pipes comm_pipes; + char buf; + int ret; + + ret = setup_comm_pipes(&comm_pipes); + if (ret) { + ksft_test_result_fail("pipe() failed\n"); + return; + } + + /* + * Trigger PTE-mapping the THP by temporarily mapping a single subpage + * R/O, such that we can try collapsing it later. + */ + ret = mprotect(mem + pagesize, pagesize, PROT_READ); + if (ret) { + ksft_test_result_fail("mprotect() failed\n"); + goto close_comm_pipes; + } + ret = mprotect(mem + pagesize, pagesize, PROT_READ | PROT_WRITE); + if (ret) { + ksft_test_result_fail("mprotect() failed\n"); + goto close_comm_pipes; + } + + switch (test) { + case ANON_THP_COLLAPSE_UNSHARED: + /* Collapse before actually COW-sharing the page. */ + ret = madvise(mem, size, MADV_COLLAPSE); + if (ret) { + ksft_test_result_skip("MADV_COLLAPSE failed: %s\n", + strerror(errno)); + goto close_comm_pipes; + } + break; + case ANON_THP_COLLAPSE_FULLY_SHARED: + /* COW-share the full PTE-mapped THP. */ + break; + case ANON_THP_COLLAPSE_LOWER_SHARED: + /* Don't COW-share the upper part of the THP. */ + ret = madvise(mem + size / 2, size / 2, MADV_DONTFORK); + if (ret) { + ksft_test_result_fail("MADV_DONTFORK failed\n"); + goto close_comm_pipes; + } + break; + case ANON_THP_COLLAPSE_UPPER_SHARED: + /* Don't COW-share the lower part of the THP. */ + ret = madvise(mem, size / 2, MADV_DONTFORK); + if (ret) { + ksft_test_result_fail("MADV_DONTFORK failed\n"); + goto close_comm_pipes; + } + break; + default: + assert(false); + } + + ret = fork(); + if (ret < 0) { + ksft_test_result_fail("fork() failed\n"); + goto close_comm_pipes; + } else if (!ret) { + switch (test) { + case ANON_THP_COLLAPSE_UNSHARED: + case ANON_THP_COLLAPSE_FULLY_SHARED: + exit(child_memcmp_fn(mem, size, &comm_pipes)); + break; + case ANON_THP_COLLAPSE_LOWER_SHARED: + exit(child_memcmp_fn(mem, size / 2, &comm_pipes)); + break; + case ANON_THP_COLLAPSE_UPPER_SHARED: + exit(child_memcmp_fn(mem + size / 2, size / 2, + &comm_pipes)); + break; + default: + assert(false); + } + } + + while (read(comm_pipes.child_ready[0], &buf, 1) != 1) + ; + + switch (test) { + case ANON_THP_COLLAPSE_UNSHARED: + break; + case ANON_THP_COLLAPSE_UPPER_SHARED: + case ANON_THP_COLLAPSE_LOWER_SHARED: + /* + * Revert MADV_DONTFORK such that we merge the VMAs and are + * able to actually collapse. + */ + ret = madvise(mem, size, MADV_DOFORK); + if (ret) { + ksft_test_result_fail("MADV_DOFORK failed\n"); + write(comm_pipes.parent_ready[1], "0", 1); + wait(&ret); + goto close_comm_pipes; + } + /* FALLTHROUGH */ + case ANON_THP_COLLAPSE_FULLY_SHARED: + /* Collapse before anyone modified the COW-shared page. */ + ret = madvise(mem, size, MADV_COLLAPSE); + if (ret) { + ksft_test_result_skip("MADV_COLLAPSE failed: %s\n", + strerror(errno)); + write(comm_pipes.parent_ready[1], "0", 1); + wait(&ret); + goto close_comm_pipes; + } + break; + default: + assert(false); + } + + /* Modify the page. */ + memset(mem, 0xff, size); + write(comm_pipes.parent_ready[1], "0", 1); + + wait(&ret); + if (WIFEXITED(ret)) + ret = WEXITSTATUS(ret); + else + ret = -EINVAL; + + ksft_test_result(!ret, "No leak from parent into child\n"); +close_comm_pipes: + close_comm_pipes(&comm_pipes); +} + +static void test_anon_thp_collapse_unshared(char *mem, size_t size) +{ + do_test_anon_thp_collapse(mem, size, ANON_THP_COLLAPSE_UNSHARED); +} + +static void test_anon_thp_collapse_fully_shared(char *mem, size_t size) +{ + do_test_anon_thp_collapse(mem, size, ANON_THP_COLLAPSE_FULLY_SHARED); +} + +static void test_anon_thp_collapse_lower_shared(char *mem, size_t size) +{ + do_test_anon_thp_collapse(mem, size, ANON_THP_COLLAPSE_LOWER_SHARED); +} + +static void test_anon_thp_collapse_upper_shared(char *mem, size_t size) +{ + do_test_anon_thp_collapse(mem, size, ANON_THP_COLLAPSE_UPPER_SHARED); +} + +/* + * Test cases that are specific to anonymous THP: pages in private mappings + * that may get shared via COW during fork(). + */ +static const struct test_case anon_thp_test_cases[] = { + /* + * Basic COW test for fork() without any GUP when collapsing a THP + * before fork(). + * + * Re-mapping a PTE-mapped anon THP using a single PMD ("in-place + * collapse") might easily get COW handling wrong when not collapsing + * exclusivity information properly. + */ + { + "Basic COW after fork() when collapsing before fork()", + test_anon_thp_collapse_unshared, + }, + /* Basic COW test, but collapse after COW-sharing a full THP. */ + { + "Basic COW after fork() when collapsing after fork() (fully shared)", + test_anon_thp_collapse_fully_shared, + }, + /* + * Basic COW test, but collapse after COW-sharing the lower half of a + * THP. + */ + { + "Basic COW after fork() when collapsing after fork() (lower shared)", + test_anon_thp_collapse_lower_shared, + }, + /* + * Basic COW test, but collapse after COW-sharing the upper half of a + * THP. + */ + { + "Basic COW after fork() when collapsing after fork() (upper shared)", + test_anon_thp_collapse_upper_shared, + }, +}; + +static void run_anon_thp_test_cases(void) +{ + int i; + + if (!thpsize) + return; + + ksft_print_msg("[INFO] Anonymous THP tests\n"); + + for (i = 0; i < ARRAY_SIZE(anon_thp_test_cases); i++) { + struct test_case const *test_case = &anon_thp_test_cases[i]; + + ksft_print_msg("[RUN] %s\n", test_case->desc); + do_run_with_thp(test_case->fn, THP_RUN_PMD); + } +} + +static int tests_per_anon_thp_test_case(void) +{ + return thpsize ? 1 : 0; +} + typedef void (*non_anon_test_fn)(char *mem, const char *smem, size_t size); static void test_cow(char *mem, const char *smem, size_t size) @@ -1518,6 +1744,7 @@ int main(int argc, char **argv) ksft_print_header(); ksft_set_plan(ARRAY_SIZE(anon_test_cases) * tests_per_anon_test_case() + + ARRAY_SIZE(anon_thp_test_cases) * tests_per_anon_thp_test_case() + ARRAY_SIZE(non_anon_test_cases) * tests_per_non_anon_test_case()); gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR); @@ -1526,6 +1753,7 @@ int main(int argc, char **argv) ksft_exit_fail_msg("opening pagemap failed\n"); run_anon_test_cases(); + run_anon_thp_test_cases(); run_non_anon_test_cases(); err = ksft_get_fail_cnt(); _ Patches currently in -mm which might be from david@redhat.com are mm-debug_vm_pgtable-more-pte_swp_exclusive-sanity-checks.patch mm-debug_vm_pgtable-more-pte_swp_exclusive-sanity-checks-fix.patch alpha-mm-support-__have_arch_pte_swp_exclusive.patch arc-mm-support-__have_arch_pte_swp_exclusive.patch arm-mm-support-__have_arch_pte_swp_exclusive.patch csky-mm-support-__have_arch_pte_swp_exclusive.patch hexagon-mm-support-__have_arch_pte_swp_exclusive.patch ia64-mm-support-__have_arch_pte_swp_exclusive.patch loongarch-mm-support-__have_arch_pte_swp_exclusive.patch m68k-mm-remove-dummy-__swp-definitions-for-nommu.patch m68k-mm-support-__have_arch_pte_swp_exclusive.patch microblaze-mm-support-__have_arch_pte_swp_exclusive.patch mips-mm-support-__have_arch_pte_swp_exclusive.patch nios2-mm-refactor-swap-pte-layout.patch nios2-mm-support-__have_arch_pte_swp_exclusive.patch openrisc-mm-support-__have_arch_pte_swp_exclusive.patch parisc-mm-support-__have_arch_pte_swp_exclusive.patch powerpc-mm-support-__have_arch_pte_swp_exclusive-on-32bit-book3s.patch powerpc-nohash-mm-support-__have_arch_pte_swp_exclusive.patch riscv-mm-support-__have_arch_pte_swp_exclusive.patch sh-mm-support-__have_arch_pte_swp_exclusive.patch sparc-mm-support-__have_arch_pte_swp_exclusive-on-32bit.patch sparc-mm-support-__have_arch_pte_swp_exclusive-on-64bit.patch um-mm-support-__have_arch_pte_swp_exclusive.patch x86-mm-support-__have_arch_pte_swp_exclusive-also-on-32bit.patch xtensa-mm-support-__have_arch_pte_swp_exclusive.patch mm-remove-__have_arch_pte_swp_exclusive.patch