From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EF5DC0502A for ; Fri, 26 Aug 2022 22:03:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E77A66B0073; Fri, 26 Aug 2022 18:03:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E249B6B0074; Fri, 26 Aug 2022 18:03:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CEE32940007; Fri, 26 Aug 2022 18:03:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id BC9C76B0073 for ; Fri, 26 Aug 2022 18:03:35 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 99F33160E74 for ; Fri, 26 Aug 2022 22:03:35 +0000 (UTC) X-FDA: 79843121190.23.F5D76E9 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf30.hostedemail.com (Postfix) with ESMTP id 5472A8002F for ; Fri, 26 Aug 2022 22:03:35 +0000 (UTC) Received: by mail-pj1-f74.google.com with SMTP id g11-20020a17090a4b0b00b001fb5f1e195fso1400686pjh.6 for ; Fri, 26 Aug 2022 15:03:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc; bh=dXP+9vsfpxslOB78pAUAKGhf3+HQiv2Z7Puca9QzRiI=; b=UUdefA99whWeAPdVewQJGWtgwJAf6QFyo2nB1eXeR3D26y8jkKl2QaDFGf9nyXegQD PMfUFQcy+eH/Rk3bAA5auGa4yLuvXfPJdFilXlNJ1+28klatOEo/92jGP3+uZSdpkctt C0F49dXfx7QZbtmguaWNYAoE0+7tR61WSiqgM6BGeEZr4rg2jK0TN2svz7n7isIonl98 KUsMlNxKo6mgGyfbY+e8xIdam+i81I97rTdl7GM/gTzXy38IIneFZ5sbhLIfHHHr3wg8 ZrdFtrvQHTkYOVyut+B0ZgU3+Li8px1TifhggrPwV2Zk+k7mMJpx4dQnDzgtIEbC6O8b 5J7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc; bh=dXP+9vsfpxslOB78pAUAKGhf3+HQiv2Z7Puca9QzRiI=; b=offWyCFx4gOD40aUVjEL/+ezvqVhl8oK3ZqncwYDFD0FSJWyl1813hCl/tjXx3Juuc kVdwWMKDWoJ59c/l7PxdRlPME0/Gugp1CegbBTefKw3iM+c1F3h6AqCsgammrA1RCIjR MDtYfe+mOn8et7zbU4Chjc+INmuinclWO1fYMjgFBIvh5Ier35BbM4VCYGwMI11QJhn4 eUDfzUZ0H/CLJFKiN3mZsU1eZ6Z5eDx/wom3ljEaqOYctmArR91DfvAtd/5czc0JEseH u3MovA/oxLx4l6CsqZ223ny9sbbU0yI4qCdFcyI/nQA+VXApsUBXuazZzBnQJVXFhBwT zxig== X-Gm-Message-State: ACgBeo3IN3/ieeA28N15pNaPKknkrCC4qo8rtjH+Yv97dbcW+KWWSBmI KQOfIN66L+ZZrJb2rSd/3cd/v5jCzgZ6q6/EH1mh1qo9xYFv43/i8EBPyHw37N/QjUSszF1E+O6 wOr4C4tGGZkoZlOOin54RV1k29eqn/7bZKTxkwFb6I/zkXTBat8BslNmH9Y4= X-Google-Smtp-Source: AA6agR7UEHuaEADQpjOa8/BfzXMogvMz51JQvcp1KYz3BoTZe3BjmDD0K1lm9ScbL81g2xPs+srLDzSYfb1S X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a05:6a00:3406:b0:535:f76f:c971 with SMTP id cn6-20020a056a00340600b00535f76fc971mr5770465pfb.5.1661551413900; Fri, 26 Aug 2022 15:03:33 -0700 (PDT) Date: Fri, 26 Aug 2022 15:03:19 -0700 Mime-Version: 1.0 X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220826220329.1495407-1-zokeefe@google.com> Subject: From: "Zach O'Keefe" To: linux-mm@kvack.org Cc: Andrew Morton , linux-api@vger.kernel.org, Axel Rasmussen , James Houghton , Hugh Dickins , Yang Shi , Miaohe Lin , David Hildenbrand , David Rientjes , Matthew Wilcox , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Chris Kennelly , "Kirill A. Shutemov" , Minchan Kim , Patrick Xia , "Zach O'Keefe" Content-Type: text/plain; charset="UTF-8" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661551415; a=rsa-sha256; cv=none; b=lG4+AX29dBfLxCmAIcoq0VeRnwHvygYHXSzMW3g4vGv2x6cr5hwXlXv3IILyXvwpIytbFL JCkWizFubAAsMHTMKGVvxOiirlz6Y/CY/7USxNAs7e+4W4rLPWrCIeuLudhGDtVpT52GSq EGv9egqnAFjLJ2LiiZciGwhRpaXYrH0= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=UUdefA99; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of 3NUMJYwcKCP45uqkklkmuumrk.iusrot03-ssq1giq.uxm@flex--zokeefe.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3NUMJYwcKCP45uqkklkmuumrk.iusrot03-ssq1giq.uxm@flex--zokeefe.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661551415; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=dXP+9vsfpxslOB78pAUAKGhf3+HQiv2Z7Puca9QzRiI=; b=CarI/rFNn8+/PwKl650mKSLLIH51LuQe/4gHt+BLOzeD2BZYvq2J9PXrBcfbRZ0m9ZnZsg 4o7GZLlxiDDgPmZ2Xw+JZbP9LJtSfpC2yF8IJA3lKrIgpnxFL5Rk7Rau9HxEfz16hSgwMR ZaoOI280LXYZdH1seGiQgktTYGPPX28= Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=UUdefA99; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of 3NUMJYwcKCP45uqkklkmuumrk.iusrot03-ssq1giq.uxm@flex--zokeefe.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3NUMJYwcKCP45uqkklkmuumrk.iusrot03-ssq1giq.uxm@flex--zokeefe.bounces.google.com X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: xdpj3e33gaci6n7dx6eszzo45sz9cjcj X-Rspamd-Queue-Id: 5472A8002F X-HE-Tag: 1661551415-989386 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Subject: [PATCH mm-unstable v2 0/9] mm: add file/shmem support to MADV_COLLAPSE v2 Forward Mostly a RESEND: rebase on latest mm-unstable + minor bug fixes from kernel test robot. -------------------------------- This series builds on top of the previous "mm: userspace hugepage collapse" series which introduced the MADV_COLLAPSE madvise mode and added support for private, anonymous mappings[1], by adding support for file and shmem backed memory to CONFIG_READ_ONLY_THP_FOR_FS=y kernels. File and shmem support have been added with effort to align with existing MADV_COLLAPSE semantics and policy decisions[2]. Collapse of shmem-backed memory ignores kernel-guiding directives and heuristics including all sysfs settings (transparent_hugepage/shmem_enabled), and tmpfs huge= mount options (shmem always supports large folios). Like anonymous mappings, on successful return of MADV_COLLAPSE on file/shmem memory, the contents of memory mapped by the addresses provided will be synchronously pmd-mapped THPs. This functionality unlocks two important uses: (1) Immediately back executable text by THPs. Current support provided by CONFIG_READ_ONLY_THP_FOR_FS may take a long time on a large system which might impair services from serving at their full rated load after (re)starting. Tricks like mremap(2)'ing text onto anonymous memory to immediately realize iTLB performance prevents page sharing and demand paging, both of which increase steady state memory footprint. Now, we can have the best of both worlds: Peak upfront performance and lower RAM footprints. (2) userfaultfd-based live migration of virtual machines satisfy UFFD faults by fetching native-sized pages over the network (to avoid latency of transferring an entire hugepage). However, after guest memory has been fully copied to the new host, MADV_COLLAPSE can be used to immediately increase guest performance. khugepaged has received a small improvement by association and can now detect and collapse pte-mapped THPs. However, there is still work to be done along the file collapse path. Compound pages of arbitrary order still needs to be supported and THP collapse needs to be converted to using folios in general. Eventually, we'd like to move away from the read-only and executable-mapped constraints currently imposed on eligible files and support any inode claiming huge folio support. That said, I think the series as-is covers enough to claim that MADV_COLLAPSE supports file/shmem memory. Patches 1-3 Implement the guts of the series. Patch 4 Is a tracepoint for debugging. Patches 5-8 Refactor existing khugepaged selftests to work with new memory types. Patch 9 Adds a userfaultfd selftest mode to mimic a functional test of UFFDIO_REGISTER_MODE_MINOR+MADV_COLLAPSE live migration. Applies against mm-unstable. [1] https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/ [2] https://lore.kernel.org/linux-mm/YtBmhaiPHUTkJml8@google.com/ v1 -> v2: - Add missing definition for khugepaged_add_pte_mapped_thp() in !CONFIG_SHEM builds, in "mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by pmds" - Minor bugfixes in "mm/madvise: add file and shmem support to MADV_COLLAPSE" for !CONFIG_SHMEM, !CONFIG_TRANSPARENT_HUGEPAGE and some compiler settings. - Rebased on latest mm-unstable Zach O'Keefe (9): mm/shmem: add flag to enforce shmem THP in hugepage_vma_check() mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by pmds mm/madvise: add file and shmem support to MADV_COLLAPSE mm/khugepaged: add tracepoint to hpage_collapse_scan_file() selftests/vm: dedup THP helpers selftests/vm: modularize thp collapse memory operations selftests/vm: add thp collapse file and tmpfs testing selftests/vm: add thp collapse shmem testing selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory include/linux/khugepaged.h | 13 +- include/linux/shmem_fs.h | 10 +- include/trace/events/huge_memory.h | 36 + kernel/events/uprobes.c | 2 +- mm/huge_memory.c | 2 +- mm/khugepaged.c | 289 ++++-- mm/shmem.c | 18 +- tools/testing/selftests/vm/Makefile | 2 + tools/testing/selftests/vm/khugepaged.c | 828 ++++++++++++------ tools/testing/selftests/vm/soft-dirty.c | 2 +- .../selftests/vm/split_huge_page_test.c | 12 +- tools/testing/selftests/vm/userfaultfd.c | 171 +++- tools/testing/selftests/vm/vm_util.c | 36 +- tools/testing/selftests/vm/vm_util.h | 5 +- 14 files changed, 1040 insertions(+), 386 deletions(-) -- 2.37.2.672.g94769d06f0-goog