From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44E0FC433F5 for ; Mon, 15 Nov 2021 07:55:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B745C63219 for ; Mon, 15 Nov 2021 07:55:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B745C63219 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 46BCE6B007B; Mon, 15 Nov 2021 02:55:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 42EED6B0080; Mon, 15 Nov 2021 02:55:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 183E76B007D; Mon, 15 Nov 2021 02:55:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0222.hostedemail.com [216.40.44.222]) by kanga.kvack.org (Postfix) with ESMTP id EE07E6B007D for ; Mon, 15 Nov 2021 02:55:49 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id A90821812690F for ; Mon, 15 Nov 2021 07:55:49 +0000 (UTC) X-FDA: 78810405618.25.E2CB696 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf12.hostedemail.com (Postfix) with ESMTP id 0EB6510000AC for ; Mon, 15 Nov 2021 07:55:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636962948; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tB7XXxUDPClsvCRJSlQGXzWYmuo0RbrsIC/Jp1UIRWI=; b=OqFG+sUtl0exIxL8W5rvVekAXpDU6+KATBh7SCaekV6zSkUjVOitOz0o3HttwCVusMM5gm R1PuAknJ8N9Y6mn8+qMhkChO89+JxwujOyeP7aSHlLkCQJJyeEFnqiznLawdhZZz2xVIS/ qn+Kv0vwO9dpI6zAjJ05rGIQ7Nrg6qk= Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-211-4lnx4ZRzO7mgLcQDK0bcmQ-1; Mon, 15 Nov 2021 02:55:47 -0500 X-MC-Unique: 4lnx4ZRzO7mgLcQDK0bcmQ-1 Received: by mail-pf1-f199.google.com with SMTP id u4-20020a056a00098400b004946fc3e863so9600699pfg.8 for ; Sun, 14 Nov 2021 23:55:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tB7XXxUDPClsvCRJSlQGXzWYmuo0RbrsIC/Jp1UIRWI=; b=wg3AkXXwUkoR1jMZrKwL3tpPOmOGyqLb6JpPZCzMCtqzpfCoPVITTk6cuHjqgkBUw9 9ldV3A/Xcxb7jFlSg1O4RONbc9xgcsw0yKJZx1+ahqexlH1Y5NgvMVFOJVyZ7Ij0fRAv t9L0a9zwKNNsbsObOVV3SEsxTC68Cq6npE3NYmtmvy0OKO2DyiStEucN3kGZQfykBqQ1 e+Ou5EMnLUTl03qquKFg95CsrgvfkdfXaa3JAfuDook1277pG9h5CsKsP8Dn65FtnLcE cpJlxpLaAabNiv6oftSIkYRSXpSC2SjtC8lwRVRHGKTsUxzK76xBa3yJ7BgAmqPc8iNd 0l8Q== X-Gm-Message-State: AOAM530IDTUdDExGg3A8Jcuz9k4HYqyGK7+8yH8zUO7aKO5J1tU2HT+i R8Qvj+xv2k1B8NscfiOQFBzlcDgUQjfa8Ug/OIbp4FrQSiiSuTe/YmvH99aFMGxWgC47j7dbEfk A6Qs71vLwGCozwEpsPuHZLFvr6wmBJUQQR+k5AAqIbjcS5Q2TRg+jAlvRFIcf X-Received: by 2002:a17:902:ced0:b0:142:189a:4284 with SMTP id d16-20020a170902ced000b00142189a4284mr33424074plg.79.1636962945587; Sun, 14 Nov 2021 23:55:45 -0800 (PST) X-Google-Smtp-Source: ABdhPJxIh24oN/aOHwQNeFngZZiEj6k+znRNmUOXfm7wYUcH1IH7Mfl4beexyoTb5Mv7N+ELT1eiqg== X-Received: by 2002:a17:902:ced0:b0:142:189a:4284 with SMTP id d16-20020a170902ced000b00142189a4284mr33424023plg.79.1636962945136; Sun, 14 Nov 2021 23:55:45 -0800 (PST) Received: from localhost.localdomain ([191.101.132.223]) by smtp.gmail.com with ESMTPSA id e10sm15792796pfv.140.2021.11.14.23.55.37 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 14 Nov 2021 23:55:44 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Axel Rasmussen , Nadav Amit , Mike Rapoport , Hugh Dickins , Mike Kravetz , "Kirill A . Shutemov" , Alistair Popple , Jerome Glisse , Matthew Wilcox , Andrew Morton , peterx@redhat.com, David Hildenbrand , Andrea Arcangeli Subject: [PATCH v6 01/23] mm: Introduce PTE_MARKER swap entry Date: Mon, 15 Nov 2021 15:55:00 +0800 Message-Id: <20211115075522.73795-2-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20211115075522.73795-1-peterx@redhat.com> References: <20211115075522.73795-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="US-ASCII" X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 0EB6510000AC X-Stat-Signature: y8rrxfnfcwp13cske1e9y5baqmpno19a Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=OqFG+sUt; spf=none (imf12.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1636962948-290646 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch introduces a new swap entry type called PTE_MARKER. It can be installed for any pte that maps a file-backed memory when the pte is temporarily zapped, so as to maintain per-pte information. The information that kept in the pte is called a "marker". Here we defin= e the marker as "unsigned long" just to match pgoff_t, however it will only wor= k if it still fits in swp_offset(), which is e.g. currently 58 bits on x86_64. A new config CONFIG_PTE_MARKER is introduced too; it's by default off. A= bunch of helpers are defined altogether to service the rest of the pte marker c= ode. Signed-off-by: Peter Xu --- include/asm-generic/hugetlb.h | 9 ++++ include/linux/swap.h | 15 ++++++- include/linux/swapops.h | 78 +++++++++++++++++++++++++++++++++++ mm/Kconfig | 7 ++++ 4 files changed, 108 insertions(+), 1 deletion(-) diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.= h index 8e1e6244a89d..f39cad20ffc6 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -2,6 +2,9 @@ #ifndef _ASM_GENERIC_HUGETLB_H #define _ASM_GENERIC_HUGETLB_H =20 +#include +#include + static inline pte_t mk_huge_pte(struct page *page, pgprot_t pgprot) { return mk_pte(page, pgprot); @@ -80,6 +83,12 @@ static inline int huge_pte_none(pte_t pte) } #endif =20 +/* Please refer to comments above pte_none_mostly() for the usage */ +static inline int huge_pte_none_mostly(pte_t pte) +{ + return huge_pte_none(pte) || is_pte_marker(pte); +} + #ifndef __HAVE_ARCH_HUGE_PTE_WRPROTECT static inline pte_t huge_pte_wrprotect(pte_t pte) { diff --git a/include/linux/swap.h b/include/linux/swap.h index d1ea44b31f19..cc9adcfd666f 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -55,6 +55,19 @@ static inline int current_is_kswapd(void) * actions on faults. */ =20 +/* + * PTE markers are used to persist information onto PTEs that are mapped= with + * file-backed memories. As its name "PTE" hints, it should only be app= lied to + * the leaves of pgtables. + */ +#ifdef CONFIG_PTE_MARKER +#define SWP_PTE_MARKER_NUM 1 +#define SWP_PTE_MARKER (MAX_SWAPFILES + SWP_HWPOISON_NUM + \ + SWP_MIGRATION_NUM + SWP_DEVICE_NUM) +#else +#define SWP_PTE_MARKER_NUM 0 +#endif + /* * Unaddressable device memory support. See include/linux/hmm.h and * Documentation/vm/hmm.rst. Short description is we need struct pages f= or @@ -100,7 +113,7 @@ static inline int current_is_kswapd(void) =20 #define MAX_SWAPFILES \ ((1 << MAX_SWAPFILES_SHIFT) - SWP_DEVICE_NUM - \ - SWP_MIGRATION_NUM - SWP_HWPOISON_NUM) + SWP_MIGRATION_NUM - SWP_HWPOISON_NUM - SWP_PTE_MARKER_NUM) =20 /* * Magic header for a swap area. The first part of the union is diff --git a/include/linux/swapops.h b/include/linux/swapops.h index d356ab4047f7..5103d2a4ae38 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -247,6 +247,84 @@ static inline int is_writable_migration_entry(swp_en= try_t entry) =20 #endif =20 +typedef unsigned long pte_marker; + +#define PTE_MARKER_MASK (0) + +#ifdef CONFIG_PTE_MARKER + +static inline swp_entry_t make_pte_marker_entry(pte_marker marker) +{ + return swp_entry(SWP_PTE_MARKER, marker); +} + +static inline bool is_pte_marker_entry(swp_entry_t entry) +{ + return swp_type(entry) =3D=3D SWP_PTE_MARKER; +} + +static inline pte_marker pte_marker_get(swp_entry_t entry) +{ + return swp_offset(entry) & PTE_MARKER_MASK; +} + +static inline bool is_pte_marker(pte_t pte) +{ + return is_swap_pte(pte) && is_pte_marker_entry(pte_to_swp_entry(pte)); +} + +#else /* CONFIG_PTE_MARKER */ + +static inline swp_entry_t make_pte_marker_entry(pte_marker marker) +{ + /* This should never be called if !CONFIG_PTE_MARKER */ + WARN_ON_ONCE(1); + return swp_entry(0, 0); +} + +static inline bool is_pte_marker_entry(swp_entry_t entry) +{ + return false; +} + +static inline pte_marker pte_marker_get(swp_entry_t entry) +{ + return 0; +} + +static inline bool is_pte_marker(pte_t pte) +{ + return false; +} + +#endif /* CONFIG_PTE_MARKER */ + +static inline pte_t make_pte_marker(pte_marker marker) +{ + return swp_entry_to_pte(make_pte_marker_entry(marker)); +} + +/* + * This is a special version to check pte_none() just to cover the case = when + * the pte is a pte marker. It existed because in many cases the pte ma= rker + * should be seen as a none pte; it's just that we have stored some info= rmation + * onto the none pte so it becomes not-none any more. + * + * It should be used when the pte is file-backed, ram-based and backing + * userspace pages, like shmem. It is not needed upon pgtables that do = not + * support pte markers at all. For example, it's not needed on anonymou= s + * memory, kernel-only memory (including when the system is during-boot)= , + * non-ram based generic file-system. It's fine to be used even there, = but the + * extra pte marker check will be pure overhead. + * + * For systems configured with !CONFIG_PTE_MARKER this will be automatic= ally + * optimized to pte_none(). + */ +static inline int pte_none_mostly(pte_t pte) +{ + return pte_none(pte) || is_pte_marker(pte); +} + static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) { struct page *p =3D pfn_to_page(swp_offset(entry)); diff --git a/mm/Kconfig b/mm/Kconfig index 068ce591a13a..66f23c6c2032 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -897,6 +897,13 @@ config IO_MAPPING config SECRETMEM def_bool ARCH_HAS_SET_DIRECT_MAP && !EMBEDDED =20 +config PTE_MARKER + def_bool n + bool "Marker PTEs support" + + help + Allows to create marker PTEs for file-backed memory. + source "mm/damon/Kconfig" =20 endmenu --=20 2.32.0