public inbox for linux-s390@vger.kernel.org
 help / color / mirror / Atom feed
From: Zi Yan <ziy@nvidia.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Janosch Frank <frankja@linux.ibm.com>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>,
	David Hildenbrand <david@redhat.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Sven Schnelle <svens@linux.ibm.com>, Peter Xu <peterx@redhat.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Arnd Bergmann <arnd@arndb.de>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Nico Pache <npache@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	Barry Song <baohua@kernel.org>, Lance Yang <lance.yang@linux.dev>,
	Muchun Song <muchun.song@linux.dev>,
	Oscar Salvador <osalvador@suse.de>,
	Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Matthew Brost <matthew.brost@intel.com>,
	Joshua Hahn <joshua.hahnjy@gmail.com>,
	Rakie Kim <rakie.kim@sk.com>, Byungchul Park <byungchul@sk.com>,
	Gregory Price <gourry@gourry.net>,
	Ying Huang <ying.huang@linux.alibaba.com>,
	Alistair Popple <apopple@nvidia.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	Kemeng Shi <shikemeng@huaweicloud.com>,
	Kairui Song <kasong@tencent.com>, Nhat Pham <nphamcs@gmail.com>,
	Baoquan He <bhe@redhat.com>, Chris Li <chrisl@kernel.org>,
	SeongJae Park <sj@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Jason Gunthorpe <jgg@ziepe.ca>, Leon Romanovsky <leon@kernel.org>,
	Xu Xin <xu.xin16@zte.com.cn>,
	Chengming Zhou <chengming.zhou@linux.dev>,
	Jann Horn <jannh@google.com>, Miaohe Lin <linmiaohe@huawei.com>,
	Naoya Horiguchi <nao.horiguchi@gmail.com>,
	Pedro Falcato <pfalcato@suse.de>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Rik van Riel <riel@surriel.com>, Harry Yoo <harry.yoo@oracle.com>,
	Hugh Dickins <hughd@google.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-arch@vger.kernel.org,
	damon@lists.linux.dev
Subject: Re: [PATCH v3 03/16] mm: avoid unnecessary uses of is_swap_pte()
Date: Wed, 12 Nov 2025 11:03:24 -0500	[thread overview]
Message-ID: <D80CF897-63BE-4C11-8363-57ABD5303DA1@nvidia.com> (raw)
In-Reply-To: <c69f57ff-c4b1-4fb9-8954-c5687dc2d904@lucifer.local>

On 12 Nov 2025, at 10:59, Lorenzo Stoakes wrote:

> On Tue, Nov 11, 2025 at 09:58:36PM -0500, Zi Yan wrote:
>> On 10 Nov 2025, at 17:21, Lorenzo Stoakes wrote:
>>
>>> There's an established convention in the kernel that we treat PTEs as
>>> containing swap entries (and the unfortunately named non-swap swap entries)
>>> should they be neither empty (i.e. pte_none() evaluating true) nor present
>>> (i.e. pte_present() evaluating true).
>>>
>>> However, there is some inconsistency in how this is applied, as we also
>>> have the is_swap_pte() helper which explicitly performs this check:
>>>
>>> 	/* check whether a pte points to a swap entry */
>>> 	static inline int is_swap_pte(pte_t pte)
>>> 	{
>>> 		return !pte_none(pte) && !pte_present(pte);
>>> 	}
>>>
>>> As this represents a predicate, and it's logical to assume that in order to
>>> establish that a PTE entry can correctly be manipulated as a swap/non-swap
>>> entry, this predicate seems as if it must first be checked.
>>>
>>> But we instead, we far more often utilise the established convention of
>>> checking pte_none() / pte_present() before operating on entries as if they
>>> were swap/non-swap.
>>>
>>> This patch works towards correcting this inconsistency by removing all uses
>>> of is_swap_pte() where we are already in a position where we perform
>>> pte_none()/pte_present() checks anyway or otherwise it is clearly logical
>>> to do so.
>>>
>>> We also take advantage of the fact that pte_swp_uffd_wp() is only set on
>>> swap entries.
>>>
>>> Additionally, update comments referencing to is_swap_pte() and
>>> non_swap_entry().
>>>
>>> No functional change intended.
>>>
>>> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>>> ---
>>>  fs/proc/task_mmu.c            | 49 ++++++++++++++++++++++++-----------
>>>  include/linux/userfaultfd_k.h |  3 +--
>>>  mm/hugetlb.c                  |  6 ++---
>>>  mm/internal.h                 |  6 ++---
>>>  mm/khugepaged.c               | 29 +++++++++++----------
>>>  mm/migrate.c                  |  2 +-
>>>  mm/mprotect.c                 | 43 ++++++++++++++----------------
>>>  mm/mremap.c                   |  7 +++--
>>>  mm/page_table_check.c         | 13 ++++++----
>>>  mm/page_vma_mapped.c          | 31 +++++++++++-----------
>>>  10 files changed, 104 insertions(+), 85 deletions(-)
>>>
>>
>> <snip>
>>
>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>> index be20468fb5a9..a4e23818f37f 100644
>>> --- a/mm/page_vma_mapped.c
>>> +++ b/mm/page_vma_mapped.c
>>> @@ -16,6 +16,7 @@ static inline bool not_found(struct page_vma_mapped_walk *pvmw)
>>>  static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>>>  		    spinlock_t **ptlp)
>>>  {
>>> +	bool is_migration;
>>>  	pte_t ptent;
>>>
>>>  	if (pvmw->flags & PVMW_SYNC) {
>>> @@ -26,6 +27,7 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>>>  		return !!pvmw->pte;
>>>  	}
>>>
>>> +	is_migration = pvmw->flags & PVMW_MIGRATION;
>>>  again:
>>>  	/*
>>>  	 * It is important to return the ptl corresponding to pte,
>>> @@ -41,11 +43,14 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>>>
>>>  	ptent = ptep_get(pvmw->pte);
>>>
>>> -	if (pvmw->flags & PVMW_MIGRATION) {
>>> -		if (!is_swap_pte(ptent))
>>
>> Here, is_migration = true and either pte_none() or pte_present()
>> would return false, and ...
>>
>>> +	if (pte_none(ptent)) {
>>> +		return false;
>>> +	} else if (pte_present(ptent)) {
>>> +		if (is_migration)
>>>  			return false;
>>> -	} else if (is_swap_pte(ptent)) {
>>> +	} else if (!is_migration) {
>>>  		swp_entry_t entry;
>>> +
>>>  		/*
>>>  		 * Handle un-addressable ZONE_DEVICE memory.
>>>  		 *
>>> @@ -66,8 +71,6 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>>>  		if (!is_device_private_entry(entry) &&
>>>  		    !is_device_exclusive_entry(entry))
>>>  			return false;
>>> -	} else if (!pte_present(ptent)) {
>>> -		return false;
>>
>> ... is_migration = false and !pte_present() is actually pte_none(),
>> because of the is_swap_pte() above the added !is_migration check.
>> So pte_none() should return false regardless of is_migration.
>
> I guess you were working this through :) well I decided to also just to
> double-check I got it right, maybe useful for you also :P -
>
> Previously:
>
> 	if (is_migration) {
> 		if (!is_swap_pte(ptent))
> 			return false;
> 	} else if (is_swap_pte(ptent)) {
> 		... ZONE_DEVICE blah ...
> 	} else if (!pte_present(ptent)) {
> 		return false;
> 	}
>
> But is_swap_pte() is the same as !pte_none() && !pte_present(), so
> !is_swap_pte() is pte_none() || pte_present() by De Morgan's law:
>
> 	if (is_migration) {
> 		if (pte_none(ptent) || pte_present(ptent))
> 			return false;
> 	} else if (!pte_none(ptent) && !pte_present(ptent)) {
> 		... ZONE_DEVICE blah ...
> 	} else if (!pte_present(ptent)) {
> 		return false;
> 	}
>
> In the last branch, we know (again by De Morgan's law) that either
> pte_none(ptent) or pte_present(ptent).. But we explicitly check for
> !pte_present(ptent) so this becomes:
>
> 	if (is_migration) {
> 		if (pte_none(ptent) || pte_present(ptent))
> 			return false;
> 	} else if (!pte_none(ptent) && !pte_present(ptent)) {
> 		... ZONE_DEVICE blah ...
> 	} else if (pte_none(ptent)) {
> 		return false;
> 	}
>
> So we can generalise - regardless of is_migration, pte_none() returns false:
>
> 	if (pte_none(ptent)) {
> 		return false;
> 	} else if (is_migration) {
> 		if (pte_none(ptent) || pte_present(ptent))
> 			return false;
> 	} else if (!pte_none(ptent) && !pte_present(ptent)) {
> 		... ZONE_DEVICE blah ...
> 	}
>
> Since we already check for pte_none() ahead of time, we can simplify again:
>
> 	if (pte_none(ptent)) {
> 		return false;
> 	} else if (is_migration) {
> 		if (pte_present(ptent))
> 			return false;
> 	} else if (!pte_present(ptent)) {
> 		... ZONE_DEVICE blah ...
> 	}
>
> We can then put the pte_present() check in the outer branch:
>
> 	if (pte_none(ptent)) {
> 		return false;
> 	} else if (pte_present(ptent)) {
> 		if (is_migration)
> 			return false;
> 	} else if (!is_migration) {
> 		... ZONE_DEVICE blah ...
> 	}
>
> Because previously an is_migration && !pte_present() case would result in no
> action here.
>
> Which is the code in this patch :)

Thanks again for spelling out the whole process.

>
>>
>> This is a nice cleanup. Thanks.
>>
>>>  	}
>>>  	spin_lock(*ptlp);
>>>  	if (unlikely(!pmd_same(*pmdvalp, pmdp_get_lockless(pvmw->pmd)))) {
>>> @@ -113,21 +116,17 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>>>  			return false;
>>>
>>>  		pfn = softleaf_to_pfn(entry);
>>> -	} else if (is_swap_pte(ptent)) {
>>> -		swp_entry_t entry;
>>> +	} else if (pte_present(ptent)) {
>>> +		pfn = pte_pfn(ptent);
>>> +	} else {
>>> +		const softleaf_t entry = softleaf_from_pte(ptent);
>>>
>>>  		/* Handle un-addressable ZONE_DEVICE memory */
>>> -		entry = pte_to_swp_entry(ptent);
>>> -		if (!is_device_private_entry(entry) &&
>>> -		    !is_device_exclusive_entry(entry))
>>> -			return false;
>>> -
>>> -		pfn = swp_offset_pfn(entry);
>>> -	} else {
>>> -		if (!pte_present(ptent))
>>
>> This !pte_present() is pte_none(). It seems that there should be
>
> Well this should be fine though as:
>
> 		const softleaf_t entry = softleaf_from_pte(ptent);
>
> 		/* Handle un-addressable ZONE_DEVICE memory */
> 		if (!softleaf_is_device_private(entry) &&
> 		    !softleaf_is_device_exclusive(entry))
> 			return false;
>
> Still correctly handles none - as softleaf_from_pte() in case of pte_none() will
> be a none softleaf entry which will fail both of these tests.
>
> So excluding pte_none() as an explicit test here was part of the rework - we no
> longer have to do that.

Got it. Now my RB is yours. :)

>
>>
>> } else if (pte_none(ptent)) {
>> 	return false;
>> }
>>
>> before the above "} else {".
>>
>>> +		if (!softleaf_is_device_private(entry) &&
>>> +		    !softleaf_is_device_exclusive(entry))
>>>  			return false;
>>>
>>> -		pfn = pte_pfn(ptent);
>>> +		pfn = softleaf_to_pfn(entry);
>>>  	}
>>>
>>>  	if ((pfn + pte_nr - 1) < pvmw->pfn)
>>> --
>>> 2.51.0
>>
>> Otherwise, LGTM. With the above issue addressed, feel free to
>> add Reviewed-by: Zi Yan <ziy@nvidia.com>
>
> Thanks!
>
>>
>> --
>> Best Regards,
>> Yan, Zi
>
> Cheers, Lorenzo


Best Regards,
Yan, Zi

  reply	other threads:[~2025-11-12 16:03 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-10 22:21 [PATCH v2 00/16] mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries Lorenzo Stoakes
2025-11-10 22:21 ` [PATCH v3 01/16] mm: correctly handle UFFD PTE markers Lorenzo Stoakes
2025-11-11  9:39   ` Mike Rapoport
2025-11-11  9:48     ` Lorenzo Stoakes
2025-11-10 22:21 ` [PATCH v3 02/16] mm: introduce leaf entry type and use to simplify leaf entry logic Lorenzo Stoakes
2025-11-11  3:25   ` Zi Yan
2025-11-11  7:16     ` Lorenzo Stoakes
2025-11-11 16:20       ` Zi Yan
2025-11-11 13:06     ` David Hildenbrand (Red Hat)
2025-11-11 16:26       ` Zi Yan
2025-11-12 15:36         ` Lorenzo Stoakes
2025-11-11  3:56   ` Zi Yan
2025-11-11  7:31     ` Lorenzo Stoakes
2025-11-11 16:40       ` Zi Yan
2025-11-12 14:06         ` Lorenzo Stoakes
2025-11-12 15:32   ` Lorenzo Stoakes
2025-11-12 15:36   ` Vlastimil Babka
2025-11-13 14:56   ` Lorenzo Stoakes
2025-11-13 15:32     ` Lorenzo Stoakes
2025-11-10 22:21 ` [PATCH v3 03/16] mm: avoid unnecessary uses of is_swap_pte() Lorenzo Stoakes
2025-11-12  2:58   ` Zi Yan
2025-11-12 15:59     ` Lorenzo Stoakes
2025-11-12 16:03       ` Zi Yan [this message]
2025-11-12 16:11     ` Zi Yan
2025-11-12 18:48   ` Vlastimil Babka
2025-11-10 22:21 ` [PATCH v3 04/16] mm: eliminate is_swap_pte() when softleaf_from_pte() suffices Lorenzo Stoakes
2025-11-21 16:46   ` Vlastimil Babka
2025-11-10 22:21 ` [PATCH v3 05/16] mm: use leaf entries in debug pgtable + remove is_swap_pte() Lorenzo Stoakes
2025-11-21 17:10   ` Vlastimil Babka
2025-11-10 22:21 ` [PATCH v3 06/16] fs/proc/task_mmu: refactor pagemap_pmd_range() Lorenzo Stoakes
2025-11-21 17:17   ` Vlastimil Babka
2025-11-10 22:21 ` [PATCH v3 07/16] mm: avoid unnecessary use of is_swap_pmd() Lorenzo Stoakes
2025-11-21 17:42   ` Vlastimil Babka
2025-11-21 19:25     ` Lorenzo Stoakes
2025-11-21 19:55       ` Andrew Morton
2025-11-24 12:27         ` Lorenzo Stoakes
2025-11-10 22:21 ` [PATCH v3 08/16] mm/huge_memory: refactor copy_huge_pmd() non-present logic Lorenzo Stoakes
2025-11-21 17:56   ` Vlastimil Babka
2025-11-21 19:23     ` Lorenzo Stoakes
2025-11-10 22:21 ` [PATCH v3 09/16] mm/huge_memory: refactor change_huge_pmd() " Lorenzo Stoakes
2025-11-21 17:58   ` Vlastimil Babka
2025-11-10 22:21 ` [PATCH v3 10/16] mm: replace pmd_to_swp_entry() with softleaf_from_pmd() Lorenzo Stoakes
2025-11-21 18:42   ` Vlastimil Babka
2025-11-21 19:22     ` Lorenzo Stoakes
2025-11-21 19:23   ` Lorenzo Stoakes
2025-11-10 22:21 ` [PATCH v3 11/16] mm: introduce pmd_is_huge() and use where appropriate Lorenzo Stoakes
2025-11-27 17:00   ` Vlastimil Babka
2025-11-10 22:21 ` [PATCH v3 12/16] mm: remove remaining is_swap_pmd() users and is_swap_pmd() Lorenzo Stoakes
2025-11-27 17:03   ` Vlastimil Babka
2025-11-10 22:21 ` [PATCH v3 13/16] mm: remove non_swap_entry() and use softleaf helpers instead Lorenzo Stoakes
2025-11-27 17:12   ` Vlastimil Babka
2025-11-10 22:21 ` [PATCH v3 14/16] mm: remove is_hugetlb_entry_[migration, hwpoisoned]() Lorenzo Stoakes
2025-11-27 17:29   ` Vlastimil Babka
2025-11-27 17:41     ` Lorenzo Stoakes
2025-11-27 17:45   ` Lorenzo Stoakes
2025-11-27 19:33     ` Andrew Morton
2025-11-10 22:21 ` [PATCH v3 15/16] mm: eliminate further swapops predicates Lorenzo Stoakes
2025-11-27 17:42   ` Vlastimil Babka
2025-11-10 22:21 ` [PATCH v3 16/16] mm: replace remaining pte_to_swp_entry() with softleaf_from_pte() Lorenzo Stoakes
2025-11-27 17:53   ` Vlastimil Babka
2025-11-27 18:02     ` Vlastimil Babka
2025-11-27 18:03     ` Lorenzo Stoakes
2025-11-10 22:24 ` [PATCH v2 00/16] mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries Lorenzo Stoakes
2025-11-11  0:17 ` Andrew Morton
2025-11-21 23:44 ` Jason Gunthorpe
2025-11-24 10:06   ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D80CF897-63BE-4C11-8363-57ABD5303DA1@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=arnd@arndb.de \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=brauner@kernel.org \
    --cc=byungchul@sk.com \
    --cc=chengming.zhou@linux.dev \
    --cc=chrisl@kernel.org \
    --cc=damon@lists.linux.dev \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=frankja@linux.ibm.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=gourry@gourry.net \
    --cc=harry.yoo@oracle.com \
    --cc=hca@linux.ibm.com \
    --cc=hughd@google.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=jgg@ziepe.ca \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kasong@tencent.com \
    --cc=kvm@vger.kernel.org \
    --cc=lance.yang@linux.dev \
    --cc=leon@kernel.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=matthew.brost@intel.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=nao.horiguchi@gmail.com \
    --cc=npache@redhat.com \
    --cc=nphamcs@gmail.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=peterx@redhat.com \
    --cc=pfalcato@suse.de \
    --cc=rakie.kim@sk.com \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=sj@kernel.org \
    --cc=surenb@google.com \
    --cc=svens@linux.ibm.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=weixugc@google.com \
    --cc=willy@infradead.org \
    --cc=xu.xin16@zte.com.cn \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yuanchu@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox