All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alistair Popple <apopple@nvidia.com>
To: Peter Xu <peterx@redhat.com>
Cc: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	David Hildenbrand <david@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Hugh Dickins <hughd@google.com>,
	Tiberiu Georgescu <tiberiu.georgescu@nutanix.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Miaohe Lin <linmiaohe@huawei.com>
Subject: Re: [PATCH v5 05/26] mm/swap: Introduce the idea of special swap ptes
Date: Fri, 16 Jul 2021 15:50:52 +1000	[thread overview]
Message-ID: <6116877.MhgVfB7NV9@nvdebian> (raw)
In-Reply-To: <20210715201422.211004-6-peterx@redhat.com>

Hi Peter,

[...]

> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ae1f5d0cb581..4b46c099ad94 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5738,7 +5738,7 @@ static enum mc_target_type get_mctgt_type(struct vm_area_struct *vma,
>  
>  	if (pte_present(ptent))
>  		page = mc_handle_present_pte(vma, addr, ptent);
> -	else if (is_swap_pte(ptent))
> +	else if (pte_has_swap_entry(ptent))
>  		page = mc_handle_swap_pte(vma, ptent, &ent);
>  	else if (pte_none(ptent))
>  		page = mc_handle_file_pte(vma, addr, ptent, &ent);

As I understand things pte_none() == False for a special swap pte, but
shouldn't this be treated as pte_none() here? Ie. does this need to be
pte_none(ptent) || is_swap_special_pte() here?

> diff --git a/mm/memory.c b/mm/memory.c
> index 0e0de08a2cd5..998a4f9a3744 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3491,6 +3491,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>  	if (!pte_unmap_same(vmf))
>  		goto out;
>  
> +	/*
> +	 * We should never call do_swap_page upon a swap special pte; just be
> +	 * safe to bail out if it happens.
> +	 */
> +	if (WARN_ON_ONCE(is_swap_special_pte(vmf->orig_pte)))
> +		goto out;
> +
>  	entry = pte_to_swp_entry(vmf->orig_pte);
>  	if (unlikely(non_swap_entry(entry))) {
>  		if (is_migration_entry(entry)) {

Are there other changes required here? Because we can end up with stale special
pte's and a special pte is !pte_none don't we need to fix some of the !pte_none
checks in these functions:

insert_pfn() -> checks for !pte_none
remap_pte_range() -> BUG_ON(!pte_none)
apply_to_pte_range() -> didn't check further but it tests for !pte_none

In general it feels like I might be missing something here though. There are
plenty of checks in the kernel for pte_none() which haven't been updated. Is
there some rule that says none of those paths can see a special pte?

> diff --git a/mm/migrate.c b/mm/migrate.c
> index 23cbd9de030b..b477d0d5f911 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -294,7 +294,7 @@ void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep,
>  
>  	spin_lock(ptl);
>  	pte = *ptep;
> -	if (!is_swap_pte(pte))
> +	if (!pte_has_swap_entry(pte))
>  		goto out;
>  
>  	entry = pte_to_swp_entry(pte);
> @@ -2276,7 +2276,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>  
>  		pte = *ptep;
>  
> -		if (pte_none(pte)) {
> +		if (pte_none(pte) || is_swap_special_pte(pte)) {

I was wondering if we can loose the special pte information here? However I see
that in migrate_vma_insert_page() we check again and fail the migration if
!pte_none() so I think this is ok.

I think it would be better if this check was moved below so the migration fails
early. Ie:

		if (pte_none(pte)) {
 			if (vma_is_anonymous(vma) && !is_swap_special_pte(pte)) {

Also how does this work for page migration in general? I can see in
page_vma_mapped_walk() that we skip special pte's, but doesn't this mean we
loose the special pte in that instance? Or is that ok for some reason?

>  			if (vma_is_anonymous(vma)) {
>  				mpfn = MIGRATE_PFN_MIGRATE;
>  				migrate->cpages++;
> diff --git a/mm/mincore.c b/mm/mincore.c
> index 9122676b54d6..5728c3e6473f 100644
> --- a/mm/mincore.c
> +++ b/mm/mincore.c
> @@ -121,7 +121,7 @@ static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
>  	for (; addr != end; ptep++, addr += PAGE_SIZE) {
>  		pte_t pte = *ptep;
>  
> -		if (pte_none(pte))
> +		if (pte_none(pte) || is_swap_special_pte(pte))
>  			__mincore_unmapped_range(addr, addr + PAGE_SIZE,
>  						 vma, vec);
>  		else if (pte_present(pte))
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 883e2cc85cad..4b743394afbe 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -139,7 +139,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
>  			}
>  			ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent);
>  			pages++;
> -		} else if (is_swap_pte(oldpte)) {
> +		} else if (pte_has_swap_entry(oldpte)) {
>  			swp_entry_t entry = pte_to_swp_entry(oldpte);
>  			pte_t newpte;
>  
> diff --git a/mm/mremap.c b/mm/mremap.c
> index 5989d3990020..122b279333ee 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -125,7 +125,7 @@ static pte_t move_soft_dirty_pte(pte_t pte)
>  #ifdef CONFIG_MEM_SOFT_DIRTY
>  	if (pte_present(pte))
>  		pte = pte_mksoft_dirty(pte);
> -	else if (is_swap_pte(pte))
> +	else if (pte_has_swap_entry(pte))
>  		pte = pte_swp_mksoft_dirty(pte);
>  #endif
>  	return pte;
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index f7b331081791..ff57b67426af 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -36,7 +36,7 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw)
>  			 * For more details on device private memory see HMM
>  			 * (include/linux/hmm.h or mm/hmm.c).
>  			 */
> -			if (is_swap_pte(*pvmw->pte)) {
> +			if (pte_has_swap_entry(*pvmw->pte)) {
>  				swp_entry_t entry;
>  
>  				/* Handle un-addressable ZONE_DEVICE memory */
> @@ -90,7 +90,7 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
>  
>  	if (pvmw->flags & PVMW_MIGRATION) {
>  		swp_entry_t entry;
> -		if (!is_swap_pte(*pvmw->pte))
> +		if (!pte_has_swap_entry(*pvmw->pte))
>  			return false;
>  		entry = pte_to_swp_entry(*pvmw->pte);
>  
> @@ -99,7 +99,7 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
>  			return false;
>  
>  		pfn = swp_offset(entry);
> -	} else if (is_swap_pte(*pvmw->pte)) {
> +	} else if (pte_has_swap_entry(*pvmw->pte)) {
>  		swp_entry_t entry;
>  
>  		/* Handle un-addressable ZONE_DEVICE memory */
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 1e07d1c776f2..4993b4454c13 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1951,7 +1951,7 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
>  	si = swap_info[type];
>  	pte = pte_offset_map(pmd, addr);
>  	do {
> -		if (!is_swap_pte(*pte))
> +		if (!pte_has_swap_entry(*pte))
>  			continue;
>  
>  		entry = pte_to_swp_entry(*pte);
> 






  reply	other threads:[~2021-07-16  5:51 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-15 20:13 [PATCH v5 00/26] userfaultfd-wp: Support shmem and hugetlbfs Peter Xu
2021-07-15 20:13 ` [PATCH v5 01/26] mm/shmem: Unconditionally set pte dirty in mfill_atomic_install_pte Peter Xu
2021-07-15 20:13 ` [PATCH v5 02/26] shmem/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Peter Xu
2021-07-15 20:13 ` [PATCH v5 03/26] mm: Clear vmf->pte after pte_unmap_same() returns Peter Xu
2021-07-15 20:14 ` [PATCH v5 04/26] mm/userfaultfd: Introduce special pte for unmapped file-backed mem Peter Xu
2021-07-15 20:14 ` [PATCH v5 05/26] mm/swap: Introduce the idea of special swap ptes Peter Xu
2021-07-16  5:50   ` Alistair Popple [this message]
2021-07-16 19:11     ` Peter Xu
2021-07-21 11:28       ` Alistair Popple
2021-07-21 21:35         ` Peter Xu
2021-07-22  1:08           ` Alistair Popple
2021-07-22 15:21             ` Peter Xu
2021-07-15 20:14 ` [PATCH v5 06/26] shmem/userfaultfd: Handle uffd-wp special pte in page fault handler Peter Xu
2021-07-15 20:14 ` [PATCH v5 07/26] mm: Drop first_index/last_index in zap_details Peter Xu
2021-07-15 20:14 ` [PATCH v5 08/26] mm: Introduce zap_details.zap_flags Peter Xu
2021-07-15 20:14 ` [PATCH v5 09/26] mm: Introduce ZAP_FLAG_SKIP_SWAP Peter Xu
2021-07-15 20:14 ` [PATCH v5 10/26] shmem/userfaultfd: Persist uffd-wp bit across zapping for file-backed Peter Xu
2021-07-15 20:15 ` [PATCH v5 11/26] shmem/userfaultfd: Allow wr-protect none pte for file-backed mem Peter Xu
2021-07-15 20:16 ` [PATCH v5 12/26] shmem/userfaultfd: Allows file-back mem to be uffd wr-protected on thps Peter Xu
2021-07-15 20:16 ` [PATCH v5 13/26] shmem/userfaultfd: Handle the left-overed special swap ptes Peter Xu
2021-07-15 20:16 ` [PATCH v5 14/26] shmem/userfaultfd: Pass over uffd-wp special swap pte when fork() Peter Xu
2021-07-15 20:16 ` [PATCH v5 15/26] mm/hugetlb: Drop __unmap_hugepage_range definition from hugetlb.h Peter Xu
2021-07-15 20:16 ` [PATCH v5 16/26] mm/hugetlb: Introduce huge pte version of uffd-wp helpers Peter Xu
2021-07-15 20:16 ` [PATCH v5 17/26] hugetlb/userfaultfd: Hook page faults for uffd write protection Peter Xu
2021-07-20 15:37   ` kernel test robot
2021-07-20 15:37     ` kernel test robot
2021-07-21 21:50     ` Peter Xu
2021-07-21 21:50       ` Peter Xu
2021-07-15 20:16 ` [PATCH v5 18/26] hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Peter Xu
2021-07-20 23:59   ` kernel test robot
2021-07-20 23:59     ` kernel test robot
2021-07-15 20:16 ` [PATCH v5 19/26] hugetlb/userfaultfd: Handle UFFDIO_WRITEPROTECT Peter Xu
2021-07-21  8:24   ` kernel test robot
2021-07-21  8:24     ` kernel test robot
2021-07-15 20:16 ` [PATCH v5 20/26] mm/hugetlb: Introduce huge version of special swap pte helpers Peter Xu
2021-07-15 20:16 ` [PATCH v5 21/26] hugetlb/userfaultfd: Handle uffd-wp special pte in hugetlb pf handler Peter Xu
2021-07-15 20:16 ` [PATCH v5 22/26] hugetlb/userfaultfd: Allow wr-protect none ptes Peter Xu
2021-07-15 20:16 ` [PATCH v5 23/26] hugetlb/userfaultfd: Only drop uffd-wp special pte if required Peter Xu
2021-07-15 20:16 ` [PATCH v5 24/26] mm/pagemap: Recognize uffd-wp bit for shmem/hugetlbfs Peter Xu
2021-07-19  9:53   ` Tiberiu Georgescu
2021-07-19 16:03     ` Peter Xu
2021-07-19 17:23       ` Tiberiu Georgescu
2021-07-19 17:56         ` Peter Xu
2021-07-21 14:38           ` Ivan Teterevkov
2021-07-21 16:19             ` David Hildenbrand
2021-07-21 19:54               ` Ivan Teterevkov
2021-07-21 22:28                 ` Peter Xu
2021-07-21 22:57                   ` Peter Xu
2021-07-22  6:27                     ` David Hildenbrand
2021-07-22 16:08                       ` Peter Xu
2021-07-15 20:16 ` [PATCH v5 25/26] mm/userfaultfd: Enable write protection for shmem & hugetlbfs Peter Xu
2021-07-15 20:16 ` [PATCH v5 26/26] userfaultfd/selftests: Enable uffd-wp for shmem/hugetlbfs Peter Xu
2021-07-19 19:21 ` [PATCH v5 00/26] userfaultfd-wp: Support shmem and hugetlbfs David Hildenbrand
2021-07-19 20:12   ` Peter Xu
2021-07-22 18:30 ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6116877.MhgVfB7NV9@nvdebian \
    --to=apopple@nvidia.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=nadav.amit@gmail.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=tiberiu.georgescu@nutanix.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.