All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alistair Popple <apopple@nvidia.com>
To: Miaohe Lin <linmiaohe@huawei.com>
Cc: akpm@linux-foundation.org, willy@infradead.org, vbabka@suse.cz,
	dhowells@redhat.com, neilb@suse.de, david@redhat.com,
	surenb@google.com, minchan@kernel.org, peterx@redhat.com,
	sfr@canb.auug.org.au, rcampbell@nvidia.com,
	naoya.horiguchi@nec.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm/swapfile: unuse_pte can map random data if swap read fails
Date: Tue, 19 Apr 2022 13:51:26 +1000	[thread overview]
Message-ID: <87tuapk9n7.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <20220416030549.60559-1-linmiaohe@huawei.com>

[-- Attachment #1: Type: text/plain, Size: 4557 bytes --]

Miaohe Lin <linmiaohe@huawei.com> writes:

> There is a bug in unuse_pte(): when swap page happens to be unreadable,
> page filled with random data is mapped into user address space. In case
> of error, a special swap entry indicating swap read fails is set to the
> page table. So the swapcache page can be freed and the user won't end up
> with a permanently mounted swap because a sector is bad. And if the page
> is accessed later, the user process will be killed so that corrupted data
> is never consumed. On the other hand, if the page is never accessed, the
> user won't even notice it.

Hi Miaohe,

It seems we're not actually using the pfn that gets stored in the special swap
entry here. Is my understanding correct? If so I think it would be better to use
the new PTE markers Peter introduced[1] rather than adding another swap entry
type.

[1] - <https://lore.kernel.org/linux-mm/20220405014833.14015-1-peterx@redhat.com/>

> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
> v2:
>   use special swap entry to avoid permanently mounted swap
>   free the bad page in swapcache
> ---
>  include/linux/swap.h    |  7 ++++++-
>  include/linux/swapops.h | 10 ++++++++++
>  mm/memory.c             |  5 ++++-
>  mm/swapfile.c           | 11 +++++++++++
>  4 files changed, 31 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index d112434f85df..03c576111737 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -55,6 +55,10 @@ static inline int current_is_kswapd(void)
>   * actions on faults.
>   */
>
> +#define SWAP_READ_ERROR_NUM 1
> +#define SWAP_READ_ERROR     (MAX_SWAPFILES + SWP_HWPOISON_NUM + \
> +			     SWP_MIGRATION_NUM + SWP_DEVICE_NUM + \
> +			     SWP_PTE_MARKER_NUM)
>  /*
>   * PTE markers are used to persist information onto PTEs that are mapped with
>   * file-backed memories.  As its name "PTE" hints, it should only be applied to
> @@ -120,7 +124,8 @@ static inline int current_is_kswapd(void)
>
>  #define MAX_SWAPFILES \
>  	((1 << MAX_SWAPFILES_SHIFT) - SWP_DEVICE_NUM - \
> -	SWP_MIGRATION_NUM - SWP_HWPOISON_NUM - SWP_PTE_MARKER_NUM)
> +	SWP_MIGRATION_NUM - SWP_HWPOISON_NUM - \
> +	SWP_PTE_MARKER_NUM - SWAP_READ_ERROR_NUM)
>
>  /*
>   * Magic header for a swap area. The first part of the union is
> diff --git a/include/linux/swapops.h b/include/linux/swapops.h
> index fffbba0036f6..d1093384de9f 100644
> --- a/include/linux/swapops.h
> +++ b/include/linux/swapops.h
> @@ -108,6 +108,16 @@ static inline void *swp_to_radix_entry(swp_entry_t entry)
>  	return xa_mk_value(entry.val);
>  }
>
> +static inline swp_entry_t make_swapin_error_entry(struct page *page)
> +{
> +	return swp_entry(SWAP_READ_ERROR, page_to_pfn(page));
> +}
> +
> +static inline int is_swapin_error_entry(swp_entry_t entry)
> +{
> +	return swp_type(entry) == SWAP_READ_ERROR;
> +}
> +
>  #if IS_ENABLED(CONFIG_DEVICE_PRIVATE)
>  static inline swp_entry_t make_readable_device_private_entry(pgoff_t offset)
>  {
> diff --git a/mm/memory.c b/mm/memory.c
> index e6434b824009..34d1d66a05bd 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1476,7 +1476,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
>  			/* Only drop the uffd-wp marker if explicitly requested */
>  			if (!zap_drop_file_uffd_wp(details))
>  				continue;
> -		} else if (is_hwpoison_entry(entry)) {
> +		} else if (is_hwpoison_entry(entry) ||
> +			   is_swapin_error_entry(entry)) {
>  			if (!should_zap_cows(details))
>  				continue;
>  		} else {
> @@ -3724,6 +3725,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>  			ret = vmf->page->pgmap->ops->migrate_to_ram(vmf);
>  		} else if (is_hwpoison_entry(entry)) {
>  			ret = VM_FAULT_HWPOISON;
> +		} else if (is_swapin_error_entry(entry)) {
> +			ret = VM_FAULT_SIGBUS;
>  		} else if (is_pte_marker_entry(entry)) {
>  			ret = handle_pte_marker(vmf);
>  		} else {
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 9398e915b36b..95b63f69f388 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1797,6 +1797,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
>  		goto out;
>  	}
>
> +	if (unlikely(!PageUptodate(page))) {
> +		pte_t pteval;
> +
> +		dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
> +		pteval = swp_entry_to_pte(make_swapin_error_entry(page));
> +		set_pte_at(vma->vm_mm, addr, pte, pteval);
> +		swap_free(entry);
> +		ret = 0;
> +		goto out;
> +	}
> +
>  	/* See do_swap_page() */
>  	BUG_ON(!PageAnon(page) && PageMappedToDisk(page));
>  	BUG_ON(PageAnon(page) && PageAnonExclusive(page));

  reply	other threads:[~2022-04-19  3:57 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-16  3:05 [PATCH v2] mm/swapfile: unuse_pte can map random data if swap read fails Miaohe Lin
2022-04-19  3:51 ` Alistair Popple [this message]
2022-04-19  7:29   ` Miaohe Lin
2022-04-19  7:39     ` David Hildenbrand
2022-04-19  8:08       ` Alistair Popple
2022-04-19 11:14         ` David Hildenbrand
2022-04-19 16:16           ` Peter Xu
2022-04-19 11:14         ` Miaohe Lin
2022-04-19  7:53   ` Alistair Popple
2022-04-19 11:26     ` Miaohe Lin
2022-04-20  0:25       ` Alistair Popple
2022-04-20  6:15         ` Miaohe Lin
2022-04-20  7:07           ` David Hildenbrand
2022-04-20  8:37             ` Miaohe Lin
2022-04-19  7:37 ` David Hildenbrand
2022-04-19 11:21   ` Miaohe Lin
2022-04-19 11:46     ` David Hildenbrand
2022-04-19 12:00       ` Miaohe Lin
2022-04-19 12:12         ` David Hildenbrand
2022-04-19 12:45           ` Miaohe Lin
2022-04-19 21:36 ` Peter Xu
2022-04-20  5:56   ` [PATCH] mm/swap: Fix lost swap bits in unuse_pte() kernel test robot
2022-04-20  6:23     ` Miaohe Lin
2022-04-20  6:23       ` Miaohe Lin
2022-04-20  6:39       ` [kbuild-all] " Philip Li
2022-04-20  6:52         ` Miaohe Lin
2022-04-20  6:52           ` Miaohe Lin
2022-04-20  6:48       ` [kbuild-all] " Chen, Rong A
2022-04-20  6:56         ` Miaohe Lin
2022-04-20  6:56           ` Miaohe Lin
2022-04-20  6:21   ` [PATCH v2] mm/swapfile: unuse_pte can map random data if swap read fails Miaohe Lin
2022-04-20 13:32     ` Peter Xu
2022-04-21  1:50       ` Miaohe Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tuapk9n7.fsf@nvdebian.thelocal \
    --to=apopple@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=neilb@suse.de \
    --cc=peterx@redhat.com \
    --cc=rcampbell@nvidia.com \
    --cc=sfr@canb.auug.org.au \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.