All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Dave Hansen <dave.hansen@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>, Jann Horn <jannh@google.com>,
	Linux-MM <linux-mm@kvack.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	Daniel Colascione <dancol@google.com>,
	"Joel Fernandes (Google)" <joel@joelfernandes.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings?
Date: Thu, 12 Mar 2020 19:00:18 -0700	[thread overview]
Message-ID: <20200313020018.GC68817@google.com> (raw)
In-Reply-To: <bd35c17d-8766-cba5-09b3-87970de4c731@intel.com>

On Thu, Mar 12, 2020 at 02:41:07PM -0700, Dave Hansen wrote:
> One other fun thing.  I have a "victim" thread sitting in a loop doing:
> 
> 	sleep(1)
> 	memcpy(&garbage, buffer, sz);
> 
> The "attacker" is doing
> 
> 	madvise(buffer, sz, MADV_PAGEOUT);
> 
> in a loop.  That, oddly enough doesn't cause the victim to page fault.
> But, if I do:
> 
> 	memcpy(&garbage, buffer, sz);
> 	madvise(buffer, sz, MADV_PAGEOUT);
> 
> It *does* cause the memory to get paged out.  The MADV_PAGEOUT code
> actually has a !pte_present() check.  It will punt on a PTE if it sees
> it.  In other words, if a page is in the swap cache but not mapped by a
> pte_present() PTE, MADV_PAGEOUT won't touch it.
> 
> Shouldn't MADV_PAGEOUT be able to find and reclaim those pages?  Patch
> attached.

> 
> 
> ---
> 
>  b/mm/madvise.c |   38 +++++++++++++++++++++++++++++++-------
>  1 file changed, 31 insertions(+), 7 deletions(-)
> 
> diff -puN mm/madvise.c~madv-pageout-find-swap-cache mm/madvise.c
> --- a/mm/madvise.c~madv-pageout-find-swap-cache	2020-03-12 14:24:45.178775035 -0700
> +++ b/mm/madvise.c	2020-03-12 14:35:49.706773378 -0700
> @@ -248,6 +248,36 @@ static void force_shm_swapin_readahead(s
>  #endif		/* CONFIG_SWAP */
>  
>  /*
> + * Given a PTE, find the corresponding 'struct page'.  Also handles
> + * non-present swap PTEs.
> + */
> +struct page *pte_to_reclaim_page(struct vm_area_struct *vma,
> +				 unsigned long addr, pte_t ptent)
> +{
> +	swp_entry_t entry;
> +
> +	/* Totally empty PTE: */
> +	if (pte_none(ptent))
> +		return NULL;
> +
> +	/* A normal, present page is mapped: */
> +	if (pte_present(ptent))
> +		return vm_normal_page(vma, addr, ptent);
> +

Please check is_swap_pte first.

> +	entry = pte_to_swp_entry(vmf->orig_pte);
> +	/* Is it one of the "swap PTEs" that's not really swap? */
> +	if (non_swap_entry(entry))
> +		return false;
> +
> +	/*
> +	 * The PTE was a true swap entry.  The page may be in the
> +	 * swap cache.  If so, find it and return it so it may be
> +	 * reclaimed.
> +	 */
> +	return lookup_swap_cache(entry, vma, addr);

If we go with handling only exclusived owned page for anon,
I think we should apply the rule to swap cache, too.

Do you mind posting it as formal patch?

Thanks for the explain about vulnerability and the patch, Dave!

> +}
> +
> +/*
>   * Schedule all required I/O operations.  Do not wait for completion.
>   */
>  static long madvise_willneed(struct vm_area_struct *vma,
> @@ -389,13 +419,7 @@ regular_page:
>  	for (; addr < end; pte++, addr += PAGE_SIZE) {
>  		ptent = *pte;
>  
> -		if (pte_none(ptent))
> -			continue;
> -
> -		if (!pte_present(ptent))
> -			continue;
> -
> -		page = vm_normal_page(vma, addr, ptent);
> +		page = pte_to_reclaim_page(vma, addr, ptent);
>  		if (!page)
>  			continue;
>  
> _



  reply	other threads:[~2020-03-13  2:00 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-10 18:08 interaction of MADV_PAGEOUT with CoW anonymous mappings? Jann Horn
2020-03-10 18:48 ` Michal Hocko
2020-03-10 19:11   ` Jann Horn
2020-03-10 21:09     ` Michal Hocko
2020-03-10 22:48       ` Dave Hansen
2020-03-11  8:45         ` Michal Hocko
2020-03-11 22:02           ` Minchan Kim
2020-03-11 23:53           ` Shakeel Butt
2020-03-12  0:18             ` Minchan Kim
2020-03-12  2:03               ` Daniel Colascione
2020-03-12 15:15                 ` Shakeel Butt
2020-03-10 20:19   ` Daniel Colascione
2020-03-10 21:40     ` Jann Horn
2020-03-10 21:52       ` Daniel Colascione
2020-03-10 22:14 ` Minchan Kim
2020-03-12  8:22 ` Michal Hocko
2020-03-12 15:40   ` Vlastimil Babka
2020-03-12 20:16   ` Minchan Kim
2020-03-12 20:26     ` Dave Hansen
2020-03-12 20:41     ` Michal Hocko
2020-03-13  2:08       ` Minchan Kim
2020-03-13  8:05         ` Michal Hocko
2020-03-13 20:59           ` Minchan Kim
2020-03-16  9:20             ` Michal Hocko
2020-03-17  1:43               ` Minchan Kim
2020-03-17  7:12                 ` Michal Hocko
2020-03-17 15:00                   ` Minchan Kim
2020-03-17 15:58                     ` Michal Hocko
2020-03-17 17:20                       ` Minchan Kim
2020-03-12 21:41     ` Dave Hansen
2020-03-13  2:00       ` Minchan Kim [this message]
2020-03-13 16:59         ` Dave Hansen
2020-03-13 21:13           ` Minchan Kim
2020-03-12 23:29     ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200313020018.GC68817@google.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dancol@google.com \
    --cc=dave.hansen@intel.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.