All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH v2] rmap: fix theoretical race between do_wp_page and shrink_active_list
Date: Wed, 13 May 2015 10:21:07 +0900	[thread overview]
Message-ID: <20150513012106.GB8267@blaptop> (raw)
In-Reply-To: <20150512152840.20805775ae82c69b9a8f3028@linux-foundation.org>

Hello Andrew,

On Tue, May 12, 2015 at 03:28:40PM -0700, Andrew Morton wrote:
> On Tue, 12 May 2015 13:18:39 +0300 Vladimir Davydov <vdavydov@parallels.com> wrote:
> 
> > As noted by Paul the compiler is free to store a temporary result in a
> > variable on stack, heap or global unless it is explicitly marked as
> > volatile, see:
> > 
> >   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html#sample-optimizations
> > 
> > This can result in a race between do_wp_page() and shrink_active_list()
> > as follows.
> > 
> > In do_wp_page() we can call page_move_anon_rmap(), which sets
> > page->mapping as follows:
> > 
> >   anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
> >   page->mapping = (struct address_space *) anon_vma;
> > 
> > The page in question may be on an LRU list, because nowhere in
> > do_wp_page() we remove it from the list, neither do we take any LRU
> > related locks. Although the page is locked, shrink_active_list() can
> > still call page_referenced() on it concurrently, because the latter does
> > not require an anonymous page to be locked:
> > 
> >   CPU0                          CPU1
> >   ----                          ----
> >   do_wp_page                    shrink_active_list
> >    lock_page                     page_referenced
> >                                   PageAnon->yes, so skip trylock_page
> >    page_move_anon_rmap
> >     page->mapping = anon_vma
> >                                   rmap_walk
> >                                    PageAnon->no
> >                                    rmap_walk_file
> >                                     BUG
> >     page->mapping += PAGE_MAPPING_ANON
> > 
> > This patch fixes this race by explicitly forbidding the compiler to
> > split page->mapping store in page_move_anon_rmap() with the aid of
> > WRITE_ONCE.
> > 
> > ...
> >
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -950,7 +950,7 @@ void page_move_anon_rmap(struct page *page,
> >  	VM_BUG_ON_PAGE(page->index != linear_page_index(vma, address), page);
> >  
> >  	anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
> > -	page->mapping = (struct address_space *) anon_vma;
> > +	WRITE_ONCE(page->mapping, (struct address_space *) anon_vma);
> 
> Please let's not put things like WRITE_ONCE() in there without
> documenting them - otherwise it's terribly hard for readers to work out
> why it was added.
> 
> How's this look?
> 
> --- a/mm/rmap.c~rmap-fix-theoretical-race-between-do_wp_page-and-shrink_active_list-fix
> +++ a/mm/rmap.c
> @@ -950,6 +950,11 @@ void page_move_anon_rmap(struct page *pa
>  	VM_BUG_ON_PAGE(page->index != linear_page_index(vma, address), page);
>  
>  	anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
> +	/*
> +	 * Ensure that anon_vma and the PAGE_MAPPING_ANON bit are written
> +	 * simultaneously, so a concurrent reader (eg shrink_active_list) will

IMHO, rather than shrink_active_list, PageAnon in page_referenced is better to me.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH v2] rmap: fix theoretical race between do_wp_page and shrink_active_list
Date: Wed, 13 May 2015 10:21:07 +0900	[thread overview]
Message-ID: <20150513012106.GB8267@blaptop> (raw)
In-Reply-To: <20150512152840.20805775ae82c69b9a8f3028@linux-foundation.org>

Hello Andrew,

On Tue, May 12, 2015 at 03:28:40PM -0700, Andrew Morton wrote:
> On Tue, 12 May 2015 13:18:39 +0300 Vladimir Davydov <vdavydov@parallels.com> wrote:
> 
> > As noted by Paul the compiler is free to store a temporary result in a
> > variable on stack, heap or global unless it is explicitly marked as
> > volatile, see:
> > 
> >   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html#sample-optimizations
> > 
> > This can result in a race between do_wp_page() and shrink_active_list()
> > as follows.
> > 
> > In do_wp_page() we can call page_move_anon_rmap(), which sets
> > page->mapping as follows:
> > 
> >   anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
> >   page->mapping = (struct address_space *) anon_vma;
> > 
> > The page in question may be on an LRU list, because nowhere in
> > do_wp_page() we remove it from the list, neither do we take any LRU
> > related locks. Although the page is locked, shrink_active_list() can
> > still call page_referenced() on it concurrently, because the latter does
> > not require an anonymous page to be locked:
> > 
> >   CPU0                          CPU1
> >   ----                          ----
> >   do_wp_page                    shrink_active_list
> >    lock_page                     page_referenced
> >                                   PageAnon->yes, so skip trylock_page
> >    page_move_anon_rmap
> >     page->mapping = anon_vma
> >                                   rmap_walk
> >                                    PageAnon->no
> >                                    rmap_walk_file
> >                                     BUG
> >     page->mapping += PAGE_MAPPING_ANON
> > 
> > This patch fixes this race by explicitly forbidding the compiler to
> > split page->mapping store in page_move_anon_rmap() with the aid of
> > WRITE_ONCE.
> > 
> > ...
> >
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -950,7 +950,7 @@ void page_move_anon_rmap(struct page *page,
> >  	VM_BUG_ON_PAGE(page->index != linear_page_index(vma, address), page);
> >  
> >  	anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
> > -	page->mapping = (struct address_space *) anon_vma;
> > +	WRITE_ONCE(page->mapping, (struct address_space *) anon_vma);
> 
> Please let's not put things like WRITE_ONCE() in there without
> documenting them - otherwise it's terribly hard for readers to work out
> why it was added.
> 
> How's this look?
> 
> --- a/mm/rmap.c~rmap-fix-theoretical-race-between-do_wp_page-and-shrink_active_list-fix
> +++ a/mm/rmap.c
> @@ -950,6 +950,11 @@ void page_move_anon_rmap(struct page *pa
>  	VM_BUG_ON_PAGE(page->index != linear_page_index(vma, address), page);
>  
>  	anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
> +	/*
> +	 * Ensure that anon_vma and the PAGE_MAPPING_ANON bit are written
> +	 * simultaneously, so a concurrent reader (eg shrink_active_list) will

IMHO, rather than shrink_active_list, PageAnon in page_referenced is better to me.

-- 
Kind regards,
Minchan Kim

  reply	other threads:[~2015-05-13  1:21 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-12 10:18 [PATCH v2] rmap: fix theoretical race between do_wp_page and shrink_active_list Vladimir Davydov
2015-05-12 10:18 ` Vladimir Davydov
2015-05-12 10:48 ` Kirill A. Shutemov
2015-05-12 10:48   ` Kirill A. Shutemov
2015-05-12 22:28 ` Andrew Morton
2015-05-12 22:28   ` Andrew Morton
2015-05-13  1:21   ` Minchan Kim [this message]
2015-05-13  1:21     ` Minchan Kim
2015-05-13  8:08   ` Vladimir Davydov
2015-05-13  8:08     ` Vladimir Davydov
2015-05-13  1:14 ` Minchan Kim
2015-05-13  1:14   ` Minchan Kim
  -- strict thread matches above, loose matches on Subject: below --
2015-05-13  1:43 Minchan Kim
2015-05-13  1:43 ` Minchan Kim
2015-05-13  2:04 ` Rik van Riel
2015-05-13  2:04   ` Rik van Riel
2015-05-13  3:00 Minchan Kim
2015-05-13  3:00 ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150513012106.GB8267@blaptop \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.