All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
To: Minchan Kim <minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>,
	Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Michel Lespinasse
	<walken-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
	Cyrill Gorcunov
	<gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
	Jonathan Corbet <corbet-T1hC0tSOHrs@public.gmane.org>,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Christoph Lameter
	<cl-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	"Paul E. McKenney"
	<paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	Peter Zijlstra
	<a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org>
Subject: Re: [PATCH v3 3/3] proc: add kpageidle file
Date: Fri, 8 May 2015 12:56:04 +0300	[thread overview]
Message-ID: <20150508095604.GO31732@esperanza> (raw)
In-Reply-To: <20150504105459.GA19384@blaptop>

On Mon, May 04, 2015 at 07:54:59PM +0900, Minchan Kim wrote:
> So, I guess once below compiler optimization happens in __page_set_anon_rmap,
> it could be corrupt in page_refernced.
> 
> __page_set_anon_rmap:
>         page->mapping = (struct address_space *) anon_vma;
>         page->mapping = (struct address_space *)((void *)page_mapping + PAGE_MAPPING_ANON);
> 
> Because page_referenced checks it with PageAnon which has no memory barrier.
> So if above compiler optimization happens, page_referenced can pass the anon
> page in rmap_walk_file, not ramp_walk_anon. It's my theory. :)

FWIW

If such splits were possible, we would have bugs all over the kernel
IMO. An example is do_wp_page() vs shrink_active_list(). In do_wp_page()
we can call page_move_anon_rmap(), which sets page->mapping in exactly
the same fashion as above-mentioned __page_set_anon_rmap():

	anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
	page->mapping = (struct address_space *) anon_vma;

The page in question may be on an LRU list, because nowhere in
do_wp_page() we remove it from the list, neither do we take any LRU
related locks. The page is locked, that's true, but shrink_active_list()
calls page_referenced() on an unlocked page, so according to your logic
they can race with the latter receiving a page with page->mapping equal
to anon_vma w/o PAGE_MAPPING_ANON bit set:

CPU0				CPU1
----				----
do_wp_page			shrink_active_list
 lock_page			 page_referenced
				  PageAnon->yes, so skip trylock_page
 page_move_anon_rmap
  page->mapping = anon_vma
				  rmap_walk
				   PageAnon->no
				   rmap_walk_file
				    BUG
  page->mapping = page->mapping+PAGE_MAPPING_ANON

However, this does not happen.

Thanks,
Vladimir

WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@parallels.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>, Greg Thelen <gthelen@google.com>,
	Michel Lespinasse <walken@google.com>,
	David Rientjes <rientjes@google.com>,
	Pavel Emelyanov <xemul@parallels.com>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	Jonathan Corbet <corbet@lwn.net>,
	linux-api@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, Rik van Riel <riel@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	Christoph Lameter <cl@linux-foundation.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH v3 3/3] proc: add kpageidle file
Date: Fri, 8 May 2015 12:56:04 +0300	[thread overview]
Message-ID: <20150508095604.GO31732@esperanza> (raw)
In-Reply-To: <20150504105459.GA19384@blaptop>

On Mon, May 04, 2015 at 07:54:59PM +0900, Minchan Kim wrote:
> So, I guess once below compiler optimization happens in __page_set_anon_rmap,
> it could be corrupt in page_refernced.
> 
> __page_set_anon_rmap:
>         page->mapping = (struct address_space *) anon_vma;
>         page->mapping = (struct address_space *)((void *)page_mapping + PAGE_MAPPING_ANON);
> 
> Because page_referenced checks it with PageAnon which has no memory barrier.
> So if above compiler optimization happens, page_referenced can pass the anon
> page in rmap_walk_file, not ramp_walk_anon. It's my theory. :)

FWIW

If such splits were possible, we would have bugs all over the kernel
IMO. An example is do_wp_page() vs shrink_active_list(). In do_wp_page()
we can call page_move_anon_rmap(), which sets page->mapping in exactly
the same fashion as above-mentioned __page_set_anon_rmap():

	anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
	page->mapping = (struct address_space *) anon_vma;

The page in question may be on an LRU list, because nowhere in
do_wp_page() we remove it from the list, neither do we take any LRU
related locks. The page is locked, that's true, but shrink_active_list()
calls page_referenced() on an unlocked page, so according to your logic
they can race with the latter receiving a page with page->mapping equal
to anon_vma w/o PAGE_MAPPING_ANON bit set:

CPU0				CPU1
----				----
do_wp_page			shrink_active_list
 lock_page			 page_referenced
				  PageAnon->yes, so skip trylock_page
 page_move_anon_rmap
  page->mapping = anon_vma
				  rmap_walk
				   PageAnon->no
				   rmap_walk_file
				    BUG
  page->mapping = page->mapping+PAGE_MAPPING_ANON

However, this does not happen.

Thanks,
Vladimir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@parallels.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>, Greg Thelen <gthelen@google.com>,
	Michel Lespinasse <walken@google.com>,
	David Rientjes <rientjes@google.com>,
	Pavel Emelyanov <xemul@parallels.com>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	Jonathan Corbet <corbet@lwn.net>, <linux-api@vger.kernel.org>,
	<linux-doc@vger.kernel.org>, <linux-mm@kvack.org>,
	<cgroups@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>,
	Christoph Lameter <cl@linux-foundation.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH v3 3/3] proc: add kpageidle file
Date: Fri, 8 May 2015 12:56:04 +0300	[thread overview]
Message-ID: <20150508095604.GO31732@esperanza> (raw)
In-Reply-To: <20150504105459.GA19384@blaptop>

On Mon, May 04, 2015 at 07:54:59PM +0900, Minchan Kim wrote:
> So, I guess once below compiler optimization happens in __page_set_anon_rmap,
> it could be corrupt in page_refernced.
> 
> __page_set_anon_rmap:
>         page->mapping = (struct address_space *) anon_vma;
>         page->mapping = (struct address_space *)((void *)page_mapping + PAGE_MAPPING_ANON);
> 
> Because page_referenced checks it with PageAnon which has no memory barrier.
> So if above compiler optimization happens, page_referenced can pass the anon
> page in rmap_walk_file, not ramp_walk_anon. It's my theory. :)

FWIW

If such splits were possible, we would have bugs all over the kernel
IMO. An example is do_wp_page() vs shrink_active_list(). In do_wp_page()
we can call page_move_anon_rmap(), which sets page->mapping in exactly
the same fashion as above-mentioned __page_set_anon_rmap():

	anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
	page->mapping = (struct address_space *) anon_vma;

The page in question may be on an LRU list, because nowhere in
do_wp_page() we remove it from the list, neither do we take any LRU
related locks. The page is locked, that's true, but shrink_active_list()
calls page_referenced() on an unlocked page, so according to your logic
they can race with the latter receiving a page with page->mapping equal
to anon_vma w/o PAGE_MAPPING_ANON bit set:

CPU0				CPU1
----				----
do_wp_page			shrink_active_list
 lock_page			 page_referenced
				  PageAnon->yes, so skip trylock_page
 page_move_anon_rmap
  page->mapping = anon_vma
				  rmap_walk
				   PageAnon->no
				   rmap_walk_file
				    BUG
  page->mapping = page->mapping+PAGE_MAPPING_ANON

However, this does not happen.

Thanks,
Vladimir

  reply	other threads:[~2015-05-08  9:56 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-28 12:24 [PATCH v3 0/3] idle memory tracking Vladimir Davydov
2015-04-28 12:24 ` Vladimir Davydov
2015-04-28 12:24 ` [PATCH v3 1/3] memcg: add page_cgroup_ino helper Vladimir Davydov
2015-04-28 12:24   ` Vladimir Davydov
2015-04-28 12:24 ` [PATCH v3 2/3] proc: add kpagecgroup file Vladimir Davydov
2015-04-28 12:24   ` Vladimir Davydov
2015-04-28 12:24   ` Vladimir Davydov
2015-04-28 12:24 ` [PATCH v3 3/3] proc: add kpageidle file Vladimir Davydov
2015-04-28 12:24   ` Vladimir Davydov
2015-04-29  4:35   ` Minchan Kim
2015-04-29  4:35     ` Minchan Kim
2015-04-29  9:12     ` Vladimir Davydov
2015-04-29  9:12       ` Vladimir Davydov
2015-04-30  8:25       ` Minchan Kim
2015-04-30  8:25         ` Minchan Kim
2015-04-30 14:50         ` Vladimir Davydov
2015-04-30 14:50           ` Vladimir Davydov
2015-04-30 14:50           ` Vladimir Davydov
2015-05-04  3:17           ` Minchan Kim
2015-05-04  3:17             ` Minchan Kim
2015-05-04  9:49             ` Vladimir Davydov
2015-05-04  9:49               ` Vladimir Davydov
2015-05-04  9:49               ` Vladimir Davydov
2015-05-04 10:54               ` Minchan Kim
2015-05-04 10:54                 ` Minchan Kim
2015-05-04 10:54                 ` Minchan Kim
2015-05-08  9:56                 ` Vladimir Davydov [this message]
2015-05-08  9:56                   ` Vladimir Davydov
2015-05-08  9:56                   ` Vladimir Davydov
2015-05-09 15:12                   ` Minchan Kim
2015-05-09 15:12                     ` Minchan Kim
2015-05-10 10:34                     ` Vladimir Davydov
2015-05-10 10:34                       ` Vladimir Davydov
2015-05-10 10:34                       ` Vladimir Davydov
2015-05-12  9:41                       ` Vladimir Davydov
2015-05-12  9:41                         ` Vladimir Davydov
     [not found]   ` <4c24a6bf2c9711dd4dbb72a43a16eba6867527b7.1430217477.git.vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2015-04-29  4:57     ` Minchan Kim
2015-04-29  4:57       ` Minchan Kim
2015-04-29  4:57       ` Minchan Kim
2015-04-29  8:31       ` Vladimir Davydov
2015-04-29  8:31         ` Vladimir Davydov
2015-04-29  8:31         ` Vladimir Davydov
2015-04-30  6:55         ` Minchan Kim
2015-04-30  6:55           ` Minchan Kim
2015-04-30  6:55           ` Minchan Kim
2015-04-29  3:57 ` [PATCH v3 0/3] idle memory tracking Minchan Kim
2015-04-29  3:57   ` Minchan Kim
2015-04-29  7:58   ` Vladimir Davydov
2015-04-29  7:58     ` Vladimir Davydov
     [not found] ` <cover.1430217477.git.vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2015-04-29  5:02   ` Minchan Kim
2015-04-29  5:02     ` Minchan Kim
2015-04-29  5:02     ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150508095604.GO31732@esperanza \
    --to=vdavydov-bzqdu9zft3wakbo8gow8eq@public.gmane.org \
    --cc=a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=cl-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=corbet-T1hC0tSOHrs@public.gmane.org \
    --cc=gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=mhocko-AlSwsSmVLrQ@public.gmane.org \
    --cc=minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=walken-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.