From: Vladimir Davydov <vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
To: Minchan Kim <minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>,
Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Michel Lespinasse
<walken-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
Cyrill Gorcunov
<gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
Jonathan Corbet <corbet-T1hC0tSOHrs@public.gmane.org>,
linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Christoph Lameter
<cl-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
"Paul E. McKenney"
<paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
Peter Zijlstra
<a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org>
Subject: Re: [PATCH v3 3/3] proc: add kpageidle file
Date: Fri, 8 May 2015 12:56:04 +0300 [thread overview]
Message-ID: <20150508095604.GO31732@esperanza> (raw)
In-Reply-To: <20150504105459.GA19384@blaptop>
On Mon, May 04, 2015 at 07:54:59PM +0900, Minchan Kim wrote:
> So, I guess once below compiler optimization happens in __page_set_anon_rmap,
> it could be corrupt in page_refernced.
>
> __page_set_anon_rmap:
> page->mapping = (struct address_space *) anon_vma;
> page->mapping = (struct address_space *)((void *)page_mapping + PAGE_MAPPING_ANON);
>
> Because page_referenced checks it with PageAnon which has no memory barrier.
> So if above compiler optimization happens, page_referenced can pass the anon
> page in rmap_walk_file, not ramp_walk_anon. It's my theory. :)
FWIW
If such splits were possible, we would have bugs all over the kernel
IMO. An example is do_wp_page() vs shrink_active_list(). In do_wp_page()
we can call page_move_anon_rmap(), which sets page->mapping in exactly
the same fashion as above-mentioned __page_set_anon_rmap():
anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
page->mapping = (struct address_space *) anon_vma;
The page in question may be on an LRU list, because nowhere in
do_wp_page() we remove it from the list, neither do we take any LRU
related locks. The page is locked, that's true, but shrink_active_list()
calls page_referenced() on an unlocked page, so according to your logic
they can race with the latter receiving a page with page->mapping equal
to anon_vma w/o PAGE_MAPPING_ANON bit set:
CPU0 CPU1
---- ----
do_wp_page shrink_active_list
lock_page page_referenced
PageAnon->yes, so skip trylock_page
page_move_anon_rmap
page->mapping = anon_vma
rmap_walk
PageAnon->no
rmap_walk_file
BUG
page->mapping = page->mapping+PAGE_MAPPING_ANON
However, this does not happen.
Thanks,
Vladimir
WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@parallels.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>, Greg Thelen <gthelen@google.com>,
Michel Lespinasse <walken@google.com>,
David Rientjes <rientjes@google.com>,
Pavel Emelyanov <xemul@parallels.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Jonathan Corbet <corbet@lwn.net>,
linux-api@vger.kernel.org, linux-doc@vger.kernel.org,
linux-mm@kvack.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org, Rik van Riel <riel@redhat.com>,
Hugh Dickins <hughd@google.com>,
Christoph Lameter <cl@linux-foundation.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH v3 3/3] proc: add kpageidle file
Date: Fri, 8 May 2015 12:56:04 +0300 [thread overview]
Message-ID: <20150508095604.GO31732@esperanza> (raw)
In-Reply-To: <20150504105459.GA19384@blaptop>
On Mon, May 04, 2015 at 07:54:59PM +0900, Minchan Kim wrote:
> So, I guess once below compiler optimization happens in __page_set_anon_rmap,
> it could be corrupt in page_refernced.
>
> __page_set_anon_rmap:
> page->mapping = (struct address_space *) anon_vma;
> page->mapping = (struct address_space *)((void *)page_mapping + PAGE_MAPPING_ANON);
>
> Because page_referenced checks it with PageAnon which has no memory barrier.
> So if above compiler optimization happens, page_referenced can pass the anon
> page in rmap_walk_file, not ramp_walk_anon. It's my theory. :)
FWIW
If such splits were possible, we would have bugs all over the kernel
IMO. An example is do_wp_page() vs shrink_active_list(). In do_wp_page()
we can call page_move_anon_rmap(), which sets page->mapping in exactly
the same fashion as above-mentioned __page_set_anon_rmap():
anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
page->mapping = (struct address_space *) anon_vma;
The page in question may be on an LRU list, because nowhere in
do_wp_page() we remove it from the list, neither do we take any LRU
related locks. The page is locked, that's true, but shrink_active_list()
calls page_referenced() on an unlocked page, so according to your logic
they can race with the latter receiving a page with page->mapping equal
to anon_vma w/o PAGE_MAPPING_ANON bit set:
CPU0 CPU1
---- ----
do_wp_page shrink_active_list
lock_page page_referenced
PageAnon->yes, so skip trylock_page
page_move_anon_rmap
page->mapping = anon_vma
rmap_walk
PageAnon->no
rmap_walk_file
BUG
page->mapping = page->mapping+PAGE_MAPPING_ANON
However, this does not happen.
Thanks,
Vladimir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@parallels.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>, Greg Thelen <gthelen@google.com>,
Michel Lespinasse <walken@google.com>,
David Rientjes <rientjes@google.com>,
Pavel Emelyanov <xemul@parallels.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Jonathan Corbet <corbet@lwn.net>, <linux-api@vger.kernel.org>,
<linux-doc@vger.kernel.org>, <linux-mm@kvack.org>,
<cgroups@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>,
Christoph Lameter <cl@linux-foundation.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH v3 3/3] proc: add kpageidle file
Date: Fri, 8 May 2015 12:56:04 +0300 [thread overview]
Message-ID: <20150508095604.GO31732@esperanza> (raw)
In-Reply-To: <20150504105459.GA19384@blaptop>
On Mon, May 04, 2015 at 07:54:59PM +0900, Minchan Kim wrote:
> So, I guess once below compiler optimization happens in __page_set_anon_rmap,
> it could be corrupt in page_refernced.
>
> __page_set_anon_rmap:
> page->mapping = (struct address_space *) anon_vma;
> page->mapping = (struct address_space *)((void *)page_mapping + PAGE_MAPPING_ANON);
>
> Because page_referenced checks it with PageAnon which has no memory barrier.
> So if above compiler optimization happens, page_referenced can pass the anon
> page in rmap_walk_file, not ramp_walk_anon. It's my theory. :)
FWIW
If such splits were possible, we would have bugs all over the kernel
IMO. An example is do_wp_page() vs shrink_active_list(). In do_wp_page()
we can call page_move_anon_rmap(), which sets page->mapping in exactly
the same fashion as above-mentioned __page_set_anon_rmap():
anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
page->mapping = (struct address_space *) anon_vma;
The page in question may be on an LRU list, because nowhere in
do_wp_page() we remove it from the list, neither do we take any LRU
related locks. The page is locked, that's true, but shrink_active_list()
calls page_referenced() on an unlocked page, so according to your logic
they can race with the latter receiving a page with page->mapping equal
to anon_vma w/o PAGE_MAPPING_ANON bit set:
CPU0 CPU1
---- ----
do_wp_page shrink_active_list
lock_page page_referenced
PageAnon->yes, so skip trylock_page
page_move_anon_rmap
page->mapping = anon_vma
rmap_walk
PageAnon->no
rmap_walk_file
BUG
page->mapping = page->mapping+PAGE_MAPPING_ANON
However, this does not happen.
Thanks,
Vladimir
next prev parent reply other threads:[~2015-05-08 9:56 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-28 12:24 [PATCH v3 0/3] idle memory tracking Vladimir Davydov
2015-04-28 12:24 ` Vladimir Davydov
2015-04-28 12:24 ` [PATCH v3 1/3] memcg: add page_cgroup_ino helper Vladimir Davydov
2015-04-28 12:24 ` Vladimir Davydov
2015-04-28 12:24 ` [PATCH v3 2/3] proc: add kpagecgroup file Vladimir Davydov
2015-04-28 12:24 ` Vladimir Davydov
2015-04-28 12:24 ` Vladimir Davydov
2015-04-28 12:24 ` [PATCH v3 3/3] proc: add kpageidle file Vladimir Davydov
2015-04-28 12:24 ` Vladimir Davydov
2015-04-29 4:35 ` Minchan Kim
2015-04-29 4:35 ` Minchan Kim
2015-04-29 9:12 ` Vladimir Davydov
2015-04-29 9:12 ` Vladimir Davydov
2015-04-30 8:25 ` Minchan Kim
2015-04-30 8:25 ` Minchan Kim
2015-04-30 14:50 ` Vladimir Davydov
2015-04-30 14:50 ` Vladimir Davydov
2015-04-30 14:50 ` Vladimir Davydov
2015-05-04 3:17 ` Minchan Kim
2015-05-04 3:17 ` Minchan Kim
2015-05-04 9:49 ` Vladimir Davydov
2015-05-04 9:49 ` Vladimir Davydov
2015-05-04 9:49 ` Vladimir Davydov
2015-05-04 10:54 ` Minchan Kim
2015-05-04 10:54 ` Minchan Kim
2015-05-04 10:54 ` Minchan Kim
2015-05-08 9:56 ` Vladimir Davydov [this message]
2015-05-08 9:56 ` Vladimir Davydov
2015-05-08 9:56 ` Vladimir Davydov
2015-05-09 15:12 ` Minchan Kim
2015-05-09 15:12 ` Minchan Kim
2015-05-10 10:34 ` Vladimir Davydov
2015-05-10 10:34 ` Vladimir Davydov
2015-05-10 10:34 ` Vladimir Davydov
2015-05-12 9:41 ` Vladimir Davydov
2015-05-12 9:41 ` Vladimir Davydov
[not found] ` <4c24a6bf2c9711dd4dbb72a43a16eba6867527b7.1430217477.git.vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2015-04-29 4:57 ` Minchan Kim
2015-04-29 4:57 ` Minchan Kim
2015-04-29 4:57 ` Minchan Kim
2015-04-29 8:31 ` Vladimir Davydov
2015-04-29 8:31 ` Vladimir Davydov
2015-04-29 8:31 ` Vladimir Davydov
2015-04-30 6:55 ` Minchan Kim
2015-04-30 6:55 ` Minchan Kim
2015-04-30 6:55 ` Minchan Kim
2015-04-29 3:57 ` [PATCH v3 0/3] idle memory tracking Minchan Kim
2015-04-29 3:57 ` Minchan Kim
2015-04-29 7:58 ` Vladimir Davydov
2015-04-29 7:58 ` Vladimir Davydov
[not found] ` <cover.1430217477.git.vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2015-04-29 5:02 ` Minchan Kim
2015-04-29 5:02 ` Minchan Kim
2015-04-29 5:02 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150508095604.GO31732@esperanza \
--to=vdavydov-bzqdu9zft3wakbo8gow8eq@public.gmane.org \
--cc=a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=cl-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=corbet-T1hC0tSOHrs@public.gmane.org \
--cc=gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
--cc=gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
--cc=hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
--cc=mhocko-AlSwsSmVLrQ@public.gmane.org \
--cc=minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
--cc=riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=walken-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.