From: Oleg Nesterov <oleg@redhat.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Colin Cross <ccross@android.com>,
Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
"Hampson, Steven T" <steven.t.hampson@intel.com>,
lkml <linux-kernel@vger.kernel.org>,
Kyungmin Park <kmpark@infradead.org>,
Christoph Hellwig <hch@infradead.org>,
John Stultz <john.stultz@linaro.org>,
Rob Landley <rob@landley.net>, Arnd Bergmann <arnd@arndb.de>,
Cyrill Gorcunov <gorcunov@openvz.org>,
David Rientjes <rientjes@google.com>,
Davidlohr Bueso <dave@gnu.org>, Kees Cook <keescook@chromium.org>,
Al Viro <viro@zeniv.linux.org.uk>, Mel Gorman <mgorman@suse.de>,
Michel Lespinasse <walken@google.com>,
Rik van Riel <riel@redhat.com>,
Konstantin Khlebnikov <khlebnikov@openvz.org>,
Peter Zijlstra <a.p.zijlstr>
Subject: Re: [PATCH 1/1] mm: mempolicy: fix mbind_range() && vma_adjust() interaction
Date: Tue, 9 Jul 2013 17:28:36 +0200 [thread overview]
Message-ID: <20130709152836.GA10033@redhat.com> (raw)
In-Reply-To: <CAHGf_=qPuzH_R1Jfztnhj4JEAX9xfD37461LRKrhHgL4nq-eHg@mail.gmail.com>
On 07/08, KOSAKI Motohiro wrote:
>
> On Mon, Jul 8, 2013 at 2:05 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> > vma_adjust() does vma_set_policy(vma, vma_policy(next)) and this
> > is doubly wrong:
> >
> > 1. This leaks vma->vm_policy if it is not NULL and not equal to
> > next->vm_policy.
> >
> > This can happen if vma_merge() expands "area", not prev (case 8).
> >
> > 2. This sets the wrong policy if vma_merge() joins prev and area,
> > area is the vma the caller needs to update and it still has the
> > old policy.
> >
> > Revert 1444f92c "mm: merging memory blocks resets mempolicy" which
> > introduced these problems.
>
> Yes, I believe 1444f92c is wrong and should be reverted.
Yes, but the problem it tried to solve is real, just we can't rely
on vma_adjust().
> > Change mbind_range() to recheck mpol_equal() after vma_merge() to
> > fix the problem 1444f92c tried to address.
> >
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> > Cc: <stable@vger.kernel.org>
> > ---
> > mm/mempolicy.c | 6 +++++-
> > mm/mmap.c | 2 +-
> > 2 files changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index 7431001..4baf12e 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -732,7 +732,10 @@ static int mbind_range(struct mm_struct *mm, unsigned long start,
> > if (prev) {
> > vma = prev;
> > next = vma->vm_next;
> > - continue;
> > + if (mpol_equal(vma_policy(vma), new_pol))
> > + continue;
> > + /* vma_merge() joined vma && vma->next, case 8 */
>
> case 3 makes the same scenario?
Not really, afaics. "case 3" is when vma_merge() "merges" a hole with
vma, mbind_range() works with the already mmapped regions.
More precisely, unless I misread this code, "case 3" means area == next,
so vma_adjust(area) actually sets next->vm_start = addr.
I can be easily wrong, but to me vma_adjust() and its usage looks a bit
overcomplicated. Perhaps it makes sense to distinguish mmapped/hole cases.
mbind_range/madvise/etc need vma_join(vma, ...), not prev/anon_vma/file.
Perhaps. not sure.
> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Thanks!
Oleg.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Oleg Nesterov <oleg@redhat.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Colin Cross <ccross@android.com>,
Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
"Hampson, Steven T" <steven.t.hampson@intel.com>,
lkml <linux-kernel@vger.kernel.org>,
Kyungmin Park <kmpark@infradead.org>,
Christoph Hellwig <hch@infradead.org>,
John Stultz <john.stultz@linaro.org>,
Rob Landley <rob@landley.net>, Arnd Bergmann <arnd@arndb.de>,
Cyrill Gorcunov <gorcunov@openvz.org>,
David Rientjes <rientjes@google.com>,
Davidlohr Bueso <dave@gnu.org>, Kees Cook <keescook@chromium.org>,
Al Viro <viro@zeniv.linux.org.uk>, Mel Gorman <mgorman@suse.de>,
Michel Lespinasse <walken@google.com>,
Rik van Riel <riel@redhat.com>,
Konstantin Khlebnikov <khlebnikov@openvz.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Rusty Russell <rusty@rustcorp.com.au>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Michal Hocko <mhocko@suse.cz>,
Anton Vorontsov <anton.vorontsov@linaro.org>,
Pekka Enberg <penberg@kernel.org>, Shaohua Li <shli@fusionio.com>,
Sasha Levin <sasha.levin@oracle.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Ingo Molnar <mingo@kernel.org>,
"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>,
"open list:GENERIC INCLUDE/A..." <linux-arch@vger.kernel.org>
Subject: Re: [PATCH 1/1] mm: mempolicy: fix mbind_range() && vma_adjust() interaction
Date: Tue, 9 Jul 2013 17:28:36 +0200 [thread overview]
Message-ID: <20130709152836.GA10033@redhat.com> (raw)
In-Reply-To: <CAHGf_=qPuzH_R1Jfztnhj4JEAX9xfD37461LRKrhHgL4nq-eHg@mail.gmail.com>
On 07/08, KOSAKI Motohiro wrote:
>
> On Mon, Jul 8, 2013 at 2:05 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> > vma_adjust() does vma_set_policy(vma, vma_policy(next)) and this
> > is doubly wrong:
> >
> > 1. This leaks vma->vm_policy if it is not NULL and not equal to
> > next->vm_policy.
> >
> > This can happen if vma_merge() expands "area", not prev (case 8).
> >
> > 2. This sets the wrong policy if vma_merge() joins prev and area,
> > area is the vma the caller needs to update and it still has the
> > old policy.
> >
> > Revert 1444f92c "mm: merging memory blocks resets mempolicy" which
> > introduced these problems.
>
> Yes, I believe 1444f92c is wrong and should be reverted.
Yes, but the problem it tried to solve is real, just we can't rely
on vma_adjust().
> > Change mbind_range() to recheck mpol_equal() after vma_merge() to
> > fix the problem 1444f92c tried to address.
> >
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> > Cc: <stable@vger.kernel.org>
> > ---
> > mm/mempolicy.c | 6 +++++-
> > mm/mmap.c | 2 +-
> > 2 files changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index 7431001..4baf12e 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -732,7 +732,10 @@ static int mbind_range(struct mm_struct *mm, unsigned long start,
> > if (prev) {
> > vma = prev;
> > next = vma->vm_next;
> > - continue;
> > + if (mpol_equal(vma_policy(vma), new_pol))
> > + continue;
> > + /* vma_merge() joined vma && vma->next, case 8 */
>
> case 3 makes the same scenario?
Not really, afaics. "case 3" is when vma_merge() "merges" a hole with
vma, mbind_range() works with the already mmapped regions.
More precisely, unless I misread this code, "case 3" means area == next,
so vma_adjust(area) actually sets next->vm_start = addr.
I can be easily wrong, but to me vma_adjust() and its usage looks a bit
overcomplicated. Perhaps it makes sense to distinguish mmapped/hole cases.
mbind_range/madvise/etc need vma_join(vma, ...), not prev/anon_vma/file.
Perhaps. not sure.
> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Thanks!
Oleg.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-07-09 15:28 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-04 1:31 [PATCH] mm: add sys_madvise2 and MADV_NAME to name vmas Colin Cross
2013-07-04 1:31 ` Colin Cross
2013-07-04 4:54 ` Eric W. Biederman
2013-07-04 4:54 ` Eric W. Biederman
2013-07-04 6:32 ` Colin Cross
2013-07-04 6:32 ` Colin Cross
2013-07-05 16:52 ` Oleg Nesterov
2013-07-05 16:52 ` Oleg Nesterov
2013-07-06 6:33 ` Pekka Enberg
2013-07-06 6:33 ` Pekka Enberg
2013-07-06 11:53 ` Eric W. Biederman
2013-07-06 11:53 ` Eric W. Biederman
2013-07-07 18:35 ` Colin Cross
2013-07-07 18:35 ` Colin Cross
2013-07-14 1:38 ` Simon Jeons
2013-07-04 8:56 ` Peter Zijlstra
2013-07-04 8:56 ` Peter Zijlstra
2013-07-05 20:25 ` Colin Cross
2013-07-05 20:25 ` Colin Cross
2013-07-10 23:20 ` Dave Hansen
2013-07-10 23:20 ` Dave Hansen
2013-07-04 20:22 ` Oleg Nesterov
2013-07-04 20:22 ` Oleg Nesterov
2013-07-05 19:40 ` Colin Cross
2013-07-05 19:40 ` Colin Cross
2013-07-08 18:04 ` [PATCH 0/1] mm: mempolicy: (Was: add sys_madvise2 and MADV_NAME to name vmas) Oleg Nesterov
2013-07-08 18:04 ` Oleg Nesterov
2013-07-08 18:05 ` [PATCH 1/1] mm: mempolicy: fix mbind_range() && vma_adjust() interaction Oleg Nesterov
2013-07-08 18:05 ` Oleg Nesterov
2013-07-08 22:29 ` KOSAKI Motohiro
2013-07-08 22:29 ` KOSAKI Motohiro
2013-07-09 15:28 ` Oleg Nesterov [this message]
2013-07-09 15:28 ` Oleg Nesterov
2013-07-09 19:43 ` Oleg Nesterov
2013-07-09 19:43 ` Oleg Nesterov
2013-07-10 2:49 ` KOSAKI Motohiro
2013-07-10 2:49 ` KOSAKI Motohiro
2013-07-09 21:56 ` Andrew Morton
2013-07-09 21:56 ` Andrew Morton
2013-07-10 15:45 ` Oleg Nesterov
2013-07-10 15:45 ` Oleg Nesterov
2013-07-24 9:40 ` [PATCH] mm: add sys_madvise2 and MADV_NAME to name vmas Jan Glauber
2013-07-24 9:40 ` Jan Glauber
2013-07-24 20:05 ` Colin Cross
2013-07-24 20:05 ` Colin Cross
2013-07-10 23:08 ` Dave Hansen
2013-07-10 23:08 ` Dave Hansen
[not found] ` <CAMbhsRTio2mS=azWTxSdRdaZJRRf5FfMNoQUZmrFjkB7kv9LSQ@mail.gmail.com>
2013-07-10 23:38 ` Dave Hansen
2013-07-10 23:38 ` Dave Hansen
[not found] ` <CAMbhsRTs45QE1ze6mvdiL2QYKD0dHjXoRk7o1h2Y_rYP80ckDg@mail.gmail.com>
2013-07-11 0:19 ` Dave Hansen
2013-07-11 0:19 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130709152836.GA10033@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=ccross@android.com \
--cc=dave@gnu.org \
--cc=gorcunov@openvz.org \
--cc=hch@infradead.org \
--cc=hughd@google.com \
--cc=john.stultz@linaro.org \
--cc=keescook@chromium.org \
--cc=khlebnikov@openvz.org \
--cc=kmpark@infradead.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=rob@landley.net \
--cc=steven.t.hampson@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.