From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kernel-team@fb.com
Subject: Re: [PATCH 1/6] mm: khugepaged: fix radix tree node leak in shmem collapse error path
Date: Fri, 11 Nov 2016 19:37:53 +0300 [thread overview]
Message-ID: <20161111163753.GH19382@node.shutemov.name> (raw)
In-Reply-To: <20161111122224.GA5090@quack2.suse.cz>
On Fri, Nov 11, 2016 at 01:22:24PM +0100, Jan Kara wrote:
> On Fri 11-11-16 13:59:21, Kirill A. Shutemov wrote:
> > On Tue, Nov 08, 2016 at 11:12:45AM -0500, Johannes Weiner wrote:
> > > On Tue, Nov 08, 2016 at 10:53:52AM +0100, Jan Kara wrote:
> > > > On Mon 07-11-16 14:07:36, Johannes Weiner wrote:
> > > > > The radix tree counts valid entries in each tree node. Entries stored
> > > > > in the tree cannot be removed by simpling storing NULL in the slot or
> > > > > the internal counters will be off and the node never gets freed again.
> > > > >
> > > > > When collapsing a shmem page fails, restore the holes that were filled
> > > > > with radix_tree_insert() with a proper radix tree deletion.
> > > > >
> > > > > Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
> > > > > Reported-by: Jan Kara <jack@suse.cz>
> > > > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > > > > ---
> > > > > mm/khugepaged.c | 3 ++-
> > > > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > > > > index 728d7790dc2d..eac6f0580e26 100644
> > > > > --- a/mm/khugepaged.c
> > > > > +++ b/mm/khugepaged.c
> > > > > @@ -1520,7 +1520,8 @@ static void collapse_shmem(struct mm_struct *mm,
> > > > > if (!nr_none)
> > > > > break;
> > > > > /* Put holes back where they were */
> > > > > - radix_tree_replace_slot(slot, NULL);
> > > > > + radix_tree_delete(&mapping->page_tree,
> > > > > + iter.index);
> > > >
> > > > Hum, but this is inside radix_tree_for_each_slot() iteration. And
> > > > radix_tree_delete() may end up freeing nodes resulting in invalidating
> > > > current slot pointer and the iteration code will do use-after-free.
> > >
> > > Good point, we need to do another tree lookup after the deletion.
> > >
> > > But there are other instances in the code, where we drop the lock
> > > temporarily and somebody else could delete the node from under us.
> > >
> > > In the main collapse path, I *think* this is prevented by the fact
> > > that when we drop the tree lock we still hold the page lock of the
> > > regular page that's in the tree while we isolate and unmap it, thus
> > > pin the node. Even so, it would seem a little hairy to rely on that.
> > >
> > > Kirill?
> >
> > [ sorry for delay ]
> >
> > Yes, we make sure that locked page still belong to the radix tree and fall
> > off if it's not. Locked page cannot be removed from radix-tree, so we
> > should be fine.
>
> Well, it cannot be removed from the radix tree but radix tree code is still
> free to collapse / expand the tree nodes as it sees fit (currently the only
> real case is when changing direct page pointer in the tree root to a node
> pointer or vice versa but still...). So code should not really assume that
> the node page is referenced from does not change once tree_lock is dropped.
> It leads to subtle bugs...
Hm. Okay.
What is the right way re-validate that slot is still valid? Do I need full
look up again? Can I pin node explicitly?
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kernel-team@fb.com
Subject: Re: [PATCH 1/6] mm: khugepaged: fix radix tree node leak in shmem collapse error path
Date: Fri, 11 Nov 2016 19:37:53 +0300 [thread overview]
Message-ID: <20161111163753.GH19382@node.shutemov.name> (raw)
In-Reply-To: <20161111122224.GA5090@quack2.suse.cz>
On Fri, Nov 11, 2016 at 01:22:24PM +0100, Jan Kara wrote:
> On Fri 11-11-16 13:59:21, Kirill A. Shutemov wrote:
> > On Tue, Nov 08, 2016 at 11:12:45AM -0500, Johannes Weiner wrote:
> > > On Tue, Nov 08, 2016 at 10:53:52AM +0100, Jan Kara wrote:
> > > > On Mon 07-11-16 14:07:36, Johannes Weiner wrote:
> > > > > The radix tree counts valid entries in each tree node. Entries stored
> > > > > in the tree cannot be removed by simpling storing NULL in the slot or
> > > > > the internal counters will be off and the node never gets freed again.
> > > > >
> > > > > When collapsing a shmem page fails, restore the holes that were filled
> > > > > with radix_tree_insert() with a proper radix tree deletion.
> > > > >
> > > > > Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
> > > > > Reported-by: Jan Kara <jack@suse.cz>
> > > > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > > > > ---
> > > > > mm/khugepaged.c | 3 ++-
> > > > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > > > > index 728d7790dc2d..eac6f0580e26 100644
> > > > > --- a/mm/khugepaged.c
> > > > > +++ b/mm/khugepaged.c
> > > > > @@ -1520,7 +1520,8 @@ static void collapse_shmem(struct mm_struct *mm,
> > > > > if (!nr_none)
> > > > > break;
> > > > > /* Put holes back where they were */
> > > > > - radix_tree_replace_slot(slot, NULL);
> > > > > + radix_tree_delete(&mapping->page_tree,
> > > > > + iter.index);
> > > >
> > > > Hum, but this is inside radix_tree_for_each_slot() iteration. And
> > > > radix_tree_delete() may end up freeing nodes resulting in invalidating
> > > > current slot pointer and the iteration code will do use-after-free.
> > >
> > > Good point, we need to do another tree lookup after the deletion.
> > >
> > > But there are other instances in the code, where we drop the lock
> > > temporarily and somebody else could delete the node from under us.
> > >
> > > In the main collapse path, I *think* this is prevented by the fact
> > > that when we drop the tree lock we still hold the page lock of the
> > > regular page that's in the tree while we isolate and unmap it, thus
> > > pin the node. Even so, it would seem a little hairy to rely on that.
> > >
> > > Kirill?
> >
> > [ sorry for delay ]
> >
> > Yes, we make sure that locked page still belong to the radix tree and fall
> > off if it's not. Locked page cannot be removed from radix-tree, so we
> > should be fine.
>
> Well, it cannot be removed from the radix tree but radix tree code is still
> free to collapse / expand the tree nodes as it sees fit (currently the only
> real case is when changing direct page pointer in the tree root to a node
> pointer or vice versa but still...). So code should not really assume that
> the node page is referenced from does not change once tree_lock is dropped.
> It leads to subtle bugs...
Hm. Okay.
What is the right way re-validate that slot is still valid? Do I need full
look up again? Can I pin node explicitly?
--
Kirill A. Shutemov
next prev parent reply other threads:[~2016-11-11 16:37 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-07 19:07 [PATCH 0/6] mm: workingset: radix tree subtleties & single-page file refaults Johannes Weiner
2016-11-07 19:07 ` Johannes Weiner
2016-11-07 19:07 ` [PATCH 1/6] mm: khugepaged: fix radix tree node leak in shmem collapse error path Johannes Weiner
2016-11-07 19:07 ` Johannes Weiner
2016-11-08 9:53 ` Jan Kara
2016-11-08 9:53 ` Jan Kara
2016-11-08 16:12 ` Johannes Weiner
2016-11-08 16:12 ` Johannes Weiner
2016-11-09 7:41 ` Jan Kara
2016-11-09 7:41 ` Jan Kara
2016-11-11 10:59 ` Kirill A. Shutemov
2016-11-11 10:59 ` Kirill A. Shutemov
2016-11-11 12:22 ` Jan Kara
2016-11-11 12:22 ` Jan Kara
2016-11-11 16:37 ` Kirill A. Shutemov [this message]
2016-11-11 16:37 ` Kirill A. Shutemov
2016-11-14 8:07 ` Jan Kara
2016-11-14 8:07 ` Jan Kara
2016-11-14 14:29 ` Kirill A. Shutemov
2016-11-14 14:29 ` Kirill A. Shutemov
2016-11-14 15:52 ` Johannes Weiner
2016-11-14 15:52 ` Johannes Weiner
2016-11-14 16:48 ` Johannes Weiner
2016-11-14 16:48 ` Johannes Weiner
2016-11-14 19:40 ` Kirill A. Shutemov
2016-11-14 19:40 ` Kirill A. Shutemov
2016-11-15 14:00 ` Johannes Weiner
2016-11-15 14:00 ` Johannes Weiner
2016-11-07 19:07 ` [PATCH 2/6] mm: workingset: turn shadow node shrinker bugs into warnings Johannes Weiner
2016-11-07 19:07 ` Johannes Weiner
2016-11-08 9:57 ` Jan Kara
2016-11-08 9:57 ` Jan Kara
2016-11-07 19:07 ` [PATCH 3/6] lib: radix-tree: native accounting of exceptional entries Johannes Weiner
2016-11-07 19:07 ` Johannes Weiner
2016-11-08 10:08 ` Jan Kara
2016-11-08 10:08 ` Jan Kara
2016-11-07 19:07 ` [PATCH 4/6] lib: radix-tree: check accounting of existing slot replacement users Johannes Weiner
2016-11-07 19:07 ` Johannes Weiner
2016-11-08 10:12 ` Jan Kara
2016-11-08 10:12 ` Jan Kara
2016-11-07 19:07 ` [PATCH 5/6] mm: workingset: switch shadow entry tracking to radix tree exceptional counting Johannes Weiner
2016-11-07 19:07 ` Johannes Weiner
2016-11-08 10:27 ` Jan Kara
2016-11-08 10:27 ` Jan Kara
2016-11-08 19:30 ` Johannes Weiner
2016-11-08 19:30 ` Johannes Weiner
2016-11-07 19:07 ` [PATCH 6/6] mm: workingset: restore refault tracking for single-page files Johannes Weiner
2016-11-07 19:07 ` Johannes Weiner
2016-11-08 10:31 ` Jan Kara
2016-11-08 10:31 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161111163753.GH19382@node.shutemov.name \
--to=kirill@shutemov.name \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=jack@suse.cz \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.