From: Minchan Kim <minchan@kernel.org>
To: Eric Biggers <ebiggers3@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, Dmitry Vyukov <dvyukov@google.com>,
Hugh Dickins <hughd@google.com>, Michal Hocko <mhocko@suse.com>,
stable@vger.kernel.org, David Rientjes <rientjes@google.com>,
Eric Biggers <ebiggers@google.com>
Subject: Re: [PATCH] mm/madvise: fix freeing of locked page with MADV_FREE
Date: Thu, 24 Aug 2017 07:19:59 +0900 [thread overview]
Message-ID: <20170823221959.GA11202@bgram> (raw)
In-Reply-To: <20170823205235.132061-1-ebiggers3@gmail.com>
On Wed, Aug 23, 2017 at 01:52:35PM -0700, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> If madvise(..., MADV_FREE) split a transparent hugepage, it called
> put_page() before unlock_page(). This was wrong because put_page() can
> free the page, e.g. if a concurrent madvise(..., MADV_DONTNEED) has
> removed it from the memory mapping. put_page() then rightfully
> complained about freeing a locked page.
>
> Fix this by moving the unlock_page() before put_page().
>
> This bug was found by syzkaller, which encountered the following splat:
>
> BUG: Bad page state in process syzkaller412798 pfn:1bd800
> page:ffffea0006f60000 count:0 mapcount:0 mapping: (null) index:0x20a00
> flags: 0x200000000040019(locked|uptodate|dirty|swapbacked)
> raw: 0200000000040019 0000000000000000 0000000000020a00 00000000ffffffff
> raw: ffffea0006f60020 ffffea0006f60020 0000000000000000 0000000000000000
> page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> bad because of flags: 0x1(locked)
> Modules linked in:
> CPU: 1 PID: 3037 Comm: syzkaller412798 Not tainted 4.13.0-rc5+ #35
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:16 [inline]
> dump_stack+0x194/0x257 lib/dump_stack.c:52
> bad_page+0x230/0x2b0 mm/page_alloc.c:565
> free_pages_check_bad+0x1f0/0x2e0 mm/page_alloc.c:943
> free_pages_check mm/page_alloc.c:952 [inline]
> free_pages_prepare mm/page_alloc.c:1043 [inline]
> free_pcp_prepare mm/page_alloc.c:1068 [inline]
> free_hot_cold_page+0x8cf/0x12b0 mm/page_alloc.c:2584
> __put_single_page mm/swap.c:79 [inline]
> __put_page+0xfb/0x160 mm/swap.c:113
> put_page include/linux/mm.h:814 [inline]
> madvise_free_pte_range+0x137a/0x1ec0 mm/madvise.c:371
> walk_pmd_range mm/pagewalk.c:50 [inline]
> walk_pud_range mm/pagewalk.c:108 [inline]
> walk_p4d_range mm/pagewalk.c:134 [inline]
> walk_pgd_range mm/pagewalk.c:160 [inline]
> __walk_page_range+0xc3a/0x1450 mm/pagewalk.c:249
> walk_page_range+0x200/0x470 mm/pagewalk.c:326
> madvise_free_page_range.isra.9+0x17d/0x230 mm/madvise.c:444
> madvise_free_single_vma+0x353/0x580 mm/madvise.c:471
> madvise_dontneed_free mm/madvise.c:555 [inline]
> madvise_vma mm/madvise.c:664 [inline]
> SYSC_madvise mm/madvise.c:832 [inline]
> SyS_madvise+0x7d3/0x13c0 mm/madvise.c:760
> entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> Here is a C reproducer:
>
> #define _GNU_SOURCE
> #include <pthread.h>
> #include <sys/mman.h>
> #include <unistd.h>
>
> #define MADV_FREE 8
> #define PAGE_SIZE 4096
>
> static void *mapping;
> static const size_t mapping_size = 0x1000000;
>
> static void *madvise_thrproc(void *arg)
> {
> madvise(mapping, mapping_size, (long)arg);
> }
>
> int main(void)
> {
> pthread_t t[2];
>
> for (;;) {
> mapping = mmap(NULL, mapping_size, PROT_WRITE,
> MAP_POPULATE|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
>
> munmap(mapping + mapping_size / 2, PAGE_SIZE);
>
> pthread_create(&t[0], 0, madvise_thrproc, (void*)MADV_DONTNEED);
> pthread_create(&t[1], 0, madvise_thrproc, (void*)MADV_FREE);
> pthread_join(t[0], NULL);
> pthread_join(t[1], NULL);
> munmap(mapping, mapping_size);
> }
> }
>
> Note: to see the splat, CONFIG_TRANSPARENT_HUGEPAGE=y and
> CONFIG_DEBUG_VM=y are needed.
>
> Fixes: 854e9ed09ded ("mm: support madvise(MADV_FREE)")
> Google-Bug-Id: 64696096
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: stable@vger.kernel.org # v4.5+
> Acked-by: David Rientjes <rientjes@google.com>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Thanks!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: Eric Biggers <ebiggers3@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, Dmitry Vyukov <dvyukov@google.com>,
Hugh Dickins <hughd@google.com>, Michal Hocko <mhocko@suse.com>,
stable@vger.kernel.org, David Rientjes <rientjes@google.com>,
Eric Biggers <ebiggers@google.com>
Subject: Re: [PATCH] mm/madvise: fix freeing of locked page with MADV_FREE
Date: Thu, 24 Aug 2017 07:19:59 +0900 [thread overview]
Message-ID: <20170823221959.GA11202@bgram> (raw)
In-Reply-To: <20170823205235.132061-1-ebiggers3@gmail.com>
On Wed, Aug 23, 2017 at 01:52:35PM -0700, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> If madvise(..., MADV_FREE) split a transparent hugepage, it called
> put_page() before unlock_page(). This was wrong because put_page() can
> free the page, e.g. if a concurrent madvise(..., MADV_DONTNEED) has
> removed it from the memory mapping. put_page() then rightfully
> complained about freeing a locked page.
>
> Fix this by moving the unlock_page() before put_page().
>
> This bug was found by syzkaller, which encountered the following splat:
>
> BUG: Bad page state in process syzkaller412798 pfn:1bd800
> page:ffffea0006f60000 count:0 mapcount:0 mapping: (null) index:0x20a00
> flags: 0x200000000040019(locked|uptodate|dirty|swapbacked)
> raw: 0200000000040019 0000000000000000 0000000000020a00 00000000ffffffff
> raw: ffffea0006f60020 ffffea0006f60020 0000000000000000 0000000000000000
> page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> bad because of flags: 0x1(locked)
> Modules linked in:
> CPU: 1 PID: 3037 Comm: syzkaller412798 Not tainted 4.13.0-rc5+ #35
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:16 [inline]
> dump_stack+0x194/0x257 lib/dump_stack.c:52
> bad_page+0x230/0x2b0 mm/page_alloc.c:565
> free_pages_check_bad+0x1f0/0x2e0 mm/page_alloc.c:943
> free_pages_check mm/page_alloc.c:952 [inline]
> free_pages_prepare mm/page_alloc.c:1043 [inline]
> free_pcp_prepare mm/page_alloc.c:1068 [inline]
> free_hot_cold_page+0x8cf/0x12b0 mm/page_alloc.c:2584
> __put_single_page mm/swap.c:79 [inline]
> __put_page+0xfb/0x160 mm/swap.c:113
> put_page include/linux/mm.h:814 [inline]
> madvise_free_pte_range+0x137a/0x1ec0 mm/madvise.c:371
> walk_pmd_range mm/pagewalk.c:50 [inline]
> walk_pud_range mm/pagewalk.c:108 [inline]
> walk_p4d_range mm/pagewalk.c:134 [inline]
> walk_pgd_range mm/pagewalk.c:160 [inline]
> __walk_page_range+0xc3a/0x1450 mm/pagewalk.c:249
> walk_page_range+0x200/0x470 mm/pagewalk.c:326
> madvise_free_page_range.isra.9+0x17d/0x230 mm/madvise.c:444
> madvise_free_single_vma+0x353/0x580 mm/madvise.c:471
> madvise_dontneed_free mm/madvise.c:555 [inline]
> madvise_vma mm/madvise.c:664 [inline]
> SYSC_madvise mm/madvise.c:832 [inline]
> SyS_madvise+0x7d3/0x13c0 mm/madvise.c:760
> entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> Here is a C reproducer:
>
> #define _GNU_SOURCE
> #include <pthread.h>
> #include <sys/mman.h>
> #include <unistd.h>
>
> #define MADV_FREE 8
> #define PAGE_SIZE 4096
>
> static void *mapping;
> static const size_t mapping_size = 0x1000000;
>
> static void *madvise_thrproc(void *arg)
> {
> madvise(mapping, mapping_size, (long)arg);
> }
>
> int main(void)
> {
> pthread_t t[2];
>
> for (;;) {
> mapping = mmap(NULL, mapping_size, PROT_WRITE,
> MAP_POPULATE|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
>
> munmap(mapping + mapping_size / 2, PAGE_SIZE);
>
> pthread_create(&t[0], 0, madvise_thrproc, (void*)MADV_DONTNEED);
> pthread_create(&t[1], 0, madvise_thrproc, (void*)MADV_FREE);
> pthread_join(t[0], NULL);
> pthread_join(t[1], NULL);
> munmap(mapping, mapping_size);
> }
> }
>
> Note: to see the splat, CONFIG_TRANSPARENT_HUGEPAGE=y and
> CONFIG_DEBUG_VM=y are needed.
>
> Fixes: 854e9ed09ded ("mm: support madvise(MADV_FREE)")
> Google-Bug-Id: 64696096
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: stable@vger.kernel.org # v4.5+
> Acked-by: David Rientjes <rientjes@google.com>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Thanks!
next prev parent reply other threads:[~2017-08-23 22:20 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-23 20:52 [PATCH] mm/madvise: fix freeing of locked page with MADV_FREE Eric Biggers
2017-08-23 20:52 ` Eric Biggers
2017-08-23 22:19 ` Minchan Kim [this message]
2017-08-23 22:19 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170823221959.GA11202@bgram \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=dvyukov@google.com \
--cc=ebiggers3@gmail.com \
--cc=ebiggers@google.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=rientjes@google.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.