From: Sasha Levin <sasha.levin@oracle.com>
To: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Dave Jones <davej@redhat.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
David Rientjes <rientjes@google.com>
Subject: Re: 3.15-rc8 oops in copy_page_rep after page fault.
Date: Tue, 17 Jun 2014 16:31:43 -0400 [thread overview]
Message-ID: <53A0A5AF.3000509@oracle.com> (raw)
In-Reply-To: <alpine.LSU.2.11.1406151957560.5820@eggly.anvils>
On 06/15/2014 11:01 PM, Hugh Dickins wrote:
> On Fri, 6 Jun 2014, Sasha Levin wrote:
>> > On 06/06/2014 02:49 PM, Kirill A. Shutemov wrote:
>>> > > On Fri, Jun 06, 2014 at 11:26:14AM -0700, Linus Torvalds wrote:
>>>>> > >> > On Fri, Jun 6, 2014 at 10:43 AM, Dave Jones <davej@redhat.com> wrote:
>>>>>>> > >>> > >
>>>>>>> > >>> > > RIP: 0010:[<ffffffff8b3287b5>] [<ffffffff8b3287b5>] copy_page_rep+0x5/0x10
>>>>> > >> >
>>>>> > >> > Ok, it's the first iteration of "rep movsq" (%rcx is still 0x200) for
>>>>> > >> > copying a page, and the pages are
>>>>> > >> >
>>>>> > >> > RSI: ffff880052766000
>>>>> > >> > RDI: ffff880014efe000
>>>>> > >> >
>>>>> > >> > which both look like reasonable kernel addresses. So I'm assuming it's
>>>>> > >> > DEBUG_PAGEALLOC that makes this trigger, and since the error code is
>>>>> > >> > 0, and the CR2 value matches RSI, it's the source page that seems to
>>>>> > >> > have been freed.
>>>>> > >> >
>>>>> > >> > And I see absolutely _zero_ reason for wht your 64k mmap_min_addr
>>>>> > >> > should make any difference what-so-ever. That's just odd.
>>>>> > >> >
>>>>> > >> > Anyway, can you try to figure out _which_ copy_user_highpage() it is
>>>>> > >> > (by looking at what is around the call-site at
>>>>> > >> > "handle_mm_fault+0x1e0". The fact that we have a stale
>>>>> > >> > do_huge_pmd_wp_page() on the stack makes me suspect that we have hit
>>>>> > >> > that VM_FAULT_FALLBACK case and this is related to splitting. Adding a
>>>>> > >> > few more people explicitly to the cc in case anybody sees anything
>>>>> > >> > (original email on lkml and linux-mm for context, guys).
>>> > > Looks like a known false positive from DEBUG_PAGEALLOC:
>>> > >
>>> > > https://lkml.org/lkml/2013/3/29/103
>>> > >
>>> > > We huge copy page in do_huge_pmd_wp_page() without ptl taken and the page
>>> > > can be splitted and freed under us. Once page is copied we take ptl again
>>> > > and recheck that PMD is not changed. If changed, we don't use new page.
>>> > > Not a bug, never triggered with DEBUG_PAGEALLOC disabled.
>>> > >
>>> > > It would be nice to have a way to mark this kind of speculative access.
>> >
>> > FWIW, this issue makes fuzzing with DEBUG_PAGEALLOC nearly impossible since
>> > this thing is so common we never get to do anything "fun" before this issue
>> > triggers.
>> >
>> > A fix would be more than welcome.
> Please give this a try: I think it's right, but I could easily be wrong.
>
>
> [PATCH] thp: fix DEBUG_PAGEALLOC oops in copy_page_rep
>
> Trinity has for over a year been reporting a CONFIG_DEBUG_PAGEALLOC
> oops in copy_page_rep() called from copy_user_huge_page() called from
> do_huge_pmd_wp_page().
>
> I believe this is a DEBUG_PAGEALLOC false positive, due to the source
> page being split, and a tail page freed, while copy is in progress; and
> not a problem without DEBUG_PAGEALLOC, since the pmd_same() check will
> prevent a miscopy from being made visible.
>
> Fix by adding get_user_huge_page() and put_user_huge_page(): reducing
> to the usual get_page() and put_page() on head page in the usual config;
> but get and put references to all of the tail pages when DEBUG_PAGEALLOC.
>
> Signed-off-by: Hugh Dickins <hughd@google.com>
Works great, thanks Hugh!
Thanks,
Sasha
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Sasha Levin <sasha.levin@oracle.com>
To: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Dave Jones <davej@redhat.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
David Rientjes <rientjes@google.com>
Subject: Re: 3.15-rc8 oops in copy_page_rep after page fault.
Date: Tue, 17 Jun 2014 16:31:43 -0400 [thread overview]
Message-ID: <53A0A5AF.3000509@oracle.com> (raw)
In-Reply-To: <alpine.LSU.2.11.1406151957560.5820@eggly.anvils>
On 06/15/2014 11:01 PM, Hugh Dickins wrote:
> On Fri, 6 Jun 2014, Sasha Levin wrote:
>> > On 06/06/2014 02:49 PM, Kirill A. Shutemov wrote:
>>> > > On Fri, Jun 06, 2014 at 11:26:14AM -0700, Linus Torvalds wrote:
>>>>> > >> > On Fri, Jun 6, 2014 at 10:43 AM, Dave Jones <davej@redhat.com> wrote:
>>>>>>> > >>> > >
>>>>>>> > >>> > > RIP: 0010:[<ffffffff8b3287b5>] [<ffffffff8b3287b5>] copy_page_rep+0x5/0x10
>>>>> > >> >
>>>>> > >> > Ok, it's the first iteration of "rep movsq" (%rcx is still 0x200) for
>>>>> > >> > copying a page, and the pages are
>>>>> > >> >
>>>>> > >> > RSI: ffff880052766000
>>>>> > >> > RDI: ffff880014efe000
>>>>> > >> >
>>>>> > >> > which both look like reasonable kernel addresses. So I'm assuming it's
>>>>> > >> > DEBUG_PAGEALLOC that makes this trigger, and since the error code is
>>>>> > >> > 0, and the CR2 value matches RSI, it's the source page that seems to
>>>>> > >> > have been freed.
>>>>> > >> >
>>>>> > >> > And I see absolutely _zero_ reason for wht your 64k mmap_min_addr
>>>>> > >> > should make any difference what-so-ever. That's just odd.
>>>>> > >> >
>>>>> > >> > Anyway, can you try to figure out _which_ copy_user_highpage() it is
>>>>> > >> > (by looking at what is around the call-site at
>>>>> > >> > "handle_mm_fault+0x1e0". The fact that we have a stale
>>>>> > >> > do_huge_pmd_wp_page() on the stack makes me suspect that we have hit
>>>>> > >> > that VM_FAULT_FALLBACK case and this is related to splitting. Adding a
>>>>> > >> > few more people explicitly to the cc in case anybody sees anything
>>>>> > >> > (original email on lkml and linux-mm for context, guys).
>>> > > Looks like a known false positive from DEBUG_PAGEALLOC:
>>> > >
>>> > > https://lkml.org/lkml/2013/3/29/103
>>> > >
>>> > > We huge copy page in do_huge_pmd_wp_page() without ptl taken and the page
>>> > > can be splitted and freed under us. Once page is copied we take ptl again
>>> > > and recheck that PMD is not changed. If changed, we don't use new page.
>>> > > Not a bug, never triggered with DEBUG_PAGEALLOC disabled.
>>> > >
>>> > > It would be nice to have a way to mark this kind of speculative access.
>> >
>> > FWIW, this issue makes fuzzing with DEBUG_PAGEALLOC nearly impossible since
>> > this thing is so common we never get to do anything "fun" before this issue
>> > triggers.
>> >
>> > A fix would be more than welcome.
> Please give this a try: I think it's right, but I could easily be wrong.
>
>
> [PATCH] thp: fix DEBUG_PAGEALLOC oops in copy_page_rep
>
> Trinity has for over a year been reporting a CONFIG_DEBUG_PAGEALLOC
> oops in copy_page_rep() called from copy_user_huge_page() called from
> do_huge_pmd_wp_page().
>
> I believe this is a DEBUG_PAGEALLOC false positive, due to the source
> page being split, and a tail page freed, while copy is in progress; and
> not a problem without DEBUG_PAGEALLOC, since the pmd_same() check will
> prevent a miscopy from being made visible.
>
> Fix by adding get_user_huge_page() and put_user_huge_page(): reducing
> to the usual get_page() and put_page() on head page in the usual config;
> but get and put references to all of the tail pages when DEBUG_PAGEALLOC.
>
> Signed-off-by: Hugh Dickins <hughd@google.com>
Works great, thanks Hugh!
Thanks,
Sasha
next prev parent reply other threads:[~2014-06-17 20:32 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-06 17:43 3.15-rc8 oops in copy_page_rep after page fault Dave Jones
2014-06-06 17:43 ` Dave Jones
2014-06-06 17:51 ` Dave Jones
2014-06-06 17:51 ` Dave Jones
2014-06-06 18:26 ` Linus Torvalds
2014-06-06 18:26 ` Linus Torvalds
2014-06-06 18:39 ` Dave Jones
2014-06-06 18:39 ` Dave Jones
2014-06-06 18:40 ` Hugh Dickins
2014-06-06 18:40 ` Hugh Dickins
2014-06-06 18:49 ` Kirill A. Shutemov
2014-06-06 18:49 ` Kirill A. Shutemov
2014-06-06 19:03 ` Sasha Levin
2014-06-06 19:03 ` Sasha Levin
2014-06-16 3:01 ` Hugh Dickins
2014-06-16 3:01 ` Hugh Dickins
2014-06-16 13:26 ` Kirill A. Shutemov
2014-06-16 13:26 ` Kirill A. Shutemov
2014-06-17 20:31 ` Sasha Levin [this message]
2014-06-17 20:31 ` Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53A0A5AF.3000509@oracle.com \
--to=sasha.levin@oracle.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=davej@redhat.com \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.