From: Rik van Riel <riel@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Steven Noonan <steven@uplinklabs.net>,
Linux Kernel mailing List <linux-kernel@vger.kernel.org>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Mel Gorman <mgorman@suse.de>, Alex Thorlton <athorlton@sgi.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [BISECTED] Linux 3.12.7 introduces page map handling regression
Date: Wed, 22 Jan 2014 13:07:44 -0500 [thread overview]
Message-ID: <52E008F0.3060602@redhat.com> (raw)
In-Reply-To: <CA+55aFw7fTFJtOAa+RETGSL7ZXZE4Ysk9+Xmg6_5yyLkwRtcTw@mail.gmail.com>
On 01/21/2014 09:47 PM, Linus Torvalds wrote:
> On Tue, Jan 21, 2014 at 5:49 PM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
>>
>> Odds are this also shows up in 3.13, right?
>
> Probably. I don't have a Xen PV setup to test with (and very little
> interest in setting one up).. And I have a suspicion that it might not
> be so much about Xen PV, as perhaps about the kind of hardware.
>
> I suspect the issue has something to do with the magic _PAGE_NUMA
> tie-in with _PAGE_PRESENT. And then mprotect(PROT_NONE) ends up
> removing the _PAGE_PRESENT bit, and now the crazy numa code is
> confused.
>
> The whole _PAGE_NUMA thing is a f*cking horrible hack, and shares the
> bit with _PAGE_PROTNONE, which is why it then has that tie-in to
> _PAGE_PRESENT.
The numa balancing code should clear _PAGE_PRESENT and
set _PAGE_NUMA / _PAGE_PROTNONE.
The difference between a numa pte and a protnone pte is
the VMA permissions.
When the VMA is protnone, do_page_fault will kill the
app with a segfault. When the VMA has proper permissions,
handle_pte_fault will call do_numa_page, and numa-y things
are done.
>
> Adding Andrea to the Cc, because he's the author of that horridness.
> Putting Steven's test-case here as an attachement for Andrea, maybe
> that makes him go "Ahh, yes, silly case".
>
> Also added Kirill, because he was involved the last _PAGE_NUMA debacle.
>
> Andrea, you can find the thread on lkml, but it boils down to commit
> 1667918b6483 (backported to 3.12.7 as 3d792d616ba4) breaking the
> attached test-case (but apparently only under Xen PV). There it
> apparently causes a "BUG: Bad page map .." error.
>
> And I suspect this is another of those "this bug is only visible on
> real numa machines, because _PAGE_NUMA isn't actually ever set
> otherwise". That has pretty much guaranteed that it gets basically
> zero testing, which is not a great idea when coupled with that subtle
> sharing of the _PAGE_PROTNONE bit..
>
> It may be that the whole "Xen PV" thing is a red herring, and that
> Steven only sees it on that one machine because the one he runs as a
> PV guest under is a real NUMA machine, and all the other machines he
> has tried it on haven't been numa. So it *may* be that that "only
> under Xen PV" is a red herring. But that's just a possible guess.
>
> Christ, how I hate that _PAGE_NUMA bit. Andrea: the fact that it gets
> no testing on any normal machines is a major problem. If it was simple
> and straightforward and the code was "obviously correct", it wouldn't
> be such a problem, but the _PAGE_NUMA code definitely does not fall
> under that "simple and obviously correct" heading.
>
> Guys, any ideas?
>
> Linus
>
next prev parent reply other threads:[~2014-01-22 18:08 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-21 23:27 [BISECTED] Linux 3.12.7 introduces page map handling regression Steven Noonan
2014-01-22 1:49 ` Greg Kroah-Hartman
2014-01-22 2:47 ` Linus Torvalds
2014-01-22 3:20 ` Steven Noonan
2014-01-22 5:02 ` Konrad Rzeszutek Wilk
2014-01-22 7:29 ` Steven Noonan
2014-01-22 7:29 ` Steven Noonan
2014-01-22 14:29 ` Daniel Borkmann
2014-01-22 14:29 ` Daniel Borkmann
2014-01-22 20:18 ` Elena Ufimtseva
2014-01-22 20:18 ` Elena Ufimtseva
2014-01-22 20:33 ` Steven Noonan
2014-01-23 16:23 ` Elena Ufimtseva
2014-01-23 23:20 ` Steven Noonan
2014-01-23 23:20 ` Steven Noonan
2014-01-24 4:28 ` Elena Ufimtseva
2014-01-24 4:28 ` Elena Ufimtseva
2014-01-24 11:05 ` David Vrabel
2014-01-24 11:05 ` David Vrabel
2014-01-24 13:38 ` Mel Gorman
2014-01-26 18:02 ` Elena Ufimtseva
2014-01-26 18:02 ` Elena Ufimtseva
2014-02-04 6:58 ` Elena Ufimtseva
2014-02-04 11:44 ` [PATCH] Subject: [PATCH] xen: Properly account for _PAGE_NUMA during xen pte translations Mel Gorman
2014-02-04 11:44 ` Mel Gorman
2014-02-04 11:44 ` Mel Gorman
2014-02-04 11:48 ` David Vrabel
2014-02-04 11:48 ` David Vrabel
2014-02-04 11:48 ` David Vrabel
2014-02-04 14:38 ` Konrad Rzeszutek Wilk
2014-02-04 14:38 ` Konrad Rzeszutek Wilk
2014-02-04 14:38 ` Konrad Rzeszutek Wilk
2014-02-04 6:58 ` [BISECTED] Linux 3.12.7 introduces page map handling regression Elena Ufimtseva
2014-01-24 13:38 ` Mel Gorman
2014-01-23 16:23 ` Elena Ufimtseva
2014-01-22 20:33 ` Steven Noonan
2014-01-22 5:02 ` Konrad Rzeszutek Wilk
2014-01-22 18:07 ` Rik van Riel [this message]
2014-01-22 18:24 ` Linus Torvalds
2014-01-22 18:39 ` Rik van Riel
2014-01-24 11:43 ` Mel Gorman
2014-01-23 17:03 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52E008F0.3060602@redhat.com \
--to=riel@redhat.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=athorlton@sgi.com \
--cc=gregkh@linuxfoundation.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=steven@uplinklabs.net \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.