All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave@linux.vnet.ibm.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"David S. Miller" <davem@davemloft.net>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	stable@vger.kernel.org,
	"Alexander Duyck" <alexander.h.duyck@intel.com>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Andrzej Pietrasiewicz" <andrzej.p@samsung.com>,
	"Arnd Bergmann" <arnd@arndb.de>, "Borislav Petkov" <bp@alien8.de>,
	"Borislav Petkov" <bp@suse.de>,
	"Christoph Lameter" <cl@linux.com>,
	"Daniel J Blueman" <daniel@numascale-asia.com>,
	"Eric Biederman" <ebiederm@xmission.com>,
	"Fenghua Yu" <fenghua.yu@intel.com>,
	"Frederic Weisbecker" <fweisbec@gmail.com>,
	"Gleb Natapov" <gleb@redhat.com>,
	"Gokul Caushik" <caushik1@gmail.com>,
	"H. J. Lu" <hjl.tools@gmail.com>,
	"Hugh Dickins" <hughd@google.com>, "Ingo Molnar" <mingo@elte.hu>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Jacob Shin" <jacob.shin@amd.com>,
	"Jamie Lokier" <jamie@shareable.org>,
	"Jarkko Sakkinen" <jarkko.sakkinen@intel.com>,
	"Jeremy Fitzhardinge" <jeremy@goop.org>,
	"Joe Millenbach" <jmillenbach@gmail.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Josh Triplett" <josh@joshtriplett.org>,
	"Kyungmin Park" <kyungmin.park@samsung.com>,
	"Lee Schermerhorn" <Lee.Schermerhorn@hp.com>,
	"Len Brown" <len.brown@intel.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Marcelo Tosatti" <mtosatti@redhat.com>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Matt Fleming" <matt.fleming@intel.com>,
	"Mel Gorman" <mgorman@suse.de>, "Paul Turner" <pjt@google.com>,
	"Pavel Machek" <pavel@ucw.cz>,
	"Pekka Enberg" <penberg@kernel.org>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Ralf Baechle" <ralf@linux-mips.org>,
	"Rik van Riel" <riel@redhat.com>, "Rob Landley" <rob@landley.net>,
	"Russell King" <linux@arm.linux.org.uk>,
	"Rusty Russell" <rusty@rustcorp.com.au>,
	"Shuah Khan" <shuah.khan@hp.com>,
	"Shuah Khan" <shuahkhan@gmail.com>,
	"Stefano Stabellini" <stefano.stabellini@eu.citrix.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ville Syrjälä" <ville.syrjala@linux.intel.com>,
	"Yasuaki Ishimatsu" <isimatu.yasuaki@jp.fujitsu.com>,
	"Yinghai Lu" <yinghai@kernel.org>,
	"Zachary Amsden" <zamsden@gmail.com>,
	avi@redhat.com, linux-mips@linux-mips.org,
	linux-pm@vger.kernel.org, mst@redhat.com,
	sparclinux@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	xen-devel@lists.xensource.com
Subject: Re: [GIT PULL] x86/mm changes for v3.9-rc1
Date: Fri, 22 Feb 2013 09:30:28 -0800	[thread overview]
Message-ID: <5127AB34.8090406@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130222165531.GA29308@phenom.dumpdata.com>

On 02/22/2013 08:55 AM, Konrad Rzeszutek Wilk wrote:
> On Thu, Feb 21, 2013 at 04:34:06PM -0800, H. Peter Anvin wrote:
>> Hi Linus,
>>
>> This is a huge set of several partly interrelated (and concurrently
>> developed) changes, which is why the branch history is messier than
>> one would like.
>>
>> The *really* big items are two humonguous patchsets mostly developed
>> by Yinghai Lu at my request, which completely revamps the way we
>> create initial page tables.  In particular, rather than estimating how
>> much memory we will need for page tables and then build them into that
>> memory -- a calculation that has shown to be incredibly fragile -- we
>> now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
>> a #PF handler which creates temporary page tables on demand.
>>
>> This has several advantages:
>>
>> 1. It makes it much easier to support things that need access to
>>    data very early (a followon patchset uses this to load microcode
>>    way early in the kernel startup).
>>
>> 2. It allows the kernel and all the kernel data objects to be invoked
>>    from above the 4 GB limit.  This allows kdump to work on very large
>>    systems.
>>
>> 3. It greatly reduces the difference between Xen and native (Xen's
>>    equivalent of the #PF handler are the temporary page tables created
>>    by the domain builder), eliminating a bunch of fragile hooks.
>>
>> The patch series also gets us a bit closer to W^X.
>>
>> Additional work in this pull is the 64-bit get_user() work which you
>> were also involved with, and a bunch of cleanups/speedups to
>> __phys_addr()/__pa().
> 
> Looking at figuring out which of the patches in the branch did this, but
> with this merge I am getting a crash with a very simple PV guest (booted with
> one 1G):
> 
> Call Trace:
>   [<ffffffff8103feba>] xen_get_user_pgd+0x5a  <--
>   [<ffffffff8103feba>] xen_get_user_pgd+0x5a 
>   [<ffffffff81042d27>] xen_write_cr3+0x77 
>   [<ffffffff81ad2d21>] init_mem_mapping+0x1f9 
>   [<ffffffff81ac293f>] setup_arch+0x742 
>   [<ffffffff81666d71>] printk+0x48 
>   [<ffffffff81abcd62>] start_kernel+0x90 
>   [<ffffffff8109416b>] __add_preferred_console.clone.1+0x9b 
>   [<ffffffff81abc5f7>] x86_64_start_reservations+0x2a 
>   [<ffffffff81abf0c7>] xen_start_kernel+0x564 

Do you have CONFIG_DEBUG_VIRTUAL on?

You're probably hitting the new BUG_ON() in __phys_addr().  It's
intended to detect places where someone is doing a __pa()/__phys_addr()
on an address that's outside the kernel's identity mapping.

There are a lot of __pa() calls around there, but from the looks of it,
it's this code:

static pgd_t *xen_get_user_pgd(pgd_t *pgd)
{
...
        if (offset < pgd_index(USER_LIMIT)) {
                struct page *page = virt_to_page(pgd_page);

I'm a bit fuzzy on exactly what the code is trying to do here.  It could
mean either that the identity mapping isn't set up enough yet, or that
__pa() is getting called on a bogus address.

I'm especially fuzzy on why we'd be calling anything that's looking at
userspace pagetables (xen_get_user_pgd() ??) this early in boot.

WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave@linux.vnet.ibm.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: linux-mips@linux-mips.org,
	"Jeremy Fitzhardinge" <jeremy@goop.org>,
	"H. J. Lu" <hjl.tools@gmail.com>,
	"Frederic Weisbecker" <fweisbec@gmail.com>,
	"Joe Millenbach" <jmillenbach@gmail.com>,
	virtualization@lists.linux-foundation.org,
	"Gokul Caushik" <caushik1@gmail.com>,
	"Ralf Baechle" <ralf@linux-mips.org>,
	"Pavel Machek" <pavel@ucw.cz>, "H. Peter Anvin" <hpa@zytor.com>,
	sparclinux@vger.kernel.org, "Christoph Lameter" <cl@linux.com>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Ville Syrjälä" <ville.syrjala@linux.intel.com>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"Lee Schermerhorn" <Lee.Schermerhorn@hp.com>,
	xen-devel@lists.xensource.com,
	"Russell King" <linux@arm.linux.org.uk>,
	"Len Brown" <len.brown@intel.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	linux-pm@vger.kernel.org, "Hugh Dickins" <hughd@google.com>,
	"Yasuaki Ishimatsu" <isimatu.yasuaki@j>
Subject: Re: [GIT PULL] x86/mm changes for v3.9-rc1
Date: Fri, 22 Feb 2013 09:30:28 -0800	[thread overview]
Message-ID: <5127AB34.8090406@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130222165531.GA29308@phenom.dumpdata.com>

On 02/22/2013 08:55 AM, Konrad Rzeszutek Wilk wrote:
> On Thu, Feb 21, 2013 at 04:34:06PM -0800, H. Peter Anvin wrote:
>> Hi Linus,
>>
>> This is a huge set of several partly interrelated (and concurrently
>> developed) changes, which is why the branch history is messier than
>> one would like.
>>
>> The *really* big items are two humonguous patchsets mostly developed
>> by Yinghai Lu at my request, which completely revamps the way we
>> create initial page tables.  In particular, rather than estimating how
>> much memory we will need for page tables and then build them into that
>> memory -- a calculation that has shown to be incredibly fragile -- we
>> now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
>> a #PF handler which creates temporary page tables on demand.
>>
>> This has several advantages:
>>
>> 1. It makes it much easier to support things that need access to
>>    data very early (a followon patchset uses this to load microcode
>>    way early in the kernel startup).
>>
>> 2. It allows the kernel and all the kernel data objects to be invoked
>>    from above the 4 GB limit.  This allows kdump to work on very large
>>    systems.
>>
>> 3. It greatly reduces the difference between Xen and native (Xen's
>>    equivalent of the #PF handler are the temporary page tables created
>>    by the domain builder), eliminating a bunch of fragile hooks.
>>
>> The patch series also gets us a bit closer to W^X.
>>
>> Additional work in this pull is the 64-bit get_user() work which you
>> were also involved with, and a bunch of cleanups/speedups to
>> __phys_addr()/__pa().
> 
> Looking at figuring out which of the patches in the branch did this, but
> with this merge I am getting a crash with a very simple PV guest (booted with
> one 1G):
> 
> Call Trace:
>   [<ffffffff8103feba>] xen_get_user_pgd+0x5a  <--
>   [<ffffffff8103feba>] xen_get_user_pgd+0x5a 
>   [<ffffffff81042d27>] xen_write_cr3+0x77 
>   [<ffffffff81ad2d21>] init_mem_mapping+0x1f9 
>   [<ffffffff81ac293f>] setup_arch+0x742 
>   [<ffffffff81666d71>] printk+0x48 
>   [<ffffffff81abcd62>] start_kernel+0x90 
>   [<ffffffff8109416b>] __add_preferred_console.clone.1+0x9b 
>   [<ffffffff81abc5f7>] x86_64_start_reservations+0x2a 
>   [<ffffffff81abf0c7>] xen_start_kernel+0x564 

Do you have CONFIG_DEBUG_VIRTUAL on?

You're probably hitting the new BUG_ON() in __phys_addr().  It's
intended to detect places where someone is doing a __pa()/__phys_addr()
on an address that's outside the kernel's identity mapping.

There are a lot of __pa() calls around there, but from the looks of it,
it's this code:

static pgd_t *xen_get_user_pgd(pgd_t *pgd)
{
...
        if (offset < pgd_index(USER_LIMIT)) {
                struct page *page = virt_to_page(pgd_page);

I'm a bit fuzzy on exactly what the code is trying to do here.  It could
mean either that the identity mapping isn't set up enough yet, or that
__pa() is getting called on a bogus address.

I'm especially fuzzy on why we'd be calling anything that's looking at
userspace pagetables (xen_get_user_pgd() ??) this early in boot.

  parent reply	other threads:[~2013-02-22 17:46 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-22  0:34 [GIT PULL] x86/mm changes for v3.9-rc1 H. Peter Anvin
2013-02-22  0:34 ` H. Peter Anvin
2013-02-22  0:34 ` H. Peter Anvin
2013-02-22 16:22 ` Linus Torvalds
2013-02-22 16:22   ` Linus Torvalds
2013-02-22 17:31   ` H. Peter Anvin
2013-02-22 17:31     ` H. Peter Anvin
2013-02-22 16:55 ` Konrad Rzeszutek Wilk
2013-02-22 16:55   ` Konrad Rzeszutek Wilk
2013-02-22 17:12   ` H. Peter Anvin
2013-02-22 17:12     ` H. Peter Anvin
2013-02-22 17:38     ` Konrad Rzeszutek Wilk
2013-02-22 17:38       ` Konrad Rzeszutek Wilk
2013-02-22 18:06       ` Stefano Stabellini
2013-02-22 18:06         ` Stefano Stabellini
2013-02-22 18:22         ` Yinghai Lu
2013-02-22 18:22           ` Yinghai Lu
2013-02-22 18:24           ` H. Peter Anvin
2013-02-22 18:24           ` H. Peter Anvin
2013-02-22 18:24           ` H. Peter Anvin
2013-02-22 18:06       ` Stefano Stabellini
2013-02-22 18:08       ` Yinghai Lu
2013-02-22 18:08         ` Yinghai Lu
2013-02-22 17:24   ` Konrad Rzeszutek Wilk
2013-02-22 17:24     ` Konrad Rzeszutek Wilk
2013-02-22 17:24     ` Konrad Rzeszutek Wilk
2013-02-22 17:30     ` H. Peter Anvin
2013-02-22 17:30     ` H. Peter Anvin
2013-02-22 17:30       ` H. Peter Anvin
2013-02-22 17:53     ` Yinghai Lu
2013-02-22 17:53       ` Yinghai Lu
2013-02-22 18:23       ` Konrad Rzeszutek Wilk
2013-02-22 18:23         ` Konrad Rzeszutek Wilk
2013-02-22 18:25       ` [Xen-devel] " Andrew Cooper
2013-02-22 18:25         ` Andrew Cooper
2013-02-22 17:30   ` Dave Hansen [this message]
2013-02-22 17:30     ` Dave Hansen
2013-02-22 17:33     ` H. Peter Anvin
2013-02-22 17:33       ` H. Peter Anvin
     [not found] <CAE9FiQX5r02Prsw-f0HsgLVJ0FZeYL9aggXebwWR-E5oYsj6cw@mail.gmail.com>
     [not found] ` <5127C620.2040605@linux.intel.com>
     [not found]   ` <alpine.DEB.2.02.1302221929040.22997@kaball.uk.xensource.com>
     [not found]     ` <5127CE65.5010703@linux.intel.com>
     [not found]       ` <CAE9FiQXHHFb0W+aJCsefRNj4p+X1+m8JOLUDz74w-nAjnhym+A@mail.gmail.com>
     [not found]         ` <5127DDB3.2010309@zytor.com>
     [not found]           ` <CAE9FiQWctM60VwXJYtOXwPBNLUoz966Fr1g6MPsPoJBiye88YQ@mail.gmail.com>
     [not found]             ` <5127FBA4.1040506@zytor.com>
     [not found]               ` <20130223003738.GA23545@phenom.dumpdata.com>
     [not found]                 ` <CAE9FiQVTbDkvU8KQGVoYv3kn6UeCTcdiPA2hvw21OKtNbM=XKg@mail.gmail.com>
     [not found]                   ` <20130223012716.GA28377@phenom.dumpdata.com>
2013-02-23  1:39                     ` Yinghai Lu
2013-02-23 19:43                       ` Konrad Rzeszutek Wilk
2013-02-23 21:37                         ` Yinghai Lu
2013-02-23 21:42                           ` H. Peter Anvin
  -- strict thread matches above, loose matches on Subject: below --
2013-02-22  0:34 H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5127AB34.8090406@linux.vnet.ibm.com \
    --to=dave@linux.vnet.ibm.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.h.duyck@intel.com \
    --cc=andrzej.p@samsung.com \
    --cc=arnd@arndb.de \
    --cc=avi@redhat.com \
    --cc=bp@alien8.de \
    --cc=bp@suse.de \
    --cc=caushik1@gmail.com \
    --cc=cl@linux.com \
    --cc=daniel@numascale-asia.com \
    --cc=davem@davemloft.net \
    --cc=ebiederm@xmission.com \
    --cc=fenghua.yu@intel.com \
    --cc=fweisbec@gmail.com \
    --cc=gleb@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hjl.tools@gmail.com \
    --cc=hpa@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=jacob.shin@amd.com \
    --cc=jamie@shareable.org \
    --cc=jarkko.sakkinen@intel.com \
    --cc=jeremy@goop.org \
    --cc=jmillenbach@gmail.com \
    --cc=joro@8bytes.org \
    --cc=josh@joshtriplett.org \
    --cc=konrad.wilk@oracle.com \
    --cc=kyungmin.park@samsung.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@linux-mips.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=m.szyprowski@samsung.com \
    --cc=matt.fleming@intel.com \
    --cc=mgorman@suse.de \
    --cc=mingo@elte.hu \
    --cc=mingo@kernel.org \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pavel@ucw.cz \
    --cc=penberg@kernel.org \
    --cc=pjt@google.com \
    --cc=ralf@linux-mips.org \
    --cc=riel@redhat.com \
    --cc=rjw@sisk.pl \
    --cc=rob@landley.net \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=shuah.khan@hp.com \
    --cc=shuahkhan@gmail.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=ville.syrjala@linux.intel.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xen-devel@lists.xensource.com \
    --cc=yinghai@kernel.org \
    --cc=zamsden@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.