All of lore.kernel.org
 help / color / mirror / Atom feed
From: "H. Peter Anvin" <hpa@zytor.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	stable <stable@vger.kernel.org>,
	"Alexander Duyck" <alexander.h.duyck@intel.com>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Andrzej Pietrasiewicz" <andrzej.p@samsung.com>,
	"Arnd Bergmann" <arnd@arndb.de>, "Borislav Petkov" <bp@alien8.de>,
	"Borislav Petkov" <bp@suse.de>,
	"Christoph Lameter" <cl@linux.com>,
	"Daniel J Blueman" <daniel@numascale-asia.com>,
	"Dave Hansen" <dave@linux.vnet.ibm.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"Fenghua Yu" <fenghua.yu@intel.com>,
	"Frederic Weisbecker" <fweisbec@gmail.com>,
	"Gleb Natapov" <gleb@redhat.com>,
	"Gokul Caushik" <caushik1@gmail.com>,
	"H. J. Lu" <hjl.tools@gmail.com>,
	"Hugh Dickins" <hughd@google.com>, "Ingo Molnar" <mingo@elte.hu>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Jacob Shin" <jacob.shin@amd.com>,
	"Jamie Lokier" <jamie@shareable.org>,
	"Jarkko Sakkinen" <jarkko.sakkinen@intel.com>,
	"Jeremy Fitzhardinge" <jeremy@goop.org>,
	"Joe Millenbach" <jmillenbach@gmail.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Josh Triplett" <josh@joshtriplett.org>,
	"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
	"Kyungmin Park" <kyungmin.park@samsung.com>,
	"Lee Schermerhorn" <Lee.Schermerhorn@hp.com>,
	"Len Brown" <len.brown@intel.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Marcelo Tosatti" <mtosatti@redhat.com>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Matt Fleming" <matt.fleming@intel.com>,
	"Mel Gorman" <mgorman@suse.de>, "Paul Turner" <pjt@google.com>,
	"Pavel Machek" <pavel@ucw.cz>,
	"Pekka Enberg" <penberg@kernel.org>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Ralf Baechle" <ralf@linux-mips.org>,
	"Rik van Riel" <riel@redhat.com>, "Rob Landley" <rob@landley.net>,
	"Russell King" <linux@arm.linux.org.uk>,
	"Rusty Russell" <rusty@rustcorp.com.au>,
	"Shuah Khan" <shuah.khan@hp.com>,
	"Shuah Khan" <shuahkhan@gmail.com>,
	"Stefano Stabellini" <stefano.stabellini@eu.citrix.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ville Syrjälä" <ville.syrjala@linux.intel.com>,
	"Yasuaki Ishimatsu" <isimatu.yasuaki@jp.fujitsu.com>,
	"Yinghai Lu" <yinghai@kernel.org>,
	"Zachary Amsden" <zamsden@gmail.com>,
	"Avi Kivity" <avi@redhat.com>,
	linux-mips <linux-mips@linux-mips.org>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	sparclinux@vger.kernel.org,
	virtualization <virtualization@lists.linux-foundation.org>,
	"Xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: [GIT PULL] x86/mm changes for v3.9-rc1
Date: Fri, 22 Feb 2013 09:31:27 -0800	[thread overview]
Message-ID: <5127AB6F.3060806@zytor.com> (raw)
In-Reply-To: <CA+55aFy=tW2X4O-qKLh_YQjSFX7aBaBme4uy8kxawn1koKdt-g@mail.gmail.com>

On 02/22/2013 08:22 AM, Linus Torvalds wrote:
>
> Ugh. So I've tried to walk through this, and it's painful. If this
> results in problems, we're going to be *so* screwed. Is it bisectable?
>

I can't tell you for sure that it is bisectable at every point.  There 
are definite bisection points in there, though, as this is several 
pieces of work from two kernel cycles that were independently tested.

> I also don't understand how "early_idt_handler" could *possibly* work.
> In particular, it seems to rely on the trap number being set up in the
> stack frame:
>
>          cmpl $14,72(%rsp)       # Page fault?
>
> but that's not even *true*. Why? Because we export both the
> early_idt_handlers[] array (that sets up the trap number and makes the
> stack frame be reliable) and the single early_idt_handler function
> (that relies on the trap number and the reliable stack frame), AND
> AFAIK WE USE THE LATTER!
>
> See x86_64_start_kernel():
>
>          for (i = 0; i < NUM_EXCEPTION_VECTORS; i++) {
> #ifdef CONFIG_EARLY_PRINTK
>                  set_intr_gate(i, &early_idt_handlers[i]);
> #else
>                  set_intr_gate(i, early_idt_handler);
> #endif
>          }
>
> so unless you have CONFIG_EARLY_PRINTK, the interrupt gate will point
> to that raw early_idt_handler function that doesn't *work* on its own,
> afaik.
>

This is a (pre-existing!) bug that absolutely needs to be fixed, which 
ought to break other things too (early use of *msr_safe for example, or 
anything else that relies on an early exception entry, which there 
aren't a lot of so far).  The fix is simple and obvious.
But you're right... what the heck is going on here?

My own testing would probably not have caught this, as I consider 
EARLY_PRINTK a must have, but Ingo's test machines definitely would have.

> Btw, it's not just the page fault index testing that is wrong. The whole
>
>          cmpl $__KERNEL_CS,96(%rsp)
>          jne 11f
>
> also relies on the stack frame being set up the same way for all
> exceptions - which again is only true if we ran through the
> early_idt_handlers[] prologue that added the extra stack entry.
>
> How does this even work for me? I don't have EARLY_PRINTK enabled.
>
> What am I missing?

I just ran a simulation without EARLY_PRINTK, presumably based on the 
memory layout, we can apparently go through the entire bootup sequence 
without actually ever taking an early trap.  It is a bug, though, and it 
is a bug even without this patchset.  I will submit a fix.  However, the 
Xen "we tested this, this worked, now it doesn't" worries me a lot.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

WARNING: multiple messages have this Message-ID (diff)
From: "H. Peter Anvin" <hpa@zytor.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-mips <linux-mips@linux-mips.org>,
	"Jeremy Fitzhardinge" <jeremy@goop.org>,
	"H. J. Lu" <hjl.tools@gmail.com>,
	"Frederic Weisbecker" <fweisbec@gmail.com>,
	"Joe Millenbach" <jmillenbach@gmail.com>,
	virtualization <virtualization@lists.linux-foundation.org>,
	"Gokul Caushik" <caushik1@gmail.com>,
	"Ralf Baechle" <ralf@linux-mips.org>,
	"Pavel Machek" <pavel@ucw.cz>,
	sparclinux@vger.kernel.org, "Christoph Lameter" <cl@linux.com>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Ville Syrjälä" <ville.syrjala@linux.intel.com>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"Lee Schermerhorn" <Lee.Schermerhorn@hp.com>,
	"Xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	"Russell King" <linux@arm.linux.org.uk>,
	"Len Brown" <len.brown@intel.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	"Stefano Stabellini" <stefano.stabellini@eu.citrix.com>
Subject: Re: [GIT PULL] x86/mm changes for v3.9-rc1
Date: Fri, 22 Feb 2013 09:31:27 -0800	[thread overview]
Message-ID: <5127AB6F.3060806@zytor.com> (raw)
In-Reply-To: <CA+55aFy=tW2X4O-qKLh_YQjSFX7aBaBme4uy8kxawn1koKdt-g@mail.gmail.com>

On 02/22/2013 08:22 AM, Linus Torvalds wrote:
>
> Ugh. So I've tried to walk through this, and it's painful. If this
> results in problems, we're going to be *so* screwed. Is it bisectable?
>

I can't tell you for sure that it is bisectable at every point.  There 
are definite bisection points in there, though, as this is several 
pieces of work from two kernel cycles that were independently tested.

> I also don't understand how "early_idt_handler" could *possibly* work.
> In particular, it seems to rely on the trap number being set up in the
> stack frame:
>
>          cmpl $14,72(%rsp)       # Page fault?
>
> but that's not even *true*. Why? Because we export both the
> early_idt_handlers[] array (that sets up the trap number and makes the
> stack frame be reliable) and the single early_idt_handler function
> (that relies on the trap number and the reliable stack frame), AND
> AFAIK WE USE THE LATTER!
>
> See x86_64_start_kernel():
>
>          for (i = 0; i < NUM_EXCEPTION_VECTORS; i++) {
> #ifdef CONFIG_EARLY_PRINTK
>                  set_intr_gate(i, &early_idt_handlers[i]);
> #else
>                  set_intr_gate(i, early_idt_handler);
> #endif
>          }
>
> so unless you have CONFIG_EARLY_PRINTK, the interrupt gate will point
> to that raw early_idt_handler function that doesn't *work* on its own,
> afaik.
>

This is a (pre-existing!) bug that absolutely needs to be fixed, which 
ought to break other things too (early use of *msr_safe for example, or 
anything else that relies on an early exception entry, which there 
aren't a lot of so far).  The fix is simple and obvious.
But you're right... what the heck is going on here?

My own testing would probably not have caught this, as I consider 
EARLY_PRINTK a must have, but Ingo's test machines definitely would have.

> Btw, it's not just the page fault index testing that is wrong. The whole
>
>          cmpl $__KERNEL_CS,96(%rsp)
>          jne 11f
>
> also relies on the stack frame being set up the same way for all
> exceptions - which again is only true if we ran through the
> early_idt_handlers[] prologue that added the extra stack entry.
>
> How does this even work for me? I don't have EARLY_PRINTK enabled.
>
> What am I missing?

I just ran a simulation without EARLY_PRINTK, presumably based on the 
memory layout, we can apparently go through the entire bootup sequence 
without actually ever taking an early trap.  It is a bug, though, and it 
is a bug even without this patchset.  I will submit a fix.  However, the 
Xen "we tested this, this worked, now it doesn't" worries me a lot.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

  reply	other threads:[~2013-02-22 17:35 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-22  0:34 [GIT PULL] x86/mm changes for v3.9-rc1 H. Peter Anvin
2013-02-22  0:34 ` H. Peter Anvin
2013-02-22  0:34 ` H. Peter Anvin
2013-02-22 16:22 ` Linus Torvalds
2013-02-22 16:22   ` Linus Torvalds
2013-02-22 17:31   ` H. Peter Anvin [this message]
2013-02-22 17:31     ` H. Peter Anvin
2013-02-22 16:55 ` Konrad Rzeszutek Wilk
2013-02-22 16:55   ` Konrad Rzeszutek Wilk
2013-02-22 17:12   ` H. Peter Anvin
2013-02-22 17:12     ` H. Peter Anvin
2013-02-22 17:38     ` Konrad Rzeszutek Wilk
2013-02-22 17:38       ` Konrad Rzeszutek Wilk
2013-02-22 18:06       ` Stefano Stabellini
2013-02-22 18:06       ` Stefano Stabellini
2013-02-22 18:06         ` Stefano Stabellini
2013-02-22 18:22         ` Yinghai Lu
2013-02-22 18:22           ` Yinghai Lu
2013-02-22 18:24           ` H. Peter Anvin
2013-02-22 18:24           ` H. Peter Anvin
2013-02-22 18:24           ` H. Peter Anvin
2013-02-22 18:08       ` Yinghai Lu
2013-02-22 18:08         ` Yinghai Lu
2013-02-22 17:24   ` Konrad Rzeszutek Wilk
2013-02-22 17:24     ` Konrad Rzeszutek Wilk
2013-02-22 17:24     ` Konrad Rzeszutek Wilk
2013-02-22 17:30     ` H. Peter Anvin
2013-02-22 17:30       ` H. Peter Anvin
2013-02-22 17:30     ` H. Peter Anvin
2013-02-22 17:53     ` Yinghai Lu
2013-02-22 17:53       ` Yinghai Lu
2013-02-22 18:23       ` Konrad Rzeszutek Wilk
2013-02-22 18:23         ` Konrad Rzeszutek Wilk
2013-02-22 18:25       ` [Xen-devel] " Andrew Cooper
2013-02-22 18:25         ` Andrew Cooper
2013-02-22 17:30   ` Dave Hansen
2013-02-22 17:30     ` Dave Hansen
2013-02-22 17:33     ` H. Peter Anvin
2013-02-22 17:33       ` H. Peter Anvin
     [not found] <CAE9FiQX5r02Prsw-f0HsgLVJ0FZeYL9aggXebwWR-E5oYsj6cw@mail.gmail.com>
     [not found] ` <5127C620.2040605@linux.intel.com>
     [not found]   ` <alpine.DEB.2.02.1302221929040.22997@kaball.uk.xensource.com>
     [not found]     ` <5127CE65.5010703@linux.intel.com>
     [not found]       ` <CAE9FiQXHHFb0W+aJCsefRNj4p+X1+m8JOLUDz74w-nAjnhym+A@mail.gmail.com>
     [not found]         ` <5127DDB3.2010309@zytor.com>
     [not found]           ` <CAE9FiQWctM60VwXJYtOXwPBNLUoz966Fr1g6MPsPoJBiye88YQ@mail.gmail.com>
     [not found]             ` <5127FBA4.1040506@zytor.com>
     [not found]               ` <20130223003738.GA23545@phenom.dumpdata.com>
     [not found]                 ` <CAE9FiQVTbDkvU8KQGVoYv3kn6UeCTcdiPA2hvw21OKtNbM=XKg@mail.gmail.com>
     [not found]                   ` <20130223012716.GA28377@phenom.dumpdata.com>
2013-02-23  1:39                     ` Yinghai Lu
2013-02-23 19:43                       ` Konrad Rzeszutek Wilk
2013-02-23 21:37                         ` Yinghai Lu
2013-02-23 21:42                           ` H. Peter Anvin
  -- strict thread matches above, loose matches on Subject: below --
2013-02-22  0:34 H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5127AB6F.3060806@zytor.com \
    --to=hpa@zytor.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.h.duyck@intel.com \
    --cc=andrzej.p@samsung.com \
    --cc=arnd@arndb.de \
    --cc=avi@redhat.com \
    --cc=bp@alien8.de \
    --cc=bp@suse.de \
    --cc=caushik1@gmail.com \
    --cc=cl@linux.com \
    --cc=daniel@numascale-asia.com \
    --cc=dave@linux.vnet.ibm.com \
    --cc=davem@davemloft.net \
    --cc=ebiederm@xmission.com \
    --cc=fenghua.yu@intel.com \
    --cc=fweisbec@gmail.com \
    --cc=gleb@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hjl.tools@gmail.com \
    --cc=hpa@linux.intel.com \
    --cc=hughd@google.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=jacob.shin@amd.com \
    --cc=jamie@shareable.org \
    --cc=jarkko.sakkinen@intel.com \
    --cc=jeremy@goop.org \
    --cc=jmillenbach@gmail.com \
    --cc=joro@8bytes.org \
    --cc=josh@joshtriplett.org \
    --cc=konrad.wilk@oracle.com \
    --cc=kyungmin.park@samsung.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@linux-mips.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=m.szyprowski@samsung.com \
    --cc=matt.fleming@intel.com \
    --cc=mgorman@suse.de \
    --cc=mingo@elte.hu \
    --cc=mingo@kernel.org \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pavel@ucw.cz \
    --cc=penberg@kernel.org \
    --cc=pjt@google.com \
    --cc=ralf@linux-mips.org \
    --cc=riel@redhat.com \
    --cc=rjw@sisk.pl \
    --cc=rob@landley.net \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=shuah.khan@hp.com \
    --cc=shuahkhan@gmail.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=ville.syrjala@linux.intel.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xen-devel@lists.xensource.com \
    --cc=yinghai@kernel.org \
    --cc=zamsden@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.