public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>,
	Brian Gerst <brgerst@gmail.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>, Pavel Machek <pavel@ucw.cz>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>, X86 ML <x86@kernel.org>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	Borislav Petkov <bp@alien8.de>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] x86: General protection fault after STR (32 bit systems only)
Date: Sat, 13 Jun 2015 09:00:46 +0200	[thread overview]
Message-ID: <20150613070046.GA26502@gmail.com> (raw)
In-Reply-To: <1434133883.2353.25.camel@spandruv-DESK3.jf.intel.com>


* Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> wrote:

> On Fri, 2015-06-12 at 11:11 -0700, Andy Lutomirski wrote:
> > On Fri, Jun 12, 2015 at 8:48 AM, Brian Gerst <brgerst@gmail.com> wrote:
> > > On Fri, Jun 12, 2015 at 4:36 AM, Ingo Molnar <mingo@kernel.org> wrote:
> > >>
> > >> * H. Peter Anvin <hpa@zytor.com> wrote:
> > >>
> > >>> %es is used implicitly by string instructions.
> > >>
> > >> Ok, so we are probably better off reloading ES as well early, right
> > >> when we return from the firmware, just in case something does
> > >> a copy before we hit the ES restore in restore_processor_state(),
> > >> which is a generic C function?
> > >>
> > >> Something like the patch below?
> > >>
> > >> I also added FS/GS/SS reloading to make it complete. If this (or a variant
> > >> thereof, it's still totally untested) works then we can remove the segment
> > >> save/restore layer in __save/restore_processor_state().
> > >>
> > >> Thanks,
> > >>
> > >>         Ingo
> > >>
> > >> ===========>
> > >>  arch/x86/kernel/acpi/wakeup_32.S | 13 +++++++++++++
> > >>  1 file changed, 13 insertions(+)
> > >>
> > >> diff --git a/arch/x86/kernel/acpi/wakeup_32.S b/arch/x86/kernel/acpi/wakeup_32.S
> > >> index 665c6b7d2ea9..1376a7fc21b7 100644
> > >> --- a/arch/x86/kernel/acpi/wakeup_32.S
> > >> +++ b/arch/x86/kernel/acpi/wakeup_32.S
> > >> @@ -61,6 +61,19 @@ ENTRY(wakeup_pmode_return)
> > >>
> > >>
> > >>  restore_registers:
> > >> +       /*
> > >> +        * In case the BIOS corrupted our segment descriptors,
> > >> +        * reload them to clear out any shadow descriptor
> > >> +        * state:
> > >> +        */
> > >> +       movl    $__USER_DS, %eax
> > >> +       movl    %eax, %ds
> > >> +       movl    %eax, %es
> > >> +       movl    %eax, %fs
> > >> +       movl    %eax, %gs
> > >> +       movl    $__KERNEL_DS, %eax
> > >> +       movl    %eax, %ss
> > >> +
> > >>         movl    saved_context_ebp, %ebp
> > >>         movl    saved_context_ebx, %ebx
> > >>         movl    saved_context_esi, %esi
> > >
> > > If you follow the convoluted flow of the calls in this file,
> > > wakeup_pmode_return is the first thing called from the trampoline on
> > > resume, and that loads the data segments with __KERNEL_DS.  The better
> > > fix would be to set ds/es to __USER_DS there.  Also since we are in
> > > the kernel, fs is fixed at __KERNEL_PERCPU, and gs is either
> > > __KERNEL_STACK_CANARY or user's gs.
> > 
> > So why is this issue rare?  Is it just that the first entry to
> > userspace after resume is only very rarely via SYSEXIT?  Or is there
> > some other code path that usually, but not quite always, fixes up the
> > segment registers?
>
> We spent a quite a bit of time debugging this. Any small debug code or printk 
> causes problem to disappear. We noticed that we can reproduce this more 
> frequently when the we run just one user space application to just do 
> suspend/resume test, which supports yours theory.

So what happens if you add a couple of CPUID (or NOP) calls? If it's timing 
related, not segment register related, then that might 'fix' the bug too.

Which would suggest that it might be some hardware problem, or a race with 
something (which could not really be on the Linux side, as we are the only thread 
of execution at this point)?

Thanks,

	Ingo

  reply	other threads:[~2015-06-13  7:01 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-11 23:45 [PATCH] x86: General protection fault after STR (32 bit systems only) Srinivas Pandruvada
2015-06-12  6:07 ` Ingo Molnar
2015-06-12  6:48   ` Andy Lutomirski
2015-06-12  7:15     ` Ingo Molnar
2015-06-12  7:41   ` Andy Lutomirski
2015-06-12  7:50     ` Ingo Molnar
2015-06-12  8:15       ` H. Peter Anvin
2015-06-12  8:36         ` Ingo Molnar
2015-06-12 15:48           ` Brian Gerst
2015-06-12 18:11             ` Andy Lutomirski
2015-06-12 18:31               ` Srinivas Pandruvada
2015-06-13  7:00                 ` Ingo Molnar [this message]
2015-06-12 22:45             ` Denys Vlasenko
2015-06-13 14:20               ` Pavel Machek
2015-06-13  7:03             ` Ingo Molnar
2015-06-13 18:23               ` Andy Lutomirski
2015-06-13 21:30                 ` Brian Gerst
2015-06-14  6:56                   ` [PATCH] x86: Load __USER_DS into DS/ES after resume Ingo Molnar
2015-06-14  7:03                     ` Pavel Machek
2015-06-12 16:15   ` [PATCH] x86: General protection fault after STR (32 bit systems only) Srinivas Pandruvada
2015-06-13  7:15     ` [PATCH, DEBUG] x86/32: Add small delay after resume Ingo Molnar
2015-06-15 16:10       ` Srinivas Pandruvada
2015-06-16 21:33         ` H. Peter Anvin
2015-06-16 22:25           ` Srinivas Pandruvada
2015-06-17 16:33           ` Konrad Rzeszutek Wilk
2015-06-17 17:22             ` H. Peter Anvin
2015-06-17 18:29               ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150613070046.GA26502@gmail.com \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dvlasenk@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=pavel@ucw.cz \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox