From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752109AbbCRWij (ORCPT <rfc822;w@1wt.eu>);
	Wed, 18 Mar 2015 18:38:39 -0400
Received: from mail-wg0-f47.google.com ([74.125.82.47]:35959 "EHLO
	mail-wg0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1750822AbbCRWih (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Wed, 18 Mar 2015 18:38:37 -0400
Message-ID: <5509FE69.3060002@message-id.googlemail.com>
Date: Wed, 18 Mar 2015 23:38:33 +0100
From: Stefan Seyfried <stefan.seyfried@googlemail.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: Andy Lutomirski <luto@amacapital.net>, Jiri Kosina <jkosina@suse.cz>
CC: Denys Vlasenko <dvlasenk@redhat.com>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Takashi Iwai <tiwai@suse.de>, X86 ML <x86@kernel.org>,
        LKML <linux-kernel@vger.kernel.org>, Tejun Heo <tj@kernel.org>
Subject: Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
References: <5505400B.8050300@message-id.googlemail.com> <s5hr3smfl11.wl-tiwai@suse.de> <s5hy4mukxpj.wl-tiwai@suse.de> <s5h4mpi5hbv.wl-tiwai@suse.de> <CALCETrVCMEcCOHZ35LneCU6uGH+W5SF0groKbUGp2zTjWpzB0w@mail.gmail.com> <5509CBF7.3040602@message-id.googlemail.com> <CALCETrU2R020HVniX2sczxexPO2qhEPbS++9DXzcxeycgxoGQg@mail.gmail.com> <CA+55aFwT4BJVR10i2Cm8pMH0UGd-J3EwnEUYKf3BWTM0awebbA@mail.gmail.com> <5509F161.3010101@redhat.com> <CALCETrXZvSiT41+AYAPizSsGZ_=O=7wmb+Lwo_ChEZySxUnH-A@mail.gmail.com> <alpine.LRH.2.00.1503182320490.13021@twin.jikos.cz> <CALCETrVnJHXhz81QCr7qmm0uwdw2t0EWe_zUw4E7bZB2WXQNTQ@mail.gmail.com>
In-Reply-To: <CALCETrVnJHXhz81QCr7qmm0uwdw2t0EWe_zUw4E7bZB2WXQNTQ@mail.gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Am 18.03.2015 um 23:29 schrieb Andy Lutomirski:
> On Wed, Mar 18, 2015 at 3:22 PM, Jiri Kosina <jkosina@suse.cz> wrote:
>> On Wed, 18 Mar 2015, Andy Lutomirski wrote:
>>
>>> sysret64 can only fail with #GP, and we're totally screwed if that
>>> happens,
>>
>> But what if the GPF handler pagefaults afterwards? It'd be operating on
>> user stack already.
> 
> Good point.
> 
> Stefan, can you try changing the first "jne
> opportunistic_sysret_failed" to "jmp opportunistic_sysret_failed" in
> entry_64.S and seeing if you can reproduce this?  (Is it easy enough
> to reproduce that this would tell us anything?)

I have no good way of reproducing the issue (happens once per week...)
but apparently Takashi has, so I'd like to hand this task over to him.

> It's a shame that double_fault doesn't record what gs was on entry.
> If we did sysret -> general_protection -> page_fault -> double_fault,
> then we'd enter double_fault with usergs, whereas syscall ->
> page_fault -> double_fault would enter double_fault with kernelgs.
> 
> Hmm.  We may be able to answer this more directly.  Stefan, can you
> dump a couple hundred bytes starting at 0x00007fffa55eafb8 (i.e. your
> page_fault stack at the time of the failure)?  That will tell us the
> faulting address.  If that fails, try starting at 00007fffa55eb000
> instead.

Unfortunately not, is this userspace memory? It's not in the dump I have.
This issue is the first I have seen where having a full dump would be
really helpful apart from cosmetic reasons...
-- 
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537