From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753171AbbCWSyo (ORCPT ); Mon, 23 Mar 2015 14:54:44 -0400 Received: from mail-we0-f182.google.com ([74.125.82.182]:35624 "EHLO mail-we0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752552AbbCWSyl (ORCPT ); Mon, 23 Mar 2015 14:54:41 -0400 Message-ID: <5510616C.4060206@message-id.googlemail.com> Date: Mon, 23 Mar 2015 19:54:36 +0100 From: Stefan Seyfried User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Andy Lutomirski , Denys Vlasenko CC: Takashi Iwai , Denys Vlasenko , Jiri Kosina , Linus Torvalds , X86 ML , LKML , Tejun Heo Subject: Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related? References: <5505400B.8050300@message-id.googlemail.com> <5509F161.3010101@redhat.com> <550AABCB.9040502@redhat.com> <550C6415.9050402@redhat.com> <55103A33.1060704@redhat.com> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am 23.03.2015 um 19:38 schrieb Andy Lutomirski: > I bet I see it. I have the advantage of having stared at KVM code and > cursed at it more recently than you, I suspect. KVM does awful, awful > things to CPU state, and, as an optimization, it allows kernel code to > run with CPU state that would be totally invalid in user mode. This > happens through a bunch of hooks, including this bit in __switch_to: > > /* > * Now maybe reload the debug registers and handle I/O bitmaps > */ > if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT || > task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV)) > __switch_to_xtra(prev_p, next_p, tss); > > IOW, we *change* tif during context switches. > > > The race looks like this: > > testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP) > jnz int_ret_from_sys_call_fixup /* Go the the slow path */ > > --- preempted here, switch to KVM guest --- > > KVM guest enters and screws up, say, MSR_SYSCALL_MASK. This wouldn't > happen to be a *32-bit* KVM guest, perhaps? not in my case (penryn CPU), there it was 64bit guests. > Now KVM schedules, calling __switch_to. __switch_to sets > _TIF_USER_RETURN_NOTIFY. We IRET back to the syscall exit code, turn > off interrupts, and do sysret. We are now screwed. > > I don't know why this manifests in this particular failure, but any > number of terrible things could happen now. > > FWIW, this will affect things other than KVM. For example, SIGKILL > sent while a process is sleeping in that two-instruction window won't > work. > > Takashi, can you re-send your patch so we can review it for real in > light of this race? -- Stefan Seyfried Linux Consultant & Developer -- GPG Key: 0x731B665B B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537