From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752621AbbC1JjX (ORCPT ); Sat, 28 Mar 2015 05:39:23 -0400 Received: from mail-wi0-f171.google.com ([209.85.212.171]:35934 "EHLO mail-wi0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751272AbbC1JjV (ORCPT ); Sat, 28 Mar 2015 05:39:21 -0400 Date: Sat, 28 Mar 2015 10:39:16 +0100 From: Ingo Molnar To: Linus Torvalds Cc: Brian Gerst , Andy Lutomirski , Denys Vlasenko , Borislav Petkov , "linux-kernel@vger.kernel.org" , X86 ML Subject: Re: ia32_sysenter_target does not preserve EFLAGS Message-ID: <20150328093916.GA9900@gmail.com> References: <5515686B.3080204@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Linus Torvalds wrote: > On Fri, Mar 27, 2015 at 1:53 PM, Brian Gerst wrote: > >> <-- IRQ. Boom > > > > The sti will delay interrupts for one instruction, and that should include NMIs. > > Nope. Intel explicitly documents the NMI case only for mov->ss and popss. > > > The Intel SDM states for STI: > > "The IF flag and the STI and CLI instructions do not prohibit the > > generation of exceptions and NMI interrupts. NMI > > interrupts (and SMIs) may be blocked for one macroinstruction following an STI." > > Note the *may*. For movss and popss the software developer guide > explicitly says that NMI's are also blocked. > > For plain sti, it seems to be dependent on microarchitecture. Well, how about 'STI+HLT' aka safe_halt()? If an NMI is allowed after that STI then we might lose wakeups, or in extreme cases (with full-dynticks) might lock up for a long time until the next irq comes, even with runnable tasks around? Arguably that's a race condition that is not very easy to notice on a typical system. Random hypothesis: maybe Intel just messed up their STI shadow in a single (possibly ancient) microarchitecture in some rare situations and figured it could fix it cheaply via updating the documentation to match the breakage, not via actually fixing the CPU? Might be useful if someone from Intel could chime in. Thanks, Ingo