From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754834Ab1LVDVL (ORCPT ); Wed, 21 Dec 2011 22:21:11 -0500 Received: from hrndva-omtalb.mail.rr.com ([71.74.56.124]:51590 "EHLO hrndva-omtalb.mail.rr.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753197Ab1LVDTx (ORCPT ); Wed, 21 Dec 2011 22:19:53 -0500 X-Authority-Analysis: v=2.0 cv=SqgSGYy0 c=1 sm=0 a=ZycB6UtQUfgMyuk2+PxD7w==:17 a=vhdKIqpQuCYA:10 a=Mgm7jKRx85EA:10 a=5SG0PmZfjMsA:10 a=bbbx4UPp9XUA:10 a=Z4Rwk6OoAAAA:8 a=VwQbUJbxAAAA:8 a=pGLkceISAAAA:8 a=QyXUC8HyAAAA:8 a=20KFwNOVAAAA:8 a=JfrnYn6hAAAA:8 a=1XWaLZrsAAAA:8 a=meVymXHHAAAA:8 a=a8jRQv5W5aWhhUlY0_oA:9 a=HPBRQbDe49wkOcnyWS8A:7 a=QEXdDO2ut3YA:10 a=jbrJJM5MRmoA:10 a=jEp0ucaQiEUA:10 a=3Rfx1nUSh_UA:10 a=MSl-tDqOz04A:10 a=Zh68SRI7RUMA:10 a=UTB_XpHje0EA:10 a=jeBq3FmKZ4MA:10 a=ovoLPEPDs08d-ujfI44A:9 a=ZycB6UtQUfgMyuk2+PxD7w==:117 X-Cloudmark-Score: 0 X-Originating-IP: 74.67.80.29 Message-Id: <20111222031949.957412785@goodmis.org> User-Agent: quilt/0.48-1 Date: Wed, 21 Dec 2011 22:13:58 -0500 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Ingo Molnar , Andrew Morton , Thomas Gleixner , Peter Zijlstra , Frederic Weisbecker , Linus Torvalds , "H. Peter Anvin" , Mathieu Desnoyers , Andi Kleen , Andi Kleen , Ingo Molnar , "H. Peter Anvin" , Paul Turner Subject: [PATCH 1/6 v3] x86: Do not schedule while still in NMI context References: <20111222031357.120841629@goodmis.org> Content-Disposition: inline; filename=0001-x86-Do-not-schedule-while-still-in-NMI-context.patch Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="00GvhwF7k39YY" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --00GvhwF7k39YY Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable From: Linus Torvalds The NMI handler uses the paranoid_exit routine that checks the NEED_RESCHED flag, and if it is set and the return is for userspace, then interrupts are enabled, the stack is swapped to the thread's stack, and schedule is called. The problem with this is that we are still in an NMI context until an iret is executed. This means that any new NMIs are now starved until an interrupt or exception occurs and does the iret. As NMIs can not be masked and can interrupt any location, they are treated as a special case. NEED_RESCHED should not be set in an NMI handler. The interruption by the NMI should not disturb the work flow for scheduling. Any IPI sent to a processor after sending the NEED_RESCHED would have to wait for the NMI anyway, and after the IPI finishes the schedule would be called as required. There is no reason to do anything special leaving an NMI. Remove the call to paranoid_exit and do a simple return. This not only fixes the bug of starved NMIs, but it also cleans up the code. Link: http://lkml.kernel.org/r/CA+55aFzgM55hXTs4griX5e9=3Dv_O+=3Due+7Rj0PTD= =3DM7hFYpyULQ@mail.gmail.com Acked-by: Andi Kleen Cc: Ingo Molnar Cc: Peter Zijlstra Cc: "H. Peter Anvin" Cc: Frederic Weisbecker Cc: Thomas Gleixner Cc: Paul Turner Signed-off-by: Linus Torvalds Signed-off-by: Steven Rostedt --- arch/x86/kernel/entry_64.S | 32 -------------------------------- 1 files changed, 0 insertions(+), 32 deletions(-) diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S index faf8d5e..3819ea9 100644 --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -1489,46 +1489,14 @@ ENTRY(nmi) movq %rsp,%rdi movq $-1,%rsi call do_nmi -#ifdef CONFIG_TRACE_IRQFLAGS - /* paranoidexit; without TRACE_IRQS_OFF */ - /* ebx: no swapgs flag */ - DISABLE_INTERRUPTS(CLBR_NONE) testl %ebx,%ebx /* swapgs needed? */ jnz nmi_restore - testl $3,CS(%rsp) - jnz nmi_userspace nmi_swapgs: SWAPGS_UNSAFE_STACK nmi_restore: RESTORE_ALL 8 jmp irq_return -nmi_userspace: - GET_THREAD_INFO(%rcx) - movl TI_flags(%rcx),%ebx - andl $_TIF_WORK_MASK,%ebx - jz nmi_swapgs - movq %rsp,%rdi /* &pt_regs */ - call sync_regs - movq %rax,%rsp /* switch stack for scheduling */ - testl $_TIF_NEED_RESCHED,%ebx - jnz nmi_schedule - movl %ebx,%edx /* arg3: thread flags */ - ENABLE_INTERRUPTS(CLBR_NONE) - xorl %esi,%esi /* arg2: oldset */ - movq %rsp,%rdi /* arg1: &pt_regs */ - call do_notify_resume - DISABLE_INTERRUPTS(CLBR_NONE) - jmp nmi_userspace -nmi_schedule: - ENABLE_INTERRUPTS(CLBR_ANY) - call schedule - DISABLE_INTERRUPTS(CLBR_ANY) - jmp nmi_userspace CFI_ENDPROC -#else - jmp paranoid_exit - CFI_ENDPROC -#endif END(nmi) =20 ENTRY(ignore_sysret) --=20 1.7.7.3 --00GvhwF7k39YY Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJO8qHWAAoJEIy3vGnGbaoA96oP/18fxxPMlFGGbLpt5oS9zUPc 87msmW/s6cctkj+RwspsPyYTBGpw1D5M+F2CQjinc3bjBIp0cylholUYbQP0IpmP HaP40M91a6FpaleVdtYVDWkWiijs29odlDPDzog+1YFjqfzPNB1Jl3qHH8qSRTJM 4vH2sVZztIWYekIENJy0AFliH/Uxgxx7P/q9H/HhgKDhtuP7puQSnCDcCyXEeiyx 79cJUKeHnqaRa5Jl6w7xtzRzKLlS3Vu67OaKpMDM3Xuvc/93iIGTqWt2LXW+F1pn eG5z64Hc5YahFKYPyMZWIKQd+sPIC2dOxXL3XTchniwY99x1YC5/i99yIJIA/Ouh 8xD6nVU0Dm9FaduNXG6GGdbWJPEUOx8/W3G1xooqHx/UzqQwy14/YLv3Q3ieBf1U BtmbpU8w3KwZwd0e/wEXST9M4LSU1c3Hw9+Jzs8y881HhMXzGAgHnLdNS/6Mw343 AH8tZTiqO1BsAe4r4Dd+h9YtjBZLZwwPjiRpiCpELWkdE1BUAldj22C/ooHarHPf 2bbpzJpJvLYfutb9UC+lUGFwQNRTraG3MQDHQk0mqFHpbaGE24bwt8+8Pn09zdxT d1yKCSkfRq4OLc6mxwKrsb3R4ipciE5LH7XtayeakSfYWQbGza8F993PHSFcXZzR YSgB6Ad/pwH/1IzJSO76 =NCRj -----END PGP SIGNATURE----- --00GvhwF7k39YY--