From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933060Ab0I0ORY (ORCPT ); Mon, 27 Sep 2010 10:17:24 -0400 Received: from mx1.redhat.com ([209.132.183.28]:30191 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932938Ab0I0ORX (ORCPT ); Mon, 27 Sep 2010 10:17:23 -0400 Message-ID: <4CA0A76B.6000803@redhat.com> Date: Mon, 27 Sep 2010 16:17:15 +0200 From: Avi Kivity User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100921 Fedora/3.1.4-1.fc13 Lightning/1.0b3pre Thunderbird/3.1.4 MIME-Version: 1.0 To: Joerg Roedel CC: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr References: <1284913699-14986-1-git-send-email-avi@redhat.com> <20100927103128.GO15338@8bytes.org> In-Reply-To: <20100927103128.GO15338@8bytes.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/27/2010 12:31 PM, Joerg Roedel wrote: > On Sun, Sep 19, 2010 at 06:28:19PM +0200, Avi Kivity wrote: > > On machines without monitor/mwait we use an sti; hlt sequence to atomically > > enable interrupts and put the cpu to sleep. The sequence uses the "interrupt > > shadow" property of the sti instruction: interrupts are enabled only after > > the instruction following sti has been executed. This means an interrupt > > cannot happen in the middle of the sequence, which would leave us with > > the interrupt processed but the cpu halted. > > > > The interrupt shadow, however, can be broken by an nmi; the following > > sequence > > > > sti > > nmi ... iret > > # interrupt shadow disabled > > intr ... iret > > hlt > > > > puts the cpu to sleep, even though the interrupt may need additional > > processing after the hlt (like scheduling a task). > > Doesn't the interrupt return path check for a re-schedule condition > before iret? So to my believe the handler would not jump back to the > idle task if something else becomes running in the interrupt handler, > no? > Perhaps on preemptible kernels? But at least on non-preemptible kernels, you can't just switch tasks while running kernel code. void cpu_idle(void) { current_thread_info()->status |= TS_POLLING; /* * If we're the non-boot CPU, nothing set the stack canary up * for us. CPU0 already has it initialized but no harm in * doing it again. This is a good place for updating it, as * we wont ever return from this function (so the invalid * canaries already on the stack wont ever trigger). */ boot_init_stack_canary(); /* endless idle loop with no priority at all */ while (1) { tick_nohz_stop_sched_tick(1); while (!need_resched()) { rmb(); if (cpu_is_offline(smp_processor_id())) play_dead(); /* * Idle routines should keep interrupts disabled * from here on, until they go to idle. * Otherwise, idle callbacks can misfire. */ local_irq_disable(); enter_idle(); /* Don't trace irqs off for idle */ stop_critical_timings(); pm_idle(); start_critical_timings(); trace_power_end(smp_processor_id()); /* In many cases the interrupt that ended idle has already called exit_idle. But some idle loops can be woken up without interrupt. */ __exit_idle(); } tick_nohz_restart_sched_tick(); preempt_enable_no_resched(); schedule(); preempt_disable(); } } Looks like we rely on an explicit schedule() - pm_idle() is called with preemption disabled. (pm_idle eventually calls safe_halt() if no other idle method is used) -- error compiling committee.c: too many arguments to function