From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Lagerwall Subject: Re: [PATCH v2] x86: Fixup IRQs when CPUs go down during shutdown Date: Wed, 2 Dec 2015 15:09:29 +0000 Message-ID: <565F09A9.9000408@citrix.com> References: <1449063981-31486-1-git-send-email-ross.lagerwall@citrix.com> <565F081502000078000BB4A1@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <565F081502000078000BB4A1@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Andrew Cooper , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 12/02/2015 02:02 PM, Jan Beulich wrote: >>>> On 02.12.15 at 14:46, wrote: >> Commit fc0c3fa2ad5c ("x86/IO-APIC: fix setup of Xen internally used IRQs >> (take 2)") introduced a regression on some hardware where Xen would hang >> during shutdown, repeating the following message: >> APIC error on CPU0: 08(08), Receive accept error >> >> This appears to be because an interrupt (in this case from the serial >> console) destined for a CPU other than the boot CPU is left unhandled so >> an APIC error on CPU 0 is generated instead. >> >> To fix this, before taking down the non-boot CPUs, call fixup_irqs() >> with a CPU mask of only the boot CPU to reset the IRQ affinities >> correctly. >> >> Signed-off-by: Ross Lagerwall >> --- > > Even though in this case interested people may know, missing info > on changes from previous version here. > >> +/* CPU(s) have been removed from mask. Re-set irq affinities. */ >> +void fixup_irqs(const cpumask_t *mask, bool_t verbose) > > The comment doesn't match reality. And I wonder whether it > wouldn't be reasonable to imply "verbose" (either from mask > equaling &cpu_online_map, or by introducing SYS_STATE_shutdown > and/or SYS_STATE_reboot). I considered introducing a new SYS_STATE but decided that a function parameter was clearer rather than implying it from some other global state. > >> @@ -2385,16 +2382,27 @@ void fixup_irqs(void) >> >> spin_unlock(&desc->lock); >> >> - if ( break_affinity && set_affinity ) >> - printk("Broke affinity for irq %i\n", irq); >> - else if ( !set_affinity ) >> - printk("Cannot set affinity for irq %i\n", irq); >> + if ( verbose ) >> + { >> + if ( break_affinity && set_affinity ) >> + printk("Broke affinity for irq %i\n", irq); >> + else if ( !set_affinity ) >> + printk("Cannot set affinity for irq %i\n", irq); >> + } > > How about > > if ( !verbose ) > continue; > > limiting churn on code? OK. > >> --- a/xen/arch/x86/smp.c >> +++ b/xen/arch/x86/smp.c >> @@ -286,6 +286,7 @@ void __stop_this_cpu(void) >> >> static void stop_this_cpu(void *dummy) >> { >> + fixup_eoi(); >> __stop_this_cpu(); > > Is this really needed during shutdown? Possibly not, but I think it's cleaner to do the same as what is used for CPU down. > >> @@ -298,6 +299,13 @@ static void stop_this_cpu(void *dummy) >> void smp_send_stop(void) >> { >> int timeout = 10; >> + cpumask_t online; >> + >> + cpumask_clear(&online); >> + cpumask_set_cpu(smp_processor_id(), &online); > > That's what we have cpumask_of() for. > OK. -- Ross Lagerwall