From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Winchell Subject: Re: [PATCH] add memory barriers to IPI related code Date: Mon, 27 Oct 2008 10:28:02 -0400 Message-ID: <4905CFF2.3010102@virtualiron.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel Cc: "Tian, Kevin" , Dave Winchell , Keir Fraser , "Nakajima, Jun" List-Id: xen-devel@lists.xenproject.org After discussions with Kevin and other folks at Intel, and some experimentation here, I am convinced that this patch is not needed. The reason for my hangs was that I had the initial cs for x2apic, but not the second one, which had the crucial mb() in the send_IPI_mask_x2apic. The mb()s I added to fix the problem were more than the minimum required. Indeed it appears that taking an interrupt implies a mb() on ia32. Thanks to everyone who helped me with this, especially Kevin Tian. -Dave Keir Fraser wrote: >I'd like some advice from Intel on the need for all these barriers. I'm >pretty sure that interrupts are supposed to imply memory barriers on the x86 >architecture. It's not beyond the realm of possibility that you're running >on a buggy SDP. > >You certainly should not need barriers on the sending side, since (apart >from x2apic, which already has an explicit mb()) the send is performed by a >memory write, and stores are guaranteed to occur in-order on x86 >architecture. So e.g., call_data must be visible before the IPI occurs in >smp_call_function(). > >And on the receiving side, it's hard to see how a read would appear to occur >before the interrupt which transfers control to the IPI handler. Unless >there's really weird cache synchronisation magic going on. > > -- Keir > >On 23/10/08 19:23, "Dave Winchell" wrote: > > > >>Hi Keir, >> >>We have been seeing some hangs in call function and >>tlb shootdown. In our case the machine was an Intel Tylersburg. >> >>One failure mode is this early return in smp_call_function_interrupt: >> >> if ( !cpu_isset(smp_processor_id(), call_data->selected) ) >> return; >> >>Without a mb(), the target processor takes this path as it doesn't yet >>see the update >>to call_data->selected from the initiating processor. Then the initiator >>waits for ever >>to get an acknowledgment from this processor. >> >>On the initiating side, I added the mb()'s to the send_IPI functions, >>x2apic already >>having one. >> >>On the target side, I added the mb()'s in the individual target functions. >> >>Regards, >>Dave >> >>diff -r 4129f0f2f2ba xen/arch/x86/smp.c >>--- a/xen/arch/x86/smp.c Fri Oct 17 14:15:37 2008 +0100 >>+++ b/xen/arch/x86/smp.c Mon Oct 20 10:53:24 2008 -0400 >>@@ -93,6 +93,8 @@ void send_IPI_mask_flat(cpumask_t cpumas >> /* An IPI with no target generates a send accept error from P5/P6 APICs. >>*/ >> WARN_ON(mask == 0); >> >>+ mb(); >>+ >> local_irq_save(flags); >> >> /* >>@@ -123,6 +125,8 @@ void send_IPI_mask_phys(cpumask_t mask, >> { >> unsigned long cfg, flags; >> unsigned int query_cpu; >>+ >>+ mb(); >> >> local_irq_save(flags); >> >>@@ -160,6 +164,7 @@ static unsigned int flush_flags; >> >> fastcall void smp_invalidate_interrupt(void) >> { >>+ mb(); >> ack_APIC_irq(); >> perfc_incr(ipis); >> irq_enter(); >>@@ -337,17 +342,22 @@ void smp_send_stop(void) >> >> fastcall void smp_event_check_interrupt(struct cpu_user_regs *regs) >> { >>+ mb(); >> ack_APIC_irq(); >> perfc_incr(ipis); >> } >> >> fastcall void smp_call_function_interrupt(struct cpu_user_regs *regs) >> { >>- void (*func)(void *info) = call_data->func; >>- void *info = call_data->info; >>- >>+ void (*func)(void *info); >>+ void *info; >>+ >>+ mb(); >> ack_APIC_irq(); >> perfc_incr(ipis); >>+ >>+ func = call_data->func; >>+ info = call_data->info; >> >> if ( !cpu_isset(smp_processor_id(), call_data->selected) ) >> return; >> >> > > > >