From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761906AbXEaUDm (ORCPT ); Thu, 31 May 2007 16:03:42 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1759762AbXEaUDg (ORCPT ); Thu, 31 May 2007 16:03:36 -0400 Received: from mga03.intel.com ([143.182.124.21]:52178 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759094AbXEaUDf (ORCPT ); Thu, 31 May 2007 16:03:35 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.16,370,1175497200"; d="scan'208";a="234224217" Date: Thu, 31 May 2007 13:00:08 -0700 From: "Siddha, Suresh B" To: "Eric W. Biederman" Cc: Andrew Morton , Ingo Molnar , "Siddha, Suresh B" , ak@suse.de, linux-kernel@vger.kernel.org, nanhai.zou@intel.com, asit.k.mallick@intel.com, keith.packard@intel.com, Yinghai Lu Subject: Re: [PATCH] x86_64 irq: check remote IRR bit before migrating level triggered irq (v3) Message-ID: <20070531200008.GA17143@linux-os.sc.intel.com> References: <20070517230324.GB8089@linux-os.sc.intel.com> <20070531133426.GA30616@elte.hu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 31, 2007 at 07:50:58AM -0600, Eric W. Biederman wrote: > > On x86_64 kernel, level triggered irq migration gets initiated in the context > of that interrupt(after executing the irq handler) and following steps are > followed to do the irq migration. > > 1. mask IOAPIC RTE entry; // write to IOAPIC RTE > 2. EOI; // processor EOI write > 3. reprogram IOAPIC RTE entry // write to IOAPIC RTE with new destination and > // and interrupt vector due to per cpu vector > // allocation. > 4. unmask IOAPIC RTE entry; // write to IOAPIC RTE > > Because of the per cpu vector allocation in x86_64 kernels, when the irq > migrates to a different cpu, new vector(corresponding to the new cpu) will > get allocated. > > An EOI write to local APIC has a side effect of generating an EOI write > for level trigger interrupts (normally this is a broadcast to all IOAPICs). > The EOI broadcast generated as a side effect of EOI write to processor may > be delayed while the other IOAPIC writes (step 3 and 4) can go through. > > Normally, the EOI generated by local APIC for level trigger interrupt > contains vector number. The IOAPIC will take this vector number and > search the IOAPIC RTE entries for an entry with matching vector number and > clear the remote IRR bit (indicate EOI). However, if the vector number is > changed (as in step 3) the IOAPIC will not find the RTE entry when the EOI > is received later. This will cause the remote IRR to get stuck causing the > interrupt hang (no more interrupt from this RTE). > > Current x86_64 kernel assumes that remote IRR bit is cleared by the time > IOAPIC RTE is reprogrammed. Fix this assumption by checking for remote IRR > bit and if it still set, delay the irq migration to the next interrupt > arrival event(hopefully, next time remote IRR bit will get cleared > before the IOAPIC RTE is reprogrammed). > > Initial analysis and patch from Nanhai. > > Clean up patch from Suresh. > > Rewritten to be less intrusive, and to contain a big fat comment by Eric. Acked-by: Suresh Siddha Thanks Eric.