From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm@xmission.com (Eric W. Biederman) Subject: Re: [PATCH v3] enable x2APIC without interrupt remapping under KVM Date: Wed, 01 Jul 2009 17:17:57 -0700 Message-ID: References: <20090629132926.GB20289@redhat.com> <20090630092623.GI20289@redhat.com> <4A4A476C.2070305@redhat.com> <4A4A6499.9000406@redhat.com> <1246488808.27006.10691.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Avi Kivity , Gleb Natapov , "linux-kernel\@vger.kernel.org" , Sheng Yang , "kvm\@vger.kernel.org" , garyhade@us.ibm.com To: suresh.b.siddha@intel.com Return-path: Received: from out02.mta.xmission.com ([166.70.13.232]:38912 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750969AbZGBASB (ORCPT ); Wed, 1 Jul 2009 20:18:01 -0400 In-Reply-To: <1246488808.27006.10691.camel@localhost.localdomain> (Suresh Siddha's message of "Wed\, 01 Jul 2009 15\:53\:28 -0700") Sender: kvm-owner@vger.kernel.org List-ID: Suresh Siddha writes: > On Tue, 2009-06-30 at 12:36 -0700, Eric W. Biederman wrote: >> Dropped irqs.. Driver hangs because it is waiting for an irq. Hardware >> hangs because it is waiting for the cpu to process the irq. >> >> Potentially we get a level triggered irq that is never acked by >> the cpu that won't arm until the cpu send an ack, and we can't >> send an ack from another cpu. > > Eric, > > Among number of experiments you have tried in the past to fix this, have > you tried the experiment of explicitly clearing the remoteIRR by > changing the trigger mode to edge and then back to level. > > Is there a problem with this? The problem I had wasn't remoteIRR getting stuck, but the symptoms were largely the same. I did try changing the trigger mode to edge and back and that did not unstick the ioapic in all cases. > We can send a spurious IPI (after the RTE migration) with the new vector > to another cpu and handler which services the level interrupt will check > if we saw the edge mode for a level interrupt and then the handler can > explicitly restore the level trigger and reset the remote IRR by mask > +edge and unmask+level. > > We might have to work with some rough edges but do you recollect any > major issue with this approach.. This is coming up enough recently I expect it is time to cook up a patch that does the ioapic migration in process context plus some user space code that stress tests things. Just so people can repeat my experiments and see what I am trying to avoid. Eric