From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Yang, Sheng" <sheng.yang@intel.com>
Subject: Re: Remaining passthrough/VT-d tasks list
Date: Sun, 28 Sep 2008 13:54:55 +0800
Message-ID: <200809281354.56553.sheng.yang@intel.com>
References: <0122C7C995D32147B66BF4F440D3016301C49E61@pdsmsx415.ccr.corp.intel.com> <D8078B8B3B09934AA9F8F2D5FB3F28CE08873AF34F@pdsmsx502.ccr.corp.intel.com> <48DF1046.1050102@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain;
  charset="gb2312"
Content-Transfer-Encoding: 7bit
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	"Han, Weidong" <weidong.han@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Amit Shah <amit.shah@redhat.com>,
	"benami@il.ibm.com" <benami@il.ibm.com>,
	"muli@il.ibm.com" <muli@il.ibm.com>,
	"Kay, Allen M" <allen.m.kay@intel.com>,
	"Zhang, Xiantao" <xiantao.zhang@intel.com>,
	Eddie Dong <eddie.dong@intel.com>
To: Avi Kivity <avi@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mga11.intel.com ([192.55.52.93]:8897 "EHLO mga11.intel.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1750925AbYI1FyH (ORCPT <rfc822;kvm@vger.kernel.org>);
	Sun, 28 Sep 2008 01:54:07 -0400
In-Reply-To: <48DF1046.1050102@redhat.com>
Content-Disposition: inline
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Sunday 28 September 2008 13:04:06 Avi Kivity wrote:
> Tian, Kevin wrote:
> >> No. Maybe the Neocleus polarity trick (which also reduces performance).
> >
> > To my knowledge, Neocleus polarity trick can't solve this isolation
> > issue, which just provides one effecient way to track
> > assertion/deassertion transition on the irq line. For example, reverse
> > polarity when receiving an instance, and then a new irq instance would
> > occur when all devices de- assert on shared irq line, and then recover
> > the polarity. In your concerned case where guest driver misbehaves, this
> > polarity trick can't work neither as one device always asserts the line.
>
> You're right, I didn't think it through.
>
> If there was a standard way to mask pci irqs, it might have worked, but
> there isn't, unfortunately.
>
One purpose:

If we suffered from IRQ storm of one level triggered irq line, two possible: 
host issue or guest issue.

If it's a host issue, host should try to stop it. If it can't, the IRQ line 
would be disabled, and guest device also isn't functional. 

If it's a guest issue, guest should try to stop it, and prevent it from 
causing trouble in host. KVM should try best including disable guest device 
to do this. So guest device also won't functional.

Base on above theory, we can assume that IRQ storm caused by assigned guest 
device, and try to stop device from doing this. (Yeah, anyway, guest device 
won't survive).

I think we can brought a little QoS concept here(stolen from Eddie :) ). The 
assumption is, the normal rate of device deliver interrupts is much slower 
than a continuous level trigger if the EOI is wrote immediately. So we can do 
something with the gap.

Measure the calling rate of our irq handler, if it's exceed some reasonable 
threshold, KVM would try to stop guest device for a while (even it don't know 
if the guest device cause this).

First to try set interrupt disable bit in Device Control Register, wait for a 
period of time, then check again.

If the irq strom can't be stopped, KVM try a more aggressive way: Do the 
Function Level Reset. It's should be the end of device's life...

Oh, of course, if even FLR didn't solve the IRQ storm, that's host's issue. 
Let's wait host to disable the IRQ line - of course, the guest device can't 
be recovered too.

It's just a initial purpose, I think it may work. The problem is if the gap is 
easy to catch... But at least, I think a physical continuous one should be 
much different from any working ones...

--
regards
Yang, Sheng