From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:36263)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mst@redhat.com>) id 1VlL5u-0002OW-0K
	for qemu-devel@nongnu.org; Tue, 26 Nov 2013 11:02:40 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mst@redhat.com>) id 1VlL5n-00043N-M1
	for qemu-devel@nongnu.org; Tue, 26 Nov 2013 11:02:33 -0500
Received: from mx1.redhat.com ([209.132.183.28]:42042)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mst@redhat.com>) id 1VlL5n-000432-CF
	for qemu-devel@nongnu.org; Tue, 26 Nov 2013 11:02:27 -0500
Date: Tue, 26 Nov 2013 18:05:37 +0200
From: "Michael S. Tsirkin" <mst@redhat.com>
Message-ID: <20131126160536.GA23007@redhat.com>
References: <D3E216785288A145B7BC975F83A2ED10448ADCCB@SZXEMA510-MBS.china.huawei.com>
	<52949847.6020908@redhat.com> <20131126125610.GM959@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20131126125610.GM959@redhat.com>
Subject: Re: [Qemu-devel] [RFC] create a single workqueue for each vm to
 update vm irq routing table
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Gleb Natapov <gleb@redhat.com>
Cc: "Huangweidong (C)" <weidong.huang@huawei.com>, KVM <kvm@vger.kernel.org>, "Jinxin (F)" <jinxin712@huawei.com>, "Zhanghaoyu (A)" <haoyu.zhang@huawei.com>, Luonengjun <luonengjun@huawei.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, Zanghongyong <zanghongyong@huawei.com>, Paolo Bonzini <pbonzini@redhat.com>

On Tue, Nov 26, 2013 at 02:56:10PM +0200, Gleb Natapov wrote:
> On Tue, Nov 26, 2013 at 01:47:03PM +0100, Paolo Bonzini wrote:
> > Il 26/11/2013 13:40, Zhanghaoyu (A) ha scritto:
> > > When guest set irq smp_affinity, VMEXIT occurs, then the vcpu thread will IOCTL return to QEMU from hypervisor, then vcpu thread ask the hypervisor to update the irq routing table,
> > > in kvm_set_irq_routing, synchronize_rcu is called, current vcpu thread is blocked for so much time to wait RCU grace period, and during this period, this vcpu cannot provide service to VM,
> > > so those interrupts delivered to this vcpu cannot be handled in time, and the apps running on this vcpu cannot be serviced too.
> > > It's unacceptable in some real-time scenario, e.g. telecom. 
> > > 
> > > So, I want to create a single workqueue for each VM, to asynchronously performing the RCU synchronization for irq routing table, 
> > > and let the vcpu thread return and VMENTRY to service VM immediately, no more need to blocked to wait RCU grace period.
> > > And, I have implemented a raw patch, took a test in our telecom environment, above problem disappeared.
> > 
> > I don't think a workqueue is even needed.  You just need to use call_rcu
> > to free "old" after releasing kvm->irq_lock.
> > 
> > What do you think?
> > 
> It should be rate limited somehow. Since it guest triggarable guest may cause
> host to allocate a lot of memory this way.

The checks in __call_rcu(), should handle this I think.  These keep a per-CPU
counter, which can be adjusted via rcutree.blimit, which defaults
to taking evasive action if more than 10K callbacks are waiting on a
given CPU.


> Is this about MSI interrupt affinity? IIRC changing INT interrupt
> affinity should not trigger kvm_set_irq_routing update. If this is about
> MSI only then what about changing userspace to use KVM_SIGNAL_MSI for
> MSI injection?
> 
> --
> 			Gleb.