From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 15/15] KVM: Fix racy in kvm_free_assigned_irq Date: Mon, 29 Dec 2008 15:37:52 +0200 Message-ID: <4958D2B0.6050904@redhat.com> References: <20081225115609.GA10087@syang10-desktop> <20081228112402.GB2610@syang10-desktop> <20081229054222.GB13166@shell.devel.redhat.com> <200812292023.29565.sheng@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Amit Shah , Marcelo Tosatti , kvm@vger.kernel.org, Amit Shah , "Han, Weidong" To: Sheng Yang Return-path: Received: from mx2.redhat.com ([66.187.237.31]:60139 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752822AbYL2Nhk (ORCPT ); Mon, 29 Dec 2008 08:37:40 -0500 In-Reply-To: <200812292023.29565.sheng@linux.intel.com> Sender: kvm-owner@vger.kernel.org List-ID: Sheng Yang wrote: > The free assigned device in the destroy path of VM, so as free irq. And we got > cancel_work_sync() in free irq which can sync with the execution of scheduled > work. And now before cancel_work_sync(), we disable the interrupt so that no > more schedule work happen again. So after cancel_work_sync(), everything(I > think it's irq handler and schedule work here) asynchronously should quiet > down. > > Or I miss something? > Suppose the work_struct gets scheduled, but is delayed somewhere in the scheduler. Some kill -9s the VM, and it starts getting destroyed. cancel_work_sync() can no longer truly cancel the work, so it has to schedule and wait for its completion. So now we have kvm_assigned_dev_interrupt_work_handler() running in a partially destroyed VM. It may work or not, but it's a fragile situation (changing the order of destruction of components will likely break things) and it's easy to avoid by keeping the reference count elevated. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.