From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [KVM PATCH v5 3/4] KVM: Fix races in irqfd using new eventfd_kref_get interface Date: Sun, 28 Jun 2009 23:07:51 +0300 Message-ID: <20090628200751.GA14993@redhat.com> References: <20090625132441.26748.641.stgit@dev.haskins.net> <20090625132826.26748.15607.stgit@dev.haskins.net> <20090628114846.GA11764@redhat.com> <4A4767C2.3010503@novell.com> <20090628125612.GA11866@redhat.com> <20090628125730.GB11866@redhat.com> <20090628132035.GD11866@redhat.com> <4A479A23.1010804@novell.com> <20090628190710.GB14136@redhat.com> <4A47CA82.4040108@novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org, avi@redhat.com To: Gregory Haskins Return-path: Received: from mx2.redhat.com ([66.187.237.31]:37278 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753070AbZF1UIT (ORCPT ); Sun, 28 Jun 2009 16:08:19 -0400 Content-Disposition: inline In-Reply-To: <4A47CA82.4040108@novell.com> Sender: kvm-owner@vger.kernel.org List-ID: On Sun, Jun 28, 2009 at 03:54:42PM -0400, Gregory Haskins wrote: > Michael S. Tsirkin wrote: > > On Sun, Jun 28, 2009 at 12:28:19PM -0400, Gregory Haskins wrote: > > > >> Michael S. Tsirkin wrote: > >> > >>> On Sun, Jun 28, 2009 at 03:57:30PM +0300, Michael S. Tsirkin wrote: > >>> > >>> > >>>> On Sun, Jun 28, 2009 at 03:56:12PM +0300, Michael S. Tsirkin wrote: > >>>> > >>>> > >>>>> On Sun, Jun 28, 2009 at 08:53:22AM -0400, Gregory Haskins wrote: > >>>>> > >>>>> > >>>>>> Michael S. Tsirkin wrote: > >>>>>> > >>>>>> > >>>>>>> On Thu, Jun 25, 2009 at 09:28:27AM -0400, Gregory Haskins wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> eventfd currently emits a POLLHUP wakeup on f_ops->release() to generate a > >>>>>>>> "release" callback. This lets eventfd clients know if the eventfd is about > >>>>>>>> to go away and is very useful particularly for in-kernel clients. However, > >>>>>>>> until recently it is not possible to use this feature of eventfd in a > >>>>>>>> race-free way. This patch utilizes a new eventfd interface to rectify > >>>>>>>> the problem. > >>>>>>>> > >>>>>>>> Note that one final race is known to exist: the slow-work thread may race > >>>>>>>> with module removal. We are currently working with slow-work upstream > >>>>>>>> to fix this issue as well. Since the code prior to this patch also > >>>>>>>> races with module_put(), we are not making anything worse, but rather > >>>>>>>> shifting the cause of the race. Once the slow-work code is patched we > >>>>>>>> will be fixing the last remaining issue. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>> By the way, why are we using slow-work here? Wouldn't a regular > >>>>>>> workqueue do just as well, with less code, and avoid the race? > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>> I believe it will cause a problem if you do a "flush_work()" from inside > >>>>>> a work-item. I could be wrong, of course, but it looks like a recipe to > >>>>>> deadlock. > >>>>>> > >>>>>> -Greg > >>>>>> > >>>>>> > >>>>>> > >>>>> Sure, but the idea is to only flush on kvm close, never from work item. > >>>>> > >>>>> > >>>> To clarify, you don't flush slow works from a work-item, > >>>> so you shouldn't need to flush workqueue either. > >>>> > >>>> > >>> I guess my question is - why is slow work different? It's still > >>> a thread pool underneath ... > >>> > >>> > >>> > >> Its not interdependent. Flush-work blocks the thread..if the thread > >> happens to be the work-queue thread you may deadlock preventing it from > >> processing further jobs like the inject. In reality it shouldnt be > >> possible, but its just a bad idea to assume its ok. > >> Slow work, on the > >> other hand, will just make a new thread. > >> > >> -Greg > >> > >> > > > > But if you create your own workqueue, and all you do there is destroy > > irqfds, things are ok I think. Right? > > > > Yep, creating your own queue works too. I picked slow-work as an > alternate to generating a dedicated resource, but I agree either method > would work fine. Do you have a preference? > > Regards, > -Greg It's not something I lose sleep about, but I think workqueue might be less code: for example, you can just flush it instead of using your own counter. And possibly things can be further simplified by making the workqueue single-threaded and always doing deassign from there. -- MST