From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH 0/2] eventfd: new EFD_STATE flag
Date: Sun, 10 Jan 2010 18:22:02 +0200
Message-ID: <20100110162202.GA29263@redhat.com>
References: <20100106222920.GB29200@redhat.com> <alpine.DEB.2.00.1001061438570.17695@makko.or.mcafeemobile.com> <20100106234521.GA30515@redhat.com> <alpine.DEB.2.00.1001061555390.17695@makko.or.mcafeemobile.com> <20100107064503.GA10379@redhat.com> <alpine.DEB.2.00.1001062315200.6811@makko.or.mcafeemobile.com> <20100107103600.GD599@redhat.com> <alpine.DEB.2.00.1001071622320.8698@makko.or.mcafeemobile.com> <20100110103007.GA20538@redhat.com> <alpine.DEB.2.00.1001100725040.20000@makko.or.mcafeemobile.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Avi Kivity <avi@redhat.com>, kvm@vger.kernel.org
To: Davide Libenzi <davidel@xmailserver.org>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:36825 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752989Ab0AJQY5 (ORCPT <rfc822;kvm@vger.kernel.org>);
	Sun, 10 Jan 2010 11:24:57 -0500
Content-Disposition: inline
In-Reply-To: <alpine.DEB.2.00.1001100725040.20000@makko.or.mcafeemobile.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Sun, Jan 10, 2010 at 07:26:12AM -0800, Davide Libenzi wrote:
> On Sun, 10 Jan 2010, Michael S. Tsirkin wrote:
> 
> > > What is you do (under proper irqfd locking) something like:
> > > 
> > > 	eventfd_ctx_read(ctx, 1, &cnt);
> > > 	if (irqfd->cnt != cnt) {
> > > 		irqfd->cnt = cnt;
> > > 		schedule_work(&irqfd->inject);
> > > 	}
> > > 
> > > 
> > > 
> > > 
> > > > And deactivation deep down does this (from irqfd_cleanup_wq workqueue,
> > > > so this is not under the spinlock):
> > > > 
> > > >         /*
> > > >          * Synchronize with the wait-queue and unhook ourselves to
> > > >          * prevent
> > > >          * further events.
> > > >          */
> > > > B.
> > > >         remove_wait_queue(irqfd->wqh, &irqfd->wait);
> > > > 
> > > > 	....
> > > > 
> > > >         /*
> > > >          * It is now safe to release the object's resources
> > > >          */
> > > >         eventfd_ctx_put(irqfd->eventfd);
> > > >         kfree(irqfd);
> > > 
> > > And:
> > > 
> > > 	eventfd_ctx_read(ctx, 1, &irqfd->cnt);
> > 
> > 
> > ->
> > 
> > > 	remove_wait_queue(irqfd->wqh, &irqfd->wait);
> > > 
> > > 
> > > 
> > > 
> > > - Davide
> > 
> > Yes, this is exactly what I wanted to do.  So, here's the issue: if an
> > event is signalled at point ->: after eventfd_ctx_read but before
> > remove_wait_queue, then we inject interrupt but counter will be left
> > non-zero and then when we unmask, we inject antoher, spurious interrupt.
> > 
> > This is why I wanted to have eventfd_ctx_read not take wait queue head
> > lock: then I could do:
> > 
> > 	spin_lock_irqsave(&ctx->wqh.lock, flags);
> >  	eventfd_ctx_read(ctx, 1, &irqfd->cnt);
> >  	__remove_wait_queue(irqfd->wqh, &irqfd->wait);
> > 	spin_lock_irqrestore(&ctx->wqh.lock, flags);
> 
> This is why I said "under proper irqfd locking".
> 
> 
> - Davide

Not sure what you mean by irqfd locking.  IMO we must take waitqueue
lock, otherwise we can not prevent wait queue callback from being
invoked, right? Note that if we take our own lock under wait queue
callback, we won't be able to use this lock around remove_wait_queue.
Also note how fs/evetfd.c uses __remove_wait_queue after taking wait
queue lock - we'd like to do the same with our wait queue.

Could you clarify pls?

-- 
MST