All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gregory Haskins <ghaskins@novell.com>
To: Davide Libenzi <davidel@xmailserver.org>
Cc: mst@redhat.com, kvm@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	avi@redhat.com, paulmck@linux.vnet.ibm.com,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions
Date: Sat, 20 Jun 2009 21:05:30 -0400	[thread overview]
Message-ID: <4A3D875A.9040101@novell.com> (raw)
In-Reply-To: <alpine.DEB.1.10.0906201202170.25534@makko.or.mcafeemobile.com>

[-- Attachment #1: Type: text/plain, Size: 6468 bytes --]

Davide Libenzi wrote:
> On Fri, 19 Jun 2009, Gregory Haskins wrote:
>
>   
>>> In the POLLIN event, you schedule_work(&irqfd->inject) and there are no 
>>> races there AFAICS (you basically do not care of anything eventfd memory 
>>> related at all).
>>> For POLLHUP, you do:
>>>
>>> 	spin_lock(irqfd->slock);
>>> 	if (irqfd->wqh)
>>> 		schedule_work(&irqfd->inject);
>>> 	irqfd->wqh = NULL;
>>> 	spin_unlock(irqfd->slock);
>>>
>>> In your work function you notice the POLLHUP condition and take proper 
>>> action (dunno what it is in your case).
>>> In your kvm_irqfd_release() function:
>>>
>>> 	spin_lock(irqfd->slock);
>>> 	if (irqfd->wqh)
>>> 		remove_wait_queue(irqfd->wqh, &irqfd->wait);
>>> 	irqfd->wqh = NULL;
>>> 	spin_unlock(irqfd->slock);
>>>
>>> Any races in there?
>>>   
>>>       
>> Yes, for one you have an ABBA deadlock on the irqfd->slock and wqh->lock
>> (recall wqh has to be locked to fix that other race I mentioned).
>>     
>
> Yep, true. How about we go in little steps?
> How about the one below?
> When you register your poll callback, you get a reference with 
> eventfd_refget().
>   

I was thinking about this last night/this morning and that was
essentially what I was going to propose next as a more palatable+simpler
solution.

> At that point we have no more worries about the eventfd memory (namely 
> "wqh") going away.
> Symmetrically, you use eventfd_refput() once you done with it.
> So we basically de-couple the internal VFS refcount, from the eventfd 
> context memory refcount.
> Where are the non-solveable races on the IRQfd side, of an approach like 
> this one?
>   
None... I think that can work.  The core requirements boil down to: 1)
locking the POLLHUP (as we already discussed), and 2) holding a kref (or
equivelent) to the eventfd_ctx that the client can release once they are
fully out.  After that, it looks like any vanilla wait-queue in the kernel.


>
>
>   
>> Yes, understood.  What I was trying to gently say is that the one-liner
>> proposal alone is still broken afaict.  However, if there is another
>> solution that works that you like better than 133-liner I posted, I am
>> more than willing to help analyze it.  In the end, I only care that this
>> is fixed.
>>     
>
> Is it so?
> What I noticed here, is that you went from "Detailed Mode" (in the emails 
> where you were pushing the new bits), to "Hint Mode" (as below) when it 
> was time to see if there were simpler solutions to the problem.
>   

I wasn't being purposely coy.  It was late friday and I was getting
pressure from the wife to get off the computer ;)  Plus, all the details
were in my patch, so I figured there wasn't a giant need to rehash since
it could just be gleened.  In any case, based on this current reply I
think we are on the same page now.

>   
>> (As a hint, I think I fixed 4-5 races with these patches, so there are a
>> few others still lurking as well)
>>     
>
> Not knowing the details of IRQfd, hinting only is not going to help 
> anything in this case.
>
>
>
> - Davide
>
>
>
> ---
>  fs/eventfd.c            |   37 ++++++++++++++++++++++++++++++++++++-
>  include/linux/eventfd.h |    6 ++++++
>  2 files changed, 42 insertions(+), 1 deletion(-)
>   

I haven't had a chance to go over the patch below in detail, but the
30,000 foot view is that it is in line with what I was thinking this
morning.  So, its a preliminary "looks good".  I'll take a closer look ASAP

Thanks Davide,
-Greg

> Index: linux-2.6.mod/fs/eventfd.c
> ===================================================================
> --- linux-2.6.mod.orig/fs/eventfd.c	2009-06-20 13:08:22.000000000 -0700
> +++ linux-2.6.mod/fs/eventfd.c	2009-06-20 14:00:23.000000000 -0700
> @@ -17,8 +17,10 @@
>  #include <linux/eventfd.h>
>  #include <linux/syscalls.h>
>  #include <linux/module.h>
> +#include <linux/kref.h>
>  
>  struct eventfd_ctx {
> +	struct kref kref;
>  	wait_queue_head_t wqh;
>  	/*
>  	 * Every time that a write(2) is performed on an eventfd, the
> @@ -59,9 +61,19 @@ int eventfd_signal(struct file *file, in
>  }
>  EXPORT_SYMBOL_GPL(eventfd_signal);
>  
> +static void eventfd_free(struct kref *kref)
> +{
> +	struct eventfd_ctx *ctx = container_of(kref, struct eventfd_ctx, kref);
> +
> +	kfree(ctx);
> +}
> +
>  static int eventfd_release(struct inode *inode, struct file *file)
>  {
> -	kfree(file->private_data);
> +	struct eventfd_ctx *ctx = file->private_data;
> +
> +	wake_up_poll(&ctx->wqh, POLLHUP);
> +	kref_put(&ctx->kref, eventfd_free);
>  	return 0;
>  }
>  
> @@ -201,6 +213,28 @@ struct file *eventfd_fget(int fd)
>  }
>  EXPORT_SYMBOL_GPL(eventfd_fget);
>  
> +int eventfd_refget(struct file *file)
> +{
> +	struct eventfd_ctx *ctx = file->private_data;
> +
> +	if (file->f_op != &eventfd_fops)
> +		return -EINVAL;
> +	kref_get(&ctx->kref);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(eventfd_refget);
> +
> +int eventfd_refput(struct file *file)
> +{
> +	struct eventfd_ctx *ctx = file->private_data;
> +
> +	if (file->f_op != &eventfd_fops)
> +		return -EINVAL;
> +	kref_put(&ctx->kref, eventfd_free);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(eventfd_refput);
> +
>  SYSCALL_DEFINE2(eventfd2, unsigned int, count, int, flags)
>  {
>  	int fd;
> @@ -217,6 +251,7 @@ SYSCALL_DEFINE2(eventfd2, unsigned int, 
>  	if (!ctx)
>  		return -ENOMEM;
>  
> +	kref_init(&ctx->kref);
>  	init_waitqueue_head(&ctx->wqh);
>  	ctx->count = count;
>  	ctx->flags = flags;
> Index: linux-2.6.mod/include/linux/eventfd.h
> ===================================================================
> --- linux-2.6.mod.orig/include/linux/eventfd.h	2009-06-20 13:08:22.000000000 -0700
> +++ linux-2.6.mod/include/linux/eventfd.h	2009-06-20 13:59:09.000000000 -0700
> @@ -29,12 +29,18 @@
>  
>  struct file *eventfd_fget(int fd);
>  int eventfd_signal(struct file *file, int n);
> +int eventfd_refget(struct file *file);
> +int eventfd_refput(struct file *file);
>  
>  #else /* CONFIG_EVENTFD */
>  
>  #define eventfd_fget(fd) ERR_PTR(-ENOSYS)
>  static inline int eventfd_signal(struct file *file, int n)
>  { return 0; }
> +static inline int eventfd_refget(struct file *file)
> +{ return -ENOSYS; }
> +static inline int eventfd_refput(struct file *file)
> +{ return -ENOSYS; }
>  
>  #endif /* CONFIG_EVENTFD */
>  
>   



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

  parent reply	other threads:[~2009-06-21  1:05 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-16  2:29 [KVM-RFC PATCH 0/2] eventfd enhancements for irqfd/iosignalfd Gregory Haskins
2009-06-16  2:29 ` [KVM-RFC PATCH 1/2] eventfd: add an explicit srcu based notifier interface Gregory Haskins
2009-06-16 14:02   ` Michael S. Tsirkin
2009-06-16 14:11     ` Gregory Haskins
2009-06-16 14:38       ` Michael S. Tsirkin
2009-06-16 14:48         ` Gregory Haskins
2009-06-16 14:54           ` Gregory Haskins
2009-06-16 15:16             ` Michael S. Tsirkin
2009-06-16 14:55           ` Michael S. Tsirkin
2009-06-16 15:20             ` Gregory Haskins
2009-06-16 15:41               ` Michael S. Tsirkin
2009-06-16 16:17                 ` Gregory Haskins
2009-06-16 16:19                   ` Davide Libenzi
2009-06-16 17:01                     ` Gregory Haskins
2009-06-17 16:38                       ` Davide Libenzi
2009-06-17 17:28                         ` Gregory Haskins
2009-06-17 17:44                           ` Davide Libenzi
2009-06-17 19:17                             ` Gregory Haskins
2009-06-17 19:50                               ` Davide Libenzi
2009-06-17 21:48                                 ` Gregory Haskins
2009-06-17 23:21                                   ` Davide Libenzi
2009-06-18  6:23                                     ` Michael S. Tsirkin
2009-06-18 17:52                                       ` Davide Libenzi
2009-06-18 14:01                                     ` Gregory Haskins
2009-06-18 14:01                                       ` Gregory Haskins
2009-06-18 17:44                                       ` Davide Libenzi
2009-06-18 19:04                                         ` Gregory Haskins
2009-06-18 19:04                                           ` Gregory Haskins
2009-06-18 22:03                                           ` Davide Libenzi
2009-06-18 22:47                                             ` Gregory Haskins
2009-06-18 22:47                                               ` Gregory Haskins
2009-06-19 18:51                                             ` Gregory Haskins
2009-06-19 18:51                                               ` [PATCH 1/3] eventfd: Allow waiters to be notified about the eventfd file* going away Gregory Haskins
2009-06-19 18:51                                               ` [PATCH 2/3] eventfd: add generalized notifier interface Gregory Haskins
2009-06-19 18:51                                               ` [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions Gregory Haskins
2009-06-19 19:10                                                 ` Davide Libenzi
2009-06-19 21:16                                                   ` Gregory Haskins
2009-06-19 21:26                                                     ` Davide Libenzi
2009-06-19 21:49                                                       ` Gregory Haskins
2009-06-19 21:54                                                         ` Davide Libenzi
2009-06-19 22:47                                                           ` Davide Libenzi
2009-06-20  2:09                                                             ` Gregory Haskins
2009-06-20 21:17                                                               ` Davide Libenzi
2009-06-20 22:11                                                                 ` Davide Libenzi
2009-06-20 23:48                                                                   ` Davide Libenzi
2009-06-21  1:14                                                                     ` Gregory Haskins
2009-06-21 16:51                                                                       ` Davide Libenzi
2009-06-21 18:39                                                                         ` Gregory Haskins
2009-06-21 23:54                                                                           ` Davide Libenzi
2009-06-22 16:05                                                                             ` Gregory Haskins
2009-06-22 16:05                                                                               ` Gregory Haskins
2009-06-22 17:01                                                                               ` Davide Libenzi
2009-06-22 17:43                                                                                 ` Gregory Haskins
2009-06-22 17:43                                                                                   ` Gregory Haskins
2009-06-22 18:03                                                                                   ` Davide Libenzi
2009-06-22 18:31                                                                                     ` Gregory Haskins
2009-06-22 18:31                                                                                       ` Gregory Haskins
2009-06-22 18:40                                                                                       ` Davide Libenzi
2009-06-22 18:41                                                                                     ` Michael S. Tsirkin
2009-06-22 18:51                                                                                       ` Davide Libenzi
2009-06-22 19:05                                                                                         ` Michael S. Tsirkin
2009-06-22 19:26                                                                                           ` Gregory Haskins
2009-06-22 19:29                                                                                             ` Davide Libenzi
2009-06-22 20:06                                                                                               ` Gregory Haskins
2009-06-22 22:53                                                                                                 ` Davide Libenzi
2009-06-23  1:03                                                                                                   ` Gregory Haskins
2009-06-23  1:17                                                                                                     ` Davide Libenzi
2009-06-23  1:26                                                                                                       ` Gregory Haskins
2009-06-23  1:26                                                                                                         ` Gregory Haskins
2009-06-23 14:29                                                                                                         ` Davide Libenzi
2009-06-23 14:37                                                                                                           ` Gregory Haskins
2009-06-23 14:37                                                                                                             ` Gregory Haskins
2009-06-23 14:35                                                                                                             ` Davide Libenzi
2009-06-23 14:42                                                                                                               ` Gregory Haskins
2009-06-23 14:42                                                                                                                 ` Gregory Haskins
2009-06-23 15:04                                                                                                               ` Michael S. Tsirkin
2009-06-22 20:28                                                                                             ` Michael S. Tsirkin
2009-06-22 19:16                                                                                         ` Gregory Haskins
2009-06-22 19:54                                                                                           ` Davide Libenzi
2009-06-24  3:25                                                                                     ` Rusty Russell
2009-06-24 22:45                                                                                       ` Davide Libenzi
2009-06-25 11:42                                                                                         ` Rusty Russell
2009-06-25 16:34                                                                                           ` Davide Libenzi
2009-06-25 17:32                                                                                             ` Gregory Haskins
2009-06-25 18:26                                                                                               ` Michael S. Tsirkin
2009-06-25 18:41                                                                                                 ` Gregory Haskins
2009-06-26 11:23                                                                                                   ` Michael S. Tsirkin
2009-06-23  3:25                                                                             ` Rusty Russell
2009-06-23 14:31                                                                               ` Davide Libenzi
2009-06-25  0:19                                                                                 ` Davide Libenzi
2009-06-21  1:05                                                                 ` Gregory Haskins [this message]
2009-06-16 17:54                   ` [KVM-RFC PATCH 1/2] eventfd: add an explicit srcu based notifier interface Michael S. Tsirkin
2009-06-16 18:09                     ` Gregory Haskins
2009-06-17 14:45                       ` Michael S. Tsirkin
2009-06-17 15:02                         ` Gregory Haskins
2009-06-17 16:25                           ` Michael S. Tsirkin
2009-06-17 16:41                             ` Gregory Haskins
2009-06-16 14:17     ` Gregory Haskins
2009-06-16 14:22       ` Gregory Haskins
2009-06-16 14:40     ` Gregory Haskins
2009-06-16 14:46       ` Michael S. Tsirkin
2009-06-18  9:03       ` Avi Kivity
2009-06-18 11:43         ` Gregory Haskins
2009-06-16  2:30 ` [KVM-RFC PATCH 2/2] eventfd: add module reference counting support for registered notifiers Gregory Haskins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A3D875A.9040101@novell.com \
    --to=ghaskins@novell.com \
    --cc=avi@redhat.com \
    --cc=davidel@xmailserver.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mst@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.