From mboxrd@z Thu Jan 1 00:00:00 1970 From: Rusty Russell Subject: Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions Date: Thu, 25 Jun 2009 21:12:24 +0930 Message-ID: <200906252112.24730.rusty@rustcorp.com.au> References: <20090619183534.31118.30934.stgit@dev.haskins.net> <200906241255.54709.rusty@rustcorp.com.au> Mime-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Cc: Gregory Haskins , mst@redhat.com, kvm@vger.kernel.org, Linux Kernel Mailing List , avi@redhat.com, paulmck@linux.vnet.ibm.com, Ingo Molnar To: Davide Libenzi Return-path: In-Reply-To: Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On Thu, 25 Jun 2009 08:15:11 am Davide Libenzi wrote: > On Wed, 24 Jun 2009, Rusty Russell wrote: > > On Tue, 23 Jun 2009 03:33:22 am Davide Libenzi wrote: > > > What you're doing there, is setting up a kernel-to-kernel (since > > > userspace only role is to create the eventfd) communication, using a > > > file* as accessory. That IMO is plain wrong. > > > > The most sensible is that userspace can use these fds; an in-kernel > > variant is possible too, but not primary IMHO. > > > > It's nice that userspace create the fds; it can then use the same fd for > > multiple event sources. > > > > But I didn't see anything wrong with the way eventfd used to work: you > > have a kvm ioctl to say "attach this eventfd to this guest notification" > > and that does the eventfd_fget. A detach ioctl does the fput (as does > > release of the kvm fd). > > > > If they close the eventfd and don't do the detach ioctl, it's their > > problem. > > Some components would like to know if userspace dropped the fd, and take > proper action accordingly (release resources, drop module instances, > etc...). Like to know? Possibly. Need to know? Not anything I've seen so far. If userspace creates the fd, component grab a ref and if userspace wants that fd completely freed must close the fd *and* tell component. Simple, race free and explicit. All wins. As this discussion shows, doing some kind of implies non-reference is hard, complex and racy. > Another thing that comes in my mind (that for some components might not > matter) is considering the effect of userspace doing things like: > > for (;;) { > fd = eventfd(...); > ioctl(xfd, XXX_ADD, fd); > close(fd); > } > > That might lead to unprivileged users drawing kernel memory w/out any > userspace accountability, if not properly handled. No, fget_eventfd covers this exactly as expected. Don't doubt your ability to design sane kernel interfaces; eventfd is nice! All lguest needed was a couple of EXPORT_SYMBOLS and it fitted in beautifully. Thanks, Rusty.