linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: avi@redhat.com, gleb@redhat.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, jan.kiszka@siemens.com
Subject: Re: [PATCH v5 2/4] kvm: KVM_EOIFD, an eventfd for EOIs
Date: Wed, 18 Jul 2012 01:24:55 +0300	[thread overview]
Message-ID: <20120717222455.GI1868@redhat.com> (raw)
In-Reply-To: <1342562965.2229.162.camel@bling.home>

On Tue, Jul 17, 2012 at 04:09:25PM -0600, Alex Williamson wrote:
> On Wed, 2012-07-18 at 00:23 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 02:03:05PM -0600, Alex Williamson wrote:
> > > On Tue, 2012-07-17 at 21:58 +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Jul 17, 2012 at 10:52:16AM -0600, Alex Williamson wrote:
> > > > > On Tue, 2012-07-17 at 19:19 +0300, Michael S. Tsirkin wrote:
> > > > > > On Tue, Jul 17, 2012 at 10:06:01AM -0600, Alex Williamson wrote:
> > > > > > > On Tue, 2012-07-17 at 18:53 +0300, Michael S. Tsirkin wrote:
> > > > > > > > On Tue, Jul 17, 2012 at 09:41:09AM -0600, Alex Williamson wrote:
> > > > > > > > > On Tue, 2012-07-17 at 18:13 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > > On Tue, Jul 17, 2012 at 08:57:04AM -0600, Alex Williamson wrote:
> > > > > > > > > > > On Tue, 2012-07-17 at 17:42 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > > > > On Tue, Jul 17, 2012 at 08:29:43AM -0600, Alex Williamson wrote:
> > > > > > > > > > > > > On Tue, 2012-07-17 at 17:10 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > > > > > > On Tue, Jul 17, 2012 at 07:59:16AM -0600, Alex Williamson wrote:
> > > > > > > > > > > > > > > On Tue, 2012-07-17 at 13:21 +0300, Michael S. Tsirkin wrote:
> > > > > > > > > > > > > > > > On Mon, Jul 16, 2012 at 02:33:55PM -0600, Alex Williamson wrote:
> > > > > > > > > > > > > > > > > +	if (args->flags & KVM_EOIFD_FLAG_LEVEL_IRQFD) {
> > > > > > > > > > > > > > > > > +		struct _irqfd *irqfd = _irqfd_fdget_lock(kvm, args->irqfd);
> > > > > > > > > > > > > > > > > +		if (IS_ERR(irqfd)) {
> > > > > > > > > > > > > > > > > +			ret = PTR_ERR(irqfd);
> > > > > > > > > > > > > > > > > +			goto fail;
> > > > > > > > > > > > > > > > > +		}
> > > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > > +		gsi = irqfd->gsi;
> > > > > > > > > > > > > > > > > +		level_irqfd = eventfd_ctx_get(irqfd->eventfd);
> > > > > > > > > > > > > > > > > +		source = _irq_source_get(irqfd->source);
> > > > > > > > > > > > > > > > > +		_irqfd_put_unlock(irqfd);
> > > > > > > > > > > > > > > > > +		if (!source) {
> > > > > > > > > > > > > > > > > +			ret = -EINVAL;
> > > > > > > > > > > > > > > > > +			goto fail;
> > > > > > > > > > > > > > > > > +		}
> > > > > > > > > > > > > > > > > +	} else {
> > > > > > > > > > > > > > > > > +		ret = -EINVAL;
> > > > > > > > > > > > > > > > > +		goto fail;
> > > > > > > > > > > > > > > > > +	}
> > > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > > +	INIT_LIST_HEAD(&eoifd->list);
> > > > > > > > > > > > > > > > > +	eoifd->kvm = kvm;
> > > > > > > > > > > > > > > > > +	eoifd->eventfd = eventfd;
> > > > > > > > > > > > > > > > > +	eoifd->source = source;
> > > > > > > > > > > > > > > > > +	eoifd->level_irqfd = level_irqfd;
> > > > > > > > > > > > > > > > > +	eoifd->notifier.gsi = gsi;
> > > > > > > > > > > > > > > > > +	eoifd->notifier.irq_acked = eoifd_event;
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > OK so this means eoifd keeps a reference to the irqfd.
> > > > > > > > > > > > > > > > And since this is the case, can't we drop the reference counting
> > > > > > > > > > > > > > > > around source ids now? Everything is referenced through irqfd.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Holding a reference and using it as a reference count are not the same
> > > > > > > > > > > > > > > thing.  What if another module holds a reference to this eventfd?  How
> > > > > > > > > > > > > > > do we do anything on release?
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > We don't as there is no release, and using kref on source id does not
> > > > > > > > > > > > > > help: it just never gets invoked.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Please work out how you think it should work and let me know, I don't
> > > > > > > > > > > > > see it.  We have an irq source id that needs to be allocated by irqfd
> > > > > > > > > > > > > and returned when it's unused.  It becomes unused when neither irqfd nor
> > > > > > > > > > > > > eoifd are making use of it.  irqfd and eoifd may be closed in any order.
> > > > > > > > > > > > > Use of the source id is what we're reference counting, which is why it's
> > > > > > > > > > > > > in struct _irq_source.  How can I use an eventfd reference for the same?
> > > > > > > > > > > > > I don't know when it's unused.  I don't know who else holds a reference
> > > > > > > > > > > > > to it...  Doesn't make sense to me.  Feels like attempting to squat on
> > > > > > > > > > > > > someone else's object.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > eoifd should prevent irqfd from being released.
> > > > > > > > > > > 
> > > > > > > > > > > Why?  Note that this is actually quite difficult too.  We can't fail a
> > > > > > > > > > > release, nobody checks close(3p) return.  Blocking a release is likely
> > > > > > > > > > > to cause all sorts of problems, so what you mean is that irqfd should
> > > > > > > > > > > linger around until there are no references to it... but that's exactly
> > > > > > > > > > > what struct _irq_source is for, is to hold the bits that we care about
> > > > > > > > > > > references to and automatically release it when there are none.
> > > > > > > > > > 
> > > > > > > > > > No no. You *already* prevent it. You take a reference to the eventfd
> > > > > > > > > > context.
> > > > > > > > > 
> > > > > > > > > Right, which keeps the fd from going away, not the struct _irqfd.
> > > > > > > > 
> > > > > > > > _irqfd too.
> > > > > > > 
> > > > > > > 
> > > > > > > How so?
> > > > > > 
> > > > > > Normally irqfd_wakeup is called with POLLHUP and calls irqfd_deactivate.
> > > > > > If you get a ctx reference this does not happen.
> > > > > 
> > > > > I think you're mistaken.  wake_up_poll(,POLLHUP) is called from
> > > > > eventfd_release (file_operations.release), not from ctx reference
> > > > > release.
> > > > 
> > > > True. I was wrong. so close has the same bug as deassign. To fix,
> > > > how about eoifd will hold a reference to the irqfd instead of the
> > > > eventfd context?
> > > 
> > > What does it mean to hold a reference to the irqfd?
> > 
> > I meant file *reference: eventfd_fget. But there are other options see
> > below.
> 
> That's no better than the eventfd context we already hold.

It means POLLHUP is not invoked until eoifd is closed.

> > > What state of functionality is an irqfd that has been
> > > closed/de-assigned but is still attached to an eoifd?  It can't
> > > continue to fire interrupts into the guest.
> > >
> > > I don't think close or de-assign have a bug, assign has a bug that it
> > > can allow re-assignment using an in-use eventfd.  I think I'd rather
> > > fix that.
> > 
> > Let me show you that the bug is in deassign:
> > 	assign irqfd for fd=1
> > 	assign for eoifd fd=2, irqfd=1
> > 	deassign irqfd 1
> > 
> > At this point eoifd has no meaning and there is also no way to deassign
> > it,
> 
> Yes, there is.  This is exactly why I hold a reference to the eventfd
> ctx.  It can still be deassigned by passing irqfd=1, we'll do an
> eventfd_ctx_get and match it to that stored.

OK.
What if instead we close irqfd 1?

> >  so the bug already triggered.
> >
> > I can see two ways out:
> > 1. easy way - fail deassign
> 
> Then close() and deassign are not the same.
> 
> > 2. elegant way - shut down eoifd on irqfd deassign too
> 
> Sorry, I've always been told it's a bad idea to have one interface kill
> another from inside the kernel.

Not kill merely deassign.

> Given that your assertion above is incorrect, I still stand by fixing
> assign.

OK, but then you also would need to protect against someone binding
an irqfd that is not level to same GSI.

Also if we go ahead with fixing assign - I do not think we need
to rebind to the same source id - just failing assign
of this irqfd with EBUSY should be enough.

-- 
MST

  reply	other threads:[~2012-07-17 22:24 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-16 20:33 [PATCH v5 0/4] kvm: level irqfd and new eoifd Alex Williamson
2012-07-16 20:33 ` [PATCH v5 1/4] kvm: Extend irqfd to support level interrupts Alex Williamson
2012-07-17 21:26   ` Michael S. Tsirkin
2012-07-17 21:57     ` Alex Williamson
2012-07-17 22:00       ` Michael S. Tsirkin
2012-07-17 22:16         ` Alex Williamson
2012-07-17 22:28           ` Michael S. Tsirkin
2012-07-18 10:41   ` Michael S. Tsirkin
2012-07-18 10:44     ` Gleb Natapov
2012-07-18 10:48       ` Michael S. Tsirkin
2012-07-18 10:49         ` Gleb Natapov
2012-07-18 10:53           ` Michael S. Tsirkin
2012-07-18 10:55             ` Gleb Natapov
2012-07-18 11:22               ` Michael S. Tsirkin
2012-07-18 11:39                 ` Michael S. Tsirkin
2012-07-18 11:48                   ` Gleb Natapov
2012-07-18 12:07                     ` Michael S. Tsirkin
2012-07-18 14:47                       ` Alex Williamson
2012-07-18 15:38                         ` Michael S. Tsirkin
2012-07-18 15:48                           ` Alex Williamson
2012-07-18 15:58                             ` Michael S. Tsirkin
2012-07-18 18:42                               ` Marcelo Tosatti
2012-07-18 19:00                                 ` Gleb Natapov
2012-07-18 19:07                                 ` Alex Williamson
2012-07-18 19:13                                   ` Alex Williamson
2012-07-18 19:16                                     ` Michael S. Tsirkin
2012-07-18 20:28                                     ` Alex Williamson
2012-07-18 21:23                                       ` Marcelo Tosatti
2012-07-18 21:30                                         ` Michael S. Tsirkin
2012-07-16 20:33 ` [PATCH v5 2/4] kvm: KVM_EOIFD, an eventfd for EOIs Alex Williamson
2012-07-17 10:21   ` Michael S. Tsirkin
2012-07-17 13:59     ` Alex Williamson
2012-07-17 14:10       ` Michael S. Tsirkin
2012-07-17 14:29         ` Alex Williamson
2012-07-17 14:42           ` Michael S. Tsirkin
2012-07-17 14:57             ` Alex Williamson
2012-07-17 15:13               ` Michael S. Tsirkin
2012-07-17 15:41                 ` Alex Williamson
2012-07-17 15:53                   ` Michael S. Tsirkin
2012-07-17 16:06                     ` Alex Williamson
2012-07-17 16:19                       ` Michael S. Tsirkin
2012-07-17 16:52                         ` Alex Williamson
2012-07-17 18:58                           ` Michael S. Tsirkin
2012-07-17 20:03                             ` Alex Williamson
2012-07-17 21:23                               ` Michael S. Tsirkin
2012-07-17 22:09                                 ` Alex Williamson
2012-07-17 22:24                                   ` Michael S. Tsirkin [this message]
2012-07-18  2:44                                     ` Alex Williamson
2012-07-18 10:31                                       ` Michael S. Tsirkin
2012-07-16 20:34 ` [PATCH v5 3/4] kvm: Create kvm_clear_irq() Alex Williamson
2012-07-17  0:51   ` Michael S. Tsirkin
2012-07-17  2:42     ` Alex Williamson
2012-07-17  0:55   ` Michael S. Tsirkin
2012-07-17 10:14   ` Michael S. Tsirkin
2012-07-17 13:56     ` Alex Williamson
2012-07-17 14:08       ` Michael S. Tsirkin
2012-07-17 14:21         ` Alex Williamson
2012-07-17 14:53           ` Michael S. Tsirkin
2012-07-17 15:20             ` Alex Williamson
2012-07-17 15:36               ` Michael S. Tsirkin
2012-07-17 15:51                 ` Alex Williamson
2012-07-17 15:57                   ` Michael S. Tsirkin
2012-07-17 16:01                     ` Gleb Natapov
2012-07-17 16:08                     ` Alex Williamson
2012-07-17 16:14                       ` Michael S. Tsirkin
2012-07-17 16:17                         ` Alex Williamson
2012-07-17 16:21                           ` Michael S. Tsirkin
2012-07-17 16:45                             ` Alex Williamson
2012-07-17 18:55                               ` Michael S. Tsirkin
2012-07-17 19:51                                 ` Alex Williamson
2012-07-17 21:05                                   ` Michael S. Tsirkin
2012-07-17 22:01                                     ` Alex Williamson
2012-07-17 22:05                                       ` Michael S. Tsirkin
2012-07-17 22:22                                         ` Alex Williamson
2012-07-17 22:31                                           ` Michael S. Tsirkin
2012-07-18  6:27                         ` Gleb Natapov
2012-07-18 10:20                           ` Michael S. Tsirkin
2012-07-18 10:27                             ` Gleb Natapov
2012-07-18 10:33                               ` Michael S. Tsirkin
2012-07-18 10:36                                 ` Gleb Natapov
2012-07-18 10:51                                   ` Michael S. Tsirkin
2012-07-18 10:53                                     ` Gleb Natapov
2012-07-18 11:08                                       ` Michael S. Tsirkin
2012-07-18 11:50                                         ` Gleb Natapov
2012-07-18 21:55                           ` Michael S. Tsirkin
2012-07-17 16:36                       ` Michael S. Tsirkin
2012-07-17 17:09                         ` Gleb Natapov
2012-07-17 10:18   ` Michael S. Tsirkin
2012-07-16 20:34 ` [PATCH v5 4/4] kvm: Convert eoifd to use kvm_clear_irq Alex Williamson
2012-07-18 10:43 ` [PATCH v5 0/4] kvm: level irqfd and new eoifd Michael S. Tsirkin
2012-07-19 16:59 ` Michael S. Tsirkin
2012-07-19 17:29   ` Alex Williamson
2012-07-19 17:45     ` Michael S. Tsirkin
2012-07-19 18:48       ` Alex Williamson
2012-07-20 10:07         ` Michael S. Tsirkin
2012-07-22 15:09           ` Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120717222455.GI1868@redhat.com \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=avi@redhat.com \
    --cc=gleb@redhat.com \
    --cc=jan.kiszka@siemens.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).