* [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
@ 2010-01-21 16:26 Michael S. Tsirkin
2010-01-21 16:58 ` Davide Libenzi
0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2010-01-21 16:26 UTC (permalink / raw)
To: Avi Kivity, davidel
Cc: Michael S. Tsirkin, mtosatti, kvm, Linux Kernel Mailing List
This is a backport of commit: 03db343a6320f780937078433fa7d8da955e6fce
modified in a way that introduces some code duplication on the one hand,
but reduces the risk of regressing existing eventfd users on the other
hand.
KVM needs a wait to atomically remove themselves from the eventfd
->poll() wait queue head, in order to handle correctly their IRQfd
deassign operation.
This patch introduces such API, plus a way to read an eventfd from its
context.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
Avi, Davidel, how about only including the following part for -stable
then? Reason is, I still would like to be able to use irqfd there, and
getting spurious interrupts 100% of times unmask is done isn't a very
good idea IMO ...
fs/eventfd.c | 35 +++++++++++++++++++++++++++++++++++
include/linux/eventfd.h | 9 +++++++++
2 files changed, 44 insertions(+), 0 deletions(-)
diff --git a/fs/eventfd.c b/fs/eventfd.c
index 8b47e42..ea9c18a 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -135,6 +135,41 @@ static unsigned int eventfd_poll(struct file *file, poll_table *wait)
return events;
}
+static void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
+{
+ *cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
+ ctx->count -= *cnt;
+}
+
+/**
+ * eventfd_ctx_remove_wait_queue - Read the current counter and removes wait queue.
+ * @ctx: [in] Pointer to eventfd context.
+ * @wait: [in] Wait queue to be removed.
+ * @cnt: [out] Pointer to the 64bit conter value.
+ *
+ * Returns zero if successful, or the following error codes:
+ *
+ * -EAGAIN : The operation would have blocked.
+ *
+ * This is used to atomically remove a wait queue entry from the eventfd wait
+ * queue head, and read/reset the counter value.
+ */
+int eventfd_ctx_remove_wait_queue(struct eventfd_ctx *ctx, wait_queue_t *wait,
+ __u64 *cnt)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ctx->wqh.lock, flags);
+ eventfd_ctx_do_read(ctx, cnt);
+ __remove_wait_queue(&ctx->wqh, wait);
+ if (*cnt != 0 && waitqueue_active(&ctx->wqh))
+ wake_up_locked_poll(&ctx->wqh, POLLOUT);
+ spin_unlock_irqrestore(&ctx->wqh.lock, flags);
+
+ return *cnt != 0 ? 0 : -EAGAIN;
+}
+EXPORT_SYMBOL_GPL(eventfd_ctx_remove_wait_queue);
+
static ssize_t eventfd_read(struct file *file, char __user *buf, size_t count,
loff_t *ppos)
{
diff --git a/include/linux/eventfd.h b/include/linux/eventfd.h
index 94dd103..85eac48 100644
--- a/include/linux/eventfd.h
+++ b/include/linux/eventfd.h
@@ -10,6 +10,7 @@
#include <linux/fcntl.h>
#include <linux/file.h>
+#include <linux/wait.h>
/*
* CAREFUL: Check include/asm-generic/fcntl.h when defining
@@ -34,6 +35,8 @@ struct file *eventfd_fget(int fd);
struct eventfd_ctx *eventfd_ctx_fdget(int fd);
struct eventfd_ctx *eventfd_ctx_fileget(struct file *file);
int eventfd_signal(struct eventfd_ctx *ctx, int n);
+int eventfd_ctx_remove_wait_queue(struct eventfd_ctx *ctx, wait_queue_t *wait,
+ __u64 *cnt);
#else /* CONFIG_EVENTFD */
@@ -61,6 +64,12 @@ static inline void eventfd_ctx_put(struct eventfd_ctx *ctx)
}
+static inline int eventfd_ctx_remove_wait_queue(struct eventfd_ctx *ctx,
+ wait_queue_t *wait, __u64 *cnt)
+{
+ return -ENOSYS;
+}
+
#endif
#endif /* _LINUX_EVENTFD_H */
--
1.6.6.144.g5c3af
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 16:26 [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove Michael S. Tsirkin
@ 2010-01-21 16:58 ` Davide Libenzi
2010-01-21 17:13 ` Avi Kivity
0 siblings, 1 reply; 15+ messages in thread
From: Davide Libenzi @ 2010-01-21 16:58 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Avi Kivity, mtosatti, kvm, Linux Kernel Mailing List
On Thu, 21 Jan 2010, Michael S. Tsirkin wrote:
> This is a backport of commit: 03db343a6320f780937078433fa7d8da955e6fce
> modified in a way that introduces some code duplication on the one hand,
> but reduces the risk of regressing existing eventfd users on the other
> hand.
>
> KVM needs a wait to atomically remove themselves from the eventfd
> ->poll() wait queue head, in order to handle correctly their IRQfd
> deassign operation.
>
> This patch introduces such API, plus a way to read an eventfd from its
> context.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>
> Avi, Davidel, how about only including the following part for -stable
> then? Reason is, I still would like to be able to use irqfd there, and
> getting spurious interrupts 100% of times unmask is done isn't a very
> good idea IMO ...
It's the same thing. Unless there are *real* problems in KVM due to the
spurious ints, I still think this is .33 material.
- Davide
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 16:58 ` Davide Libenzi
@ 2010-01-21 17:13 ` Avi Kivity
2010-01-21 17:14 ` Avi Kivity
2010-01-21 17:23 ` Michael S. Tsirkin
0 siblings, 2 replies; 15+ messages in thread
From: Avi Kivity @ 2010-01-21 17:13 UTC (permalink / raw)
To: Davide Libenzi
Cc: Michael S. Tsirkin, mtosatti, kvm, Linux Kernel Mailing List
On 01/21/2010 06:58 PM, Davide Libenzi wrote:
> On Thu, 21 Jan 2010, Michael S. Tsirkin wrote:
>
>
>> This is a backport of commit: 03db343a6320f780937078433fa7d8da955e6fce
>> modified in a way that introduces some code duplication on the one hand,
>> but reduces the risk of regressing existing eventfd users on the other
>> hand.
>>
>> KVM needs a wait to atomically remove themselves from the eventfd
>> ->poll() wait queue head, in order to handle correctly their IRQfd
>> deassign operation.
>>
>> This patch introduces such API, plus a way to read an eventfd from its
>> context.
>>
>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>> ---
>>
>> Avi, Davidel, how about only including the following part for -stable
>> then? Reason is, I still would like to be able to use irqfd there, and
>> getting spurious interrupts 100% of times unmask is done isn't a very
>> good idea IMO ...
>>
> It's the same thing. Unless there are *real* problems in KVM due to the
> spurious ints, I still think this is .33 material.
>
I agree.
But I think we can solve this in another way in .32: we can clear the
eventfd from irqfd->inject work, which is in process context. The new
stuff is only needed for lockless clearing, no?
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 17:13 ` Avi Kivity
@ 2010-01-21 17:14 ` Avi Kivity
2010-01-21 17:23 ` Michael S. Tsirkin
1 sibling, 0 replies; 15+ messages in thread
From: Avi Kivity @ 2010-01-21 17:14 UTC (permalink / raw)
To: Davide Libenzi
Cc: Michael S. Tsirkin, mtosatti, kvm, Linux Kernel Mailing List
On 01/21/2010 07:13 PM, Avi Kivity wrote:
>
> But I think we can solve this in another way in .32: we can clear the
> eventfd from irqfd->inject work, which is in process context. The new
> stuff is only needed for lockless clearing, no?
>
I meant atomic clearing, when we inject interrupts from the irqfd atomic
context.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 17:13 ` Avi Kivity
2010-01-21 17:14 ` Avi Kivity
@ 2010-01-21 17:23 ` Michael S. Tsirkin
2010-01-21 17:33 ` Avi Kivity
1 sibling, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2010-01-21 17:23 UTC (permalink / raw)
To: Avi Kivity; +Cc: Davide Libenzi, mtosatti, kvm, Linux Kernel Mailing List
On Thu, Jan 21, 2010 at 07:13:13PM +0200, Avi Kivity wrote:
> On 01/21/2010 06:58 PM, Davide Libenzi wrote:
>> On Thu, 21 Jan 2010, Michael S. Tsirkin wrote:
>>
>>
>>> This is a backport of commit: 03db343a6320f780937078433fa7d8da955e6fce
>>> modified in a way that introduces some code duplication on the one hand,
>>> but reduces the risk of regressing existing eventfd users on the other
>>> hand.
>>>
>>> KVM needs a wait to atomically remove themselves from the eventfd
>>> ->poll() wait queue head, in order to handle correctly their IRQfd
>>> deassign operation.
>>>
>>> This patch introduces such API, plus a way to read an eventfd from its
>>> context.
>>>
>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>> ---
>>>
>>> Avi, Davidel, how about only including the following part for -stable
>>> then? Reason is, I still would like to be able to use irqfd there, and
>>> getting spurious interrupts 100% of times unmask is done isn't a very
>>> good idea IMO ...
>>>
>> It's the same thing. Unless there are *real* problems in KVM due to the
>> spurious ints, I still think this is .33 material.
>>
>
> I agree.
>
> But I think we can solve this in another way in .32: we can clear the
> eventfd from irqfd->inject work, which is in process context. The new
> stuff is only needed for lockless clearing, no?
No, AFAIK there's no way to clear the counter from kernel without
this patch.
> --
> Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 17:33 ` Avi Kivity
@ 2010-01-21 17:32 ` Michael S. Tsirkin
2010-01-21 17:47 ` Avi Kivity
0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2010-01-21 17:32 UTC (permalink / raw)
To: Avi Kivity; +Cc: Davide Libenzi, mtosatti, kvm, Linux Kernel Mailing List
On Thu, Jan 21, 2010 at 07:33:02PM +0200, Avi Kivity wrote:
> On 01/21/2010 07:23 PM, Michael S. Tsirkin wrote:
>>
>>> I agree.
>>>
>>> But I think we can solve this in another way in .32: we can clear the
>>> eventfd from irqfd->inject work, which is in process context. The new
>>> stuff is only needed for lockless clearing, no?
>>>
>> No, AFAIK there's no way to clear the counter from kernel without
>> this patch.
>>
>
> Can't you read from the file?
IMO no, the read could block.
> --
> Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 17:23 ` Michael S. Tsirkin
@ 2010-01-21 17:33 ` Avi Kivity
2010-01-21 17:32 ` Michael S. Tsirkin
0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-01-21 17:33 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Davide Libenzi, mtosatti, kvm, Linux Kernel Mailing List
On 01/21/2010 07:23 PM, Michael S. Tsirkin wrote:
>
>> I agree.
>>
>> But I think we can solve this in another way in .32: we can clear the
>> eventfd from irqfd->inject work, which is in process context. The new
>> stuff is only needed for lockless clearing, no?
>>
> No, AFAIK there's no way to clear the counter from kernel without
> this patch.
>
Can't you read from the file?
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 17:47 ` Avi Kivity
@ 2010-01-21 17:45 ` Michael S. Tsirkin
2010-01-21 17:56 ` Avi Kivity
2010-01-21 17:50 ` Davide Libenzi
1 sibling, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2010-01-21 17:45 UTC (permalink / raw)
To: Avi Kivity; +Cc: Davide Libenzi, mtosatti, kvm, Linux Kernel Mailing List
On Thu, Jan 21, 2010 at 07:47:40PM +0200, Avi Kivity wrote:
> On 01/21/2010 07:32 PM, Michael S. Tsirkin wrote:
>>
>>> Can't you read from the file?
>>>
>> IMO no, the read could block.
>>
>
> But you're in process context. An eventfd never blocks.
Yes it blocks if counter is 0. And we don't know
it's not 0 unless we read :) catch-22.
> --
> Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 17:32 ` Michael S. Tsirkin
@ 2010-01-21 17:47 ` Avi Kivity
2010-01-21 17:45 ` Michael S. Tsirkin
2010-01-21 17:50 ` Davide Libenzi
0 siblings, 2 replies; 15+ messages in thread
From: Avi Kivity @ 2010-01-21 17:47 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Davide Libenzi, mtosatti, kvm, Linux Kernel Mailing List
On 01/21/2010 07:32 PM, Michael S. Tsirkin wrote:
>
>> Can't you read from the file?
>>
> IMO no, the read could block.
>
But you're in process context. An eventfd never blocks.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 17:50 ` Davide Libenzi
@ 2010-01-21 17:50 ` Michael S. Tsirkin
0 siblings, 0 replies; 15+ messages in thread
From: Michael S. Tsirkin @ 2010-01-21 17:50 UTC (permalink / raw)
To: Davide Libenzi; +Cc: Avi Kivity, mtosatti, kvm, Linux Kernel Mailing List
On Thu, Jan 21, 2010 at 09:50:34AM -0800, Davide Libenzi wrote:
> On Thu, 21 Jan 2010, Avi Kivity wrote:
>
> > On 01/21/2010 07:32 PM, Michael S. Tsirkin wrote:
> > >
> > > > Can't you read from the file?
> > > >
> > > IMO no, the read could block.
> > >
> >
> > But you're in process context. An eventfd never blocks.
>
> Can you control the eventfd flags? Because if yes, O_NONBLOCK will never
> block.
>
Userspace can but kvm can't.
>
> - Davide
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 17:47 ` Avi Kivity
2010-01-21 17:45 ` Michael S. Tsirkin
@ 2010-01-21 17:50 ` Davide Libenzi
2010-01-21 17:50 ` Michael S. Tsirkin
1 sibling, 1 reply; 15+ messages in thread
From: Davide Libenzi @ 2010-01-21 17:50 UTC (permalink / raw)
To: Avi Kivity; +Cc: Michael S. Tsirkin, mtosatti, kvm, Linux Kernel Mailing List
On Thu, 21 Jan 2010, Avi Kivity wrote:
> On 01/21/2010 07:32 PM, Michael S. Tsirkin wrote:
> >
> > > Can't you read from the file?
> > >
> > IMO no, the read could block.
> >
>
> But you're in process context. An eventfd never blocks.
Can you control the eventfd flags? Because if yes, O_NONBLOCK will never
block.
- Davide
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 17:45 ` Michael S. Tsirkin
@ 2010-01-21 17:56 ` Avi Kivity
2010-01-21 17:57 ` Avi Kivity
0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-01-21 17:56 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Davide Libenzi, mtosatti, kvm, Linux Kernel Mailing List
On 01/21/2010 07:45 PM, Michael S. Tsirkin wrote:
>
>> But you're in process context. An eventfd never blocks.
>>
> Yes it blocks if counter is 0. And we don't know
> it's not 0 unless we read :) catch-22.
>
Ah yes, I forgot.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 17:56 ` Avi Kivity
@ 2010-01-21 17:57 ` Avi Kivity
2010-01-21 18:02 ` Michael S. Tsirkin
0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-01-21 17:57 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Davide Libenzi, mtosatti, kvm, Linux Kernel Mailing List
On 01/21/2010 07:56 PM, Avi Kivity wrote:
> On 01/21/2010 07:45 PM, Michael S. Tsirkin wrote:
>>
>>> But you're in process context. An eventfd never blocks.
>> Yes it blocks if counter is 0. And we don't know
>> it's not 0 unless we read :) catch-22.
>
> Ah yes, I forgot.
>
Well, you can poll it and then read it... this introduces a new race (if
userspace does a read in parallel) but it's limited to kvm and buggy
userspace.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 17:57 ` Avi Kivity
@ 2010-01-21 18:02 ` Michael S. Tsirkin
2010-01-24 11:06 ` Avi Kivity
0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2010-01-21 18:02 UTC (permalink / raw)
To: Avi Kivity; +Cc: Davide Libenzi, mtosatti, kvm, Linux Kernel Mailing List
On Thu, Jan 21, 2010 at 07:57:22PM +0200, Avi Kivity wrote:
> On 01/21/2010 07:56 PM, Avi Kivity wrote:
>> On 01/21/2010 07:45 PM, Michael S. Tsirkin wrote:
>>>
>>>> But you're in process context. An eventfd never blocks.
>>> Yes it blocks if counter is 0. And we don't know
>>> it's not 0 unless we read :) catch-22.
>>
>> Ah yes, I forgot.
>>
>
> Well, you can poll it and then read it... this introduces a new race (if
> userspace does a read in parallel) but it's limited to kvm and buggy
> userspace.
I would rather not require that userspace never reads this fd.
You are right that it does not now, but adding this as requirement
looks like exporting an implementation bug to userspace.
--
MST
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove
2010-01-21 18:02 ` Michael S. Tsirkin
@ 2010-01-24 11:06 ` Avi Kivity
0 siblings, 0 replies; 15+ messages in thread
From: Avi Kivity @ 2010-01-24 11:06 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Davide Libenzi, mtosatti, kvm, Linux Kernel Mailing List
On 01/21/2010 08:02 PM, Michael S. Tsirkin wrote:
> On Thu, Jan 21, 2010 at 07:57:22PM +0200, Avi Kivity wrote:
>
>> On 01/21/2010 07:56 PM, Avi Kivity wrote:
>>
>>> On 01/21/2010 07:45 PM, Michael S. Tsirkin wrote:
>>>
>>>>
>>>>> But you're in process context. An eventfd never blocks.
>>>>>
>>>> Yes it blocks if counter is 0. And we don't know
>>>> it's not 0 unless we read :) catch-22.
>>>>
>>> Ah yes, I forgot.
>>>
>>>
>> Well, you can poll it and then read it... this introduces a new race (if
>> userspace does a read in parallel) but it's limited to kvm and buggy
>> userspace.
>>
> I would rather not require that userspace never reads this fd.
> You are right that it does not now, but adding this as requirement
> looks like exporting an implementation bug to userspace.
>
Well, I don't like risking 2.6.32 non-kvm users for a bug that doesn't
happen in practice now.
After it gets some time in 2.6.33, we can backport it to 2.6.32 (since
that will be maintained long term).
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2010-01-24 11:06 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-21 16:26 [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove Michael S. Tsirkin
2010-01-21 16:58 ` Davide Libenzi
2010-01-21 17:13 ` Avi Kivity
2010-01-21 17:14 ` Avi Kivity
2010-01-21 17:23 ` Michael S. Tsirkin
2010-01-21 17:33 ` Avi Kivity
2010-01-21 17:32 ` Michael S. Tsirkin
2010-01-21 17:47 ` Avi Kivity
2010-01-21 17:45 ` Michael S. Tsirkin
2010-01-21 17:56 ` Avi Kivity
2010-01-21 17:57 ` Avi Kivity
2010-01-21 18:02 ` Michael S. Tsirkin
2010-01-24 11:06 ` Avi Kivity
2010-01-21 17:50 ` Davide Libenzi
2010-01-21 17:50 ` Michael S. Tsirkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox