All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Roberson <jroberson@chesapeake.net>
To: linux-kernel@vger.kernel.org
Cc: riel@redhat.com
Subject: [PATCH] eventfd signal race in aio_complete()
Date: Fri, 7 Mar 2008 16:32:38 -1000 (HST)	[thread overview]
Message-ID: <20080307161854.E920@desktop> (raw)

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1440 bytes --]

Hello,

I have an application that makes use of eventfd to merge socket and aio 
blocking with epoll in one thread.  Under heavy loads the application 
sometimes hangs when we receive notification from epoll that the eventfd 
has an event ready but reading the aio completions produces no results. 
Further investigation revealed that the aiocb was later ready with no 
new event and completing it based on a timer resolved the application 
hang.

This pointed to the eventfd being signaled prematurely and I verified that 
this was indeed the problem.  aio_complete() calls eventfd_signal() before 
the event is actually placed on the completion ring.  On a multi-processor 
system it is possible to read the event from epoll and return to userspace 
before aio_complete() finishes.

The enclosed patch simply moves the signaling to the bottom of the 
function.  I'm not 100% familiar with this code and it looks like it may 
be possible to have spurious wakeups now but there will be no missed 
wakeups.  An application may also race the other way now and receive aio 
completion before the signal, thus still leaving it with a signal with no 
completion.  signaling while the kioctx is locked would resolve this but I 
was hesitant to introduce further nesting of spinlocks that might have 
another order elsewhere.

Please keep me in the cc line for any necessary replies.

Thanks,
Jeff

Signed-off-by:  Jeff Roberson <jeff@freebsd.org>

[-- Attachment #2: Type: TEXT/x-diff, Size: 892 bytes --]

--- aio.c.orig	2008-03-08 00:23:50.000000000 +0000
+++ aio.c	2008-03-08 00:24:32.000000000 +0000
@@ -946,14 +946,6 @@ int fastcall aio_complete(struct kiocb *
 		return 1;
 	}
 
-	/*
-	 * Check if the user asked us to deliver the result through an
-	 * eventfd. The eventfd_signal() function is safe to be called
-	 * from IRQ context.
-	 */
-	if (!IS_ERR(iocb->ki_eventfd))
-		eventfd_signal(iocb->ki_eventfd, 1);
-
 	info = &ctx->ring_info;
 
 	/* add a completion event to the ring buffer.
@@ -1010,6 +1002,15 @@ put_rq:
 		wake_up(&ctx->wait);
 
 	spin_unlock_irqrestore(&ctx->ctx_lock, flags);
+
+	/*
+	 * Check if the user asked us to deliver the result through an
+	 * eventfd. The eventfd_signal() function is safe to be called
+	 * from IRQ context.
+	 */
+	if (!IS_ERR(iocb->ki_eventfd))
+		eventfd_signal(iocb->ki_eventfd, 1);
+
 	return ret;
 }
 

             reply	other threads:[~2008-03-08  3:03 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-08  2:32 Jeff Roberson [this message]
2008-03-08  4:29 ` [PATCH] eventfd signal race in aio_complete() Davide Libenzi
2008-03-08 15:23   ` Rik van Riel
2008-03-08 20:38     ` Davide Libenzi
2008-03-08 21:38       ` Jeff Roberson
2008-03-08 21:55         ` Davide Libenzi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080307161854.E920@desktop \
    --to=jroberson@chesapeake.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.