public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] aio: fix io_setup/io_destroy race
@ 2012-03-07  5:16 Al Viro
  2012-03-07 16:50 ` Benjamin LaHaise
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Al Viro @ 2012-03-07  5:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Benjamin LaHaise, linux-kernel

Have ioctx_alloc() return an extra reference, so that caller would drop it
on success and not bother with re-grabbing it on failure exit.  The current
code is obviously broken - io_destroy() from another thread that managed
to guess the address io_setup() would've returned would free ioctx right
under us; gets especially interesting if aio_context_t * we pass to
io_setup() points to PROT_READ mapping, so put_user() fails and we end
up doing io_destroy() on kioctx another thread has just got freed...

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
[there's other pending aio stuff, but this part is really obvious and
self-contained]

diff --git a/fs/aio.c b/fs/aio.c
index 67e4b90..f6578cb 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -273,7 +273,7 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
 	mm = ctx->mm = current->mm;
 	atomic_inc(&mm->mm_count);
 
-	atomic_set(&ctx->users, 1);
+	atomic_set(&ctx->users, 2);
 	spin_lock_init(&ctx->ctx_lock);
 	spin_lock_init(&ctx->ring_info.ring_lock);
 	init_waitqueue_head(&ctx->wait);
@@ -1338,10 +1338,10 @@ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp)
 	ret = PTR_ERR(ioctx);
 	if (!IS_ERR(ioctx)) {
 		ret = put_user(ioctx->user_id, ctxp);
-		if (!ret)
+		if (!ret) {
+			put_ioctx(ioctx);
 			return 0;
-
-		get_ioctx(ioctx); /* io_destroy() expects us to hold a ref */
+		}
 		io_destroy(ioctx);
 	}
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: fix io_setup/io_destroy race
  2012-03-07  5:16 [PATCH] aio: fix io_setup/io_destroy race Al Viro
@ 2012-03-07 16:50 ` Benjamin LaHaise
  2012-03-07 18:16 ` Jeff Moyer
  2012-03-08 17:51 ` [PATCH] aio: fix the "too late munmap()" race Al Viro
  2 siblings, 0 replies; 6+ messages in thread
From: Benjamin LaHaise @ 2012-03-07 16:50 UTC (permalink / raw)
  To: Al Viro; +Cc: Linus Torvalds, linux-kernel

On Wed, Mar 07, 2012 at 05:16:35AM +0000, Al Viro wrote:
> Have ioctx_alloc() return an extra reference, so that caller would drop it
> on success and not bother with re-grabbing it on failure exit.  The current
> code is obviously broken - io_destroy() from another thread that managed
> to guess the address io_setup() would've returned would free ioctx right
> under us; gets especially interesting if aio_context_t * we pass to
> io_setup() points to PROT_READ mapping, so put_user() fails and we end
> up doing io_destroy() on kioctx another thread has just got freed...
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>

The fix looks good, and works with a quick test program checking the two cases 
involved (ioctx create success + failure due to PROT_READ).

		-ben

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: fix io_setup/io_destroy race
  2012-03-07  5:16 [PATCH] aio: fix io_setup/io_destroy race Al Viro
  2012-03-07 16:50 ` Benjamin LaHaise
@ 2012-03-07 18:16 ` Jeff Moyer
  2012-03-08 17:51 ` [PATCH] aio: fix the "too late munmap()" race Al Viro
  2 siblings, 0 replies; 6+ messages in thread
From: Jeff Moyer @ 2012-03-07 18:16 UTC (permalink / raw)
  To: Al Viro; +Cc: Linus Torvalds, Benjamin LaHaise, linux-kernel, linux-aio

Al Viro <viro@ZenIV.linux.org.uk> writes:

> Have ioctx_alloc() return an extra reference, so that caller would drop it
> on success and not bother with re-grabbing it on failure exit.  The current
> code is obviously broken - io_destroy() from another thread that managed
> to guess the address io_setup() would've returned would free ioctx right
> under us; gets especially interesting if aio_context_t * we pass to
> io_setup() points to PROT_READ mapping, so put_user() fails and we end
> up doing io_destroy() on kioctx another thread has just got freed...

Al, you certainly are creative.  ;-) I agree with the problem and the
fix.  It would be nice, though if you had added comments.

I ran xfstests ./check -g aio, and there were no problems.

Reviewed-by: Jeff Moyer <jmoyer@redhat.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] aio: fix the "too late munmap()" race
  2012-03-07  5:16 [PATCH] aio: fix io_setup/io_destroy race Al Viro
  2012-03-07 16:50 ` Benjamin LaHaise
  2012-03-07 18:16 ` Jeff Moyer
@ 2012-03-08 17:51 ` Al Viro
  2012-03-08 18:15   ` Jeff Moyer
  2012-03-10  2:13   ` Benjamin LaHaise
  2 siblings, 2 replies; 6+ messages in thread
From: Al Viro @ 2012-03-08 17:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Benjamin LaHaise, linux-kernel

Current code has put_ioctx() called asynchronously from aio_fput_routine();
that's done *after* we have killed the request that used to pin ioctx,
so there's nothing to stop io_destroy() waiting in wait_for_all_aios()
from progressing.  As the result, we can end up with async call of
put_ioctx() being the last one and possibly happening during exit_mmap()
or elf_core_dump(), neither of which expects stray munmap() being done
to them...

We do need to prevent _freeing_ ioctx until aio_fput_routine() is done
with that, but that's all we care about - neither io_destroy() nor
exit_aio() will progress past wait_for_all_aios() until aio_fput_routine()
does really_put_req(), so the ioctx teardown won't be done until then
and we don't care about the contents of ioctx past that point.

Since actual freeing of these suckers is RCU-delayed, we don't need to
bump ioctx refcount when request goes into list for async removal.
All we need is rcu_read_lock held just over the ->ctx_lock-protected
area in aio_fput_routine().

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
[on top of the previous patch, applies to anything past 2008 when freeing
via call_rcu() went in; earlier versions just need kmem_cache_free() done
via call_rcu() as commit abf137dd7712132ee56d5b3143c2ff61a72a5faa does,
except that we don't really need to delay playing with aio_nr - mainline
also will be better off by not bothering with that]
diff --git a/fs/aio.c b/fs/aio.c
index f6578cb..8cf3796 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -228,12 +228,6 @@ static void __put_ioctx(struct kioctx *ctx)
 	call_rcu(&ctx->rcu_head, ctx_rcu_free);
 }
 
-static inline void get_ioctx(struct kioctx *kioctx)
-{
-	BUG_ON(atomic_read(&kioctx->users) <= 0);
-	atomic_inc(&kioctx->users);
-}
-
 static inline int try_get_ioctx(struct kioctx *kioctx)
 {
 	return atomic_inc_not_zero(&kioctx->users);
@@ -609,11 +603,16 @@ static void aio_fput_routine(struct work_struct *data)
 			fput(req->ki_filp);
 
 		/* Link the iocb into the context's free list */
+		rcu_read_lock();
 		spin_lock_irq(&ctx->ctx_lock);
 		really_put_req(ctx, req);
+		/*
+		 * at that point ctx might've been killed, but actual
+		 * freeing is RCU'd
+		 */
 		spin_unlock_irq(&ctx->ctx_lock);
+		rcu_read_unlock();
 
-		put_ioctx(ctx);
 		spin_lock_irq(&fput_lock);
 	}
 	spin_unlock_irq(&fput_lock);
@@ -644,7 +643,6 @@ static int __aio_put_req(struct kioctx *ctx, struct kiocb *req)
 	 * this function will be executed w/out any aio kthread wakeup.
 	 */
 	if (unlikely(!fput_atomic(req->ki_filp))) {
-		get_ioctx(ctx);
 		spin_lock(&fput_lock);
 		list_add(&req->ki_list, &fput_head);
 		spin_unlock(&fput_lock);

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: fix the "too late munmap()" race
  2012-03-08 17:51 ` [PATCH] aio: fix the "too late munmap()" race Al Viro
@ 2012-03-08 18:15   ` Jeff Moyer
  2012-03-10  2:13   ` Benjamin LaHaise
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff Moyer @ 2012-03-08 18:15 UTC (permalink / raw)
  To: Al Viro; +Cc: Linus Torvalds, Benjamin LaHaise, linux-kernel, linux-aio

Al Viro <viro@ZenIV.linux.org.uk> writes:

> Current code has put_ioctx() called asynchronously from aio_fput_routine();
> that's done *after* we have killed the request that used to pin ioctx,
> so there's nothing to stop io_destroy() waiting in wait_for_all_aios()
> from progressing.  As the result, we can end up with async call of
> put_ioctx() being the last one and possibly happening during exit_mmap()
> or elf_core_dump(), neither of which expects stray munmap() being done
> to them...
>
> We do need to prevent _freeing_ ioctx until aio_fput_routine() is done
> with that, but that's all we care about - neither io_destroy() nor
> exit_aio() will progress past wait_for_all_aios() until aio_fput_routine()
> does really_put_req(), so the ioctx teardown won't be done until then
> and we don't care about the contents of ioctx past that point.
>
> Since actual freeing of these suckers is RCU-delayed, we don't need to
> bump ioctx refcount when request goes into list for async removal.
> All we need is rcu_read_lock held just over the ->ctx_lock-protected
> area in aio_fput_routine().

Looks good to me.

Reviewed-by: Jeff Moyer <jmoyer@redhat.com>

>
> Cc: stable@vger.kernel.org
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
> [on top of the previous patch, applies to anything past 2008 when freeing
> via call_rcu() went in; earlier versions just need kmem_cache_free() done
> via call_rcu() as commit abf137dd7712132ee56d5b3143c2ff61a72a5faa does,
> except that we don't really need to delay playing with aio_nr - mainline
> also will be better off by not bothering with that]
> diff --git a/fs/aio.c b/fs/aio.c
> index f6578cb..8cf3796 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -228,12 +228,6 @@ static void __put_ioctx(struct kioctx *ctx)
>  	call_rcu(&ctx->rcu_head, ctx_rcu_free);
>  }
>  
> -static inline void get_ioctx(struct kioctx *kioctx)
> -{
> -	BUG_ON(atomic_read(&kioctx->users) <= 0);
> -	atomic_inc(&kioctx->users);
> -}
> -
>  static inline int try_get_ioctx(struct kioctx *kioctx)
>  {
>  	return atomic_inc_not_zero(&kioctx->users);
> @@ -609,11 +603,16 @@ static void aio_fput_routine(struct work_struct *data)
>  			fput(req->ki_filp);
>  
>  		/* Link the iocb into the context's free list */
> +		rcu_read_lock();
>  		spin_lock_irq(&ctx->ctx_lock);
>  		really_put_req(ctx, req);
> +		/*
> +		 * at that point ctx might've been killed, but actual
> +		 * freeing is RCU'd
> +		 */
>  		spin_unlock_irq(&ctx->ctx_lock);
> +		rcu_read_unlock();
>  
> -		put_ioctx(ctx);
>  		spin_lock_irq(&fput_lock);
>  	}
>  	spin_unlock_irq(&fput_lock);
> @@ -644,7 +643,6 @@ static int __aio_put_req(struct kioctx *ctx, struct kiocb *req)
>  	 * this function will be executed w/out any aio kthread wakeup.
>  	 */
>  	if (unlikely(!fput_atomic(req->ki_filp))) {
> -		get_ioctx(ctx);
>  		spin_lock(&fput_lock);
>  		list_add(&req->ki_list, &fput_head);
>  		spin_unlock(&fput_lock);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: fix the "too late munmap()" race
  2012-03-08 17:51 ` [PATCH] aio: fix the "too late munmap()" race Al Viro
  2012-03-08 18:15   ` Jeff Moyer
@ 2012-03-10  2:13   ` Benjamin LaHaise
  1 sibling, 0 replies; 6+ messages in thread
From: Benjamin LaHaise @ 2012-03-10  2:13 UTC (permalink / raw)
  To: Al Viro; +Cc: Linus Torvalds, linux-kernel

On Thu, Mar 08, 2012 at 05:51:19PM +0000, Al Viro wrote:
> Since actual freeing of these suckers is RCU-delayed, we don't need to
> bump ioctx refcount when request goes into list for async removal.
> All we need is rcu_read_lock held just over the ->ctx_lock-protected
> area in aio_fput_routine().
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>

Looks good.  As with the previous patch, I instrumented and tested the code 
to trigger fput_work() with no issues encountered.

		-ben

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-03-10  2:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-07  5:16 [PATCH] aio: fix io_setup/io_destroy race Al Viro
2012-03-07 16:50 ` Benjamin LaHaise
2012-03-07 18:16 ` Jeff Moyer
2012-03-08 17:51 ` [PATCH] aio: fix the "too late munmap()" race Al Viro
2012-03-08 18:15   ` Jeff Moyer
2012-03-10  2:13   ` Benjamin LaHaise

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox