* [PATCH] aio: fix io_setup/io_destroy race
@ 2012-03-07 5:16 Al Viro
2012-03-07 16:50 ` Benjamin LaHaise
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Al Viro @ 2012-03-07 5:16 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Benjamin LaHaise, linux-kernel
Have ioctx_alloc() return an extra reference, so that caller would drop it
on success and not bother with re-grabbing it on failure exit. The current
code is obviously broken - io_destroy() from another thread that managed
to guess the address io_setup() would've returned would free ioctx right
under us; gets especially interesting if aio_context_t * we pass to
io_setup() points to PROT_READ mapping, so put_user() fails and we end
up doing io_destroy() on kioctx another thread has just got freed...
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
[there's other pending aio stuff, but this part is really obvious and
self-contained]
diff --git a/fs/aio.c b/fs/aio.c
index 67e4b90..f6578cb 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -273,7 +273,7 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
mm = ctx->mm = current->mm;
atomic_inc(&mm->mm_count);
- atomic_set(&ctx->users, 1);
+ atomic_set(&ctx->users, 2);
spin_lock_init(&ctx->ctx_lock);
spin_lock_init(&ctx->ring_info.ring_lock);
init_waitqueue_head(&ctx->wait);
@@ -1338,10 +1338,10 @@ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp)
ret = PTR_ERR(ioctx);
if (!IS_ERR(ioctx)) {
ret = put_user(ioctx->user_id, ctxp);
- if (!ret)
+ if (!ret) {
+ put_ioctx(ioctx);
return 0;
-
- get_ioctx(ioctx); /* io_destroy() expects us to hold a ref */
+ }
io_destroy(ioctx);
}
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] aio: fix io_setup/io_destroy race
2012-03-07 5:16 [PATCH] aio: fix io_setup/io_destroy race Al Viro
@ 2012-03-07 16:50 ` Benjamin LaHaise
2012-03-07 18:16 ` Jeff Moyer
2012-03-08 17:51 ` [PATCH] aio: fix the "too late munmap()" race Al Viro
2 siblings, 0 replies; 6+ messages in thread
From: Benjamin LaHaise @ 2012-03-07 16:50 UTC (permalink / raw)
To: Al Viro; +Cc: Linus Torvalds, linux-kernel
On Wed, Mar 07, 2012 at 05:16:35AM +0000, Al Viro wrote:
> Have ioctx_alloc() return an extra reference, so that caller would drop it
> on success and not bother with re-grabbing it on failure exit. The current
> code is obviously broken - io_destroy() from another thread that managed
> to guess the address io_setup() would've returned would free ioctx right
> under us; gets especially interesting if aio_context_t * we pass to
> io_setup() points to PROT_READ mapping, so put_user() fails and we end
> up doing io_destroy() on kioctx another thread has just got freed...
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
The fix looks good, and works with a quick test program checking the two cases
involved (ioctx create success + failure due to PROT_READ).
-ben
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] aio: fix io_setup/io_destroy race
2012-03-07 5:16 [PATCH] aio: fix io_setup/io_destroy race Al Viro
2012-03-07 16:50 ` Benjamin LaHaise
@ 2012-03-07 18:16 ` Jeff Moyer
2012-03-08 17:51 ` [PATCH] aio: fix the "too late munmap()" race Al Viro
2 siblings, 0 replies; 6+ messages in thread
From: Jeff Moyer @ 2012-03-07 18:16 UTC (permalink / raw)
To: Al Viro; +Cc: Linus Torvalds, Benjamin LaHaise, linux-kernel, linux-aio
Al Viro <viro@ZenIV.linux.org.uk> writes:
> Have ioctx_alloc() return an extra reference, so that caller would drop it
> on success and not bother with re-grabbing it on failure exit. The current
> code is obviously broken - io_destroy() from another thread that managed
> to guess the address io_setup() would've returned would free ioctx right
> under us; gets especially interesting if aio_context_t * we pass to
> io_setup() points to PROT_READ mapping, so put_user() fails and we end
> up doing io_destroy() on kioctx another thread has just got freed...
Al, you certainly are creative. ;-) I agree with the problem and the
fix. It would be nice, though if you had added comments.
I ran xfstests ./check -g aio, and there were no problems.
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] aio: fix the "too late munmap()" race
2012-03-07 5:16 [PATCH] aio: fix io_setup/io_destroy race Al Viro
2012-03-07 16:50 ` Benjamin LaHaise
2012-03-07 18:16 ` Jeff Moyer
@ 2012-03-08 17:51 ` Al Viro
2012-03-08 18:15 ` Jeff Moyer
2012-03-10 2:13 ` Benjamin LaHaise
2 siblings, 2 replies; 6+ messages in thread
From: Al Viro @ 2012-03-08 17:51 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Benjamin LaHaise, linux-kernel
Current code has put_ioctx() called asynchronously from aio_fput_routine();
that's done *after* we have killed the request that used to pin ioctx,
so there's nothing to stop io_destroy() waiting in wait_for_all_aios()
from progressing. As the result, we can end up with async call of
put_ioctx() being the last one and possibly happening during exit_mmap()
or elf_core_dump(), neither of which expects stray munmap() being done
to them...
We do need to prevent _freeing_ ioctx until aio_fput_routine() is done
with that, but that's all we care about - neither io_destroy() nor
exit_aio() will progress past wait_for_all_aios() until aio_fput_routine()
does really_put_req(), so the ioctx teardown won't be done until then
and we don't care about the contents of ioctx past that point.
Since actual freeing of these suckers is RCU-delayed, we don't need to
bump ioctx refcount when request goes into list for async removal.
All we need is rcu_read_lock held just over the ->ctx_lock-protected
area in aio_fput_routine().
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
[on top of the previous patch, applies to anything past 2008 when freeing
via call_rcu() went in; earlier versions just need kmem_cache_free() done
via call_rcu() as commit abf137dd7712132ee56d5b3143c2ff61a72a5faa does,
except that we don't really need to delay playing with aio_nr - mainline
also will be better off by not bothering with that]
diff --git a/fs/aio.c b/fs/aio.c
index f6578cb..8cf3796 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -228,12 +228,6 @@ static void __put_ioctx(struct kioctx *ctx)
call_rcu(&ctx->rcu_head, ctx_rcu_free);
}
-static inline void get_ioctx(struct kioctx *kioctx)
-{
- BUG_ON(atomic_read(&kioctx->users) <= 0);
- atomic_inc(&kioctx->users);
-}
-
static inline int try_get_ioctx(struct kioctx *kioctx)
{
return atomic_inc_not_zero(&kioctx->users);
@@ -609,11 +603,16 @@ static void aio_fput_routine(struct work_struct *data)
fput(req->ki_filp);
/* Link the iocb into the context's free list */
+ rcu_read_lock();
spin_lock_irq(&ctx->ctx_lock);
really_put_req(ctx, req);
+ /*
+ * at that point ctx might've been killed, but actual
+ * freeing is RCU'd
+ */
spin_unlock_irq(&ctx->ctx_lock);
+ rcu_read_unlock();
- put_ioctx(ctx);
spin_lock_irq(&fput_lock);
}
spin_unlock_irq(&fput_lock);
@@ -644,7 +643,6 @@ static int __aio_put_req(struct kioctx *ctx, struct kiocb *req)
* this function will be executed w/out any aio kthread wakeup.
*/
if (unlikely(!fput_atomic(req->ki_filp))) {
- get_ioctx(ctx);
spin_lock(&fput_lock);
list_add(&req->ki_list, &fput_head);
spin_unlock(&fput_lock);
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] aio: fix the "too late munmap()" race
2012-03-08 17:51 ` [PATCH] aio: fix the "too late munmap()" race Al Viro
@ 2012-03-08 18:15 ` Jeff Moyer
2012-03-10 2:13 ` Benjamin LaHaise
1 sibling, 0 replies; 6+ messages in thread
From: Jeff Moyer @ 2012-03-08 18:15 UTC (permalink / raw)
To: Al Viro; +Cc: Linus Torvalds, Benjamin LaHaise, linux-kernel, linux-aio
Al Viro <viro@ZenIV.linux.org.uk> writes:
> Current code has put_ioctx() called asynchronously from aio_fput_routine();
> that's done *after* we have killed the request that used to pin ioctx,
> so there's nothing to stop io_destroy() waiting in wait_for_all_aios()
> from progressing. As the result, we can end up with async call of
> put_ioctx() being the last one and possibly happening during exit_mmap()
> or elf_core_dump(), neither of which expects stray munmap() being done
> to them...
>
> We do need to prevent _freeing_ ioctx until aio_fput_routine() is done
> with that, but that's all we care about - neither io_destroy() nor
> exit_aio() will progress past wait_for_all_aios() until aio_fput_routine()
> does really_put_req(), so the ioctx teardown won't be done until then
> and we don't care about the contents of ioctx past that point.
>
> Since actual freeing of these suckers is RCU-delayed, we don't need to
> bump ioctx refcount when request goes into list for async removal.
> All we need is rcu_read_lock held just over the ->ctx_lock-protected
> area in aio_fput_routine().
Looks good to me.
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
> [on top of the previous patch, applies to anything past 2008 when freeing
> via call_rcu() went in; earlier versions just need kmem_cache_free() done
> via call_rcu() as commit abf137dd7712132ee56d5b3143c2ff61a72a5faa does,
> except that we don't really need to delay playing with aio_nr - mainline
> also will be better off by not bothering with that]
> diff --git a/fs/aio.c b/fs/aio.c
> index f6578cb..8cf3796 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -228,12 +228,6 @@ static void __put_ioctx(struct kioctx *ctx)
> call_rcu(&ctx->rcu_head, ctx_rcu_free);
> }
>
> -static inline void get_ioctx(struct kioctx *kioctx)
> -{
> - BUG_ON(atomic_read(&kioctx->users) <= 0);
> - atomic_inc(&kioctx->users);
> -}
> -
> static inline int try_get_ioctx(struct kioctx *kioctx)
> {
> return atomic_inc_not_zero(&kioctx->users);
> @@ -609,11 +603,16 @@ static void aio_fput_routine(struct work_struct *data)
> fput(req->ki_filp);
>
> /* Link the iocb into the context's free list */
> + rcu_read_lock();
> spin_lock_irq(&ctx->ctx_lock);
> really_put_req(ctx, req);
> + /*
> + * at that point ctx might've been killed, but actual
> + * freeing is RCU'd
> + */
> spin_unlock_irq(&ctx->ctx_lock);
> + rcu_read_unlock();
>
> - put_ioctx(ctx);
> spin_lock_irq(&fput_lock);
> }
> spin_unlock_irq(&fput_lock);
> @@ -644,7 +643,6 @@ static int __aio_put_req(struct kioctx *ctx, struct kiocb *req)
> * this function will be executed w/out any aio kthread wakeup.
> */
> if (unlikely(!fput_atomic(req->ki_filp))) {
> - get_ioctx(ctx);
> spin_lock(&fput_lock);
> list_add(&req->ki_list, &fput_head);
> spin_unlock(&fput_lock);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] aio: fix the "too late munmap()" race
2012-03-08 17:51 ` [PATCH] aio: fix the "too late munmap()" race Al Viro
2012-03-08 18:15 ` Jeff Moyer
@ 2012-03-10 2:13 ` Benjamin LaHaise
1 sibling, 0 replies; 6+ messages in thread
From: Benjamin LaHaise @ 2012-03-10 2:13 UTC (permalink / raw)
To: Al Viro; +Cc: Linus Torvalds, linux-kernel
On Thu, Mar 08, 2012 at 05:51:19PM +0000, Al Viro wrote:
> Since actual freeing of these suckers is RCU-delayed, we don't need to
> bump ioctx refcount when request goes into list for async removal.
> All we need is rcu_read_lock held just over the ->ctx_lock-protected
> area in aio_fput_routine().
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Looks good. As with the previous patch, I instrumented and tested the code
to trigger fput_work() with no issues encountered.
-ben
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-03-10 2:13 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-07 5:16 [PATCH] aio: fix io_setup/io_destroy race Al Viro
2012-03-07 16:50 ` Benjamin LaHaise
2012-03-07 18:16 ` Jeff Moyer
2012-03-08 17:51 ` [PATCH] aio: fix the "too late munmap()" race Al Viro
2012-03-08 18:15 ` Jeff Moyer
2012-03-10 2:13 ` Benjamin LaHaise
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox