* Re: next-20130117 - kernel BUG with aio
[not found] ` <20130124211850.GH26407@google.com>
@ 2013-01-24 21:27 ` Andrew Morton
2013-01-24 21:39 ` Kent Overstreet
2013-01-24 22:13 ` Kent Overstreet
2013-01-24 21:43 ` [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio Kent Overstreet
` (2 subsequent siblings)
3 siblings, 2 replies; 14+ messages in thread
From: Andrew Morton @ 2013-01-24 21:27 UTC (permalink / raw)
To: Kent Overstreet
Cc: Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise, linux-kernel,
linux-aio, zab, linux-fsdevel
On Thu, 24 Jan 2013 13:18:50 -0800
Kent Overstreet <koverstreet@google.com> wrote:
> So, Andrew - that "smoosh struct kiocb" patch should just be dropped,
> even if I fixed that issue clearly the idea is a lot less safe than I
> thought. I've got patches for the other stuff I'm going to mail out
> momentarily.
Dropped. Do you expect that this will fix everything?
Please cc linux-fsdevel on the aio stuff - I'm seeing some AIO things
over there which aren't cc'ed to lkml or linux-aio.
Please also take a look at Jan's recent
http://www.spinics.net/lists/linux-fsdevel/msg61738.html and have a
think about how this plays with your patchset.
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: next-20130117 - kernel BUG with aio
2013-01-24 21:27 ` next-20130117 - kernel BUG with aio Andrew Morton
@ 2013-01-24 21:39 ` Kent Overstreet
2013-01-24 22:25 ` Zach Brown
2013-01-24 22:13 ` Kent Overstreet
1 sibling, 1 reply; 14+ messages in thread
From: Kent Overstreet @ 2013-01-24 21:39 UTC (permalink / raw)
To: Andrew Morton
Cc: Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise, linux-kernel,
linux-aio, zab, linux-fsdevel
On Thu, Jan 24, 2013 at 01:27:59PM -0800, Andrew Morton wrote:
> On Thu, 24 Jan 2013 13:18:50 -0800
> Kent Overstreet <koverstreet@google.com> wrote:
>
> > So, Andrew - that "smoosh struct kiocb" patch should just be dropped,
> > even if I fixed that issue clearly the idea is a lot less safe than I
> > thought. I've got patches for the other stuff I'm going to mail out
> > momentarily.
>
> Dropped. Do you expect that this will fix everything?
No, I didn't see that bug until after I'd fixed the other three, but as
far as I can tell everything's fixed with the patches I'm about to mail
out - my test VM has been running for the past two days without errors,
it's kill -9'ing a process that's got iocbs in flight to a loopback
device every two seconds.
> Please cc linux-fsdevel on the aio stuff - I'm seeing some AIO things
> over there which aren't cc'ed to lkml or linux-aio.
Will do
> Please also take a look at Jan's recent
> http://www.spinics.net/lists/linux-fsdevel/msg61738.html and have a
> think about how this plays with your patchset.
Oh fun, I've chased a bug or two in that ext4 code, that stuff's
scary... I'm sure it'll take me a bit to wrap my head around what's
going on there but I'm looking at it now.
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: next-20130117 - kernel BUG with aio
2013-01-24 21:39 ` Kent Overstreet
@ 2013-01-24 22:25 ` Zach Brown
2013-01-24 22:47 ` Jeff Moyer
2013-01-24 23:03 ` Kent Overstreet
0 siblings, 2 replies; 14+ messages in thread
From: Zach Brown @ 2013-01-24 22:25 UTC (permalink / raw)
To: Kent Overstreet
Cc: Andrew Morton, Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise,
linux-kernel, linux-aio, linux-fsdevel, Jeff Moyer
> No, I didn't see that bug until after I'd fixed the other three, but as
> far as I can tell everything's fixed with the patches I'm about to mail
> out - my test VM has been running for the past two days without errors,
> it's kill -9'ing a process that's got iocbs in flight to a loopback
> device every two seconds.
I'm really worried that this patch series hasn't seen significant enough
testing to justify being queued.
I'll be first in line for blame for not finding the time to finish my
review of the series.
What specific tests has this gone through? The aio tests in xfstests /
ltp? (And as you discovered while chasing this bug, whatever platform
you were running on doesn't seem slow enough to catch some paths.. run
all the tests over loop?)
Jeff, can you suggest a more modern testing regime for the aio core?
It's been so long since I had to hammer on this stuff..
- z
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: next-20130117 - kernel BUG with aio
2013-01-24 22:25 ` Zach Brown
@ 2013-01-24 22:47 ` Jeff Moyer
2013-01-24 23:03 ` Kent Overstreet
1 sibling, 0 replies; 14+ messages in thread
From: Jeff Moyer @ 2013-01-24 22:47 UTC (permalink / raw)
To: Zach Brown
Cc: Kent Overstreet, Andrew Morton, Valdis.Kletnieks, Hillf Danton,
Benjamin LaHaise, linux-kernel, linux-aio, linux-fsdevel
Zach Brown <zab@redhat.com> writes:
>> No, I didn't see that bug until after I'd fixed the other three, but as
>> far as I can tell everything's fixed with the patches I'm about to mail
>> out - my test VM has been running for the past two days without errors,
>> it's kill -9'ing a process that's got iocbs in flight to a loopback
>> device every two seconds.
>
> I'm really worried that this patch series hasn't seen significant enough
> testing to justify being queued.
>
> I'll be first in line for blame for not finding the time to finish my
> review of the series.
>
> What specific tests has this gone through? The aio tests in xfstests /
> ltp? (And as you discovered while chasing this bug, whatever platform
> you were running on doesn't seem slow enough to catch some paths.. run
> all the tests over loop?)
>
> Jeff, can you suggest a more modern testing regime for the aio core?
> It's been so long since I had to hammer on this stuff..
Modern? No. ;-) I usually use xfstests (all of them, not just the aio
group), the libaio test harness, and then hand it off to our performance
team to stress the code under benchmarking workloads. Oh, and usually
targeted testing for the thing that I'm working on.
I'll put a couple of kernels together to hand off to our performance
team, though I don't know how much time they have at present.
Cheers,
Jeff
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: next-20130117 - kernel BUG with aio
2013-01-24 22:25 ` Zach Brown
2013-01-24 22:47 ` Jeff Moyer
@ 2013-01-24 23:03 ` Kent Overstreet
1 sibling, 0 replies; 14+ messages in thread
From: Kent Overstreet @ 2013-01-24 23:03 UTC (permalink / raw)
To: Zach Brown
Cc: Andrew Morton, Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise,
linux-kernel, linux-aio, linux-fsdevel, Jeff Moyer
On Thu, Jan 24, 2013 at 02:25:37PM -0800, Zach Brown wrote:
> > No, I didn't see that bug until after I'd fixed the other three, but as
> > far as I can tell everything's fixed with the patches I'm about to mail
> > out - my test VM has been running for the past two days without errors,
> > it's kill -9'ing a process that's got iocbs in flight to a loopback
> > device every two seconds.
>
> I'm really worried that this patch series hasn't seen significant enough
> testing to justify being queued.
>
> I'll be first in line for blame for not finding the time to finish my
> review of the series.
>
> What specific tests has this gone through? The aio tests in xfstests /
> ltp? (And as you discovered while chasing this bug, whatever platform
> you were running on doesn't seem slow enough to catch some paths.. run
> all the tests over loop?)
I have run xfstests on real hardware - though that didn't catch this
either. These patches are all queued up for our internal tree (I need to
bug Ted to review them... and add these latest fixes). And aio is used
heavily enough internally that if I screwed anything up, I'll be on the
hook...
> Jeff, can you suggest a more modern testing regime for the aio core?
> It's been so long since I had to hammer on this stuff..
I'm wondering how much it'd buy us to rig up fault injection (I've got
some awesome dynamic fault injection code I need to push upstream...).
I'll try and test with that in the next day or so, though.
There's some people here that'd been working on code coverage tooling
too, I should probably learn how to work that stuff.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: next-20130117 - kernel BUG with aio
2013-01-24 21:27 ` next-20130117 - kernel BUG with aio Andrew Morton
2013-01-24 21:39 ` Kent Overstreet
@ 2013-01-24 22:13 ` Kent Overstreet
2013-01-29 13:41 ` Jan Kara
1 sibling, 1 reply; 14+ messages in thread
From: Kent Overstreet @ 2013-01-24 22:13 UTC (permalink / raw)
To: Andrew Morton, jack, tytso
Cc: Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise, linux-kernel,
linux-aio, zab, linux-fsdevel
On Thu, Jan 24, 2013 at 01:27:59PM -0800, Andrew Morton wrote:
> Please also take a look at Jan's recent
> http://www.spinics.net/lists/linux-fsdevel/msg61738.html and have a
> think about how this plays with your patchset.
I can't think of any possible interactions - none of my aio stuff messes
with the way the fput() happens; the aio code does call fput() when the
kiocb is freed and my patches do touch _that_ code but none of the
behaviour there changes.
Might be worth documenting this though, I can't think of any reason it'd
be obvious looking at the aio code that the fput() has to happen after
aio_complete(). As with the bugs I just sent you patches for it's not
terribly clear who owns what in the kiocb when.
Reading those patches though - the main change is to call
inode_dio_done() before calling aio_complete(). All inode_dio_done()
does though is issue a wakeup - to whatever called inode_dio_wait().
That means whatever called inode_dio_wait() needs its own ref on the
inode, and from a cursory glance at the code it is _not_ at all clear to
me that's the case - if inode_dio_wait() is merely finishing things for
that specific IO that need to be done in process context, I can easily
imagine it not being the case.
Assuming whatever does call inode_dio_wait() does have its own ref,
there was only a real use after free when nothing was waiting on the
inode.
Similarly for the ext4 code to write unwritten extents - and having seen
and helped chase a bug in that code before, that code _definitely_ needs
auditing.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: next-20130117 - kernel BUG with aio
2013-01-24 22:13 ` Kent Overstreet
@ 2013-01-29 13:41 ` Jan Kara
0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2013-01-29 13:41 UTC (permalink / raw)
To: Kent Overstreet
Cc: Andrew Morton, jack, tytso, Valdis.Kletnieks, Hillf Danton,
Benjamin LaHaise, linux-kernel, linux-aio, zab, linux-fsdevel
On Thu 24-01-13 14:13:52, Kent Overstreet wrote:
> On Thu, Jan 24, 2013 at 01:27:59PM -0800, Andrew Morton wrote:
> > Please also take a look at Jan's recent
> > http://www.spinics.net/lists/linux-fsdevel/msg61738.html and have a
> > think about how this plays with your patchset.
>
> I can't think of any possible interactions - none of my aio stuff messes
> with the way the fput() happens; the aio code does call fput() when the
> kiocb is freed and my patches do touch _that_ code but none of the
> behaviour there changes.
>
> Might be worth documenting this though, I can't think of any reason it'd
> be obvious looking at the aio code that the fput() has to happen after
> aio_complete(). As with the bugs I just sent you patches for it's not
> terribly clear who owns what in the kiocb when.
>
> Reading those patches though - the main change is to call
> inode_dio_done() before calling aio_complete(). All inode_dio_done()
> does though is issue a wakeup - to whatever called inode_dio_wait().
inode_dio_done() does a decrement and wakeup.
> That means whatever called inode_dio_wait() needs its own ref on the
> inode, and from a cursory glance at the code it is _not_ at all clear to
> me that's the case - if inode_dio_wait() is merely finishing things for
> that specific IO that need to be done in process context, I can easily
> imagine it not being the case.
>
> Assuming whatever does call inode_dio_wait() does have its own ref,
> there was only a real use after free when nothing was waiting on the
> inode.
Well, but there doesn't have to be any waiter... If there is, it had
better have it's own ref, that's for sure.
> Similarly for the ext4 code to write unwritten extents - and having seen
> and helped chase a bug in that code before, that code _definitely_ needs
> auditing.
Agreed. That code is a mess. I'm cleaning up some of it but it's not
easy.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio
[not found] ` <20130124211850.GH26407@google.com>
2013-01-24 21:27 ` next-20130117 - kernel BUG with aio Andrew Morton
@ 2013-01-24 21:43 ` Kent Overstreet
2013-01-25 13:15 ` Hillf Danton
2013-01-24 21:43 ` [PATCH 2/3] aio-kill-ki_retry-fix-fix Kent Overstreet
2013-01-24 21:43 ` [PATCH 3/3] aio-use-cancellation-list-lazily-fix Kent Overstreet
3 siblings, 1 reply; 14+ messages in thread
From: Kent Overstreet @ 2013-01-24 21:43 UTC (permalink / raw)
To: akpm
Cc: Kent Overstreet, Valdis.Kletnieks, dhillf, bcrl, zab,
linux-kernel, linux-aio, linux-fsdevel
The batch completion code was trying to be a bit too clever, and skip
checking ctx where it couldn't be NULL - but that broke if a kiocb had
been cancelled. Move the check to kioctx_ring_unlock().
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
---
fs/aio.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 62573d3..0d2f39d 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -683,6 +683,9 @@ static inline void kioctx_ring_unlock(struct kioctx *ctx, unsigned tail)
{
struct aio_ring *ring;
+ if (!ctx)
+ return;
+
smp_wmb();
/* make event visible before updating tail */
@@ -760,8 +763,7 @@ void batch_complete_aio(struct batch_complete *batch)
}
if (unlikely(req->ki_ctx != ctx)) {
- if (ctx)
- kioctx_ring_unlock(ctx, tail);
+ kioctx_ring_unlock(ctx, tail);
ctx = req->ki_ctx;
tail = kioctx_ring_lock(ctx);
--
1.7.12
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio
2013-01-24 21:43 ` [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio Kent Overstreet
@ 2013-01-25 13:15 ` Hillf Danton
0 siblings, 0 replies; 14+ messages in thread
From: Hillf Danton @ 2013-01-25 13:15 UTC (permalink / raw)
To: Kent Overstreet
Cc: akpm, Valdis.Kletnieks, bcrl, zab, linux-kernel, linux-aio,
linux-fsdevel
On Fri, Jan 25, 2013 at 5:43 AM, Kent Overstreet <koverstreet@google.com> wrote:
> The batch completion code was trying to be a bit too clever, and skip
> checking ctx where it couldn't be NULL - but that broke if a kiocb had
> been cancelled. Move the check to kioctx_ring_unlock().
>
> Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
> Signed-off-by: Kent Overstreet <koverstreet@google.com>
> ---
Acked-by: Hillf Danton <dhillf@gmail.com>
> fs/aio.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/aio.c b/fs/aio.c
> index 62573d3..0d2f39d 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -683,6 +683,9 @@ static inline void kioctx_ring_unlock(struct kioctx *ctx, unsigned tail)
> {
> struct aio_ring *ring;
>
> + if (!ctx)
> + return;
> +
> smp_wmb();
> /* make event visible before updating tail */
>
> @@ -760,8 +763,7 @@ void batch_complete_aio(struct batch_complete *batch)
> }
>
> if (unlikely(req->ki_ctx != ctx)) {
> - if (ctx)
> - kioctx_ring_unlock(ctx, tail);
> + kioctx_ring_unlock(ctx, tail);
>
> ctx = req->ki_ctx;
> tail = kioctx_ring_lock(ctx);
> --
> 1.7.12
>
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/3] aio-kill-ki_retry-fix-fix
[not found] ` <20130124211850.GH26407@google.com>
2013-01-24 21:27 ` next-20130117 - kernel BUG with aio Andrew Morton
2013-01-24 21:43 ` [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio Kent Overstreet
@ 2013-01-24 21:43 ` Kent Overstreet
2013-01-24 21:43 ` [PATCH 3/3] aio-use-cancellation-list-lazily-fix Kent Overstreet
3 siblings, 0 replies; 14+ messages in thread
From: Kent Overstreet @ 2013-01-24 21:43 UTC (permalink / raw)
To: akpm
Cc: Kent Overstreet, Valdis.Kletnieks, dhillf, bcrl, zab,
linux-kernel, linux-aio, linux-fsdevel
The "aio: kill ki-retry" patch was assuming that we didn't touch struct
kiocb after passing it off to something that would call aio_complete() -
which was wrong. So, revert the refcounting changes.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
---
fs/aio.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 0d2f39d..85d1b38 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -592,7 +592,7 @@ static inline struct kiocb *aio_get_req(struct kioctx *ctx)
memset(req, 0, offsetof(struct kiocb, ki_ctx));
req->ki_ctx = ctx;
- atomic_set(&req->ki_users, 1);
+ atomic_set(&req->ki_users, 2);
return req;
out_put:
put_reqs_available(ctx, 1);
@@ -1291,10 +1291,12 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
if (ret)
goto out_put_req;
+ aio_put_req(req); /* drop extra ref to req */
return 0;
out_put_req:
put_reqs_available(ctx, 1);
- aio_put_req(req);
+ aio_put_req(req); /* drop extra ref to req */
+ aio_put_req(req); /* drop i/o ref to req */
return ret;
}
--
1.7.12
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/3] aio-use-cancellation-list-lazily-fix
[not found] ` <20130124211850.GH26407@google.com>
` (2 preceding siblings ...)
2013-01-24 21:43 ` [PATCH 2/3] aio-kill-ki_retry-fix-fix Kent Overstreet
@ 2013-01-24 21:43 ` Kent Overstreet
2013-01-25 13:30 ` Hillf Danton
3 siblings, 1 reply; 14+ messages in thread
From: Kent Overstreet @ 2013-01-24 21:43 UTC (permalink / raw)
To: akpm
Cc: Kent Overstreet, Valdis.Kletnieks, dhillf, bcrl, zab,
linux-kernel, linux-aio, linux-fsdevel
The cancellation changes were fubar - we can't cancel a kiocb if it
doesn't actually have a cancellation callback.
The use of xchg() in aio_complete() was right - there we're marking the
kiocb as completed - but we need to use cmpxchg() in kiocb_cancel() - a
lock isn't sufficient since we're synchronizing with aio_complete()
which isn't taking any locks.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
---
fs/aio.c | 32 ++++++++++++++++++++++----------
include/linux/aio.h | 11 +++++++++++
2 files changed, 33 insertions(+), 10 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 85d1b38..2760180 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -244,28 +244,40 @@ static int aio_setup_ring(struct kioctx *ctx)
void kiocb_set_cancel_fn(struct kiocb *req, kiocb_cancel_fn *cancel)
{
- if (!req->ki_list.next) {
- struct kioctx *ctx = req->ki_ctx;
- unsigned long flags;
+ struct kioctx *ctx = req->ki_ctx;
+ unsigned long flags;
- spin_lock_irqsave(&ctx->ctx_lock, flags);
+ spin_lock_irqsave(&ctx->ctx_lock, flags);
+
+ if (!req->ki_list.next)
list_add(&req->ki_list, &ctx->active_reqs);
- spin_unlock_irqrestore(&ctx->ctx_lock, flags);
- }
req->ki_cancel = cancel;
+
+ spin_unlock_irqrestore(&ctx->ctx_lock, flags);
}
EXPORT_SYMBOL(kiocb_set_cancel_fn);
static int kiocb_cancel(struct kioctx *ctx, struct kiocb *kiocb,
struct io_event *res)
{
- kiocb_cancel_fn *cancel;
+ kiocb_cancel_fn *old, *cancel;
int ret = -EINVAL;
- cancel = xchg(&kiocb->ki_cancel, KIOCB_CANCELLED);
- if (!cancel || cancel == KIOCB_CANCELLED)
- return ret;
+ /*
+ * Don't want to set kiocb->ki_cancel = KIOCB_CANCELLED unless it
+ * actually has a cancel function, hence the cmpxchg()
+ */
+
+ cancel = ACCESS_ONCE(kiocb->ki_cancel);
+ do {
+ if (!cancel || cancel == KIOCB_CANCELLED)
+ return ret;
+
+ BUG();
+ old = cancel;
+ cancel = cmpxchg(&kiocb->ki_cancel, old, KIOCB_CANCELLED);
+ } while (cancel != old);
atomic_inc(&kiocb->ki_users);
spin_unlock_irq(&ctx->ctx_lock);
diff --git a/include/linux/aio.h b/include/linux/aio.h
index 8eaa490..cd4aea0 100644
--- a/include/linux/aio.h
+++ b/include/linux/aio.h
@@ -15,6 +15,17 @@ struct batch_complete;
#define KIOCB_KEY 0
+/*
+ * We use ki_cancel == KIOCB_CANCELLED to indicate that a kiocb has been either
+ * cancelled or completed (this makes a certain amount of sense because
+ * successful cancellation - io_cancel() - does deliver the completion to
+ * userspace).
+ *
+ * And since most things don't implement kiocb cancellation and we'd really like
+ * kiocb completion to be lockless when possible, we use ki_cancel to
+ * synchronize cancellation and completion - we only set it to KIOCB_CANCELLED
+ * with xchg() or cmpxchg(), see batch_complete_aio() and kiocb_cancel().
+ */
#define KIOCB_CANCELLED ((void *) (~0ULL))
typedef int (kiocb_cancel_fn)(struct kiocb *, struct io_event *);
--
1.7.12
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] aio-use-cancellation-list-lazily-fix
2013-01-24 21:43 ` [PATCH 3/3] aio-use-cancellation-list-lazily-fix Kent Overstreet
@ 2013-01-25 13:30 ` Hillf Danton
2013-01-25 23:12 ` Andrew Morton
0 siblings, 1 reply; 14+ messages in thread
From: Hillf Danton @ 2013-01-25 13:30 UTC (permalink / raw)
To: Kent Overstreet
Cc: akpm, Valdis.Kletnieks, bcrl, zab, linux-kernel, linux-aio,
linux-fsdevel
On Fri, Jan 25, 2013 at 5:43 AM, Kent Overstreet <koverstreet@google.com> wrote:
> The cancellation changes were fubar - we can't cancel a kiocb if it
> doesn't actually have a cancellation callback.
>
> The use of xchg() in aio_complete() was right - there we're marking the
> kiocb as completed - but we need to use cmpxchg() in kiocb_cancel() - a
> lock isn't sufficient since we're synchronizing with aio_complete()
> which isn't taking any locks.
>
What is observed without this fix?
> Signed-off-by: Kent Overstreet <koverstreet@google.com>
> ---
> fs/aio.c | 32 ++++++++++++++++++++++----------
> include/linux/aio.h | 11 +++++++++++
> 2 files changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/fs/aio.c b/fs/aio.c
> index 85d1b38..2760180 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -244,28 +244,40 @@ static int aio_setup_ring(struct kioctx *ctx)
>
> void kiocb_set_cancel_fn(struct kiocb *req, kiocb_cancel_fn *cancel)
> {
> - if (!req->ki_list.next) {
> - struct kioctx *ctx = req->ki_ctx;
> - unsigned long flags;
> + struct kioctx *ctx = req->ki_ctx;
> + unsigned long flags;
>
> - spin_lock_irqsave(&ctx->ctx_lock, flags);
> + spin_lock_irqsave(&ctx->ctx_lock, flags);
> +
> + if (!req->ki_list.next)
> list_add(&req->ki_list, &ctx->active_reqs);
> - spin_unlock_irqrestore(&ctx->ctx_lock, flags);
> - }
>
> req->ki_cancel = cancel;
> +
> + spin_unlock_irqrestore(&ctx->ctx_lock, flags);
> }
> EXPORT_SYMBOL(kiocb_set_cancel_fn);
>
> static int kiocb_cancel(struct kioctx *ctx, struct kiocb *kiocb,
> struct io_event *res)
> {
> - kiocb_cancel_fn *cancel;
> + kiocb_cancel_fn *old, *cancel;
> int ret = -EINVAL;
>
> - cancel = xchg(&kiocb->ki_cancel, KIOCB_CANCELLED);
> - if (!cancel || cancel == KIOCB_CANCELLED)
> - return ret;
> + /*
> + * Don't want to set kiocb->ki_cancel = KIOCB_CANCELLED unless it
> + * actually has a cancel function, hence the cmpxchg()
> + */
> +
> + cancel = ACCESS_ONCE(kiocb->ki_cancel);
> + do {
> + if (!cancel || cancel == KIOCB_CANCELLED)
> + return ret;
> +
> + BUG();
Hmm, what is trapped?
> + old = cancel;
> + cancel = cmpxchg(&kiocb->ki_cancel, old, KIOCB_CANCELLED);
> + } while (cancel != old);
>
> atomic_inc(&kiocb->ki_users);
> spin_unlock_irq(&ctx->ctx_lock);
> diff --git a/include/linux/aio.h b/include/linux/aio.h
> index 8eaa490..cd4aea0 100644
> --- a/include/linux/aio.h
> +++ b/include/linux/aio.h
> @@ -15,6 +15,17 @@ struct batch_complete;
>
> #define KIOCB_KEY 0
>
> +/*
> + * We use ki_cancel == KIOCB_CANCELLED to indicate that a kiocb has been either
> + * cancelled or completed (this makes a certain amount of sense because
> + * successful cancellation - io_cancel() - does deliver the completion to
> + * userspace).
> + *
> + * And since most things don't implement kiocb cancellation and we'd really like
> + * kiocb completion to be lockless when possible, we use ki_cancel to
> + * synchronize cancellation and completion - we only set it to KIOCB_CANCELLED
> + * with xchg() or cmpxchg(), see batch_complete_aio() and kiocb_cancel().
> + */
> #define KIOCB_CANCELLED ((void *) (~0ULL))
>
> typedef int (kiocb_cancel_fn)(struct kiocb *, struct io_event *);
> --
> 1.7.12
>
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] aio-use-cancellation-list-lazily-fix
2013-01-25 13:30 ` Hillf Danton
@ 2013-01-25 23:12 ` Andrew Morton
2013-01-28 17:37 ` Kent Overstreet
0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2013-01-25 23:12 UTC (permalink / raw)
To: Hillf Danton
Cc: Kent Overstreet, Valdis.Kletnieks, bcrl, zab, linux-kernel,
linux-aio, linux-fsdevel
On Fri, 25 Jan 2013 21:30:32 +0800
Hillf Danton <dhillf@gmail.com> wrote:
> On Fri, Jan 25, 2013 at 5:43 AM, Kent Overstreet <koverstreet@google.com> wrote:
> > The cancellation changes were fubar - we can't cancel a kiocb if it
> > doesn't actually have a cancellation callback.
> >
> > The use of xchg() in aio_complete() was right - there we're marking the
> > kiocb as completed - but we need to use cmpxchg() in kiocb_cancel() - a
> > lock isn't sufficient since we're synchronizing with aio_complete()
> > which isn't taking any locks.
> >
> > static int kiocb_cancel(struct kioctx *ctx, struct kiocb *kiocb,
> > struct io_event *res)
> > {
> > - kiocb_cancel_fn *cancel;
> > + kiocb_cancel_fn *old, *cancel;
> > int ret = -EINVAL;
> >
> > - cancel = xchg(&kiocb->ki_cancel, KIOCB_CANCELLED);
> > - if (!cancel || cancel == KIOCB_CANCELLED)
> > - return ret;
> > + /*
> > + * Don't want to set kiocb->ki_cancel = KIOCB_CANCELLED unless it
> > + * actually has a cancel function, hence the cmpxchg()
> > + */
> > +
> > + cancel = ACCESS_ONCE(kiocb->ki_cancel);
> > + do {
> > + if (!cancel || cancel == KIOCB_CANCELLED)
> > + return ret;
> > +
> > + BUG();
>
> Hmm, what is trapped?
>
> > + old = cancel;
> > + cancel = cmpxchg(&kiocb->ki_cancel, old, KIOCB_CANCELLED);
> > + } while (cancel != old);
erk, I missed that. What earthly sense is there in putting a BUG() in
that place.
I think I'll delete it and pretend I never saw it :(
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] aio-use-cancellation-list-lazily-fix
2013-01-25 23:12 ` Andrew Morton
@ 2013-01-28 17:37 ` Kent Overstreet
0 siblings, 0 replies; 14+ messages in thread
From: Kent Overstreet @ 2013-01-28 17:37 UTC (permalink / raw)
To: Andrew Morton
Cc: Hillf Danton, Valdis.Kletnieks, bcrl, zab, linux-kernel,
linux-aio, linux-fsdevel
On Fri, Jan 25, 2013 at 03:12:51PM -0800, Andrew Morton wrote:
> On Fri, 25 Jan 2013 21:30:32 +0800
> Hillf Danton <dhillf@gmail.com> wrote:
>
> > On Fri, Jan 25, 2013 at 5:43 AM, Kent Overstreet <koverstreet@google.com> wrote:
> > > The cancellation changes were fubar - we can't cancel a kiocb if it
> > > doesn't actually have a cancellation callback.
> > >
> > > The use of xchg() in aio_complete() was right - there we're marking the
> > > kiocb as completed - but we need to use cmpxchg() in kiocb_cancel() - a
> > > lock isn't sufficient since we're synchronizing with aio_complete()
> > > which isn't taking any locks.
> > >
> > > static int kiocb_cancel(struct kioctx *ctx, struct kiocb *kiocb,
> > > struct io_event *res)
> > > {
> > > - kiocb_cancel_fn *cancel;
> > > + kiocb_cancel_fn *old, *cancel;
> > > int ret = -EINVAL;
> > >
> > > - cancel = xchg(&kiocb->ki_cancel, KIOCB_CANCELLED);
> > > - if (!cancel || cancel == KIOCB_CANCELLED)
> > > - return ret;
> > > + /*
> > > + * Don't want to set kiocb->ki_cancel = KIOCB_CANCELLED unless it
> > > + * actually has a cancel function, hence the cmpxchg()
> > > + */
> > > +
> > > + cancel = ACCESS_ONCE(kiocb->ki_cancel);
> > > + do {
> > > + if (!cancel || cancel == KIOCB_CANCELLED)
> > > + return ret;
> > > +
> > > + BUG();
> >
> > Hmm, what is trapped?
> >
> > > + old = cancel;
> > > + cancel = cmpxchg(&kiocb->ki_cancel, old, KIOCB_CANCELLED);
> > > + } while (cancel != old);
>
> erk, I missed that. What earthly sense is there in putting a BUG() in
> that place.
>
> I think I'll delete it and pretend I never saw it :(
Argh. Yeah, sorry about that. Put that in there when I was trying to
track down the other bugs :(
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 14+ messages in thread