* next-20130117 - kernel BUG with aio @ 2013-01-21 13:24 Valdis Kletnieks 2013-01-22 13:43 ` Hillf Danton 0 siblings, 1 reply; 26+ messages in thread From: Valdis Kletnieks @ 2013-01-21 13:24 UTC (permalink / raw) To: Benjamin LaHaise, Kent Overstreet; +Cc: linux-kernel, linux-aio [-- Attachment #1: Type: text/plain, Size: 3777 bytes --] Am seeing a reproducible BUG in the kernel with next-20130117 whenever I fire up VirtualBox. Unfortunately, I hadn't done that in a while, so the last 'known good' kernel was next-20121203. I'm strongly suspecting one of Kent Overstreet's 32 patches against aio, because 'git blame' shows those landing on Jan 12, and not much else happening to fs/aio.c in ages. The stack traceback ring any bells before I go to bisect this? [ 327.375581] BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1138 [ 327.375588] in_atomic(): 0, irqs_disabled(): 1, pid: 2096, name: AioMgr0-N [ 327.375590] INFO: lockdep is turned off. [ 327.375593] irq event stamp: 0 [ 327.375595] hardirqs last enabled at (0): [< (null)>] (null) [ 327.375599] hardirqs last disabled at (0): [<ffffffff8102d9eb>] copy_process.part.40+0x565/0x14be [ 327.375607] softirqs last enabled at (0): [<ffffffff8102d9eb>] copy_process.part.40+0x565/0x14be [ 327.375611] softirqs last disabled at (0): [< (null)>] (null) [ 327.375616] Pid: 2096, comm: AioMgr0-N Tainted: P O 3.8.0-rc3-next-20130117-dirty #49 [ 327.375618] Call Trace: [ 327.375624] [<ffffffff810770ba>] ? print_irqtrace_events+0x9d/0xa1 [ 327.375630] [<ffffffff8105a576>] __might_sleep+0x19f/0x1a7 [ 327.375635] [<ffffffff81617ab4>] __do_page_fault+0x2a4/0x57c [ 327.375641] [<ffffffff810dbb55>] ? invalidate_inode_pages2_range+0x2e0/0x2f8 [ 327.375645] [<ffffffff811843f4>] ? ext4_direct_IO+0x224/0x3c2 [ 327.375650] [<ffffffff81186438>] ? noalloc_get_block_write+0x57/0x57 [ 327.375654] [<ffffffff81182c4d>] ? ext4_readpages+0x41/0x41 [ 327.375659] [<ffffffff810b7caf>] ? time_hardirqs_off+0x1b/0x2f [ 327.375663] [<ffffffff81615373>] ? error_sti+0x5/0x6 [ 327.375667] [<ffffffff8107522f>] ? trace_hardirqs_off_caller+0x1f/0x9e [ 327.375672] [<ffffffff8124ad2d>] ? trace_hardirqs_off_thunk+0x3a/0x3c [ 327.375676] [<ffffffff81617d95>] do_page_fault+0x9/0xb [ 327.375680] [<ffffffff81615182>] page_fault+0x22/0x30 [ 327.375685] [<ffffffff811522af>] ? kioctx_ring_unlock+0xd/0x5f [ 327.375689] [<ffffffff811524c7>] batch_complete_aio+0x1c6/0x212 [ 327.375694] [<ffffffff8117fc63>] ? ext4_unwritten_wait+0x98/0x98 [ 327.375697] [<ffffffff81152b3a>] aio_complete_batch+0x125/0x132 [ 327.375702] [<ffffffff8117fc63>] ? ext4_unwritten_wait+0x98/0x98 [ 327.375705] [<ffffffff81153931>] do_io_submit+0x781/0x84b [ 327.375710] [<ffffffff81153a06>] sys_io_submit+0xb/0xd [ 327.375715] [<ffffffff8161b0d2>] system_call_fastpath+0x16/0x1b (and that BUG cascades into a second one: [ 327.375724] BUG: unable to handle kernel NULL pointer dereference at 0000000000000250 [ 327.375729] IP: [<ffffffff811522af>] kioctx_ring_unlock+0xd/0x5f [ 327.375733] PGD d0d36067 PUD da749067 PMD 0 [ 327.375740] Oops: 0002 [#1] PREEMPT SMP ... [ 327.375829] Call Trace: [ 327.375833] [<ffffffff811524c7>] batch_complete_aio+0x1c6/0x212 [ 327.375838] [<ffffffff8117fc63>] ? ext4_unwritten_wait+0x98/0x98 [ 327.375842] [<ffffffff81152b3a>] aio_complete_batch+0x125/0x132 [ 327.375846] [<ffffffff8117fc63>] ? ext4_unwritten_wait+0x98/0x98 [ 327.375850] [<ffffffff81153931>] do_io_submit+0x781/0x84b [ 327.375855] [<ffffffff81153a06>] sys_io_submit+0xb/0xd [ 327.375859] [<ffffffff8161b0d2>] system_call_fastpath+0x16/0x1b [ 327.375861] Code: 00 50 48 8d 5f 90 48 81 c7 98 01 00 00 e8 0e 9c f0 ff 48 89 df e8 87 fd ff ff 58 5b 5d c3 55 48 89 e5 41 54 41 89 f4 53 48 89 fb <89> b3 50 02 00 00 48 8b 47 50 48 8b 38 e8 37 f9 ff ff 44 89 60 [ 327.375937] RIP [<ffffffff811522af>] kioctx_ring_unlock+0xd/0x5f [ 327.375942] RSP <ffff8800b8965db8> [ 327.375944] CR2: 0000000000000250 [ 327.375949] ---[ end trace b119850056dcfba4 ]--- [-- Attachment #2: Type: application/pgp-signature, Size: 865 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-21 13:24 next-20130117 - kernel BUG with aio Valdis Kletnieks @ 2013-01-22 13:43 ` Hillf Danton 2013-01-22 21:28 ` Valdis.Kletnieks 0 siblings, 1 reply; 26+ messages in thread From: Hillf Danton @ 2013-01-22 13:43 UTC (permalink / raw) To: Valdis Kletnieks Cc: Benjamin LaHaise, Kent Overstreet, linux-kernel, linux-aio On Mon, Jan 21, 2013 at 9:24 PM, Valdis Kletnieks <Valdis.Kletnieks@vt.edu> wrote: > Am seeing a reproducible BUG in the kernel with next-20130117 > whenever I fire up VirtualBox. Unfortunately, I hadn't done that > in a while, so the last 'known good' kernel was next-20121203. > > I'm strongly suspecting one of Kent Overstreet's 32 patches against aio, > because 'git blame' shows those landing on Jan 12, and not much else > happening to fs/aio.c in ages. > Take a try? --- --- a/fs/aio.c Tue Jan 22 21:37:54 2013 +++ b/fs/aio.c Tue Jan 22 21:43:58 2013 @@ -683,6 +683,9 @@ static inline void kioctx_ring_unlock(st { struct aio_ring *ring; + if (!ctx) + return; + smp_wmb(); /* make event visible before updating tail */ -- ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-22 13:43 ` Hillf Danton @ 2013-01-22 21:28 ` Valdis.Kletnieks 2013-01-23 12:10 ` Hillf Danton 2013-01-31 21:59 ` next-20130117 - kernel BUG with aio Andrew Morton 0 siblings, 2 replies; 26+ messages in thread From: Valdis.Kletnieks @ 2013-01-22 21:28 UTC (permalink / raw) To: Hillf Danton; +Cc: Benjamin LaHaise, Kent Overstreet, linux-kernel, linux-aio [-- Attachment #1: Type: text/plain, Size: 1587 bytes --] On Tue, 22 Jan 2013 21:43:27 +0800, Hillf Danton said: > On Mon, Jan 21, 2013 at 9:24 PM, Valdis Kletnieks > <Valdis.Kletnieks@vt.edu> wrote: > > Am seeing a reproducible BUG in the kernel with next-20130117 > > whenever I fire up VirtualBox. Unfortunately, I hadn't done that > > in a while, so the last 'known good' kernel was next-20121203. > > > > I'm strongly suspecting one of Kent Overstreet's 32 patches against aio, > > because 'git blame' shows those landing on Jan 12, and not much else > > happening to fs/aio.c in ages. > > > Take a try? > --- > --- a/fs/aio.c Tue Jan 22 21:37:54 2013 > +++ b/fs/aio.c Tue Jan 22 21:43:58 2013 > @@ -683,6 +683,9 @@ static inline void kioctx_ring_unlock(st > { > struct aio_ring *ring; > > + if (!ctx) > + return; > + > smp_wmb(); > /* make event visible before updating tail */ Well, things are improved - at least now it doesn't BUG :) [ 534.879083] ------------[ cut here ]------------ [ 534.879094] WARNING: at fs/aio.c:336 put_ioctx+0x1cb/0x252() [ 534.879121] Call Trace: [ 534.879129] [<ffffffff8102f5ad>] warn_slowpath_common+0x7e/0x97 [ 534.879133] [<ffffffff8102f5db>] warn_slowpath_null+0x15/0x17 [ 534.879137] [<ffffffff811521f0>] put_ioctx+0x1cb/0x252 [ 534.879142] [<ffffffff8105bee3>] ? __wake_up+0x3f/0x48 [ 534.879146] [<ffffffff8115229e>] ? kill_ioctx_work+0x27/0x2b [ 534.879150] [<ffffffff811531a5>] sys_io_destroy+0x40/0x50 [ 534.879156] [<ffffffff8161b112>] system_call_fastpath+0x16/0x1b [ 534.879159] ---[ end trace a2c46a8bc9058404 ]--- Hopefully that tells you and Kent something. :) [-- Attachment #2: Type: application/pgp-signature, Size: 865 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-22 21:28 ` Valdis.Kletnieks @ 2013-01-23 12:10 ` Hillf Danton 2013-01-24 17:22 ` Valdis.Kletnieks 2013-01-31 21:59 ` next-20130117 - kernel BUG with aio Andrew Morton 1 sibling, 1 reply; 26+ messages in thread From: Hillf Danton @ 2013-01-23 12:10 UTC (permalink / raw) To: Valdis.Kletnieks Cc: Benjamin LaHaise, Kent Overstreet, linux-kernel, linux-aio On Wed, Jan 23, 2013 at 5:28 AM, <Valdis.Kletnieks@vt.edu> wrote: > On Tue, 22 Jan 2013 21:43:27 +0800, Hillf Danton said: >> On Mon, Jan 21, 2013 at 9:24 PM, Valdis Kletnieks >> <Valdis.Kletnieks@vt.edu> wrote: >> > Am seeing a reproducible BUG in the kernel with next-20130117 >> > whenever I fire up VirtualBox. Unfortunately, I hadn't done that >> > in a while, so the last 'known good' kernel was next-20121203. >> > >> > I'm strongly suspecting one of Kent Overstreet's 32 patches against aio, >> > because 'git blame' shows those landing on Jan 12, and not much else >> > happening to fs/aio.c in ages. >> > >> Take a try? >> --- >> --- a/fs/aio.c Tue Jan 22 21:37:54 2013 >> +++ b/fs/aio.c Tue Jan 22 21:43:58 2013 >> @@ -683,6 +683,9 @@ static inline void kioctx_ring_unlock(st >> { >> struct aio_ring *ring; >> >> + if (!ctx) >> + return; >> + >> smp_wmb(); >> /* make event visible before updating tail */ > > Well, things are improved - at least now it doesn't BUG :) Good news ;) > > [ 534.879083] ------------[ cut here ]------------ > [ 534.879094] WARNING: at fs/aio.c:336 put_ioctx+0x1cb/0x252() > [ 534.879121] Call Trace: > [ 534.879129] [<ffffffff8102f5ad>] warn_slowpath_common+0x7e/0x97 > [ 534.879133] [<ffffffff8102f5db>] warn_slowpath_null+0x15/0x17 > [ 534.879137] [<ffffffff811521f0>] put_ioctx+0x1cb/0x252 > [ 534.879142] [<ffffffff8105bee3>] ? __wake_up+0x3f/0x48 > [ 534.879146] [<ffffffff8115229e>] ? kill_ioctx_work+0x27/0x2b > [ 534.879150] [<ffffffff811531a5>] sys_io_destroy+0x40/0x50 > [ 534.879156] [<ffffffff8161b112>] system_call_fastpath+0x16/0x1b > [ 534.879159] ---[ end trace a2c46a8bc9058404 ]--- > > Hopefully that tells you and Kent something. :) Try again? --- --- a/fs/aio.c Tue Jan 22 21:37:54 2013 +++ b/fs/aio.c Wed Jan 23 20:06:14 2013 @@ -683,6 +683,9 @@ static inline void kioctx_ring_unlock(st { struct aio_ring *ring; + if (!ctx) + return; + smp_wmb(); /* make event visible before updating tail */ @@ -723,6 +726,7 @@ void batch_complete_aio(struct batch_com n = rb_first(&batch->kiocb); while (n) { struct kiocb *req = container_of(n, struct kiocb, ki_node); + int cancelled = 0; if (n->rb_right) { n->rb_right->__rb_parent_color = n->__rb_parent_color; @@ -736,13 +740,8 @@ void batch_complete_aio(struct batch_com if (unlikely(xchg(&req->ki_cancel, KIOCB_CANCELLED) == KIOCB_CANCELLED)) { - /* - * Can't use the percpu reqs_available here - could race - * with free_ioctx() - */ - atomic_inc(&req->ki_ctx->reqs_available); - aio_put_req(req); - continue; + cancelled = 1; + goto lock; } if (unlikely(req->ki_eventfd != eventfd)) { @@ -759,6 +758,7 @@ void batch_complete_aio(struct batch_com req->ki_eventfd = NULL; } + lock: if (unlikely(req->ki_ctx != ctx)) { if (ctx) kioctx_ring_unlock(ctx, tail); @@ -767,7 +767,12 @@ void batch_complete_aio(struct batch_com tail = kioctx_ring_lock(ctx); } - tail = kioctx_ring_put(ctx, req, tail); + if (cancelled) { + if (++tail >= ctx->nr) + tail = 0; + } else + tail = kioctx_ring_put(ctx, req, tail); + aio_put_req(req); } -- ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-23 12:10 ` Hillf Danton @ 2013-01-24 17:22 ` Valdis.Kletnieks 2013-01-24 21:18 ` Kent Overstreet 0 siblings, 1 reply; 26+ messages in thread From: Valdis.Kletnieks @ 2013-01-24 17:22 UTC (permalink / raw) To: Hillf Danton; +Cc: Benjamin LaHaise, Kent Overstreet, linux-kernel, linux-aio [-- Attachment #1: Type: text/plain, Size: 1497 bytes --] On Wed, 23 Jan 2013 20:10:03 +0800, Hillf Danton said: > Try again? > --- > > --- a/fs/aio.c Tue Jan 22 21:37:54 2013 > +++ b/fs/aio.c Wed Jan 23 20:06:14 2013 Now seeing this: [ 2941.495370] ------------[ cut here ]------------ [ 2941.495379] WARNING: at fs/aio.c:336 put_ioctx+0x1cb/0x252() Same warn location, but different traceback? [ 2941.495407] Call Trace: [ 2941.495415] [<ffffffff8102f5ad>] warn_slowpath_common+0x7e/0x97 [ 2941.495419] [<ffffffff8102f5db>] warn_slowpath_null+0x15/0x17 [ 2941.495423] [<ffffffff8115247b>] put_ioctx+0x1cb/0x252 [ 2941.495428] [<ffffffff81617fc3>] ? sub_preempt_count+0x33/0x46 [ 2941.495433] [<ffffffff81050448>] ? abort_exclusive_wait+0x89/0x89 [ 2941.495438] [<ffffffff81152529>] kill_ioctx_work+0x27/0x2b [ 2941.495443] [<ffffffff81603ded>] process_one_work+0x26f/0x4be [ 2941.495447] [<ffffffff81603d27>] ? process_one_work+0x1a9/0x4be [ 2941.495453] [<ffffffff810483c5>] ? spin_lock_irq+0x9/0xb [ 2941.495457] [<ffffffff8104b0f9>] worker_thread+0x1b6/0x28e [ 2941.495461] [<ffffffff8104af43>] ? rescuer_thread+0x1a5/0x1a5 [ 2941.495466] [<ffffffff8104f7cc>] kthread+0x9d/0xa5 [ 2941.495471] [<ffffffff8104f72f>] ? __kthread_parkme+0x60/0x60 [ 2941.495475] [<ffffffff8161b06c>] ret_from_fork+0x7c/0xb0 [ 2941.495479] [<ffffffff8104f72f>] ? __kthread_parkme+0x60/0x60 [ 2941.495483] ---[ end trace 8e545cd216c853ec ]--- Obviously VirtualBox 4.2.6 on my laptop is going out of its way to get indigestion at this patch series. :) [-- Attachment #2: Type: application/pgp-signature, Size: 865 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-24 17:22 ` Valdis.Kletnieks @ 2013-01-24 21:18 ` Kent Overstreet 2013-01-24 21:27 ` Andrew Morton ` (3 more replies) 0 siblings, 4 replies; 26+ messages in thread From: Kent Overstreet @ 2013-01-24 21:18 UTC (permalink / raw) To: Valdis.Kletnieks Cc: Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio, zab, akpm On Thu, Jan 24, 2013 at 12:22:21PM -0500, Valdis.Kletnieks@vt.edu wrote: > On Wed, 23 Jan 2013 20:10:03 +0800, Hillf Danton said: > > > Try again? > > --- > > > > --- a/fs/aio.c Tue Jan 22 21:37:54 2013 > > +++ b/fs/aio.c Wed Jan 23 20:06:14 2013 > > Now seeing this: > > [ 2941.495370] ------------[ cut here ]------------ > [ 2941.495379] WARNING: at fs/aio.c:336 put_ioctx+0x1cb/0x252() > > Same warn location, but different traceback? Finally reproduced it (thanks to Zach Brown for the idea - using a loopback device) - it's triggered when there's outstanding kiocbs when we're trying to tear down the kioctx. Found a couple bugs once I was able to reproduce it. Besides the null pointer deref that Hillf posted a patch for, cancellation was fubar - kiocb_cancel() shouldn't be marking the kiocb as cancelled if it didn't have a cancel callback. The other two bugs I found were both a result of the fact that aio_run_iocb() touches the kiocb after passing it off to a method that may call aio_complete() on it - which is something I originally missed. Digression: When I was refactoring, I was hoping to be able to change the kiocb refcounting so that the refcount's just initialized to one, and the initial refcount is dropped when aio_complete() is called - and since nothing outside of the aio code messes with the kiocb refcount, it might be possible to get rid of the kiocb refcount entirely if I can figure out how to deal with cancellation. Anyways - that didn't work out, or at least it's going to take more work. The two bugs that resulted from that were: - The "aio: kill ki_retry" patch dropped the second kiocb ref that io_submit_one drops when it returns. But aio_rw_vect_retry() touches the kiocb after passing it off to f_op->aio_(read|write) which will call aio_complete(), so this is a use after free. - The "aio: smoosh struct kiocb" patch assumed that some of the fields in struct kiocb aren't needed after aio_complete() has been called. This is _almost_ true, but again aio_rw_vect_retry() looks at those fields which ends up determining whether aio_run_iocb() calls aio_complete() itself. This seems _ridiculously_ sketchy to me, or at least convoluted... but it'll take awhile to figure out how to clean that up and I'm not in a great hurry to do so. So, Andrew - that "smoosh struct kiocb" patch should just be dropped, even if I fixed that issue clearly the idea is a lot less safe than I thought. I've got patches for the other stuff I'm going to mail out momentarily. Ben, Zach - the cancellation stuff (both the fix and the other changes) need more review, and we need a saner way of testing cancellation. Maybe it'd be worth implementing cancellation for regular block devices just so that we have a way of testing it :/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-24 21:18 ` Kent Overstreet @ 2013-01-24 21:27 ` Andrew Morton 2013-01-24 21:39 ` Kent Overstreet 2013-01-24 22:13 ` Kent Overstreet 2013-01-24 21:43 ` [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio Kent Overstreet ` (2 subsequent siblings) 3 siblings, 2 replies; 26+ messages in thread From: Andrew Morton @ 2013-01-24 21:27 UTC (permalink / raw) To: Kent Overstreet Cc: Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio, zab, linux-fsdevel On Thu, 24 Jan 2013 13:18:50 -0800 Kent Overstreet <koverstreet@google.com> wrote: > So, Andrew - that "smoosh struct kiocb" patch should just be dropped, > even if I fixed that issue clearly the idea is a lot less safe than I > thought. I've got patches for the other stuff I'm going to mail out > momentarily. Dropped. Do you expect that this will fix everything? Please cc linux-fsdevel on the aio stuff - I'm seeing some AIO things over there which aren't cc'ed to lkml or linux-aio. Please also take a look at Jan's recent http://www.spinics.net/lists/linux-fsdevel/msg61738.html and have a think about how this plays with your patchset. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-24 21:27 ` Andrew Morton @ 2013-01-24 21:39 ` Kent Overstreet 2013-01-24 22:25 ` Zach Brown 2013-01-24 22:13 ` Kent Overstreet 1 sibling, 1 reply; 26+ messages in thread From: Kent Overstreet @ 2013-01-24 21:39 UTC (permalink / raw) To: Andrew Morton Cc: Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio, zab, linux-fsdevel On Thu, Jan 24, 2013 at 01:27:59PM -0800, Andrew Morton wrote: > On Thu, 24 Jan 2013 13:18:50 -0800 > Kent Overstreet <koverstreet@google.com> wrote: > > > So, Andrew - that "smoosh struct kiocb" patch should just be dropped, > > even if I fixed that issue clearly the idea is a lot less safe than I > > thought. I've got patches for the other stuff I'm going to mail out > > momentarily. > > Dropped. Do you expect that this will fix everything? No, I didn't see that bug until after I'd fixed the other three, but as far as I can tell everything's fixed with the patches I'm about to mail out - my test VM has been running for the past two days without errors, it's kill -9'ing a process that's got iocbs in flight to a loopback device every two seconds. > Please cc linux-fsdevel on the aio stuff - I'm seeing some AIO things > over there which aren't cc'ed to lkml or linux-aio. Will do > Please also take a look at Jan's recent > http://www.spinics.net/lists/linux-fsdevel/msg61738.html and have a > think about how this plays with your patchset. Oh fun, I've chased a bug or two in that ext4 code, that stuff's scary... I'm sure it'll take me a bit to wrap my head around what's going on there but I'm looking at it now. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-24 21:39 ` Kent Overstreet @ 2013-01-24 22:25 ` Zach Brown 2013-01-24 22:47 ` Jeff Moyer 2013-01-24 23:03 ` Kent Overstreet 0 siblings, 2 replies; 26+ messages in thread From: Zach Brown @ 2013-01-24 22:25 UTC (permalink / raw) To: Kent Overstreet Cc: Andrew Morton, Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio, linux-fsdevel, Jeff Moyer > No, I didn't see that bug until after I'd fixed the other three, but as > far as I can tell everything's fixed with the patches I'm about to mail > out - my test VM has been running for the past two days without errors, > it's kill -9'ing a process that's got iocbs in flight to a loopback > device every two seconds. I'm really worried that this patch series hasn't seen significant enough testing to justify being queued. I'll be first in line for blame for not finding the time to finish my review of the series. What specific tests has this gone through? The aio tests in xfstests / ltp? (And as you discovered while chasing this bug, whatever platform you were running on doesn't seem slow enough to catch some paths.. run all the tests over loop?) Jeff, can you suggest a more modern testing regime for the aio core? It's been so long since I had to hammer on this stuff.. - z ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-24 22:25 ` Zach Brown @ 2013-01-24 22:47 ` Jeff Moyer 2013-01-24 23:03 ` Kent Overstreet 1 sibling, 0 replies; 26+ messages in thread From: Jeff Moyer @ 2013-01-24 22:47 UTC (permalink / raw) To: Zach Brown Cc: Kent Overstreet, Andrew Morton, Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio, linux-fsdevel Zach Brown <zab@redhat.com> writes: >> No, I didn't see that bug until after I'd fixed the other three, but as >> far as I can tell everything's fixed with the patches I'm about to mail >> out - my test VM has been running for the past two days without errors, >> it's kill -9'ing a process that's got iocbs in flight to a loopback >> device every two seconds. > > I'm really worried that this patch series hasn't seen significant enough > testing to justify being queued. > > I'll be first in line for blame for not finding the time to finish my > review of the series. > > What specific tests has this gone through? The aio tests in xfstests / > ltp? (And as you discovered while chasing this bug, whatever platform > you were running on doesn't seem slow enough to catch some paths.. run > all the tests over loop?) > > Jeff, can you suggest a more modern testing regime for the aio core? > It's been so long since I had to hammer on this stuff.. Modern? No. ;-) I usually use xfstests (all of them, not just the aio group), the libaio test harness, and then hand it off to our performance team to stress the code under benchmarking workloads. Oh, and usually targeted testing for the thing that I'm working on. I'll put a couple of kernels together to hand off to our performance team, though I don't know how much time they have at present. Cheers, Jeff ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-24 22:25 ` Zach Brown 2013-01-24 22:47 ` Jeff Moyer @ 2013-01-24 23:03 ` Kent Overstreet 1 sibling, 0 replies; 26+ messages in thread From: Kent Overstreet @ 2013-01-24 23:03 UTC (permalink / raw) To: Zach Brown Cc: Andrew Morton, Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio, linux-fsdevel, Jeff Moyer On Thu, Jan 24, 2013 at 02:25:37PM -0800, Zach Brown wrote: > > No, I didn't see that bug until after I'd fixed the other three, but as > > far as I can tell everything's fixed with the patches I'm about to mail > > out - my test VM has been running for the past two days without errors, > > it's kill -9'ing a process that's got iocbs in flight to a loopback > > device every two seconds. > > I'm really worried that this patch series hasn't seen significant enough > testing to justify being queued. > > I'll be first in line for blame for not finding the time to finish my > review of the series. > > What specific tests has this gone through? The aio tests in xfstests / > ltp? (And as you discovered while chasing this bug, whatever platform > you were running on doesn't seem slow enough to catch some paths.. run > all the tests over loop?) I have run xfstests on real hardware - though that didn't catch this either. These patches are all queued up for our internal tree (I need to bug Ted to review them... and add these latest fixes). And aio is used heavily enough internally that if I screwed anything up, I'll be on the hook... > Jeff, can you suggest a more modern testing regime for the aio core? > It's been so long since I had to hammer on this stuff.. I'm wondering how much it'd buy us to rig up fault injection (I've got some awesome dynamic fault injection code I need to push upstream...). I'll try and test with that in the next day or so, though. There's some people here that'd been working on code coverage tooling too, I should probably learn how to work that stuff. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-24 21:27 ` Andrew Morton 2013-01-24 21:39 ` Kent Overstreet @ 2013-01-24 22:13 ` Kent Overstreet 2013-01-29 13:41 ` Jan Kara 1 sibling, 1 reply; 26+ messages in thread From: Kent Overstreet @ 2013-01-24 22:13 UTC (permalink / raw) To: Andrew Morton, jack, tytso Cc: Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio, zab, linux-fsdevel On Thu, Jan 24, 2013 at 01:27:59PM -0800, Andrew Morton wrote: > Please also take a look at Jan's recent > http://www.spinics.net/lists/linux-fsdevel/msg61738.html and have a > think about how this plays with your patchset. I can't think of any possible interactions - none of my aio stuff messes with the way the fput() happens; the aio code does call fput() when the kiocb is freed and my patches do touch _that_ code but none of the behaviour there changes. Might be worth documenting this though, I can't think of any reason it'd be obvious looking at the aio code that the fput() has to happen after aio_complete(). As with the bugs I just sent you patches for it's not terribly clear who owns what in the kiocb when. Reading those patches though - the main change is to call inode_dio_done() before calling aio_complete(). All inode_dio_done() does though is issue a wakeup - to whatever called inode_dio_wait(). That means whatever called inode_dio_wait() needs its own ref on the inode, and from a cursory glance at the code it is _not_ at all clear to me that's the case - if inode_dio_wait() is merely finishing things for that specific IO that need to be done in process context, I can easily imagine it not being the case. Assuming whatever does call inode_dio_wait() does have its own ref, there was only a real use after free when nothing was waiting on the inode. Similarly for the ext4 code to write unwritten extents - and having seen and helped chase a bug in that code before, that code _definitely_ needs auditing. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-24 22:13 ` Kent Overstreet @ 2013-01-29 13:41 ` Jan Kara 0 siblings, 0 replies; 26+ messages in thread From: Jan Kara @ 2013-01-29 13:41 UTC (permalink / raw) To: Kent Overstreet Cc: Andrew Morton, jack, tytso, Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio, zab, linux-fsdevel On Thu 24-01-13 14:13:52, Kent Overstreet wrote: > On Thu, Jan 24, 2013 at 01:27:59PM -0800, Andrew Morton wrote: > > Please also take a look at Jan's recent > > http://www.spinics.net/lists/linux-fsdevel/msg61738.html and have a > > think about how this plays with your patchset. > > I can't think of any possible interactions - none of my aio stuff messes > with the way the fput() happens; the aio code does call fput() when the > kiocb is freed and my patches do touch _that_ code but none of the > behaviour there changes. > > Might be worth documenting this though, I can't think of any reason it'd > be obvious looking at the aio code that the fput() has to happen after > aio_complete(). As with the bugs I just sent you patches for it's not > terribly clear who owns what in the kiocb when. > > Reading those patches though - the main change is to call > inode_dio_done() before calling aio_complete(). All inode_dio_done() > does though is issue a wakeup - to whatever called inode_dio_wait(). inode_dio_done() does a decrement and wakeup. > That means whatever called inode_dio_wait() needs its own ref on the > inode, and from a cursory glance at the code it is _not_ at all clear to > me that's the case - if inode_dio_wait() is merely finishing things for > that specific IO that need to be done in process context, I can easily > imagine it not being the case. > > Assuming whatever does call inode_dio_wait() does have its own ref, > there was only a real use after free when nothing was waiting on the > inode. Well, but there doesn't have to be any waiter... If there is, it had better have it's own ref, that's for sure. > Similarly for the ext4 code to write unwritten extents - and having seen > and helped chase a bug in that code before, that code _definitely_ needs > auditing. Agreed. That code is a mess. I'm cleaning up some of it but it's not easy. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio 2013-01-24 21:18 ` Kent Overstreet 2013-01-24 21:27 ` Andrew Morton @ 2013-01-24 21:43 ` Kent Overstreet 2013-01-25 13:15 ` Hillf Danton 2013-01-24 21:43 ` [PATCH 2/3] aio-kill-ki_retry-fix-fix Kent Overstreet 2013-01-24 21:43 ` [PATCH 3/3] aio-use-cancellation-list-lazily-fix Kent Overstreet 3 siblings, 1 reply; 26+ messages in thread From: Kent Overstreet @ 2013-01-24 21:43 UTC (permalink / raw) To: akpm Cc: Kent Overstreet, Valdis.Kletnieks, dhillf, bcrl, zab, linux-kernel, linux-aio, linux-fsdevel The batch completion code was trying to be a bit too clever, and skip checking ctx where it couldn't be NULL - but that broke if a kiocb had been cancelled. Move the check to kioctx_ring_unlock(). Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Kent Overstreet <koverstreet@google.com> --- fs/aio.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 62573d3..0d2f39d 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -683,6 +683,9 @@ static inline void kioctx_ring_unlock(struct kioctx *ctx, unsigned tail) { struct aio_ring *ring; + if (!ctx) + return; + smp_wmb(); /* make event visible before updating tail */ @@ -760,8 +763,7 @@ void batch_complete_aio(struct batch_complete *batch) } if (unlikely(req->ki_ctx != ctx)) { - if (ctx) - kioctx_ring_unlock(ctx, tail); + kioctx_ring_unlock(ctx, tail); ctx = req->ki_ctx; tail = kioctx_ring_lock(ctx); -- 1.7.12 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio 2013-01-24 21:43 ` [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio Kent Overstreet @ 2013-01-25 13:15 ` Hillf Danton 0 siblings, 0 replies; 26+ messages in thread From: Hillf Danton @ 2013-01-25 13:15 UTC (permalink / raw) To: Kent Overstreet Cc: akpm, Valdis.Kletnieks, bcrl, zab, linux-kernel, linux-aio, linux-fsdevel On Fri, Jan 25, 2013 at 5:43 AM, Kent Overstreet <koverstreet@google.com> wrote: > The batch completion code was trying to be a bit too clever, and skip > checking ctx where it couldn't be NULL - but that broke if a kiocb had > been cancelled. Move the check to kioctx_ring_unlock(). > > Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> > Signed-off-by: Kent Overstreet <koverstreet@google.com> > --- Acked-by: Hillf Danton <dhillf@gmail.com> > fs/aio.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/aio.c b/fs/aio.c > index 62573d3..0d2f39d 100644 > --- a/fs/aio.c > +++ b/fs/aio.c > @@ -683,6 +683,9 @@ static inline void kioctx_ring_unlock(struct kioctx *ctx, unsigned tail) > { > struct aio_ring *ring; > > + if (!ctx) > + return; > + > smp_wmb(); > /* make event visible before updating tail */ > > @@ -760,8 +763,7 @@ void batch_complete_aio(struct batch_complete *batch) > } > > if (unlikely(req->ki_ctx != ctx)) { > - if (ctx) > - kioctx_ring_unlock(ctx, tail); > + kioctx_ring_unlock(ctx, tail); > > ctx = req->ki_ctx; > tail = kioctx_ring_lock(ctx); > -- > 1.7.12 > ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 2/3] aio-kill-ki_retry-fix-fix 2013-01-24 21:18 ` Kent Overstreet 2013-01-24 21:27 ` Andrew Morton 2013-01-24 21:43 ` [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio Kent Overstreet @ 2013-01-24 21:43 ` Kent Overstreet 2013-01-24 21:43 ` [PATCH 3/3] aio-use-cancellation-list-lazily-fix Kent Overstreet 3 siblings, 0 replies; 26+ messages in thread From: Kent Overstreet @ 2013-01-24 21:43 UTC (permalink / raw) To: akpm Cc: Kent Overstreet, Valdis.Kletnieks, dhillf, bcrl, zab, linux-kernel, linux-aio, linux-fsdevel The "aio: kill ki-retry" patch was assuming that we didn't touch struct kiocb after passing it off to something that would call aio_complete() - which was wrong. So, revert the refcounting changes. Signed-off-by: Kent Overstreet <koverstreet@google.com> --- fs/aio.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 0d2f39d..85d1b38 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -592,7 +592,7 @@ static inline struct kiocb *aio_get_req(struct kioctx *ctx) memset(req, 0, offsetof(struct kiocb, ki_ctx)); req->ki_ctx = ctx; - atomic_set(&req->ki_users, 1); + atomic_set(&req->ki_users, 2); return req; out_put: put_reqs_available(ctx, 1); @@ -1291,10 +1291,12 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, if (ret) goto out_put_req; + aio_put_req(req); /* drop extra ref to req */ return 0; out_put_req: put_reqs_available(ctx, 1); - aio_put_req(req); + aio_put_req(req); /* drop extra ref to req */ + aio_put_req(req); /* drop i/o ref to req */ return ret; } -- 1.7.12 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 3/3] aio-use-cancellation-list-lazily-fix 2013-01-24 21:18 ` Kent Overstreet ` (2 preceding siblings ...) 2013-01-24 21:43 ` [PATCH 2/3] aio-kill-ki_retry-fix-fix Kent Overstreet @ 2013-01-24 21:43 ` Kent Overstreet 2013-01-25 13:30 ` Hillf Danton 3 siblings, 1 reply; 26+ messages in thread From: Kent Overstreet @ 2013-01-24 21:43 UTC (permalink / raw) To: akpm Cc: Kent Overstreet, Valdis.Kletnieks, dhillf, bcrl, zab, linux-kernel, linux-aio, linux-fsdevel The cancellation changes were fubar - we can't cancel a kiocb if it doesn't actually have a cancellation callback. The use of xchg() in aio_complete() was right - there we're marking the kiocb as completed - but we need to use cmpxchg() in kiocb_cancel() - a lock isn't sufficient since we're synchronizing with aio_complete() which isn't taking any locks. Signed-off-by: Kent Overstreet <koverstreet@google.com> --- fs/aio.c | 32 ++++++++++++++++++++++---------- include/linux/aio.h | 11 +++++++++++ 2 files changed, 33 insertions(+), 10 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 85d1b38..2760180 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -244,28 +244,40 @@ static int aio_setup_ring(struct kioctx *ctx) void kiocb_set_cancel_fn(struct kiocb *req, kiocb_cancel_fn *cancel) { - if (!req->ki_list.next) { - struct kioctx *ctx = req->ki_ctx; - unsigned long flags; + struct kioctx *ctx = req->ki_ctx; + unsigned long flags; - spin_lock_irqsave(&ctx->ctx_lock, flags); + spin_lock_irqsave(&ctx->ctx_lock, flags); + + if (!req->ki_list.next) list_add(&req->ki_list, &ctx->active_reqs); - spin_unlock_irqrestore(&ctx->ctx_lock, flags); - } req->ki_cancel = cancel; + + spin_unlock_irqrestore(&ctx->ctx_lock, flags); } EXPORT_SYMBOL(kiocb_set_cancel_fn); static int kiocb_cancel(struct kioctx *ctx, struct kiocb *kiocb, struct io_event *res) { - kiocb_cancel_fn *cancel; + kiocb_cancel_fn *old, *cancel; int ret = -EINVAL; - cancel = xchg(&kiocb->ki_cancel, KIOCB_CANCELLED); - if (!cancel || cancel == KIOCB_CANCELLED) - return ret; + /* + * Don't want to set kiocb->ki_cancel = KIOCB_CANCELLED unless it + * actually has a cancel function, hence the cmpxchg() + */ + + cancel = ACCESS_ONCE(kiocb->ki_cancel); + do { + if (!cancel || cancel == KIOCB_CANCELLED) + return ret; + + BUG(); + old = cancel; + cancel = cmpxchg(&kiocb->ki_cancel, old, KIOCB_CANCELLED); + } while (cancel != old); atomic_inc(&kiocb->ki_users); spin_unlock_irq(&ctx->ctx_lock); diff --git a/include/linux/aio.h b/include/linux/aio.h index 8eaa490..cd4aea0 100644 --- a/include/linux/aio.h +++ b/include/linux/aio.h @@ -15,6 +15,17 @@ struct batch_complete; #define KIOCB_KEY 0 +/* + * We use ki_cancel == KIOCB_CANCELLED to indicate that a kiocb has been either + * cancelled or completed (this makes a certain amount of sense because + * successful cancellation - io_cancel() - does deliver the completion to + * userspace). + * + * And since most things don't implement kiocb cancellation and we'd really like + * kiocb completion to be lockless when possible, we use ki_cancel to + * synchronize cancellation and completion - we only set it to KIOCB_CANCELLED + * with xchg() or cmpxchg(), see batch_complete_aio() and kiocb_cancel(). + */ #define KIOCB_CANCELLED ((void *) (~0ULL)) typedef int (kiocb_cancel_fn)(struct kiocb *, struct io_event *); -- 1.7.12 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 3/3] aio-use-cancellation-list-lazily-fix 2013-01-24 21:43 ` [PATCH 3/3] aio-use-cancellation-list-lazily-fix Kent Overstreet @ 2013-01-25 13:30 ` Hillf Danton 2013-01-25 23:12 ` Andrew Morton 0 siblings, 1 reply; 26+ messages in thread From: Hillf Danton @ 2013-01-25 13:30 UTC (permalink / raw) To: Kent Overstreet Cc: akpm, Valdis.Kletnieks, bcrl, zab, linux-kernel, linux-aio, linux-fsdevel On Fri, Jan 25, 2013 at 5:43 AM, Kent Overstreet <koverstreet@google.com> wrote: > The cancellation changes were fubar - we can't cancel a kiocb if it > doesn't actually have a cancellation callback. > > The use of xchg() in aio_complete() was right - there we're marking the > kiocb as completed - but we need to use cmpxchg() in kiocb_cancel() - a > lock isn't sufficient since we're synchronizing with aio_complete() > which isn't taking any locks. > What is observed without this fix? > Signed-off-by: Kent Overstreet <koverstreet@google.com> > --- > fs/aio.c | 32 ++++++++++++++++++++++---------- > include/linux/aio.h | 11 +++++++++++ > 2 files changed, 33 insertions(+), 10 deletions(-) > > diff --git a/fs/aio.c b/fs/aio.c > index 85d1b38..2760180 100644 > --- a/fs/aio.c > +++ b/fs/aio.c > @@ -244,28 +244,40 @@ static int aio_setup_ring(struct kioctx *ctx) > > void kiocb_set_cancel_fn(struct kiocb *req, kiocb_cancel_fn *cancel) > { > - if (!req->ki_list.next) { > - struct kioctx *ctx = req->ki_ctx; > - unsigned long flags; > + struct kioctx *ctx = req->ki_ctx; > + unsigned long flags; > > - spin_lock_irqsave(&ctx->ctx_lock, flags); > + spin_lock_irqsave(&ctx->ctx_lock, flags); > + > + if (!req->ki_list.next) > list_add(&req->ki_list, &ctx->active_reqs); > - spin_unlock_irqrestore(&ctx->ctx_lock, flags); > - } > > req->ki_cancel = cancel; > + > + spin_unlock_irqrestore(&ctx->ctx_lock, flags); > } > EXPORT_SYMBOL(kiocb_set_cancel_fn); > > static int kiocb_cancel(struct kioctx *ctx, struct kiocb *kiocb, > struct io_event *res) > { > - kiocb_cancel_fn *cancel; > + kiocb_cancel_fn *old, *cancel; > int ret = -EINVAL; > > - cancel = xchg(&kiocb->ki_cancel, KIOCB_CANCELLED); > - if (!cancel || cancel == KIOCB_CANCELLED) > - return ret; > + /* > + * Don't want to set kiocb->ki_cancel = KIOCB_CANCELLED unless it > + * actually has a cancel function, hence the cmpxchg() > + */ > + > + cancel = ACCESS_ONCE(kiocb->ki_cancel); > + do { > + if (!cancel || cancel == KIOCB_CANCELLED) > + return ret; > + > + BUG(); Hmm, what is trapped? > + old = cancel; > + cancel = cmpxchg(&kiocb->ki_cancel, old, KIOCB_CANCELLED); > + } while (cancel != old); > > atomic_inc(&kiocb->ki_users); > spin_unlock_irq(&ctx->ctx_lock); > diff --git a/include/linux/aio.h b/include/linux/aio.h > index 8eaa490..cd4aea0 100644 > --- a/include/linux/aio.h > +++ b/include/linux/aio.h > @@ -15,6 +15,17 @@ struct batch_complete; > > #define KIOCB_KEY 0 > > +/* > + * We use ki_cancel == KIOCB_CANCELLED to indicate that a kiocb has been either > + * cancelled or completed (this makes a certain amount of sense because > + * successful cancellation - io_cancel() - does deliver the completion to > + * userspace). > + * > + * And since most things don't implement kiocb cancellation and we'd really like > + * kiocb completion to be lockless when possible, we use ki_cancel to > + * synchronize cancellation and completion - we only set it to KIOCB_CANCELLED > + * with xchg() or cmpxchg(), see batch_complete_aio() and kiocb_cancel(). > + */ > #define KIOCB_CANCELLED ((void *) (~0ULL)) > > typedef int (kiocb_cancel_fn)(struct kiocb *, struct io_event *); > -- > 1.7.12 > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/3] aio-use-cancellation-list-lazily-fix 2013-01-25 13:30 ` Hillf Danton @ 2013-01-25 23:12 ` Andrew Morton 2013-01-28 17:37 ` Kent Overstreet 0 siblings, 1 reply; 26+ messages in thread From: Andrew Morton @ 2013-01-25 23:12 UTC (permalink / raw) To: Hillf Danton Cc: Kent Overstreet, Valdis.Kletnieks, bcrl, zab, linux-kernel, linux-aio, linux-fsdevel On Fri, 25 Jan 2013 21:30:32 +0800 Hillf Danton <dhillf@gmail.com> wrote: > On Fri, Jan 25, 2013 at 5:43 AM, Kent Overstreet <koverstreet@google.com> wrote: > > The cancellation changes were fubar - we can't cancel a kiocb if it > > doesn't actually have a cancellation callback. > > > > The use of xchg() in aio_complete() was right - there we're marking the > > kiocb as completed - but we need to use cmpxchg() in kiocb_cancel() - a > > lock isn't sufficient since we're synchronizing with aio_complete() > > which isn't taking any locks. > > > > static int kiocb_cancel(struct kioctx *ctx, struct kiocb *kiocb, > > struct io_event *res) > > { > > - kiocb_cancel_fn *cancel; > > + kiocb_cancel_fn *old, *cancel; > > int ret = -EINVAL; > > > > - cancel = xchg(&kiocb->ki_cancel, KIOCB_CANCELLED); > > - if (!cancel || cancel == KIOCB_CANCELLED) > > - return ret; > > + /* > > + * Don't want to set kiocb->ki_cancel = KIOCB_CANCELLED unless it > > + * actually has a cancel function, hence the cmpxchg() > > + */ > > + > > + cancel = ACCESS_ONCE(kiocb->ki_cancel); > > + do { > > + if (!cancel || cancel == KIOCB_CANCELLED) > > + return ret; > > + > > + BUG(); > > Hmm, what is trapped? > > > + old = cancel; > > + cancel = cmpxchg(&kiocb->ki_cancel, old, KIOCB_CANCELLED); > > + } while (cancel != old); erk, I missed that. What earthly sense is there in putting a BUG() in that place. I think I'll delete it and pretend I never saw it :( ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/3] aio-use-cancellation-list-lazily-fix 2013-01-25 23:12 ` Andrew Morton @ 2013-01-28 17:37 ` Kent Overstreet 0 siblings, 0 replies; 26+ messages in thread From: Kent Overstreet @ 2013-01-28 17:37 UTC (permalink / raw) To: Andrew Morton Cc: Hillf Danton, Valdis.Kletnieks, bcrl, zab, linux-kernel, linux-aio, linux-fsdevel On Fri, Jan 25, 2013 at 03:12:51PM -0800, Andrew Morton wrote: > On Fri, 25 Jan 2013 21:30:32 +0800 > Hillf Danton <dhillf@gmail.com> wrote: > > > On Fri, Jan 25, 2013 at 5:43 AM, Kent Overstreet <koverstreet@google.com> wrote: > > > The cancellation changes were fubar - we can't cancel a kiocb if it > > > doesn't actually have a cancellation callback. > > > > > > The use of xchg() in aio_complete() was right - there we're marking the > > > kiocb as completed - but we need to use cmpxchg() in kiocb_cancel() - a > > > lock isn't sufficient since we're synchronizing with aio_complete() > > > which isn't taking any locks. > > > > > > static int kiocb_cancel(struct kioctx *ctx, struct kiocb *kiocb, > > > struct io_event *res) > > > { > > > - kiocb_cancel_fn *cancel; > > > + kiocb_cancel_fn *old, *cancel; > > > int ret = -EINVAL; > > > > > > - cancel = xchg(&kiocb->ki_cancel, KIOCB_CANCELLED); > > > - if (!cancel || cancel == KIOCB_CANCELLED) > > > - return ret; > > > + /* > > > + * Don't want to set kiocb->ki_cancel = KIOCB_CANCELLED unless it > > > + * actually has a cancel function, hence the cmpxchg() > > > + */ > > > + > > > + cancel = ACCESS_ONCE(kiocb->ki_cancel); > > > + do { > > > + if (!cancel || cancel == KIOCB_CANCELLED) > > > + return ret; > > > + > > > + BUG(); > > > > Hmm, what is trapped? > > > > > + old = cancel; > > > + cancel = cmpxchg(&kiocb->ki_cancel, old, KIOCB_CANCELLED); > > > + } while (cancel != old); > > erk, I missed that. What earthly sense is there in putting a BUG() in > that place. > > I think I'll delete it and pretend I never saw it :( Argh. Yeah, sorry about that. Put that in there when I was trying to track down the other bugs :( ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-22 21:28 ` Valdis.Kletnieks 2013-01-23 12:10 ` Hillf Danton @ 2013-01-31 21:59 ` Andrew Morton 2013-02-01 0:37 ` Kent Overstreet 1 sibling, 1 reply; 26+ messages in thread From: Andrew Morton @ 2013-01-31 21:59 UTC (permalink / raw) To: Valdis.Kletnieks Cc: Hillf Danton, Benjamin LaHaise, Kent Overstreet, linux-kernel, linux-aio On Tue, 22 Jan 2013 16:28:18 -0500 Valdis.Kletnieks@vt.edu wrote: > On Tue, 22 Jan 2013 21:43:27 +0800, Hillf Danton said: > > On Mon, Jan 21, 2013 at 9:24 PM, Valdis Kletnieks > > <Valdis.Kletnieks@vt.edu> wrote: > > > Am seeing a reproducible BUG in the kernel with next-20130117 > > > whenever I fire up VirtualBox. Unfortunately, I hadn't done that > > > in a while, so the last 'known good' kernel was next-20121203. > > > > > > I'm strongly suspecting one of Kent Overstreet's 32 patches against aio, > > > because 'git blame' shows those landing on Jan 12, and not much else > > > happening to fs/aio.c in ages. > > > > > Take a try? > > --- > > --- a/fs/aio.c Tue Jan 22 21:37:54 2013 > > +++ b/fs/aio.c Tue Jan 22 21:43:58 2013 > > @@ -683,6 +683,9 @@ static inline void kioctx_ring_unlock(st > > { > > struct aio_ring *ring; > > > > + if (!ctx) > > + return; > > + > > smp_wmb(); > > /* make event visible before updating tail */ > > Well, things are improved - at least now it doesn't BUG :) > > [ 534.879083] ------------[ cut here ]------------ > [ 534.879094] WARNING: at fs/aio.c:336 put_ioctx+0x1cb/0x252() > [ 534.879121] Call Trace: > [ 534.879129] [<ffffffff8102f5ad>] warn_slowpath_common+0x7e/0x97 > [ 534.879133] [<ffffffff8102f5db>] warn_slowpath_null+0x15/0x17 > [ 534.879137] [<ffffffff811521f0>] put_ioctx+0x1cb/0x252 > [ 534.879142] [<ffffffff8105bee3>] ? __wake_up+0x3f/0x48 > [ 534.879146] [<ffffffff8115229e>] ? kill_ioctx_work+0x27/0x2b > [ 534.879150] [<ffffffff811531a5>] sys_io_destroy+0x40/0x50 > [ 534.879156] [<ffffffff8161b112>] system_call_fastpath+0x16/0x1b > [ 534.879159] ---[ end trace a2c46a8bc9058404 ]--- > > Hopefully that tells you and Kent something. :) Did this get fixed? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-01-31 21:59 ` next-20130117 - kernel BUG with aio Andrew Morton @ 2013-02-01 0:37 ` Kent Overstreet 2013-02-05 15:53 ` Valdis.Kletnieks 0 siblings, 1 reply; 26+ messages in thread From: Kent Overstreet @ 2013-02-01 0:37 UTC (permalink / raw) To: Andrew Morton Cc: Valdis.Kletnieks, Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio On Thu, Jan 31, 2013 at 01:59:52PM -0800, Andrew Morton wrote: > On Tue, 22 Jan 2013 16:28:18 -0500 > Valdis.Kletnieks@vt.edu wrote: > > > On Tue, 22 Jan 2013 21:43:27 +0800, Hillf Danton said: > > > On Mon, Jan 21, 2013 at 9:24 PM, Valdis Kletnieks > > > <Valdis.Kletnieks@vt.edu> wrote: > > > > Am seeing a reproducible BUG in the kernel with next-20130117 > > > > whenever I fire up VirtualBox. Unfortunately, I hadn't done that > > > > in a while, so the last 'known good' kernel was next-20121203. > > > > > > > > I'm strongly suspecting one of Kent Overstreet's 32 patches against aio, > > > > because 'git blame' shows those landing on Jan 12, and not much else > > > > happening to fs/aio.c in ages. > > > > > > > Take a try? > > > --- > > > --- a/fs/aio.c Tue Jan 22 21:37:54 2013 > > > +++ b/fs/aio.c Tue Jan 22 21:43:58 2013 > > > @@ -683,6 +683,9 @@ static inline void kioctx_ring_unlock(st > > > { > > > struct aio_ring *ring; > > > > > > + if (!ctx) > > > + return; > > > + > > > smp_wmb(); > > > /* make event visible before updating tail */ > > > > Well, things are improved - at least now it doesn't BUG :) > > > > [ 534.879083] ------------[ cut here ]------------ > > [ 534.879094] WARNING: at fs/aio.c:336 put_ioctx+0x1cb/0x252() > > [ 534.879121] Call Trace: > > [ 534.879129] [<ffffffff8102f5ad>] warn_slowpath_common+0x7e/0x97 > > [ 534.879133] [<ffffffff8102f5db>] warn_slowpath_null+0x15/0x17 > > [ 534.879137] [<ffffffff811521f0>] put_ioctx+0x1cb/0x252 > > [ 534.879142] [<ffffffff8105bee3>] ? __wake_up+0x3f/0x48 > > [ 534.879146] [<ffffffff8115229e>] ? kill_ioctx_work+0x27/0x2b > > [ 534.879150] [<ffffffff811531a5>] sys_io_destroy+0x40/0x50 > > [ 534.879156] [<ffffffff8161b112>] system_call_fastpath+0x16/0x1b > > [ 534.879159] ---[ end trace a2c46a8bc9058404 ]--- > > > > Hopefully that tells you and Kent something. :) > > Did this get fixed? With the patches I sent you, yes - not seeing a new linux-next tree yet? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-02-01 0:37 ` Kent Overstreet @ 2013-02-05 15:53 ` Valdis.Kletnieks 2013-02-05 17:20 ` Kent Overstreet 0 siblings, 1 reply; 26+ messages in thread From: Valdis.Kletnieks @ 2013-02-05 15:53 UTC (permalink / raw) To: Kent Overstreet Cc: Andrew Morton, Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio [-- Attachment #1: Type: text/plain, Size: 2021 bytes --] On Thu, 31 Jan 2013 16:37:27 -0800, Kent Overstreet said: > On Thu, Jan 31, 2013 at 01:59:52PM -0800, Andrew Morton wrote: > > Did this get fixed? > With the patches I sent you, yes - not seeing a new linux-next tree yet? Well, it's a mixed bag at my end. Finally got a chance to do some more testing, and: 1) next-20130128 didn't show anything in dmesg, but my VirtualBox Windows 7 images appear to livelock on the way up - the Windows throbber would keep going, but it never made any actual progress towards booting. (Part of the delay was fixing a next-20121224 environment, and then discovering it took Windows *two* reboot cycles to get its act back together after getting into that hung state). 2_ next-20130128 plus the following 3 patches: Subject: [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio Subject: [PATCH 3/3] aio-use-cancellation-list-lazily-fix Subject: [PATCH 2/3] aio-kill-ki_retry-fix-fix VirtualBox appears to be functional (I did 2 complete boot/shutdown sequences of both a 32-bit and 64-bit Win7 Enterprise image). *HOWEVER*, I saw 3 of these in dmesg: [ 668.278624] WARNING: at fs/aio.c:348 put_ioctx+0x1c0/0x241() [ 668.278652] Call Trace: [ 668.278660] [<ffffffff8102ed10>] warn_slowpath_common+0x7c/0x96 [ 668.278665] [<ffffffff8102edc9>] warn_slowpath_null+0x15/0x17 [ 668.278669] [<ffffffff8114c562>] put_ioctx+0x1c0/0x241 [ 668.278673] [<ffffffff8114d42a>] sys_io_destroy+0x4c/0x5c [ 668.278679] [<ffffffff8160c112>] system_call_fastpath+0x16/0x1b and the code there says: WARN_ON(atomic_read(&ctx->reqs_available) > ctx->nr); which leaves me wondering exactly how we exited the while loop just above - is the intention that it loop until reqs_available == ctx->nr exactly? Looks like if 'avail' is anything other than exactly 1 in that while loop, we can be at a state where reqs_avail == (ctx->nr -1), get 'avail=2', do the atomic_add, fall out of the loop, and trigger the WARN_ON. Damned if I see how that can happen though.... [-- Attachment #2: Type: application/pgp-signature, Size: 865 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-02-05 15:53 ` Valdis.Kletnieks @ 2013-02-05 17:20 ` Kent Overstreet 2013-02-05 17:48 ` Valdis.Kletnieks 2013-02-06 17:15 ` Valdis.Kletnieks 0 siblings, 2 replies; 26+ messages in thread From: Kent Overstreet @ 2013-02-05 17:20 UTC (permalink / raw) To: Valdis.Kletnieks Cc: Andrew Morton, Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio On Tue, Feb 05, 2013 at 10:53:00AM -0500, Valdis.Kletnieks@vt.edu wrote: > On Thu, 31 Jan 2013 16:37:27 -0800, Kent Overstreet said: > > On Thu, Jan 31, 2013 at 01:59:52PM -0800, Andrew Morton wrote: > > > Did this get fixed? > > > With the patches I sent you, yes - not seeing a new linux-next tree yet? > > Well, it's a mixed bag at my end. Finally got a chance to do some more > testing, and: > > 1) next-20130128 didn't show anything in dmesg, but my VirtualBox Windows 7 > images appear to livelock on the way up - the Windows throbber would keep > going, but it never made any actual progress towards booting. (Part of the > delay was fixing a next-20121224 environment, and then discovering it > took Windows *two* reboot cycles to get its act back together after getting > into that hung state). > > 2_ next-20130128 plus the following 3 patches: > > Subject: [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio > Subject: [PATCH 3/3] aio-use-cancellation-list-lazily-fix > Subject: [PATCH 2/3] aio-kill-ki_retry-fix-fix The "smoosh struct kiocb" patch also needs to be dropped. That causes aio_rw_vect_retry() to check ki_nbytes/ki_left after they've been overwritten by aio_complete(), which causes it to return an error when it shouldn't have, which causes aio_run_iocb() to double complete the iocb causing put_reqs_available() to be called twice and the count screwed up. > VirtualBox appears to be functional (I did 2 complete boot/shutdown > sequences of both a 32-bit and 64-bit Win7 Enterprise image). *HOWEVER*, > I saw 3 of these in dmesg: > > [ 668.278624] WARNING: at fs/aio.c:348 put_ioctx+0x1c0/0x241() > > [ 668.278652] Call Trace: > [ 668.278660] [<ffffffff8102ed10>] warn_slowpath_common+0x7c/0x96 > [ 668.278665] [<ffffffff8102edc9>] warn_slowpath_null+0x15/0x17 > [ 668.278669] [<ffffffff8114c562>] put_ioctx+0x1c0/0x241 > [ 668.278673] [<ffffffff8114d42a>] sys_io_destroy+0x4c/0x5c > [ 668.278679] [<ffffffff8160c112>] system_call_fastpath+0x16/0x1b > > and the code there says: > > WARN_ON(atomic_read(&ctx->reqs_available) > ctx->nr); > > which leaves me wondering exactly how we exited the while loop > just above - is the intention that it loop until reqs_available == ctx->nr > exactly? Looks like if 'avail' is anything other than exactly 1 in > that while loop, we can be at a state where reqs_avail == (ctx->nr -1), > get 'avail=2', do the atomic_add, fall out of the loop, and trigger > the WARN_ON. > > Damned if I see how that can happen though.... > > > > > > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-02-05 17:20 ` Kent Overstreet @ 2013-02-05 17:48 ` Valdis.Kletnieks 2013-02-06 17:15 ` Valdis.Kletnieks 1 sibling, 0 replies; 26+ messages in thread From: Valdis.Kletnieks @ 2013-02-05 17:48 UTC (permalink / raw) To: Kent Overstreet Cc: Andrew Morton, Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio [-- Attachment #1: Type: text/plain, Size: 2519 bytes --] On Tue, 05 Feb 2013 09:20:15 -0800, Kent Overstreet said: > On Tue, Feb 05, 2013 at 10:53:00AM -0500, Valdis.Kletnieks@vt.edu wrote: > > On Thu, 31 Jan 2013 16:37:27 -0800, Kent Overstreet said: > > > On Thu, Jan 31, 2013 at 01:59:52PM -0800, Andrew Morton wrote: > > > > Did this get fixed? > > > > > With the patches I sent you, yes - not seeing a new linux-next tree yet? > > > > Well, it's a mixed bag at my end. Finally got a chance to do some more > > testing, and: > > > > 1) next-20130128 didn't show anything in dmesg, but my VirtualBox Windows 7 > > images appear to livelock on the way up - the Windows throbber would keep > > going, but it never made any actual progress towards booting. (Part of the > > delay was fixing a next-20121224 environment, and then discovering it > > took Windows *two* reboot cycles to get its act back together after getting > > into that hung state). > > > > 2_ next-20130128 plus the following 3 patches: > > > > Subject: [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio > > Subject: [PATCH 3/3] aio-use-cancellation-list-lazily-fix > > Subject: [PATCH 2/3] aio-kill-ki_retry-fix-fix > > The "smoosh struct kiocb" patch also needs to be dropped. That causes > aio_rw_vect_retry() to check ki_nbytes/ki_left after they've been > overwritten by aio_complete(), which causes it to return an error when > it shouldn't have, which causes aio_run_iocb() to double complete the > iocb causing put_reqs_available() to be called twice and the count > screwed up. Unfortunately, that's not a clean slam-dunk revert: [/usr/src/linux-next] patch -p1 -R --dry-run < ~/Downloads/32-32-aio-Smoosh-struct-kiocb.patch checking file fs/aio.c Hunk #1 FAILED at 570. Hunk #2 FAILED at 634. Hunk #3 FAILED at 1246. 3 out of 3 hunks FAILED checking file include/linux/aio.h Hunk #1 succeeded at 31 (offset 11 lines). Looks like the above 3 patches introduce conflicts. Not sure what the proper resolution is for some of it. For the first hunk, the smoosh patch has near line 590: - atomic_set(&req->ki_users, 1); + memset(req, 0, offsetof(struct kiocb, ki_ctx)); req->ki_ctx = ctx; + atomic_set(&req->ki_users, 1); return req; but after the 3 patches, I have: memset(req, 0, offsetof(struct kiocb, ki_ctx)); req->ki_ctx = ctx; atomic_set(&req->ki_users, 2); return req; Easy to fix, except that '2' is too magical for me to understand, so I'm not sure I want to smash it to a 1 to make the revert easier. :) [-- Attachment #2: Type: application/pgp-signature, Size: 865 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: next-20130117 - kernel BUG with aio 2013-02-05 17:20 ` Kent Overstreet 2013-02-05 17:48 ` Valdis.Kletnieks @ 2013-02-06 17:15 ` Valdis.Kletnieks 1 sibling, 0 replies; 26+ messages in thread From: Valdis.Kletnieks @ 2013-02-06 17:15 UTC (permalink / raw) To: Kent Overstreet Cc: Andrew Morton, Hillf Danton, Benjamin LaHaise, linux-kernel, linux-aio [-- Attachment #1: Type: text/plain, Size: 732 bytes --] On Tue, 05 Feb 2013 09:20:15 -0800, Kent Overstreet said: > The "smoosh struct kiocb" patch also needs to be dropped. That causes > aio_rw_vect_retry() to check ki_nbytes/ki_left after they've been > overwritten by aio_complete(), which causes it to return an error when > it shouldn't have, which causes aio_run_iocb() to double complete the > iocb causing put_reqs_available() to be called twice and the count > screwed up. Hooray! linux-next-20130206 builds, boots, and VirtualBox 4.2.6 runs without producing any kernel messages. Feel free to add the following as appropriate to this version of the patch series: Reported-By: Valdis Kletnieks <valdis.kletnieks@vt.edu> Tested-By: Valdis Kletnieks <valdis.kletnieks@vt.edu> [-- Attachment #2: Type: application/pgp-signature, Size: 865 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2013-02-06 17:17 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-01-21 13:24 next-20130117 - kernel BUG with aio Valdis Kletnieks 2013-01-22 13:43 ` Hillf Danton 2013-01-22 21:28 ` Valdis.Kletnieks 2013-01-23 12:10 ` Hillf Danton 2013-01-24 17:22 ` Valdis.Kletnieks 2013-01-24 21:18 ` Kent Overstreet 2013-01-24 21:27 ` Andrew Morton 2013-01-24 21:39 ` Kent Overstreet 2013-01-24 22:25 ` Zach Brown 2013-01-24 22:47 ` Jeff Moyer 2013-01-24 23:03 ` Kent Overstreet 2013-01-24 22:13 ` Kent Overstreet 2013-01-29 13:41 ` Jan Kara 2013-01-24 21:43 ` [PATCH 1/3] aio: Fix a null pointer deref in batch_complete_aio Kent Overstreet 2013-01-25 13:15 ` Hillf Danton 2013-01-24 21:43 ` [PATCH 2/3] aio-kill-ki_retry-fix-fix Kent Overstreet 2013-01-24 21:43 ` [PATCH 3/3] aio-use-cancellation-list-lazily-fix Kent Overstreet 2013-01-25 13:30 ` Hillf Danton 2013-01-25 23:12 ` Andrew Morton 2013-01-28 17:37 ` Kent Overstreet 2013-01-31 21:59 ` next-20130117 - kernel BUG with aio Andrew Morton 2013-02-01 0:37 ` Kent Overstreet 2013-02-05 15:53 ` Valdis.Kletnieks 2013-02-05 17:20 ` Kent Overstreet 2013-02-05 17:48 ` Valdis.Kletnieks 2013-02-06 17:15 ` Valdis.Kletnieks
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox