* xfs_vm_releasepage() causing BUG at free_buffer_head() @ 2016-07-18 18:00 Alex Lyakas 2016-07-18 20:18 ` Holger Hoffstätte 2016-07-19 23:11 ` Dave Chinner 0 siblings, 2 replies; 7+ messages in thread From: Alex Lyakas @ 2016-07-18 18:00 UTC (permalink / raw) To: xfs Greetings XFS community, We have hit the following BUG [1]. This is in free_buffer_head(): BUG_ON(!list_empty(&bh->b_assoc_buffers)); This is happening in a long-term mainline kernel 3.18.19. Some googling revealed a possibly-related discussion at: http://comments.gmane.org/gmane.linux.file-systems/105093 https://lkml.org/lkml/2016/5/30/1007 except that in our case I don't see the "WARN_ON_ONCE(delalloc)" triggered. I have no idea what to do this, so reporting. Thanks, Alex. [2540217.134291] ------------[ cut here ]------------ [2540217.135008] kernel BUG at fs/buffer.c:3339! [2540217.135008] invalid opcode: 0000 [#1] PREEMPT SMP [2540217.135008] CPU: 0 PID: 38 Comm: kswapd0 Tainted: G WC OE 3.18.19-zadara05 #1 [2540217.135008] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [2540217.135008] task: ffff8800db499440 ti: ffff880118934000 task.ti: ffff880118934000 [2540217.135008] RIP: 0010:[<ffffffff8121b117>] [<ffffffff8121b117>] free_buffer_head+0x67/0x70 [2540217.135008] RSP: 0000:ffff880118937980 EFLAGS: 00010293 [2540217.135008] RAX: ffff8800a6b4e2b8 RBX: ffff8800a6b4e270 RCX: 0000000000000000 [2540217.135008] RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff8800a6b4e270 [2540217.135008] RBP: ffff8801189379b8 R08: 0000000000000018 R09: ffff88001d9d32f8 [2540217.135008] R10: ffff880118937990 R11: ffffea00029ad380 R12: 0000000000000001 [2540217.135008] R13: ffff88001d9d3388 R14: ffffea000166c920 R15: ffff880118937ab0 [2540217.135008] FS: 0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000 [2540217.135008] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [2540217.135008] CR2: 00007ff5ce91d77c CR3: 0000000115897000 CR4: 00000000001406f0 [2540217.135008] Stack: [2540217.135008] ffffffff8121b25c ffff88001f035240 ffff8800a6b4e270 0000000000000000 [2540217.135008] ffff880118937e50 ffffea000166c900 ffff88001d9d31a8 ffff8801189379f8 [2540217.135008] ffffffffc0a8933b 0000000000000000 0000000000000000 ffffffff811abc60 [2540217.135008] Call Trace: [2540217.193019] [<ffffffff8121b25c>] ? try_to_free_buffers+0x7c/0xc0 [2540217.193019] [<ffffffffc0a8933b>] xfs_vm_releasepage+0x4b/0x120 [xfs] [2540217.193019] [<ffffffff811abc60>] ? page_get_anon_vma+0xb0/0xb0 [2540217.193019] [<ffffffff811722f2>] try_to_release_page+0x32/0x50 [2540217.193019] [<ffffffff8118596d>] shrink_page_list+0x8fd/0xad0 [2540217.193019] [<ffffffff817173e9>] ? _raw_spin_unlock_irq+0x19/0x50 [2540217.193019] [<ffffffff81186116>] shrink_inactive_list+0x1a6/0x550 [2540217.193019] [<ffffffff81399119>] ? radix_tree_gang_lookup_tag+0x89/0xd0 [2540217.193019] [<ffffffff81186e0d>] shrink_lruvec+0x58d/0x750 [2540217.193019] [<ffffffff81187053>] shrink_zone+0x83/0x1d0 [2540217.193019] [<ffffffff8118727b>] kswapd_shrink_zone+0xdb/0x1b0 [2540217.193019] [<ffffffff811884fd>] kswapd+0x4ed/0x8f0 [2540217.193019] [<ffffffff81188010>] ? mem_cgroup_shrink_node_zone+0x190/0x190 [2540217.193019] [<ffffffff810911b9>] kthread+0xc9/0xe0 [2540217.193019] [<ffffffff810910f0>] ? kthread_create_on_node+0x180/0x180 [2540217.193019] [<ffffffff81717918>] ret_from_fork+0x58/0x90 [2540217.193019] [<ffffffff810910f0>] ? kthread_create_on_node+0x180/0x180 [2540217.193019] Code: 04 fb 00 00 3d ff 0f 00 00 7f 19 65 ff 0c 25 20 b8 00 00 74 07 5d c3 0f 1f 44 00 00 e8 34 6a 18 00 5d c3 90 e8 8b fa ff ff eb e0 <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 45 [2540217.193019] RIP [<ffffffff8121b117>] free_buffer_head+0x67/0x70 [2540217.193019] RSP <ffff880118937980> [2540217.218819] ---[ end trace ffb67f26b48f16a2 ]--- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfs_vm_releasepage() causing BUG at free_buffer_head() 2016-07-18 18:00 xfs_vm_releasepage() causing BUG at free_buffer_head() Alex Lyakas @ 2016-07-18 20:18 ` Holger Hoffstätte 2016-07-19 8:43 ` Alex Lyakas 2016-07-19 23:11 ` Dave Chinner 1 sibling, 1 reply; 7+ messages in thread From: Holger Hoffstätte @ 2016-07-18 20:18 UTC (permalink / raw) To: Alex Lyakas, xfs On 07/18/16 20:00, Alex Lyakas wrote: > Greetings XFS community, > > We have hit the following BUG [1]. > > This is in free_buffer_head(): > BUG_ON(!list_empty(&bh->b_assoc_buffers)); > > This is happening in a long-term mainline kernel 3.18.19. > > Some googling revealed a possibly-related discussion at: > http://comments.gmane.org/gmane.linux.file-systems/105093 > https://lkml.org/lkml/2016/5/30/1007 > except that in our case I don't see the "WARN_ON_ONCE(delalloc)" triggered. Since you make it past the WARN_ONs that makes it look like this very recent report from Friday: http://oss.sgi.com/pipermail/xfs/2016-July/050199.html Dave posted a patch in that thread which seems ot work fine and so far hasn't set anything on fire, at least for me on 4.4.x. cheers, Holger _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfs_vm_releasepage() causing BUG at free_buffer_head() 2016-07-18 20:18 ` Holger Hoffstätte @ 2016-07-19 8:43 ` Alex Lyakas 2016-07-19 11:24 ` Holger Hoffstätte 2016-07-19 23:05 ` Dave Chinner 0 siblings, 2 replies; 7+ messages in thread From: Alex Lyakas @ 2016-07-19 8:43 UTC (permalink / raw) To: xfs, Holger Hoffstätte Hello Holger, Thank you for your response. I see that xfs_finish_page_writeback() has been added very recently and is called from xfs_destroy_ioend(). In my kernel (3.18.19), the xfs_destroy_ioend() is [1]. I think it doesn't suffer from the problem of xfs_finish_page_writeback(). Looking at other usage of "b_this_page" in my kernel, they all seem valid, and similar to what Linus's tree has. Looking at b_private usage to link buffer heads, the only suspicious code is in xfs_submit_ioend(): for (bh = ioend->io_buffer_head; bh; bh = bh->b_private) { if (!bio) { retry: bio = xfs_alloc_ioend_bio(bh); } else if (bh->b_blocknr != lastblock + 1) { xfs_submit_ioend_bio(wbc, ioend, bio); goto retry; } if (xfs_bio_add_buffer(bio, bh) != bh->b_size) { xfs_submit_ioend_bio(wbc, ioend, bio); goto retry; } lastblock = bh->b_blocknr; } Can it happen that when the for loop does "bh = bh->b_private", the bh has already been completed and freed? With this in mind, the "goto retry" also seem suspicious for the same reason. What do you think? Thanks, Alex. [1] STATIC void xfs_destroy_ioend( xfs_ioend_t *ioend) { struct buffer_head *bh, *next; for (bh = ioend->io_buffer_head; bh; bh = next) { next = bh->b_private; bh->b_end_io(bh, !ioend->io_error); } mempool_free(ioend, xfs_ioend_pool); } -----Original Message----- From: Holger Hoffstätte Sent: Monday, July 18, 2016 11:18 PM To: Alex Lyakas ; xfs@oss.sgi.com Subject: Re: xfs_vm_releasepage() causing BUG at free_buffer_head() On 07/18/16 20:00, Alex Lyakas wrote: > Greetings XFS community, > > We have hit the following BUG [1]. > > This is in free_buffer_head(): > BUG_ON(!list_empty(&bh->b_assoc_buffers)); > > This is happening in a long-term mainline kernel 3.18.19. > > Some googling revealed a possibly-related discussion at: > http://comments.gmane.org/gmane.linux.file-systems/105093 > https://lkml.org/lkml/2016/5/30/1007 > except that in our case I don't see the "WARN_ON_ONCE(delalloc)" > triggered. Since you make it past the WARN_ONs that makes it look like this very recent report from Friday: http://oss.sgi.com/pipermail/xfs/2016-July/050199.html Dave posted a patch in that thread which seems ot work fine and so far hasn't set anything on fire, at least for me on 4.4.x. cheers, Holger _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfs_vm_releasepage() causing BUG at free_buffer_head() 2016-07-19 8:43 ` Alex Lyakas @ 2016-07-19 11:24 ` Holger Hoffstätte 2016-07-19 23:05 ` Dave Chinner 1 sibling, 0 replies; 7+ messages in thread From: Holger Hoffstätte @ 2016-07-19 11:24 UTC (permalink / raw) To: Alex Lyakas, xfs Hi, first off I didn't mean to imply that this is exactly the same problem, merely a related symptom due to buffer shrinking crashing your party. On 07/19/16 10:43, Alex Lyakas wrote: > Thank you for your response. I see that xfs_finish_page_writeback() > has been added very recently and is called from xfs_destroy_ioend(). > In my kernel (3.18.19), the xfs_destroy_ioend() is [1]. I think it > doesn't suffer from the problem of xfs_finish_page_writeback(). > Looking at other usage of "b_this_page" in my kernel, they all seem > valid, and similar to what Linus's tree has. Unwinding this a bit, all I superficially understand is that e10de3723c "don't chain ioends during writepage submission" made the window for bh corruption smaller, and then both bb18782aa4 "build bios directly in xfs_add_to_ioend" and 37992c18bb "don't release bios on completion immediately" changed that to track page state instead, presumably because the bh traversing was indeed racy. That was still incomplete, as Calvin found. So I don't see why your current version of xfs_submit_ioend() wouldn't suffer from the same problem(s). You just walked into the bh BUG later, instead of a use-after-free as it can happen now. > Looking at b_private usage to link buffer heads, the only suspicious > code is in xfs_submit_ioend(): > > for (bh = ioend->io_buffer_head; bh; bh = bh->b_private) { > > if (!bio) { > retry: > bio = xfs_alloc_ioend_bio(bh); > } else if (bh->b_blocknr != lastblock + 1) { > xfs_submit_ioend_bio(wbc, ioend, bio); > goto retry; > } > > if (xfs_bio_add_buffer(bio, bh) != bh->b_size) { > xfs_submit_ioend_bio(wbc, ioend, bio); > goto retry; > } > > lastblock = bh->b_blocknr; > } > > Can it happen that when the for loop does "bh = bh->b_private", the > bh has already been completed and freed? With this in mind, the "goto > retry" also seem suspicious for the same reason. > > What do you think? I think all this is dark and full of terrors. As for what you could do - other than backport half of mainline XFS - I guess only Dave can make a realistic suggestion. -h _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfs_vm_releasepage() causing BUG at free_buffer_head() 2016-07-19 8:43 ` Alex Lyakas 2016-07-19 11:24 ` Holger Hoffstätte @ 2016-07-19 23:05 ` Dave Chinner 1 sibling, 0 replies; 7+ messages in thread From: Dave Chinner @ 2016-07-19 23:05 UTC (permalink / raw) To: Alex Lyakas; +Cc: Holger Hoffstätte, xfs On Tue, Jul 19, 2016 at 11:43:52AM +0300, Alex Lyakas wrote: > Hello Holger, > > Thank you for your response. I see that xfs_finish_page_writeback() > has been added very recently and is called from xfs_destroy_ioend(). > In my kernel (3.18.19), the xfs_destroy_ioend() is [1]. I think it > doesn't suffer from the problem of xfs_finish_page_writeback(). > Looking at other usage of "b_this_page" in my kernel, they all seem > valid, and similar to what Linus's tree has. > > Looking at b_private usage to link buffer heads, the only suspicious > code is in xfs_submit_ioend(): > > for (bh = ioend->io_buffer_head; bh; bh = bh->b_private) { > > if (!bio) { > retry: > bio = xfs_alloc_ioend_bio(bh); > } else if (bh->b_blocknr != lastblock + 1) { > xfs_submit_ioend_bio(wbc, ioend, bio); > goto retry; > } > > if (xfs_bio_add_buffer(bio, bh) != bh->b_size) { > xfs_submit_ioend_bio(wbc, ioend, bio); > goto retry; > } > > lastblock = bh->b_blocknr; > } > > Can it happen that when the for loop does "bh = bh->b_private", the > bh has already been completed and freed? > With this in mind, the "goto retry" also seem suspicious for the > same reason. > > What do you think? No, because the bh cannot run completion callbacks (via xfs_destroy_ioend) while there is an active reference on the ioend. The reference protecting submission is not dropped until after the entire loop above is finished and xfs_finish_ioend() is called. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfs_vm_releasepage() causing BUG at free_buffer_head() 2016-07-18 18:00 xfs_vm_releasepage() causing BUG at free_buffer_head() Alex Lyakas 2016-07-18 20:18 ` Holger Hoffstätte @ 2016-07-19 23:11 ` Dave Chinner 2016-07-20 9:42 ` Alex Lyakas 1 sibling, 1 reply; 7+ messages in thread From: Dave Chinner @ 2016-07-19 23:11 UTC (permalink / raw) To: Alex Lyakas; +Cc: xfs On Mon, Jul 18, 2016 at 09:00:41PM +0300, Alex Lyakas wrote: > Greetings XFS community, > > We have hit the following BUG [1]. > > This is in free_buffer_head(): > BUG_ON(!list_empty(&bh->b_assoc_buffers)); XFS doesn't use the bh->b_assoc_buffers field at all, so nothing in XFS should ever corrupt it. Do you have any extN filesystems active, or any other filesystems/block devices that use bufferheads than might have a use after free bug? e.g. a long time ago (circa ~2.6.16, IIRC) we had a bufferhead corruption problem detected in XFS that was actually caused by a reiserfs use after free. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfs_vm_releasepage() causing BUG at free_buffer_head() 2016-07-19 23:11 ` Dave Chinner @ 2016-07-20 9:42 ` Alex Lyakas 0 siblings, 0 replies; 7+ messages in thread From: Alex Lyakas @ 2016-07-20 9:42 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs Hello Dave, Grepping through my kernel source code, I see the following: - direct users of b_assoc_buffers are nilfs2, reiserfs and jbd2. In my case, jbd2 is used by ext4. Looking at jbd2 usage, however, it looks like it handles this list correctly. - the only other place where somebody can use the "b_assoc_buffers" link is by calling mark_buffer_dirty_inode(), which puts the bufferhead on "mapping->private_list" using the "b_assoc_buffers" link. There are several users of this API, but for my case the only relevant being again jbd2. Therefore, I will ask on the ext4 community. Thanks, Alex. -----Original Message----- From: Dave Chinner Sent: Wednesday, July 20, 2016 2:11 AM To: Alex Lyakas Cc: xfs@oss.sgi.com Subject: Re: xfs_vm_releasepage() causing BUG at free_buffer_head() On Mon, Jul 18, 2016 at 09:00:41PM +0300, Alex Lyakas wrote: > Greetings XFS community, > > We have hit the following BUG [1]. > > This is in free_buffer_head(): > BUG_ON(!list_empty(&bh->b_assoc_buffers)); XFS doesn't use the bh->b_assoc_buffers field at all, so nothing in XFS should ever corrupt it. Do you have any extN filesystems active, or any other filesystems/block devices that use bufferheads than might have a use after free bug? e.g. a long time ago (circa ~2.6.16, IIRC) we had a bufferhead corruption problem detected in XFS that was actually caused by a reiserfs use after free. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-07-20 9:43 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-07-18 18:00 xfs_vm_releasepage() causing BUG at free_buffer_head() Alex Lyakas 2016-07-18 20:18 ` Holger Hoffstätte 2016-07-19 8:43 ` Alex Lyakas 2016-07-19 11:24 ` Holger Hoffstätte 2016-07-19 23:05 ` Dave Chinner 2016-07-19 23:11 ` Dave Chinner 2016-07-20 9:42 ` Alex Lyakas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox