From: Josef Bacik <josef@toxicpanda.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>, Qu Wenruo <wqu@suse.com>,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v3] btrfs: Don't submit any btree write bio after transaction is aborted
Date: Thu, 6 Feb 2020 19:37:12 -0500 [thread overview]
Message-ID: <547fd1f1-ded8-2953-d958-5741a631aaef@toxicpanda.com> (raw)
In-Reply-To: <443a3b4a-c751-741c-1f27-f39f16ad1ded@gmx.com>
On 2/6/20 7:24 PM, Qu Wenruo wrote:
>
>
> On 2020/2/7 上午12:00, Josef Bacik wrote:
>> On 2/5/20 2:10 AM, Qu Wenruo wrote:
>>> [BUG]
>>> There is a fuzzed image which could cause KASAN report at unmount time.
>>>
>>> ==================================================================
>>> BUG: KASAN: use-after-free in btrfs_queue_work+0x2c1/0x390
>>> Read of size 8 at addr ffff888067cf6848 by task umount/1922
>>>
>>> CPU: 0 PID: 1922 Comm: umount Tainted: G W 5.0.21 #1
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>> 1.10.2-1ubuntu1 04/01/2014
>>> Call Trace:
>>> dump_stack+0x5b/0x8b
>>> print_address_description+0x70/0x280
>>> kasan_report+0x13a/0x19b
>>> btrfs_queue_work+0x2c1/0x390
>>> btrfs_wq_submit_bio+0x1cd/0x240
>>> btree_submit_bio_hook+0x18c/0x2a0
>>> submit_one_bio+0x1be/0x320
>>> flush_write_bio.isra.41+0x2c/0x70
>>> btree_write_cache_pages+0x3bb/0x7f0
>>> do_writepages+0x5c/0x130
>>> __writeback_single_inode+0xa3/0x9a0
>>> writeback_single_inode+0x23d/0x390
>>> write_inode_now+0x1b5/0x280
>>> iput+0x2ef/0x600
>>> close_ctree+0x341/0x750
>>> generic_shutdown_super+0x126/0x370
>>> kill_anon_super+0x31/0x50
>>> btrfs_kill_super+0x36/0x2b0
>>> deactivate_locked_super+0x80/0xc0
>>> deactivate_super+0x13c/0x150
>>> cleanup_mnt+0x9a/0x130
>>> task_work_run+0x11a/0x1b0
>>> exit_to_usermode_loop+0x107/0x130
>>> do_syscall_64+0x1e5/0x280
>>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>
>>> [CAUSE]
>>> The fuzzed image has a completely screwd up extent tree:
>>> leaf 29421568 gen 8 total ptrs 6 free space 3587 owner EXTENT_TREE
>>> refs 2 lock (w:0 r:0 bw:0 br:0 sw:0 sr:0) lock_owner 0 current 5938
>>> item 0 key (12587008 168 4096) itemoff 3942 itemsize 53
>>> extent refs 1 gen 9 flags 1
>>> ref#0: extent data backref root 5 objectid 259
>>> offset 0 count 1
>>> item 1 key (12591104 168 8192) itemoff 3889 itemsize 53
>>> extent refs 1 gen 9 flags 1
>>> ref#0: extent data backref root 5 objectid 271
>>> offset 0 count 1
>>> item 2 key (12599296 168 4096) itemoff 3836 itemsize 53
>>> extent refs 1 gen 9 flags 1
>>> ref#0: extent data backref root 5 objectid 259
>>> offset 4096 count 1
>>> item 3 key (29360128 169 0) itemoff 3803 itemsize 33
>>> extent refs 1 gen 9 flags 2
>>> ref#0: tree block backref root 5
>>> item 4 key (29368320 169 1) itemoff 3770 itemsize 33
>>> extent refs 1 gen 9 flags 2
>>> ref#0: tree block backref root 5
>>> item 5 key (29372416 169 0) itemoff 3737 itemsize 33
>>> extent refs 1 gen 9 flags 2
>>> ref#0: tree block backref root 5
>>>
>>> Note that, leaf 29421568 doesn't has its backref in extent tree.
>>> Thus extent allocator can re-allocate leaf 29421568 for other trees.
>>>
>>> Short version for the corruption:
>>> - Extent tree corruption
>>> Existing tree block X can be allocated as new tree block.
>>>
>>> - Tree block X allocated to log tree
>>> The tree block X generation get bumped, and is traced by
>>> log_root->dirty_log_pages now.
>>>
>>> - Log tree writes tree blocks
>>> log_root->dirty_log_pages is cleaned.
>>>
>>> - The original owner of tree block X wants to modify its content
>>> Instead of COW tree block X to a new eb, due to the bumped
>>> generation, tree block X is reused as is.
>>>
>>> Btrfs believes tree block X is already dirtied due to its transid,
>>> but it is not tranced by transaction->dirty_pages.
>>>
>>
>> But at the write part we should have gotten BTRFS_HEADER_FLAG_WRITTEN,
>> so we should have cow'ed this block. So this isn't what's happening,
>> right?
>
> From my debugging, it's not the case. By somehow, after log tree writes
> back, the tree block just got reused.
>
>> Or is something else clearing the BTRFS_HEADER_FLAG_WRITTEN in
>> between the writeout and this part? Thanks,
>
> It didn't occur to me at the time of writing, is it possible that log
> tree get freed, thus that tree block X is considered free, and get
> re-allocated to extent tree again?
>
Yeah, but then they'd go onto the dirty pages radix tree properly, because it
would be the correct root, and we wouldn't have this problem.
> The problem is really killing me to digging.
> Can't we use this last-resort method anyway? The corrupted extent tree
> is really causing all kinds of issues we didn't expect...
Which is why I want the real root cause and a real fix, not something that's
papering over the problem. Thanks,
Josef
next prev parent reply other threads:[~2020-02-07 0:37 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-05 7:10 [PATCH v3] btrfs: Don't submit any btree write bio after transaction is aborted Qu Wenruo
2020-02-06 16:00 ` Josef Bacik
2020-02-07 0:24 ` Qu Wenruo
2020-02-07 0:37 ` Josef Bacik [this message]
2020-02-08 6:28 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=547fd1f1-ded8-2953-d958-5741a631aaef@toxicpanda.com \
--to=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox