public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v5] btrfs: Don't submit any btree write bio if the fs has error
Date: Wed, 12 Feb 2020 07:50:38 -0500	[thread overview]
Message-ID: <f04b12ee-4f72-6c64-db01-4f353900e2f8@toxicpanda.com> (raw)
In-Reply-To: <20200212061244.26851-1-wqu@suse.com>

On 2/12/20 1:12 AM, Qu Wenruo wrote:
> [BUG]
> There is a fuzzed image which could cause KASAN report at unmount time.
> 
>    ==================================================================
>    BUG: KASAN: use-after-free in btrfs_queue_work+0x2c1/0x390
>    Read of size 8 at addr ffff888067cf6848 by task umount/1922
> 
>    CPU: 0 PID: 1922 Comm: umount Tainted: G        W         5.0.21 #1
>    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
>    Call Trace:
>     dump_stack+0x5b/0x8b
>     print_address_description+0x70/0x280
>     kasan_report+0x13a/0x19b
>     btrfs_queue_work+0x2c1/0x390
>     btrfs_wq_submit_bio+0x1cd/0x240
>     btree_submit_bio_hook+0x18c/0x2a0
>     submit_one_bio+0x1be/0x320
>     flush_write_bio.isra.41+0x2c/0x70
>     btree_write_cache_pages+0x3bb/0x7f0
>     do_writepages+0x5c/0x130
>     __writeback_single_inode+0xa3/0x9a0
>     writeback_single_inode+0x23d/0x390
>     write_inode_now+0x1b5/0x280
>     iput+0x2ef/0x600
>     close_ctree+0x341/0x750
>     generic_shutdown_super+0x126/0x370
>     kill_anon_super+0x31/0x50
>     btrfs_kill_super+0x36/0x2b0
>     deactivate_locked_super+0x80/0xc0
>     deactivate_super+0x13c/0x150
>     cleanup_mnt+0x9a/0x130
>     task_work_run+0x11a/0x1b0
>     exit_to_usermode_loop+0x107/0x130
>     do_syscall_64+0x1e5/0x280
>     entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> [CAUSE]
> The fuzzed image has a completely screwd up extent tree:
>    leaf 29421568 gen 8 total ptrs 6 free space 3587 owner EXTENT_TREE
>    refs 2 lock (w:0 r:0 bw:0 br:0 sw:0 sr:0) lock_owner 0 current 5938
>            item 0 key (12587008 168 4096) itemoff 3942 itemsize 53
>                    extent refs 1 gen 9 flags 1
>                    ref#0: extent data backref root 5 objectid 259 offset 0 count 1
>            item 1 key (12591104 168 8192) itemoff 3889 itemsize 53
>                    extent refs 1 gen 9 flags 1
>                    ref#0: extent data backref root 5 objectid 271 offset 0 count 1
>            item 2 key (12599296 168 4096) itemoff 3836 itemsize 53
>                    extent refs 1 gen 9 flags 1
>                    ref#0: extent data backref root 5 objectid 259 offset 4096 count 1
>            item 3 key (29360128 169 0) itemoff 3803 itemsize 33
>                    extent refs 1 gen 9 flags 2
>                    ref#0: tree block backref root 5
>            item 4 key (29368320 169 1) itemoff 3770 itemsize 33
>                    extent refs 1 gen 9 flags 2
>                    ref#0: tree block backref root 5
>            item 5 key (29372416 169 0) itemoff 3737 itemsize 33
>                    extent refs 1 gen 9 flags 2
>                    ref#0: tree block backref root 5
> 
> Note that, leaf 29421568 doesn't has its backref in extent tree.
> Thus extent allocator can re-allocate leaf 29421568 for other trees.
> 
> In short, the bug is caused by:
> - Existing tree block get allocated to log tree
>    This got its generation bumped.
> 
> - Log tree balance cleaned dirty bit of offending tree block
>    It will not be written back to disk, thus no WRITTEN flag.
> 
> - Original owner of the tree block get COWed
>    Since the tree block has higher transid, no WRITTEN flag, it's reused,
>    and not traced by transaction::dirty_pages.
> 
> - Transaction aborted
>    Tree blocks get cleaned according to transaction::dirty_pages. But the
>    offending tree block is not recorded at all.
> 
> - Fs unmount
>    Btrfs believes all pages are cleaned, destroying all workqueue, then
>    call iput(btree_inode).
>    But offending tree block is still dirty, which triggers writeback, and
>    cause use-after-free bug.
> 
> The detailed sequence looks like this:
> - Initial status
>    eb: 29421568, header=WRITTEN bflags_dirty=0, page_dirty=0, gen=8,
>        not traced by any dirty extent_iot_tree.
> 
> - New tree block is allocated
>    Since there is no backref for 29421568, it's re-allocated as new tree
>    block.
>    Keep in mind that, tree block 29421568 is still referred by extent
>    tree.
> 
> - Tree block 29421568 is filled for log tree
>    eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9 << (gen bumped)
>        traced by btrfs_root::dirty_log_pages
> 
> - Some log tree operations
>    Since the fs is using node size 4096, the log tree can easily go a
>    level higher.
> 
> - Log tree needs balance
>    Tree block 29421568 gets all it content pushed to right, thus now
>    it is empty, and btrfs don't need it.
>    btrfs_clean_tree_block() from __push_leaf_right() get called.
> 
>    eb: 29421568, header=0 bflags_dirty=0, page_dirty=0, gen=9
>        traced by btrfs_root::dirty_log_pages
> 
> - Log tree write back
>    btree_write_cache_pages() go through dirty pages ranges, but since
>    page of tree block 29421568 gets cleaned already, it's not written
>    back to disk. Thus it doesn't have WRITTEN bit set.
>    But ranges in dirty_log_pages are cleared.
> 
>    eb: 29421568, header=0 bflags_dirty=0, page_dirty=0, gen=9
>        not traced by any dirty extent_iot_tree.
> 
> - Extent tree update when committing transaction
>    Since tree block 29421568 has transid equals to running trans, and has
>    no WRITTEN bit, should_cow_block() will use it directly without adding
>    it to btrfs_transaction::dirty_pages.
> 
>    eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9
>        not traced by any dirty extent_iot_tree.
> 
>    At this stage, we're doomed. We have a dirty eb not traced by any
>    extent io tree.
> 
> - Transaction get aborted due to corrupted extent tree
>    Btrfs cleans up dirty pages according to transaction::dirty_pages and
>    btrfs_root::dirty_log_pages.
>    But since tree block 29421568 is not traced by either of them, it's
>    still dirty.
> 
>    eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9
>        not traced by any dirty extent_iot_tree.
> 
> - Fs unmount
>    Since btrfs believes all its cleanup has done, it destroys all its
>    workqueue. Then call iput(btree_inode), expecting no dirty pages.
>    But tree 29421568 is still dirty, thus triggering writeback.
>    Since all workqueues are already freed, we cause use-after-free.
> 
> This shows us that, log tree blocks + bad extent tree can cause wild
> dirty pages.
> 
> [FIX]
> To fix the problem, don't submit any btree write bio if the fs has any
> error.
> This is the last safe net, just in case other cleanup didn't catch it.
> 
> Link: https://github.com/bobfuzzer/CVE/tree/master/CVE-2019-19377
> CVE: CVE-2019-19377
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

  reply	other threads:[~2020-02-12 12:50 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-12  6:12 [PATCH v5] btrfs: Don't submit any btree write bio if the fs has error Qu Wenruo
2020-02-12 12:50 ` Josef Bacik [this message]
2020-02-21 13:35 ` David Sterba
2020-02-21 13:40   ` Qu Wenruo
2020-02-24 17:06     ` David Sterba
2020-02-25  0:02       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f04b12ee-4f72-6c64-db01-4f353900e2f8@toxicpanda.com \
    --to=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox