public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Josef Bacik <josef@toxicpanda.com>,
	Qu Wenruo <wqu@suse.com>, David Sterba <dsterba@suse.com>,
	Ben Hutchings <ben.hutchings@codethink.co.uk>
Subject: [PATCH 4.19 21/71] btrfs: Dont submit any btree write bio if the fs has errors
Date: Mon,  9 Nov 2020 13:55:15 +0100	[thread overview]
Message-ID: <20201109125020.896825235@linuxfoundation.org> (raw)
In-Reply-To: <20201109125019.906191744@linuxfoundation.org>

From: Qu Wenruo <wqu@suse.com>

commit b3ff8f1d380e65dddd772542aa9bff6c86bf715a upstream.

[BUG]
There is a fuzzed image which could cause KASAN report at unmount time.

  BUG: KASAN: use-after-free in btrfs_queue_work+0x2c1/0x390
  Read of size 8 at addr ffff888067cf6848 by task umount/1922

  CPU: 0 PID: 1922 Comm: umount Tainted: G        W         5.0.21 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
  Call Trace:
   dump_stack+0x5b/0x8b
   print_address_description+0x70/0x280
   kasan_report+0x13a/0x19b
   btrfs_queue_work+0x2c1/0x390
   btrfs_wq_submit_bio+0x1cd/0x240
   btree_submit_bio_hook+0x18c/0x2a0
   submit_one_bio+0x1be/0x320
   flush_write_bio.isra.41+0x2c/0x70
   btree_write_cache_pages+0x3bb/0x7f0
   do_writepages+0x5c/0x130
   __writeback_single_inode+0xa3/0x9a0
   writeback_single_inode+0x23d/0x390
   write_inode_now+0x1b5/0x280
   iput+0x2ef/0x600
   close_ctree+0x341/0x750
   generic_shutdown_super+0x126/0x370
   kill_anon_super+0x31/0x50
   btrfs_kill_super+0x36/0x2b0
   deactivate_locked_super+0x80/0xc0
   deactivate_super+0x13c/0x150
   cleanup_mnt+0x9a/0x130
   task_work_run+0x11a/0x1b0
   exit_to_usermode_loop+0x107/0x130
   do_syscall_64+0x1e5/0x280
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

[CAUSE]
The fuzzed image has a completely screwd up extent tree:

  leaf 29421568 gen 8 total ptrs 6 free space 3587 owner EXTENT_TREE
  refs 2 lock (w:0 r:0 bw:0 br:0 sw:0 sr:0) lock_owner 0 current 5938
          item 0 key (12587008 168 4096) itemoff 3942 itemsize 53
                  extent refs 1 gen 9 flags 1
                  ref#0: extent data backref root 5 objectid 259 offset 0 count 1
          item 1 key (12591104 168 8192) itemoff 3889 itemsize 53
                  extent refs 1 gen 9 flags 1
                  ref#0: extent data backref root 5 objectid 271 offset 0 count 1
          item 2 key (12599296 168 4096) itemoff 3836 itemsize 53
                  extent refs 1 gen 9 flags 1
                  ref#0: extent data backref root 5 objectid 259 offset 4096 count 1
          item 3 key (29360128 169 0) itemoff 3803 itemsize 33
                  extent refs 1 gen 9 flags 2
                  ref#0: tree block backref root 5
          item 4 key (29368320 169 1) itemoff 3770 itemsize 33
                  extent refs 1 gen 9 flags 2
                  ref#0: tree block backref root 5
          item 5 key (29372416 169 0) itemoff 3737 itemsize 33
                  extent refs 1 gen 9 flags 2
                  ref#0: tree block backref root 5

Note that leaf 29421568 doesn't have its backref in the extent tree.
Thus extent allocator can re-allocate leaf 29421568 for other trees.

In short, the bug is caused by:

- Existing tree block gets allocated to log tree
  This got its generation bumped.

- Log tree balance cleaned dirty bit of offending tree block
  It will not be written back to disk, thus no WRITTEN flag.

- Original owner of the tree block gets COWed
  Since the tree block has higher transid, no WRITTEN flag, it's reused,
  and not traced by transaction::dirty_pages.

- Transaction aborted
  Tree blocks get cleaned according to transaction::dirty_pages. But the
  offending tree block is not recorded at all.

- Filesystem unmount
  All pages are assumed to be are clean, destroying all workqueue, then
  call iput(btree_inode).
  But offending tree block is still dirty, which triggers writeback, and
  causes use-after-free bug.

The detailed sequence looks like this:

- Initial status
  eb: 29421568, header=WRITTEN bflags_dirty=0, page_dirty=0, gen=8,
      not traced by any dirty extent_iot_tree.

- New tree block is allocated
  Since there is no backref for 29421568, it's re-allocated as new tree
  block.
  Keep in mind that tree block 29421568 is still referred by extent
  tree.

- Tree block 29421568 is filled for log tree
  eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9 << (gen bumped)
      traced by btrfs_root::dirty_log_pages

- Some log tree operations
  Since the fs is using node size 4096, the log tree can easily go a
  level higher.

- Log tree needs balance
  Tree block 29421568 gets all its content pushed to right, thus now
  it is empty, and we don't need it.
  btrfs_clean_tree_block() from __push_leaf_right() get called.

  eb: 29421568, header=0 bflags_dirty=0, page_dirty=0, gen=9
      traced by btrfs_root::dirty_log_pages

- Log tree write back
  btree_write_cache_pages() goes through dirty pages ranges, but since
  page of tree block 29421568 gets cleaned already, it's not written
  back to disk. Thus it doesn't have WRITTEN bit set.
  But ranges in dirty_log_pages are cleared.

  eb: 29421568, header=0 bflags_dirty=0, page_dirty=0, gen=9
      not traced by any dirty extent_iot_tree.

- Extent tree update when committing transaction
  Since tree block 29421568 has transid equal to running trans, and has
  no WRITTEN bit, should_cow_block() will use it directly without adding
  it to btrfs_transaction::dirty_pages.

  eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9
      not traced by any dirty extent_iot_tree.

  At this stage, we're doomed. We have a dirty eb not tracked by any
  extent io tree.

- Transaction gets aborted due to corrupted extent tree
  Btrfs cleans up dirty pages according to transaction::dirty_pages and
  btrfs_root::dirty_log_pages.
  But since tree block 29421568 is not tracked by neither of them, it's
  still dirty.

  eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9
      not traced by any dirty extent_iot_tree.

- Filesystem unmount
  Since all cleanup is assumed to be done, all workqueus are destroyed.
  Then iput(btree_inode) is called, expecting no dirty pages.
  But tree 29421568 is still dirty, thus triggering writeback.
  Since all workqueues are already freed, we cause use-after-free.

This shows us that, log tree blocks + bad extent tree can cause wild
dirty pages.

[FIX]
To fix the problem, don't submit any btree write bio if the filesytem
has any error.  This is the last safe net, just in case other cleanup
haven't caught catch it.

Link: https://github.com/bobfuzzer/CVE/tree/master/CVE-2019-19377
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[bwh: Backported to 4.19: fs_info variable already exists in
 btree_write_cache_pages()]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/extent_io.c |   34 +++++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3947,7 +3947,39 @@ retry:
 		end_write_bio(&epd, ret);
 		return ret;
 	}
-	ret = flush_write_bio(&epd);
+	/*
+	 * If something went wrong, don't allow any metadata write bio to be
+	 * submitted.
+	 *
+	 * This would prevent use-after-free if we had dirty pages not
+	 * cleaned up, which can still happen by fuzzed images.
+	 *
+	 * - Bad extent tree
+	 *   Allowing existing tree block to be allocated for other trees.
+	 *
+	 * - Log tree operations
+	 *   Exiting tree blocks get allocated to log tree, bumps its
+	 *   generation, then get cleaned in tree re-balance.
+	 *   Such tree block will not be written back, since it's clean,
+	 *   thus no WRITTEN flag set.
+	 *   And after log writes back, this tree block is not traced by
+	 *   any dirty extent_io_tree.
+	 *
+	 * - Offending tree block gets re-dirtied from its original owner
+	 *   Since it has bumped generation, no WRITTEN flag, it can be
+	 *   reused without COWing. This tree block will not be traced
+	 *   by btrfs_transaction::dirty_pages.
+	 *
+	 *   Now such dirty tree block will not be cleaned by any dirty
+	 *   extent io tree. Thus we don't want to submit such wild eb
+	 *   if the fs already has error.
+	 */
+	if (!test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
+		ret = flush_write_bio(&epd);
+	} else {
+		ret = -EUCLEAN;
+		end_write_bio(&epd, ret);
+	}
 	return ret;
 }
 



  parent reply	other threads:[~2020-11-09 13:08 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-09 12:54 [PATCH 4.19 00/71] 4.19.156-rc1 review Greg Kroah-Hartman
2020-11-09 12:54 ` [PATCH 4.19 01/71] drm/i915: Break up error capture compression loops with cond_resched() Greg Kroah-Hartman
2020-11-09 18:20   ` Pavel Machek
2020-11-09 12:54 ` [PATCH 4.19 02/71] tipc: fix use-after-free in tipc_bcast_get_mode Greg Kroah-Hartman
2020-11-09 12:54 ` [PATCH 4.19 03/71] ptrace: fix task_join_group_stop() for the case when current is traced Greg Kroah-Hartman
2020-11-09 12:54 ` [PATCH 4.19 04/71] cadence: force nonlinear buffers to be cloned Greg Kroah-Hartman
2020-11-09 12:54 ` [PATCH 4.19 05/71] chelsio/chtls: fix memory leaks caused by a race Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 06/71] chelsio/chtls: fix always leaking ctrl_skb Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 07/71] gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 08/71] gianfar: Account for Tx PTP timestamp in the skb headroom Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 09/71] net: usb: qmi_wwan: add Telit LE910Cx 0x1230 composition Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 10/71] sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian platforms Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 11/71] sfp: Fix error handing in sfp_probe() Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 12/71] blktrace: fix debugfs use after free Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 13/71] btrfs: extent_io: Kill the forward declaration of flush_write_bio Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 14/71] btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 15/71] Revert "btrfs: flush write bio if we loop in extent_write_cache_pages" Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 16/71] btrfs: flush write bio if we loop in extent_write_cache_pages Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 17/71] btrfs: extent_io: Handle errors better in extent_write_full_page() Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 18/71] btrfs: extent_io: Handle errors better in btree_write_cache_pages() Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 19/71] btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io() Greg Kroah-Hartman
2020-11-11 12:44   ` Pavel Machek
2020-11-11 14:39     ` Ben Hutchings
2020-11-12 16:06       ` Sasha Levin
2020-11-09 12:55 ` [PATCH 4.19 20/71] Btrfs: fix unwritten extent buffers and hangs on future writeback attempts Greg Kroah-Hartman
2020-11-09 12:55 ` Greg Kroah-Hartman [this message]
2020-11-09 12:55 ` [PATCH 4.19 22/71] btrfs: Move btrfs_check_chunk_valid() to tree-check.[ch] and export it Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 23/71] btrfs: tree-checker: Make chunk item checker messages more readable Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 24/71] btrfs: tree-checker: Make btrfs_check_chunk_valid() return EUCLEAN instead of EIO Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 25/71] btrfs: tree-checker: Check chunk item at tree block read time Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 26/71] btrfs: tree-checker: Verify dev item Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 27/71] btrfs: tree-checker: Fix wrong check on max devid Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 28/71] btrfs: tree-checker: Enhance chunk checker to validate chunk profile Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 29/71] btrfs: tree-checker: Verify inode item Greg Kroah-Hartman
2020-11-11 13:13   ` Pavel Machek
2020-11-11 13:30     ` Qu Wenruo
2020-11-11 13:38       ` Pavel Machek
2020-11-11 14:04         ` Qu Wenruo
2020-11-09 12:55 ` [PATCH 4.19 30/71] btrfs: tree-checker: fix the error message for transid error Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 31/71] Fonts: Replace discarded const qualifier Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 32/71] ALSA: usb-audio: Add implicit feedback quirk for Zoom UAC-2 Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 33/71] ALSA: usb-audio: add usb vendor id as DSD-capable for Khadas devices Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 34/71] ALSA: usb-audio: Add implicit feedback quirk for Qu-16 Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 35/71] ALSA: usb-audio: Add implicit feedback quirk for MODX Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 36/71] mm: mempolicy: fix potential pte_unmap_unlock pte error Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 37/71] lib/crc32test: remove extra local_irq_disable/enable Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 38/71] kthread_worker: prevent queuing delayed work from timer_fn when it is being canceled Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 39/71] mm: always have io_remap_pfn_range() set pgprot_decrypted() Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 40/71] gfs2: Wake up when sd_glock_disposal becomes zero Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 41/71] ring-buffer: Fix recursion protection transitions between interrupt context Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 42/71] ftrace: Fix recursion check for NMI test Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 43/71] ftrace: Handle tracing when switching between context Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 44/71] tracing: Fix out of bounds write in get_trace_buf Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 45/71] futex: Handle transient "ownerless" rtmutex state correctly Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 46/71] ARM: dts: sun4i-a10: fix cpu_alert temperature Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 47/71] x86/kexec: Use up-to-dated screen_info copy to fill boot params Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 48/71] of: Fix reserved-memory overlap detection Greg Kroah-Hartman
2020-11-11 12:53   ` Pavel Machek
2020-11-11 14:34     ` Vincent Whitchurch
2020-11-09 12:55 ` [PATCH 4.19 49/71] blk-cgroup: Fix memleak on error path Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 50/71] blk-cgroup: Pre-allocate tree node on blkg_conf_prep Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 51/71] scsi: core: Dont start concurrent async scan on same host Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 52/71] vsock: use ns_capable_noaudit() on socket create Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 53/71] drm/vc4: drv: Add error handding for bind Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 54/71] ACPI: NFIT: Fix comparison to -ENXIO Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 55/71] vt: Disable KD_FONT_OP_COPY Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 56/71] fork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parent Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 57/71] serial: 8250_mtk: Fix uart_get_baud_rate warning Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 58/71] serial: txx9: add missing platform_driver_unregister() on error in serial_txx9_init Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 59/71] USB: serial: cyberjack: fix write-URB completion race Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 60/71] USB: serial: option: add Quectel EC200T module support Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 61/71] USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231 Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 62/71] USB: serial: option: add Telit FN980 composition 0x1055 Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 63/71] USB: Add NO_LPM quirk for Kingston flash drive Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 64/71] usb: mtu3: fix panic in mtu3_gadget_stop() Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 65/71] ARC: stack unwinding: avoid indefinite looping Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 4.19 66/71] Revert "ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE" Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 4.19 67/71] PM: runtime: Resume the device earlier in __device_release_driver() Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 4.19 68/71] perf/core: Fix a memory leak in perf_event_parse_addr_filter() Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 4.19 69/71] tools: perf: Fix build error in v4.19.y Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 4.19 70/71] net: dsa: read mac address from DT for slave device Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 4.19 71/71] arm64: dts: marvell: espressobin: Add ethernet switch aliases Greg Kroah-Hartman
2020-11-09 19:22 ` [PATCH 4.19 00/71] 4.19.156-rc1 review Pavel Machek
2020-11-09 23:07 ` Guenter Roeck
2020-11-09 23:22 ` Shuah Khan
2020-11-10  7:44 ` Naresh Kamboju
2020-11-19  8:10 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201109125020.896825235@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=ben.hutchings@codethink.co.uk \
    --cc=dsterba@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox