stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: gregkh@linuxfoundation.org
Cc: fdmanana@suse.com, dsterba@suse.com, josef@toxicpanda.com,
	stable@vger.kernel.org
Subject: Re: FAILED: patch "[PATCH] Btrfs: fix race between shrinking truncate and fiemap" failed to apply to 4.19-stable tree
Date: Tue, 18 Feb 2020 11:43:00 -0500	[thread overview]
Message-ID: <20200218164300.GQ1734@sasha-vm> (raw)
In-Reply-To: <158196656712413@kroah.com>

On Mon, Feb 17, 2020 at 08:09:27PM +0100, gregkh@linuxfoundation.org wrote:
>
>The patch below does not apply to the 4.19-stable tree.
>If someone wants it applied there, or to any other stable or longterm
>tree, then please email the backport, including the original git commit
>id to <stable@vger.kernel.org>.
>
>thanks,
>
>greg k-h
>
>------------------ original commit in Linus's tree ------------------
>
>From 28553fa992cb28be6a65566681aac6cafabb4f2d Mon Sep 17 00:00:00 2001
>From: Filipe Manana <fdmanana@suse.com>
>Date: Fri, 7 Feb 2020 12:23:09 +0000
>Subject: [PATCH] Btrfs: fix race between shrinking truncate and fiemap
>
>When there is a fiemap executing in parallel with a shrinking truncate
>we can end up in a situation where we have extent maps for which we no
>longer have corresponding file extent items. This is generally harmless
>and at the moment the only consequences are missing file extent items
>representing holes after we expand the file size again after the
>truncate operation removed the prealloc extent items, and stale
>information for future fiemap calls (reporting extents that no longer
>exist or may have been reallocated to other files for example).
>
>Consider the following example:
>
>1) Our inode has a size of 128KiB, one 128KiB extent at file offset 0
>   and a 1MiB prealloc extent at file offset 128KiB;
>
>2) Task A starts doing a shrinking truncate of our inode to reduce it to
>   a size of 64KiB. Before it searches the subvolume tree for file
>   extent items to delete, it drops all the extent maps in the range
>   from 64KiB to (u64)-1 by calling btrfs_drop_extent_cache();
>
>3) Task B starts doing a fiemap against our inode. When looking up for
>   the inode's extent maps in the range from 128KiB to (u64)-1, it
>   doesn't find any in the inode's extent map tree, since they were
>   removed by task A.  Because it didn't find any in the extent map
>   tree, it scans the inode's subvolume tree for file extent items, and
>   it finds the 1MiB prealloc extent at file offset 128KiB, then it
>   creates an extent map based on that file extent item and adds it to
>   inode's extent map tree (this ends up being done by
>   btrfs_get_extent() <- btrfs_get_extent_fiemap() <-
>   get_extent_skip_holes());
>
>4) Task A then drops the prealloc extent at file offset 128KiB and
>   shrinks the 128KiB extent file offset 0 to a length of 64KiB. The
>   truncation operation finishes and we end up with an extent map
>   representing a 1MiB prealloc extent at file offset 128KiB, despite we
>   don't have any more that extent;
>
>After this the two types of problems we have are:
>
>1) Future calls to fiemap always report that a 1MiB prealloc extent
>   exists at file offset 128KiB. This is stale information, no longer
>   correct;
>
>2) If the size of the file is increased, by a truncate operation that
>   increases the file size or by a write into a file offset > 64KiB for
>   example, we end up not inserting file extent items to represent holes
>   for any range between 128KiB and 128KiB + 1MiB, since the hole
>   expansion function, btrfs_cont_expand() will skip hole insertion for
>   any range for which an extent map exists that represents a prealloc
>   extent. This causes fsck to complain about missing file extent items
>   when not using the NO_HOLES feature.
>
>The second issue could be often triggered by test case generic/561 from
>fstests, which runs fsstress and duperemove in parallel, and duperemove
>does frequent fiemap calls.
>
>Essentially the problems happens because fiemap does not acquire the
>inode's lock while truncate does, and fiemap locks the file range in the
>inode's iotree while truncate does not. So fix the issue by making
>btrfs_truncate_inode_items() lock the file range from the new file size
>to (u64)-1, so that it serializes with fiemap.
>
>CC: stable@vger.kernel.org # 4.4+
>Reviewed-by: Josef Bacik <josef@toxicpanda.com>
>Signed-off-by: Filipe Manana <fdmanana@suse.com>
>Reviewed-by: David Sterba <dsterba@suse.com>
>Signed-off-by: David Sterba <dsterba@suse.com>

Note: since this patch has a fix that just hit upstream, and it needs
backporting to older kernels, I've dropped it from 5.5 and 5.4 for now
and will queue both this and it's fix for the next release.

Backports of both to older kernels (<5.4) would be great to have.

-- 
Thanks,
Sasha

  reply	other threads:[~2020-02-18 16:43 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-17 19:09 FAILED: patch "[PATCH] Btrfs: fix race between shrinking truncate and fiemap" failed to apply to 4.19-stable tree gregkh
2020-02-18 16:43 ` Sasha Levin [this message]
  -- strict thread matches above, loose matches on Subject: below --
2020-02-27  9:19 gregkh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200218164300.GQ1734@sasha-vm \
    --to=sashal@kernel.org \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=josef@toxicpanda.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).