From: Nikolay Borisov <nborisov@suse.com>
To: linux-btrfs@vger.kernel.org
Cc: Nikolay Borisov <nborisov@suse.com>
Subject: [PATCH] btrfs: Correctly free extent buffer in case btree_read_extent_buffer_pages fails
Date: Tue, 12 Mar 2019 15:18:47 +0200 [thread overview]
Message-ID: <20190312131847.20311-1-nborisov@suse.com> (raw)
If a an eb fails to be read for whatever reason - it's corrupted on disk
and parent transid/key validations fail or IO for eb pages fail then
this buffer must be removed from the buffer cache. Currently the code
calls free_extent_buffer if an error occurs. Unfortunately this doesn't
achieve the desired behavior since btrfs_find_create_tree_block returns
with eb->refs == 2. On the other hand free_extent_buffer will only
decrement the refs once leavin it added to the buffer cache radix tree.
This enables later code to look up the buffer from the cache and utilize
it potentially leading to a crash.
The correct way to free the buffer is call free_extent_buffer_stale.
This function will correctly call atomic_dec explicitly for the buffer
and subsequently call release_extent_buffer which will decrement the
final reference thus correctly remove the invalid buffer from buffer
cache. This change affects only newly allocated buffers since they have
eb->refs == 2.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202755
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
This patch was inspired by Qu's "btrfs: Check the first key and level for
cached extent buffer". Though fixing Qu's crash is only a side effect, what I
was aiming for is "correct behavior" - in this case immediately remove an eb
from the eb cache if it's detected broken. This, however, doesn't eliminate
the need for Qu's patch which adds the parent check in read_block_for_search.
I have validated it it using the image from the bugzilla issue and reading
/MOUNT/foo/bar/stress/f7. Without my patch (or Qu's) it crashes with either of
them I get:
cat: scratch/foo/bar/stress/f7: Input/output error
This also survived a full xfstest run with no regressions.
fs/btrfs/disk-io.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index f7010312d171..03df73de475c 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1043,12 +1043,12 @@ int reada_tree_block_flagged(struct btrfs_fs_info *fs_info, u64 bytenr,
ret = read_extent_buffer_pages(io_tree, buf, WAIT_PAGE_LOCK,
mirror_num);
if (ret) {
- free_extent_buffer(buf);
+ free_extent_buffer_stale(buf);
return ret;
}
if (test_bit(EXTENT_BUFFER_CORRUPT, &buf->bflags)) {
- free_extent_buffer(buf);
+ free_extent_buffer_stale(buf);
return -EIO;
} else if (extent_buffer_uptodate(buf)) {
*eb = buf;
@@ -1102,7 +1102,7 @@ struct extent_buffer *read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
ret = btree_read_extent_buffer_pages(fs_info, buf, parent_transid,
level, first_key);
if (ret) {
- free_extent_buffer(buf);
+ free_extent_buffer_stale(buf);
return ERR_PTR(ret);
}
return buf;
--
2.17.1
next reply other threads:[~2019-03-12 13:18 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-12 13:18 Nikolay Borisov [this message]
2019-03-13 18:03 ` [PATCH] btrfs: Correctly free extent buffer in case btree_read_extent_buffer_pages fails David Sterba
2019-03-14 7:10 ` Nikolay Borisov
2019-03-18 19:01 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190312131847.20311-1-nborisov@suse.com \
--to=nborisov@suse.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).