From: Filipe Manana <fdmanana@gmail.com>
To: robbieko <robbieko@synology.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>,
linux-btrfs-owner@vger.kernel.org
Subject: Re: [PATCH] btrfs: incremental send, fix BUG when parent commit_root changed
Date: Wed, 9 May 2018 09:29:24 +0100 [thread overview]
Message-ID: <CAL3q7H7gnDdxPF8Grn0J8vXRGfpCDto_eNmaUQ6akonmEf_idg@mail.gmail.com> (raw)
In-Reply-To: <bd8ca0719d5e9aa6dce8bea371533225@synology.com>
On Wed, May 9, 2018 at 2:10 AM, robbieko <robbieko@synology.com> wrote:
> Filipe Manana 於 2018-05-08 19:12 寫到:
>>
>> On Tue, May 8, 2018 at 11:30 AM, robbieko <robbieko@synology.com> wrote:
>>>
>>> Hi Filipe Manana,
>>>
>>> Although the snapshot is readonly, when the snapshot is created,
>>> in order to modify the last_snapshot, it will cause generation changes,
>>> and it will affect the commit_root modification.
>>
>>
>> So what you to say is that the problem happens when creating a
>> snapshot of snapshot that is being used by send.
>> You don't mention that anywhere, except in the example below.
>>
>> Just say that in the changelog, that the problem can happen if while
>> we are doing a send one of the snapshots used (parent or send) is
>> snapshotted, because snapshoting implies COWing the root of the source
>> subvolume/snaphot.
>> That is not mentioned anywhere in the changelog.
>
>
> OK, I will add this information to changelog.
>
>
>>
>> Now, why does this happen without your patch?
>> Before we use the commit roots, we call extent_buffer_get() on them to
>> make sure they are not released and we do that while holding the
>> semaphore commit_root_sem. Why isn't this enough to prevent
>> use-after-free problems?
>>
>
> Although we use extent_buffer_get to prevent the extent_buffer
> from being released, because of the snapshoting implies COWing
> the root, the commit_root original space (A) is released.
> Therefore, when other process need alloc the new tree block,
> it may use the space of A.
> However, alloc_extent_buffer is created by the bytenr.
> It will first find out if there is an existing extent_buffer through
> find_extent_buffer and cause the original extent_buffer to be modified.
> Eventually causing send process to access illegal memory.
> Thus extent_buffer_get can only prevent extent_buffer from being released,
> but it cannot prevent extent_buffer from being used by others.
>
> btrfs_alloc_tree_block
> --btrfs_reserve_extent (get A bytenr)
> --btrfs_init_new_buffer (use A bytenr)
> ----btrfs_find_create_tree_block
> ------alloc_extent_buffer
> --------find_extent_buffer (get A)
>
> So we fixed the problem by copy commit_root to avoid accessing illegal
> memory
Here it should be mentioned as well how the space gets released
despite the fact that the extent buffer has references.
I.e. that __btrfs_cow_block() calls btrfs_free_tree_block().
Please add all this information, in detail, to the changelog.
The changelog you provided had no useful information at all - didn't
mention how the problem can happen nor why.
thanks
>
>
>
>> That should be explained as well in the changelog.
>>
>> Those are the 2 pieces of information you need to put in the changelog.
>>
>>>
>>> Example:
>>> #btrfs sub show snap1
>>> Name: snap1
>>> UUID: 17ba3140-b79d-1649-a2e1-b238c3253a40
>>> Parent UUID: 6fe90a14-02f3-1b40-bdfc-be435060d023
>>> Received UUID: 17ba3140-b79d-1649-a2e1-b238c3253a40
>>> Creation time: 2018-04-27 18:04:11 +0800
>>> Subvolume ID: 514
>>> Generation: 4854
>>> Gen at creation: 506
>>> Parent ID: 511
>>> Top level ID: 511
>>> Flags: readonly
>>> Snapshot(s):
>>>
>>> #btrfs-debug-tree -t 514 /dev/md2 | more
>>> btrfs-progs v4.0
>>> file tree key (514 ROOT_ITEM 506)
>>> node 259127066624 level 2 items 192 free 301 generation 4854 owner 514
>>> fs uuid 081b4c28-007c-430b-be7f-ead57aa71548
>>> chunk uuid a48e7346-bd15-4eac-8957-b11a2bf9bbe2
>>> key (256 INODE_ITEM 0) block 258707652608 (15790262) gen 505
>>>
>>> #btrfs sub snap send_snap1 test_subvol
>>> #btrfs-debug-tree -t 514 /dev/md2 | more
>>> btrfs-progs v4.0
>>> file tree key (514 ROOT_ITEM 506)
>>> node 258444509184 level 2 items 192 free 301 generation 4881 owner 514
>>> fs uuid 081b4c28-007c-430b-be7f-ead57aa71548
>>> chunk uuid a48e7346-bd15-4eac-8957-b11a2bf9bbe2
>>> key (256 INODE_ITEM 0) block 258707652608 (15790262) gen 505
>>>
>>> According to the above example node and generation will be updated.
>>>
>>> This means that as long as the send or parent snapshot is taken during
>>> the btrfs send, commit_root will change, and send process may access
>>> invalid memory.
>>>
>>> Thanks.
>>> Robbie Ko
>>>
>>>
>>> Filipe Manana 於 2018-05-08 17:15 寫到:
>>>>
>>>>
>>>> On Tue, May 8, 2018 at 9:15 AM, robbieko <robbieko@synology.com> wrote:
>>>>
>>>>> From: Robbie Ko <robbieko@synology.com>
>>>>>
>>>>> [BUG]
>>>>> btrfs send BUG on parent commit_root node receive destination
>>>>> is at the same volume.
>>>>
>>>>
>>>>
>>>> I can't make sense of that sentence.
>>>>
>>>>>
>>>>> [Example]
>>>>> btrfs send -e -p /mnt/snap1/ /mnt/snap2/ | btrfs receive -e /mnt/temp
>>>>
>>>>
>>>>
>>>> So, is -e necessary?
>>>> So doing that, for any parent and send snapshots, triggers the bug?
>>>> (and regardless of -e)
>>>> If that was enough I'm pretty sure almost everyone would be reporting
>>>> send/receive as unusable.
>>>> Please clarify.
>>>>
>>>>>
>>>>> [REASON]
>>>>> 1. When send with the parent, the send process will get the parent
>>>>> commit_root(A), then send commit_root(B) and then compare the
>>>>> tree between A and B.
>>>>> 2. The receive process will take snapshot from parent, then the parent
>>>>> commit_root changes from A to C, so A will be released.
>>>>
>>>>
>>>>
>>>> I don't understand this. How can the commit_root change?
>>>> The snapshot is readonly, it's can't be modified after it's created.
>>>>
>>>>> 3. Then A will be assigned to other leaf nodes.
>>>>> 4. Therefore, the send process will use invalid A, resulting in
>>>>> invalid memory access.
>>>>>
>>>>> [FIX]
>>>>> We create new root and copy from parent/send commit_root to
>>>>> avoid invalid memory access.
>>>>>
>>>>> CallTrace looks like this:
>>>>> ------------[ cut here ]------------
>>>>> kernel BUG at fs/btrfs/ctree.c:1861!
>>>>> invalid opcode: 0000 [#1] SMP
>>>>> CPU: 6 PID: 24235 Comm: btrfs Tainted: P O 3.10.105 #23721
>>>>
>>>>
>>>>
>>>> 3.10 kernel....
>>>> You should reproduce any bugs you find on more recent kernels...
>>>>
>>>>> ffff88046652d680 ti: ffff88041b720000 task.ti: ffff88041b720000
>>>>> RIP: 0010:[<ffffffffa08dd0e8>] read_node_slot+0x108/0x110 [btrfs]
>>>>> RSP: 0018:ffff88041b723b68 EFLAGS: 00010246
>>>>> RAX: ffff88043ca6b000 RBX: ffff88041b723c50 RCX: ffff880000000000
>>>>> RDX: 000000000000004c RSI: ffff880314b133f8 RDI: ffff880458b24000
>>>>> RBP: 0000000000000000 R08: 0000000000000001 R09: ffff88041b723c66
>>>>> R10: 0000000000000001 R11: 0000000000001000 R12: ffff8803f3e48890
>>>>> R13: ffff8803f3e48880 R14: ffff880466351800 R15: 0000000000000001
>>>>> FS: 00007f8c321dc8c0(0000) GS:ffff88047fcc0000(0000)
>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> R2: 00007efd1006d000 CR3: 0000000213a24000 CR4: 00000000003407e0
>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>>> Stack:
>>>>> ffff88041b723c50 ffff8803f3e48880 ffff8803f3e48890 ffff8803f3e48880
>>>>> ffff880466351800 0000000000000001 ffffffffa08dd9d7 ffff88041b723c50
>>>>> ffff8803f3e48880 ffff88041b723c66 ffffffffa08dde85 a9ff88042d2c4400
>>>>> Call Trace:
>>>>> [<ffffffffa08dd9d7>] ? tree_move_down.isra.33+0x27/0x50 [btrfs]
>>>>> [<ffffffffa08dde85>] ? tree_advance+0xb5/0xc0 [btrfs]
>>>>> [<ffffffffa08e83d4>] ? btrfs_compare_trees+0x2d4/0x760 [btrfs]
>>>>> [<ffffffffa0982050>] ? finish_inode_if_needed+0x870/0x870 [btrfs]
>>>>> [<ffffffffa09841ea>] ? btrfs_ioctl_send+0xeda/0x1050 [btrfs]
>>>>> [<ffffffffa094bd3d>] ? btrfs_ioctl+0x1e3d/0x33f0 [btrfs]
>>>>> [<ffffffff81111133>] ? handle_pte_fault+0x373/0x990
>>>>> [<ffffffff8153a096>] ? atomic_notifier_call_chain+0x16/0x20
>>>>> [<ffffffff81063256>] ? set_task_cpu+0xb6/0x1d0
>>>>> [<ffffffff811122c3>] ? handle_mm_fault+0x143/0x2a0
>>>>> [<ffffffff81539cc0>] ? __do_page_fault+0x1d0/0x500
>>>>> [<ffffffff81062f07>] ? check_preempt_curr+0x57/0x90
>>>>> [<ffffffff8115075a>] ? do_vfs_ioctl+0x4aa/0x990
>>>>> [<ffffffff81034f83>] ? do_fork+0x113/0x3b0
>>>>> [<ffffffff812dd7d7>] ? trace_hardirqs_off_thunk+0x3a/0x6c
>>>>> [<ffffffff81150cc8>] ? SyS_ioctl+0x88/0xa0
>>>>> [<ffffffff8153e422>] ? system_call_fastpath+0x16/0x1b
>>>>> ---[ end trace 29576629ee80b2e1 ]---
>>>>>
>>>>> Signed-off-by: Robbie Ko <robbieko@synology.com>
>>>>> ---
>>>>> fs/btrfs/ctree.c | 26 ++++++++++++++++++++++++--
>>>>> 1 file changed, 24 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
>>>>> index b88a79e..c9ce52f 100644
>>>>> --- a/fs/btrfs/ctree.c
>>>>> +++ b/fs/btrfs/ctree.c
>>>>> @@ -5398,6 +5398,8 @@ int btrfs_compare_trees(struct btrfs_root
>>>>> *left_root,
>>>>> u64 right_blockptr;
>>>>> u64 left_gen;
>>>>> u64 right_gen;
>>>>> + struct extent_buffer *left_root_node = NULL;
>>>>> + struct extent_buffer *right_root_node = NULL;
>>>>>
>>>>> left_path = btrfs_alloc_path();
>>>>> if (!left_path) {
>>>>> @@ -5416,6 +5418,20 @@ int btrfs_compare_trees(struct btrfs_root
>>>>> *left_root,
>>>>> goto out;
>>>>> }
>>>>>
>>>>> + left_root_node = alloc_dummy_extent_buffer(left_root->fs_info,
>>>>> 0);
>>>>> + if (!left_root_node) {
>>>>> + ret = -ENOMEM;
>>>>> + goto out;
>>>>> + }
>>>>> + extent_buffer_get(left_root_node);
>>>>> +
>>>>> + right_root_node = alloc_dummy_extent_buffer(left_root->fs_info,
>>>>> 0);
>>>>> + if (!right_root_node) {
>>>>> + ret = -ENOMEM;
>>>>> + goto out;
>>>>> + }
>>>>> + extent_buffer_get(right_root_node);
>>>>> +
>>>>> left_path->search_commit_root = 1;
>>>>> left_path->skip_locking = 1;
>>>>> right_path->search_commit_root = 1;
>>>>> @@ -5460,12 +5476,16 @@ int btrfs_compare_trees(struct btrfs_root
>>>>> *left_root,
>>>>> down_read(&fs_info->commit_root_sem);
>>>>> left_level = btrfs_header_level(left_root->commit_root);
>>>>> left_root_level = left_level;
>>>>> - left_path->nodes[left_level] = left_root->commit_root;
>>>>> + copy_extent_buffer_full(left_root_node,
>>>>> left_root->commit_root);
>>>>> + set_bit(EXTENT_BUFFER_UPTODATE, &left_root_node->bflags);
>>>>> + left_path->nodes[left_level] = left_root_node;
>>>>> extent_buffer_get(left_path->nodes[left_level]);
>>>>>
>>>>> right_level = btrfs_header_level(right_root->commit_root);
>>>>> right_root_level = right_level;
>>>>> - right_path->nodes[right_level] = right_root->commit_root;
>>>>> + copy_extent_buffer_full(right_root_node,
>>>>> right_root->commit_root);
>>>>> + set_bit(EXTENT_BUFFER_UPTODATE, &right_root_node->bflags);
>>>>> + right_path->nodes[right_level] = right_root_node;
>>>>> extent_buffer_get(right_path->nodes[right_level]);
>>>>> up_read(&fs_info->commit_root_sem);
>>>>>
>>>>> @@ -5613,6 +5633,8 @@ int btrfs_compare_trees(struct btrfs_root
>>>>> *left_root,
>>>>> out:
>>>>> btrfs_free_path(left_path);
>>>>> btrfs_free_path(right_path);
>>>>> + free_extent_buffer(left_root_node);
>>>>> + free_extent_buffer(right_root_node);
>>>>> kvfree(tmp_buf);
>>>>> return ret;
>>>>> }
>>>>> --
>>>>> 1.9.1
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>>>>> in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>
--
Filipe David Manana,
“Whether you think you can, or you think you can't — you're right.”
next prev parent reply other threads:[~2018-05-09 8:29 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-08 8:15 [PATCH] btrfs: incremental send, fix BUG when parent commit_root changed robbieko
2018-05-08 9:15 ` Filipe Manana
2018-05-08 9:21 ` robbieko
2018-05-08 10:30 ` robbieko
2018-05-08 11:12 ` Filipe Manana
2018-05-09 1:10 ` robbieko
2018-05-09 8:29 ` Filipe Manana [this message]
2018-05-09 8:36 ` robbieko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAL3q7H7gnDdxPF8Grn0J8vXRGfpCDto_eNmaUQ6akonmEf_idg@mail.gmail.com \
--to=fdmanana@gmail.com \
--cc=linux-btrfs-owner@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=robbieko@synology.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).