From: robbieko <robbieko@synology.com>
To: fdmanana@gmail.com
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>,
linux-btrfs-owner@vger.kernel.org
Subject: Re: [PATCH] btrfs: incremental send, fix BUG when parent commit_root changed
Date: Wed, 09 May 2018 16:36:29 +0800 [thread overview]
Message-ID: <9431a6574bbf4df48728cb254ba602cf@synology.com> (raw)
In-Reply-To: <CAL3q7H7gnDdxPF8Grn0J8vXRGfpCDto_eNmaUQ6akonmEf_idg@mail.gmail.com>
Hi Filipe Manana,
Ok. I will add all this information, in detail, to the changelog,
and than send a patch V2 later.
Thanks.
Robbie Ko
Filipe Manana 於 2018-05-09 16:29 寫到:
> On Wed, May 9, 2018 at 2:10 AM, robbieko <robbieko@synology.com> wrote:
>> Filipe Manana 於 2018-05-08 19:12 寫到:
>>>
>>> On Tue, May 8, 2018 at 11:30 AM, robbieko <robbieko@synology.com>
>>> wrote:
>>>>
>>>> Hi Filipe Manana,
>>>>
>>>> Although the snapshot is readonly, when the snapshot is created,
>>>> in order to modify the last_snapshot, it will cause generation
>>>> changes,
>>>> and it will affect the commit_root modification.
>>>
>>>
>>> So what you to say is that the problem happens when creating a
>>> snapshot of snapshot that is being used by send.
>>> You don't mention that anywhere, except in the example below.
>>>
>>> Just say that in the changelog, that the problem can happen if while
>>> we are doing a send one of the snapshots used (parent or send) is
>>> snapshotted, because snapshoting implies COWing the root of the
>>> source
>>> subvolume/snaphot.
>>> That is not mentioned anywhere in the changelog.
>>
>>
>> OK, I will add this information to changelog.
>>
>>
>>>
>>> Now, why does this happen without your patch?
>>> Before we use the commit roots, we call extent_buffer_get() on them
>>> to
>>> make sure they are not released and we do that while holding the
>>> semaphore commit_root_sem. Why isn't this enough to prevent
>>> use-after-free problems?
>>>
>>
>> Although we use extent_buffer_get to prevent the extent_buffer
>> from being released, because of the snapshoting implies COWing
>> the root, the commit_root original space (A) is released.
>> Therefore, when other process need alloc the new tree block,
>> it may use the space of A.
>> However, alloc_extent_buffer is created by the bytenr.
>> It will first find out if there is an existing extent_buffer through
>> find_extent_buffer and cause the original extent_buffer to be
>> modified.
>> Eventually causing send process to access illegal memory.
>> Thus extent_buffer_get can only prevent extent_buffer from being
>> released,
>> but it cannot prevent extent_buffer from being used by others.
>>
>> btrfs_alloc_tree_block
>> --btrfs_reserve_extent (get A bytenr)
>> --btrfs_init_new_buffer (use A bytenr)
>> ----btrfs_find_create_tree_block
>> ------alloc_extent_buffer
>> --------find_extent_buffer (get A)
>>
>> So we fixed the problem by copy commit_root to avoid accessing illegal
>> memory
>
> Here it should be mentioned as well how the space gets released
> despite the fact that the extent buffer has references.
> I.e. that __btrfs_cow_block() calls btrfs_free_tree_block().
>
> Please add all this information, in detail, to the changelog.
> The changelog you provided had no useful information at all - didn't
> mention how the problem can happen nor why.
>
> thanks
>
>>
>>
>>
>>> That should be explained as well in the changelog.
>>>
>>> Those are the 2 pieces of information you need to put in the
>>> changelog.
>>>
>>>>
>>>> Example:
>>>> #btrfs sub show snap1
>>>> Name: snap1
>>>> UUID: 17ba3140-b79d-1649-a2e1-b238c3253a40
>>>> Parent UUID: 6fe90a14-02f3-1b40-bdfc-be435060d023
>>>> Received UUID: 17ba3140-b79d-1649-a2e1-b238c3253a40
>>>> Creation time: 2018-04-27 18:04:11 +0800
>>>> Subvolume ID: 514
>>>> Generation: 4854
>>>> Gen at creation: 506
>>>> Parent ID: 511
>>>> Top level ID: 511
>>>> Flags: readonly
>>>> Snapshot(s):
>>>>
>>>> #btrfs-debug-tree -t 514 /dev/md2 | more
>>>> btrfs-progs v4.0
>>>> file tree key (514 ROOT_ITEM 506)
>>>> node 259127066624 level 2 items 192 free 301 generation 4854 owner
>>>> 514
>>>> fs uuid 081b4c28-007c-430b-be7f-ead57aa71548
>>>> chunk uuid a48e7346-bd15-4eac-8957-b11a2bf9bbe2
>>>> key (256 INODE_ITEM 0) block 258707652608 (15790262) gen 505
>>>>
>>>> #btrfs sub snap send_snap1 test_subvol
>>>> #btrfs-debug-tree -t 514 /dev/md2 | more
>>>> btrfs-progs v4.0
>>>> file tree key (514 ROOT_ITEM 506)
>>>> node 258444509184 level 2 items 192 free 301 generation 4881 owner
>>>> 514
>>>> fs uuid 081b4c28-007c-430b-be7f-ead57aa71548
>>>> chunk uuid a48e7346-bd15-4eac-8957-b11a2bf9bbe2
>>>> key (256 INODE_ITEM 0) block 258707652608 (15790262) gen 505
>>>>
>>>> According to the above example node and generation will be updated.
>>>>
>>>> This means that as long as the send or parent snapshot is taken
>>>> during
>>>> the btrfs send, commit_root will change, and send process may access
>>>> invalid memory.
>>>>
>>>> Thanks.
>>>> Robbie Ko
>>>>
>>>>
>>>> Filipe Manana 於 2018-05-08 17:15 寫到:
>>>>>
>>>>>
>>>>> On Tue, May 8, 2018 at 9:15 AM, robbieko <robbieko@synology.com>
>>>>> wrote:
>>>>>
>>>>>> From: Robbie Ko <robbieko@synology.com>
>>>>>>
>>>>>> [BUG]
>>>>>> btrfs send BUG on parent commit_root node receive destination
>>>>>> is at the same volume.
>>>>>
>>>>>
>>>>>
>>>>> I can't make sense of that sentence.
>>>>>
>>>>>>
>>>>>> [Example]
>>>>>> btrfs send -e -p /mnt/snap1/ /mnt/snap2/ | btrfs receive -e
>>>>>> /mnt/temp
>>>>>
>>>>>
>>>>>
>>>>> So, is -e necessary?
>>>>> So doing that, for any parent and send snapshots, triggers the bug?
>>>>> (and regardless of -e)
>>>>> If that was enough I'm pretty sure almost everyone would be
>>>>> reporting
>>>>> send/receive as unusable.
>>>>> Please clarify.
>>>>>
>>>>>>
>>>>>> [REASON]
>>>>>> 1. When send with the parent, the send process will get the parent
>>>>>> commit_root(A), then send commit_root(B) and then compare the
>>>>>> tree between A and B.
>>>>>> 2. The receive process will take snapshot from parent, then the
>>>>>> parent
>>>>>> commit_root changes from A to C, so A will be released.
>>>>>
>>>>>
>>>>>
>>>>> I don't understand this. How can the commit_root change?
>>>>> The snapshot is readonly, it's can't be modified after it's
>>>>> created.
>>>>>
>>>>>> 3. Then A will be assigned to other leaf nodes.
>>>>>> 4. Therefore, the send process will use invalid A, resulting in
>>>>>> invalid memory access.
>>>>>>
>>>>>> [FIX]
>>>>>> We create new root and copy from parent/send commit_root to
>>>>>> avoid invalid memory access.
>>>>>>
>>>>>> CallTrace looks like this:
>>>>>> ------------[ cut here ]------------
>>>>>> kernel BUG at fs/btrfs/ctree.c:1861!
>>>>>> invalid opcode: 0000 [#1] SMP
>>>>>> CPU: 6 PID: 24235 Comm: btrfs Tainted: P O 3.10.105
>>>>>> #23721
>>>>>
>>>>>
>>>>>
>>>>> 3.10 kernel....
>>>>> You should reproduce any bugs you find on more recent kernels...
>>>>>
>>>>>> ffff88046652d680 ti: ffff88041b720000 task.ti: ffff88041b720000
>>>>>> RIP: 0010:[<ffffffffa08dd0e8>] read_node_slot+0x108/0x110 [btrfs]
>>>>>> RSP: 0018:ffff88041b723b68 EFLAGS: 00010246
>>>>>> RAX: ffff88043ca6b000 RBX: ffff88041b723c50 RCX: ffff880000000000
>>>>>> RDX: 000000000000004c RSI: ffff880314b133f8 RDI: ffff880458b24000
>>>>>> RBP: 0000000000000000 R08: 0000000000000001 R09: ffff88041b723c66
>>>>>> R10: 0000000000000001 R11: 0000000000001000 R12: ffff8803f3e48890
>>>>>> R13: ffff8803f3e48880 R14: ffff880466351800 R15: 0000000000000001
>>>>>> FS: 00007f8c321dc8c0(0000) GS:ffff88047fcc0000(0000)
>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> R2: 00007efd1006d000 CR3: 0000000213a24000 CR4: 00000000003407e0
>>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>>>> Stack:
>>>>>> ffff88041b723c50 ffff8803f3e48880 ffff8803f3e48890
>>>>>> ffff8803f3e48880
>>>>>> ffff880466351800 0000000000000001 ffffffffa08dd9d7
>>>>>> ffff88041b723c50
>>>>>> ffff8803f3e48880 ffff88041b723c66 ffffffffa08dde85
>>>>>> a9ff88042d2c4400
>>>>>> Call Trace:
>>>>>> [<ffffffffa08dd9d7>] ? tree_move_down.isra.33+0x27/0x50 [btrfs]
>>>>>> [<ffffffffa08dde85>] ? tree_advance+0xb5/0xc0 [btrfs]
>>>>>> [<ffffffffa08e83d4>] ? btrfs_compare_trees+0x2d4/0x760 [btrfs]
>>>>>> [<ffffffffa0982050>] ? finish_inode_if_needed+0x870/0x870 [btrfs]
>>>>>> [<ffffffffa09841ea>] ? btrfs_ioctl_send+0xeda/0x1050 [btrfs]
>>>>>> [<ffffffffa094bd3d>] ? btrfs_ioctl+0x1e3d/0x33f0 [btrfs]
>>>>>> [<ffffffff81111133>] ? handle_pte_fault+0x373/0x990
>>>>>> [<ffffffff8153a096>] ? atomic_notifier_call_chain+0x16/0x20
>>>>>> [<ffffffff81063256>] ? set_task_cpu+0xb6/0x1d0
>>>>>> [<ffffffff811122c3>] ? handle_mm_fault+0x143/0x2a0
>>>>>> [<ffffffff81539cc0>] ? __do_page_fault+0x1d0/0x500
>>>>>> [<ffffffff81062f07>] ? check_preempt_curr+0x57/0x90
>>>>>> [<ffffffff8115075a>] ? do_vfs_ioctl+0x4aa/0x990
>>>>>> [<ffffffff81034f83>] ? do_fork+0x113/0x3b0
>>>>>> [<ffffffff812dd7d7>] ? trace_hardirqs_off_thunk+0x3a/0x6c
>>>>>> [<ffffffff81150cc8>] ? SyS_ioctl+0x88/0xa0
>>>>>> [<ffffffff8153e422>] ? system_call_fastpath+0x16/0x1b
>>>>>> ---[ end trace 29576629ee80b2e1 ]---
>>>>>>
>>>>>> Signed-off-by: Robbie Ko <robbieko@synology.com>
>>>>>> ---
>>>>>> fs/btrfs/ctree.c | 26 ++++++++++++++++++++++++--
>>>>>> 1 file changed, 24 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
>>>>>> index b88a79e..c9ce52f 100644
>>>>>> --- a/fs/btrfs/ctree.c
>>>>>> +++ b/fs/btrfs/ctree.c
>>>>>> @@ -5398,6 +5398,8 @@ int btrfs_compare_trees(struct btrfs_root
>>>>>> *left_root,
>>>>>> u64 right_blockptr;
>>>>>> u64 left_gen;
>>>>>> u64 right_gen;
>>>>>> + struct extent_buffer *left_root_node = NULL;
>>>>>> + struct extent_buffer *right_root_node = NULL;
>>>>>>
>>>>>> left_path = btrfs_alloc_path();
>>>>>> if (!left_path) {
>>>>>> @@ -5416,6 +5418,20 @@ int btrfs_compare_trees(struct btrfs_root
>>>>>> *left_root,
>>>>>> goto out;
>>>>>> }
>>>>>>
>>>>>> + left_root_node =
>>>>>> alloc_dummy_extent_buffer(left_root->fs_info,
>>>>>> 0);
>>>>>> + if (!left_root_node) {
>>>>>> + ret = -ENOMEM;
>>>>>> + goto out;
>>>>>> + }
>>>>>> + extent_buffer_get(left_root_node);
>>>>>> +
>>>>>> + right_root_node =
>>>>>> alloc_dummy_extent_buffer(left_root->fs_info,
>>>>>> 0);
>>>>>> + if (!right_root_node) {
>>>>>> + ret = -ENOMEM;
>>>>>> + goto out;
>>>>>> + }
>>>>>> + extent_buffer_get(right_root_node);
>>>>>> +
>>>>>> left_path->search_commit_root = 1;
>>>>>> left_path->skip_locking = 1;
>>>>>> right_path->search_commit_root = 1;
>>>>>> @@ -5460,12 +5476,16 @@ int btrfs_compare_trees(struct btrfs_root
>>>>>> *left_root,
>>>>>> down_read(&fs_info->commit_root_sem);
>>>>>> left_level = btrfs_header_level(left_root->commit_root);
>>>>>> left_root_level = left_level;
>>>>>> - left_path->nodes[left_level] = left_root->commit_root;
>>>>>> + copy_extent_buffer_full(left_root_node,
>>>>>> left_root->commit_root);
>>>>>> + set_bit(EXTENT_BUFFER_UPTODATE, &left_root_node->bflags);
>>>>>> + left_path->nodes[left_level] = left_root_node;
>>>>>> extent_buffer_get(left_path->nodes[left_level]);
>>>>>>
>>>>>> right_level = btrfs_header_level(right_root->commit_root);
>>>>>> right_root_level = right_level;
>>>>>> - right_path->nodes[right_level] = right_root->commit_root;
>>>>>> + copy_extent_buffer_full(right_root_node,
>>>>>> right_root->commit_root);
>>>>>> + set_bit(EXTENT_BUFFER_UPTODATE, &right_root_node->bflags);
>>>>>> + right_path->nodes[right_level] = right_root_node;
>>>>>> extent_buffer_get(right_path->nodes[right_level]);
>>>>>> up_read(&fs_info->commit_root_sem);
>>>>>>
>>>>>> @@ -5613,6 +5633,8 @@ int btrfs_compare_trees(struct btrfs_root
>>>>>> *left_root,
>>>>>> out:
>>>>>> btrfs_free_path(left_path);
>>>>>> btrfs_free_path(right_path);
>>>>>> + free_extent_buffer(left_root_node);
>>>>>> + free_extent_buffer(right_root_node);
>>>>>> kvfree(tmp_buf);
>>>>>> return ret;
>>>>>> }
>>>>>> --
>>>>>> 1.9.1
>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>> linux-btrfs"
>>>>>> in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>>>
>>
prev parent reply other threads:[~2018-05-09 8:36 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-08 8:15 [PATCH] btrfs: incremental send, fix BUG when parent commit_root changed robbieko
2018-05-08 9:15 ` Filipe Manana
2018-05-08 9:21 ` robbieko
2018-05-08 10:30 ` robbieko
2018-05-08 11:12 ` Filipe Manana
2018-05-09 1:10 ` robbieko
2018-05-09 8:29 ` Filipe Manana
2018-05-09 8:36 ` robbieko [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9431a6574bbf4df48728cb254ba602cf@synology.com \
--to=robbieko@synology.com \
--cc=fdmanana@gmail.com \
--cc=linux-btrfs-owner@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).