Re: [PATCH v2] btrfs: qgroup: Fix data leakage caused by race between writeback and truncate

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Josef Bacik <josef@toxicpanda.com>, Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2] btrfs: qgroup: Fix data leakage caused by race between writeback and truncate
Date: Sat, 18 Jul 2020 07:38:27 +0800	[thread overview]
Message-ID: <dcc47e7f-53e0-e832-0e39-e8c1d82e318e@gmx.com> (raw)
In-Reply-To: <9b03ca60-e56f-442c-7558-3ca1b2b1df77@toxicpanda.com>


[-- Attachment #1.1: Type: text/plain, Size: 3766 bytes --]



On 2020/7/17 下午11:30, Josef Bacik wrote:
> On 7/17/20 3:12 AM, Qu Wenruo wrote:
>> [BUG]
>> When running tests like generic/013 on test device with btrfs quota
>> enabled, it can normally lead to data leakage, detected at unmount time:
>>
>>    BTRFS warning (device dm-3): qgroup 0/5 has unreleased space, type
>> 0 rsv 4096
>>    ------------[ cut here ]------------
>>    WARNING: CPU: 11 PID: 16386 at fs/btrfs/disk-io.c:4142
>> close_ctree+0x1dc/0x323 [btrfs]
>>    RIP: 0010:close_ctree+0x1dc/0x323 [btrfs]
>>    Call Trace:
>>     btrfs_put_super+0x15/0x17 [btrfs]
>>     generic_shutdown_super+0x72/0x110
>>     kill_anon_super+0x18/0x30
>>     btrfs_kill_super+0x17/0x30 [btrfs]
>>     deactivate_locked_super+0x3b/0xa0
>>     deactivate_super+0x40/0x50
>>     cleanup_mnt+0x135/0x190
>>     __cleanup_mnt+0x12/0x20
>>     task_work_run+0x64/0xb0
>>     __prepare_exit_to_usermode+0x1bc/0x1c0
>>     __syscall_return_slowpath+0x47/0x230
>>     do_syscall_64+0x64/0xb0
>>     entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>    ---[ end trace caf08beafeca2392 ]---
>>    BTRFS error (device dm-3): qgroup reserved space leaked
>>
>> [CAUSE]
>> In the offending case, the offending operations are:
>> 2/6: writev f2X[269 1 0 0 0 0] [1006997,67,288] 0
>> 2/7: truncate f2X[269 1 0 0 48 1026293] 18388 0
>>
>> The following sequence of events could happen after the writev():
>>     CPU1 (writeback)        |        CPU2 (truncate)
>> -----------------------------------------------------------------
>> btrfs_writepages()            |
>> |- extent_write_cache_pages()        |
>>     |- Got page for 1003520        |
>>     |  1003520 is Dirty, no writeback    |
>>     |  So (!clear_page_dirty_for_io())   |
>>     |  gets called for it        |
>>     |- Now page 1003520 is Clean.    |
>>     |                    | btrfs_setattr()
>>     |                    | |- btrfs_setsize()
>>     |                    |    |- truncate_setsize()
>>     |                    |       New i_size is 18388
>>     |- __extent_writepage()        |
>>     |  |- page_offset() > i_size        |
>>        |- btrfs_invalidatepage()        |
>>      |- Page is clean, so no qgroup |
>>         callback executed
>>
>> This means, the qgroup reserved data space is not properly released in
>> btrfs_invalidatepage() as the page is Clean.
>>
>> [FIX]
>> Instead of checking the dirty bit of a page, call
>> btrfs_qgroup_free_data() unconditionally in btrfs_invalidatepage().
>>
>> As qgroup rsv are completely binded to the QGROUP_RESERVED bit of
>> io_tree, not binded to page status, thus we won't cause double freeing
>> anyway.
>>
>> Fixes: 0b34c261e235 ("btrfs: qgroup: Prevent qgroup->reserved from
>> going subzero")
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>
> 
> I don't understand how this is ok.  We can call invalidatepage via
> memory pressure, so what if we have started the write and have an
> ordered extent outstanding, and then we call into invalidate page and
> now unconditionally drop the qgroup reservation, even tho we still need
> it for the ordered extent.  Am I missing something here?  Thanks,

As long as the ordered extent as been started
(__btrfs_add_ordered_extent()), then the QGROUP_RESERVED bit is cleared,
either freed for NODATACOW write, or released for COW writes.

IIRC this recent change is suggested by you, and that paved the road for
this fix.

Thanks,
Qu
> 
> Josef


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

next prev parent reply	other threads:[~2020-07-17 23:38 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-17  7:12 [PATCH v2] btrfs: qgroup: Fix data leakage caused by race between writeback and truncate Qu Wenruo
2020-07-17 15:30 ` Josef Bacik
2020-07-17 23:38   ` Qu Wenruo [this message]
2020-07-17 23:56     ` Josef Bacik
2020-07-20 15:00 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dcc47e7f-53e0-e832-0e39-e8c1d82e318e@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox