From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Josef Bacik <josef@toxicpanda.com>, Qu Wenruo <wqu@suse.com>,
linux-btrfs@vger.kernel.org
Cc: Hans van Kranenburg <hans@knorrie.org>
Subject: Re: [PATCH RFC 2/2] btrfs: space-info: Don't allow signal to interrupt ticket waiting
Date: Mon, 6 Jul 2020 22:05:29 +0800 [thread overview]
Message-ID: <13f7d3b6-e555-38cc-b73c-817bab70924c@gmx.com> (raw)
In-Reply-To: <d01c2c4a-0c3e-c599-fd64-715c000ad31f@toxicpanda.com>
[-- Attachment #1.1: Type: text/plain, Size: 4025 bytes --]
On 2020/7/6 下午9:53, Josef Bacik wrote:
> On 7/6/20 9:50 AM, Qu Wenruo wrote:
>>
>>
>> On 2020/7/6 下午9:45, Josef Bacik wrote:
>>> On 7/6/20 3:44 AM, Qu Wenruo wrote:
>>>> [BUG]
>>>> When balance receive a fatal signal, it can make the fs to read-only
>>>> mode if the timing is unlucky enough:
>>>>
>>>> BTRFS info (device xvdb): balance: start -d -m -s
>>>> BTRFS info (device xvdb): relocating block group 73001861120 flags
>>>> metadata
>>>> BTRFS info (device xvdb): found 12236 extents, stage: move data
>>>> extents
>>>> BTRFS info (device xvdb): relocating block group 71928119296 flags
>>>> data
>>>> BTRFS info (device xvdb): found 3 extents, stage: move data extents
>>>> BTRFS info (device xvdb): found 3 extents, stage: update data
>>>> pointers
>>>> BTRFS info (device xvdb): relocating block group 60922265600 flags
>>>> metadata
>>>> BTRFS: error (device xvdb) in btrfs_drop_snapshot:5505: errno=-4
>>>> unknown
>>>> BTRFS info (device xvdb): forced readonly
>>>> BTRFS info (device xvdb): balance: ended with status: -4
>>>>
>>>> [CAUSE]
>>>> This is caused by the fact that btrfs ticketing space system can be
>>>> interrupted, and cause all kind of -EINTR returned to various critical
>>>> section, where we never thought of -EINTR at all.
>>>>
>>>> Even for things like btrfs_start_transaction() can be affected by
>>>> signal:
>>>> btrfs_start_transaction()
>>>> |- start_transaction(flush = FLUSH_ALL)
>>>> |- btrfs_block_rsv_add()
>>>> |- btrfs_reserve_metadata_bytes()
>>>> |- __reserve_metadata_bytes()
>>>> |- handle_reserve_ticket()
>>>> |- wait_reserve_ticket()
>>>> |- prepare_to_wait_event(TASK_KILLABLE)
>>>> |- ticket->error = -EINTR;
>>>>
>>>> And all related callers get -EINTR error.
>>>>
>>>> In fact, there are really very limited call sites can really handle
>>>> that
>>>> -EINTR properly, above btrfs_drop_snapshot() is one case.
>>>>
>>>> [FIX]
>>>> Things like metadata allocation is really a critical section for btrfs,
>>>> we don't really want it to be that killable by some impatient users.
>>>>
>>>> In fact, for really long duration calls, it should have their own
>>>> checks
>>>> on signal, like balance, reflink, generic fiemap calls.
>>>>
>>>> So this patch will make ticket waiting uninterruptible, relying on each
>>>> long duration calls to handle their signals more properly.
>>>>
>>>
>>> Nope, everybody that calls start_transaction() should be able to handle
>>> -EINTR, so we need to find those callsites and fix them, not make it so
>>> we hang the box because we don't like fixing error paths. Thanks,
>>
>> Then we also need to handle btrfs_delalloc_reserve_metadata(),
>> btrfs_block_rsv_refill(), btrfs_use_block_rsv(), btrfs_block_rsv_add().
>>
>
> This only needs to be handled for FLUSH_ALL and FLUSH_STEAL. Anybody
> doing btrfs_start_transaction() should be able to fail with -EINTR,
> because they should be close to the syscall path. Balance needs to be
> fixed to deal with it, and I assume there might be a few other places,
> but by-in-large none of these places should flip read only. Thanks,
There are already ones existing, for btrfs_drop_snapshot().
This is mostly caused by btrfs_delalloc_reserve_metadata(), which always
use FLUSH_ALL unless there is a running trans or its space cache inode.
But still, for a lot of relocation code, we don't really want to bother
the EINTR and just want uninterruptible at least for now.
Any idea for that?
Or just rework how we handle errors in a lot of places?
It doesn't look correct for such a low level mechanism to return -EINTR
while most of callers doesn't really want to bother.
Thanks,
Qu
>
> Josef
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2020-07-06 14:05 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-06 7:44 [PATCH RFC 0/2] btrfs: make ticket wait uninterruptible to address unexpected RO during balance Qu Wenruo
2020-07-06 7:44 ` [PATCH RFC 1/2] btrfs: relocation: Allow signal to cancel balance Qu Wenruo
2020-07-06 13:45 ` Josef Bacik
2020-07-06 18:19 ` Hans van Kranenburg
2020-07-06 22:43 ` Qu Wenruo
2020-07-06 7:44 ` [PATCH RFC 2/2] btrfs: space-info: Don't allow signal to interrupt ticket waiting Qu Wenruo
2020-07-06 13:45 ` Josef Bacik
2020-07-06 13:50 ` Qu Wenruo
2020-07-06 13:53 ` Josef Bacik
2020-07-06 14:05 ` Qu Wenruo [this message]
2020-07-06 14:33 ` Josef Bacik
2020-07-07 16:16 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=13f7d3b6-e555-38cc-b73c-817bab70924c@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=hans@knorrie.org \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox