Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Christoph Hellwig <hch@infradead.org>,
	dsterba@suse.cz, Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH DRAFT] btrfs: RAID56J journal on-disk format draft
Date: Wed, 25 May 2022 17:13:11 +0800	[thread overview]
Message-ID: <bd6ac4d4-41bf-f662-e7c0-7841895554a6@gmx.com> (raw)
In-Reply-To: <Yo3wRJO/h+Cx47bw@infradead.org>



On 2022/5/25 17:00, Christoph Hellwig wrote:
> On Tue, May 24, 2022 at 07:02:34PM +0200, David Sterba wrote:
>> Well, that does not sound encouraging. One option discussed in the past
>> how to fix the write hole was to always do full RMW cycle. Having a "not
>> fast journal at all" would require a format change and have probably a
>> comparable performance drop.
>
> So maybe I'm just dumb, but what is the problem with only using
> raid56 for data, forbidding nowcow for it and thus avoiding the
> problem entirely?

The problem is, we can have partial write for RAID56, no matter if we
use NODATACOW or not.

For example, we have a very typical 3 disks RAID5:

	0	32K	64K
Disk 1  |DDDDDDD|       |
Disk 2  |ddddddd|ddddddd|
Disk 3  |PPPPPPP|PPPPPPP|


D = old data, it's there for a while.
d = new data, we want to write.

Now bio for disk 1 and disk 3 finished, but before bio for disk2 can
finish, we hit power loss.

Btrfs reverts to old data, so we should still only see D, but no new data d.

So far so good.

But what if disk 1 now disappear?

To read out old data (DDDDD), we need to rebuild using disk2 and disk3.

But please note that, now Disk 3 has the new parity, but disk2 is still
old data.

Now the recovered data will be wrong, and not pass btrfs csum check.

This is the write-hole problem, it's not screwing up all our data by a
sudden, but corrupt out data bytes by bytes each time we hit a power loss.

Thanks,
Qu


  reply	other threads:[~2022-05-25  9:16 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-24  6:13 [PATCH DRAFT] btrfs: RAID56J journal on-disk format draft Qu Wenruo
2022-05-24 11:08 ` kernel test robot
2022-05-24 12:19 ` kernel test robot
2022-05-24 17:02 ` David Sterba
2022-05-24 22:31   ` Qu Wenruo
2022-05-25  9:00   ` Christoph Hellwig
2022-05-25  9:13     ` Qu Wenruo [this message]
2022-05-25  9:26       ` Christoph Hellwig
2022-05-25  9:35         ` Qu Wenruo
2022-05-26  9:06           ` waxhead
2022-05-26  9:26             ` Qu Wenruo
2022-05-26 15:30               ` Goffredo Baroncelli
2022-05-26 16:10                 ` David Sterba
2022-06-01  2:06 ` Wang Yugui
2022-06-01  2:13   ` Qu Wenruo
2022-06-01  2:25     ` Wang Yugui
2022-06-01  2:55       ` Qu Wenruo
2022-06-01  9:07         ` Wang Yugui
2022-06-01  9:27           ` Qu Wenruo
2022-06-01  9:56             ` Paul Jones
2022-06-01 10:12               ` Qu Wenruo
2022-06-01 18:49                 ` Martin Raiber
2022-06-01 21:37                   ` Qu Wenruo
2022-06-03  9:32                     ` Lukas Straub
2022-06-03  9:59                       ` Qu Wenruo
2022-06-06  8:16                         ` Qu Wenruo
2022-06-06 11:21                           ` Qu Wenruo
2022-06-06 18:10                             ` Goffredo Baroncelli
2022-06-07  1:27                               ` Qu Wenruo
2022-06-07 17:36                                 ` Goffredo Baroncelli
2022-06-07 22:14                                   ` Qu Wenruo
2022-06-08 17:26                                     ` Goffredo Baroncelli
2022-06-13  2:27                                       ` Qu Wenruo
2022-06-08 15:17                         ` Lukas Straub
2022-06-08 17:32                           ` Goffredo Baroncelli
2022-06-01 12:21               ` Qu Wenruo
2022-06-01 14:55                 ` Robert Krig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bd6ac4d4-41bf-f662-e7c0-7841895554a6@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=dsterba@suse.cz \
    --cc=hch@infradead.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox