public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Goffredo Baroncelli <kreijack@libero.it>
To: Hubert Tonneau <hubert.tonneau@fullpliant.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Avoiding BRTFS RAID5 write hole
Date: Tue, 12 Nov 2019 20:49:33 +0100	[thread overview]
Message-ID: <441de86e-a0ea-fdc9-7fce-bb2cf56d0be8@libero.it> (raw)
In-Reply-To: <0JG8IAF11@briare1.fullpliant.org>

On 12/11/2019 16.13, Hubert Tonneau wrote:
> Hi,
> 
> In order to close the RAID5 write hole, I prepose the add a mount option that would change RAID5 (and RAID6) behaviour :
> 
> . When overwriting a RAID5 stripe, first convert it to RAID1 (convert it to RAID1C3 if it was RAID6)

You can't overwrite  and convert a existing stripe for two kind of reason:
1) you still have to protect the stripe overwriting from the write hole
2) depending by the layout, a raid1 stripe consumes more space than a raid5 stripe with equal "capacity"

So you have to write (temporarily) the data on another place. This is something not different from what Qu proposed few years ago:

https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg66472.html [Btrfs: Add journal for raid5/6 writes]

where he added a device for logging the writes.

Unfortunately, this means doubling the writes; that for a COW filesystem (which already suffers this kind of issue) would be big performance penality....

Instead I would like to investigate the idea of COW-ing the stripe: instead of updating the stripe on place, why not write the new stripe in another place and then update the data extent to point to the new data ? Of course would work only for the data and not for the metadata.
Pros: the data is written only once
Cons: the pressure of the metadata would increase; the fragmentation would increase


> 
> . Have a background process that converts RAID1 stripes to RAID5 (RAID1C3 to RAID6)
> 
> Expected advantages are :
> . the low level features set basically remains the same
> . the filesystem format remains the same
> . old kernels and btrs-progs would not be disturbed
> 
> The end result would be a mixed filesystem where active parts are RAID1 and archives one are RAID5.
> 
> Regards,
> Hubert Tonneau
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

  parent reply	other threads:[~2019-11-12 19:57 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-12 15:13 Avoiding BRTFS RAID5 write hole Hubert Tonneau
2019-11-12 18:44 ` Chris Murphy
2019-11-12 19:49 ` Goffredo Baroncelli [this message]
2019-11-14  4:25   ` Zygo Blaxell
  -- strict thread matches above, loose matches on Subject: below --
2019-11-12 22:27 Hubert Tonneau
2019-11-13 19:34 ` Goffredo Baroncelli
2019-11-13 22:29 Hubert Tonneau
2019-11-13 22:51 ` waxhead
2019-11-14 21:25 ` Goffredo Baroncelli
2019-11-15 20:41   ` Hubert Tonneau
2019-11-17  8:53     ` Goffredo Baroncelli
2019-11-17 19:49       ` Hubert Tonneau
2019-11-28 11:37       ` Hubert Tonneau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=441de86e-a0ea-fdc9-7fce-bb2cf56d0be8@libero.it \
    --to=kreijack@libero.it \
    --cc=hubert.tonneau@fullpliant.org \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox