* Raid5 Write Hole: Is it worse than in MD?
@ 2020-10-13 9:34 Hendrik Friedel
2020-10-13 9:43 ` Johannes Thumshirn
2020-10-13 22:54 ` Zygo Blaxell, @hungrycats.org
0 siblings, 2 replies; 4+ messages in thread
From: Hendrik Friedel @ 2020-10-13 9:34 UTC (permalink / raw)
To: linux-btrfs
Hello,
I recently read this article about the write-hole in md:
https://lwn.net/Articles/665299/
Whilst the article is focused on the journal as a fix for the write hole
(by the way: Is that possible with btrfs?), it made me wonder, if the
write hole in btrfs is any worse than in md?
Regards,
Hendrik
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Raid5 Write Hole: Is it worse than in MD?
2020-10-13 9:34 Raid5 Write Hole: Is it worse than in MD? Hendrik Friedel
@ 2020-10-13 9:43 ` Johannes Thumshirn
2020-10-13 13:46 ` Piotr Szymaniak
2020-10-13 22:54 ` Zygo Blaxell, @hungrycats.org
1 sibling, 1 reply; 4+ messages in thread
From: Johannes Thumshirn @ 2020-10-13 9:43 UTC (permalink / raw)
To: Hendrik Friedel, linux-btrfs@vger.kernel.org
On 13/10/2020 11:34, Hendrik Friedel wrote:
> Whilst the article is focused on the journal as a fix for the write hole
> (by the way: Is that possible with btrfs?), it made me wonder, if the
> write hole in btrfs is any worse than in md?
Not a direct answer to your question, but IMHO adding a journal isn't the
right fix for btrfs. The correct fix for the write hole (and other problems
we encountered with btrfs raid5/6) would be a raid stripe tree.
This is something I'm currently investigating.
For the other problems of raid56, Zygo once compiled a very comprehensive list,
but I don't have the link anymore.
Byte,
Johannes
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Raid5 Write Hole: Is it worse than in MD?
2020-10-13 9:43 ` Johannes Thumshirn
@ 2020-10-13 13:46 ` Piotr Szymaniak
0 siblings, 0 replies; 4+ messages in thread
From: Piotr Szymaniak @ 2020-10-13 13:46 UTC (permalink / raw)
To: Johannes Thumshirn; +Cc: Hendrik Friedel, linux-btrfs@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 721 bytes --]
On Tue, Oct 13, 2020 at 09:43:25AM +0000, Johannes Thumshirn wrote:
> *snip*
> For the other problems of raid56, Zygo once compiled a very comprehensive list,
> but I don't have the link anymore.
This list (both user/dev):
https://lore.kernel.org/linux-btrfs/20200627032414.GX10769@hungrycats.org/
https://lore.kernel.org/linux-btrfs/20200627030614.GW10769@hungrycats.org/
Best regards,
Piotr Szymaniak.
--
Chyba musze juz wracac do sklepu. Kelly jest w porzadku, ale czasem
potrafi zupelnie sie wylaczyc. I do tego nie wierzy w cos takiego jak
odpowiedzialnosc. Ma to jakis zwiazek z ta sekta, do ktorej nalezy.
Maharishi Woda-z-mozgu czy cos w tym stylu.
-- Graham Masterton, "Mirror"
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Raid5 Write Hole: Is it worse than in MD?
2020-10-13 9:34 Raid5 Write Hole: Is it worse than in MD? Hendrik Friedel
2020-10-13 9:43 ` Johannes Thumshirn
@ 2020-10-13 22:54 ` Zygo Blaxell, @hungrycats.org
1 sibling, 0 replies; 4+ messages in thread
From: Zygo Blaxell, @hungrycats.org @ 2020-10-13 22:54 UTC (permalink / raw)
To: Hendrik Friedel; +Cc: linux-btrfs
On Tue, Oct 13, 2020 at 09:34:50AM +0000, Hendrik Friedel wrote:
> Hello,
>
> I recently read this article about the write-hole in md:
> https://lwn.net/Articles/665299/
>
> Whilst the article is focused on the journal as a fix for the write hole (by
> the way: Is that possible with btrfs?), it made me wonder, if the write hole
> in btrfs is any worse than in md?
It is hard to compare them directly, because write hole is only one of
several ways a raid5 array can fail on either mdadm or btrfs, and both
have significant shortcomings.
btrfs and mdadm have separate strengths and weaknesses in their raid5
implementations. e.g. btrfs can often recover from data corruption that
is not reported by the drives, while mdadm can't detect or repair it.
On the other hand, mdadm has no problems reading a degraded non-corrupted
raid5 array that I know of, while btrfs has some known troubles there.
It's possible to implement a raid5 stripe update journal (or tree), but
it's not the only possible solution (or only part of a complete solution).
Other possible solutions include:
- adjust the allocator to minimize stripe RMW update operations
(effectively banning them outright for datacow and metadata), or
- throw out the existing raid5/6 implementation and start
over with an implementation that works in harmony with the
copy-on-write semantics, more like the way data compression in
btrfs works now (effectively solving the problem the same way
ZFS did).
These all have various performance and capability tradeoffs. Some of
them can even be combined (e.g. minimize RMW updates with allocator
changes, fall back to stripe log tree for the rest).
> Regards,
> Hendrik
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-10-13 23:01 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-10-13 9:34 Raid5 Write Hole: Is it worse than in MD? Hendrik Friedel
2020-10-13 9:43 ` Johannes Thumshirn
2020-10-13 13:46 ` Piotr Szymaniak
2020-10-13 22:54 ` Zygo Blaxell, @hungrycats.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox