* Raid5 Write Hole: Is it worse than in MD? @ 2020-10-13 9:34 Hendrik Friedel 2020-10-13 9:43 ` Johannes Thumshirn 2020-10-13 22:54 ` Zygo Blaxell, @hungrycats.org 0 siblings, 2 replies; 4+ messages in thread From: Hendrik Friedel @ 2020-10-13 9:34 UTC (permalink / raw) To: linux-btrfs Hello, I recently read this article about the write-hole in md: https://lwn.net/Articles/665299/ Whilst the article is focused on the journal as a fix for the write hole (by the way: Is that possible with btrfs?), it made me wonder, if the write hole in btrfs is any worse than in md? Regards, Hendrik ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Raid5 Write Hole: Is it worse than in MD? 2020-10-13 9:34 Raid5 Write Hole: Is it worse than in MD? Hendrik Friedel @ 2020-10-13 9:43 ` Johannes Thumshirn 2020-10-13 13:46 ` Piotr Szymaniak 2020-10-13 22:54 ` Zygo Blaxell, @hungrycats.org 1 sibling, 1 reply; 4+ messages in thread From: Johannes Thumshirn @ 2020-10-13 9:43 UTC (permalink / raw) To: Hendrik Friedel, linux-btrfs@vger.kernel.org On 13/10/2020 11:34, Hendrik Friedel wrote: > Whilst the article is focused on the journal as a fix for the write hole > (by the way: Is that possible with btrfs?), it made me wonder, if the > write hole in btrfs is any worse than in md? Not a direct answer to your question, but IMHO adding a journal isn't the right fix for btrfs. The correct fix for the write hole (and other problems we encountered with btrfs raid5/6) would be a raid stripe tree. This is something I'm currently investigating. For the other problems of raid56, Zygo once compiled a very comprehensive list, but I don't have the link anymore. Byte, Johannes ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Raid5 Write Hole: Is it worse than in MD? 2020-10-13 9:43 ` Johannes Thumshirn @ 2020-10-13 13:46 ` Piotr Szymaniak 0 siblings, 0 replies; 4+ messages in thread From: Piotr Szymaniak @ 2020-10-13 13:46 UTC (permalink / raw) To: Johannes Thumshirn; +Cc: Hendrik Friedel, linux-btrfs@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 721 bytes --] On Tue, Oct 13, 2020 at 09:43:25AM +0000, Johannes Thumshirn wrote: > *snip* > For the other problems of raid56, Zygo once compiled a very comprehensive list, > but I don't have the link anymore. This list (both user/dev): https://lore.kernel.org/linux-btrfs/20200627032414.GX10769@hungrycats.org/ https://lore.kernel.org/linux-btrfs/20200627030614.GW10769@hungrycats.org/ Best regards, Piotr Szymaniak. -- Chyba musze juz wracac do sklepu. Kelly jest w porzadku, ale czasem potrafi zupelnie sie wylaczyc. I do tego nie wierzy w cos takiego jak odpowiedzialnosc. Ma to jakis zwiazek z ta sekta, do ktorej nalezy. Maharishi Woda-z-mozgu czy cos w tym stylu. -- Graham Masterton, "Mirror" [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Raid5 Write Hole: Is it worse than in MD? 2020-10-13 9:34 Raid5 Write Hole: Is it worse than in MD? Hendrik Friedel 2020-10-13 9:43 ` Johannes Thumshirn @ 2020-10-13 22:54 ` Zygo Blaxell, @hungrycats.org 1 sibling, 0 replies; 4+ messages in thread From: Zygo Blaxell, @hungrycats.org @ 2020-10-13 22:54 UTC (permalink / raw) To: Hendrik Friedel; +Cc: linux-btrfs On Tue, Oct 13, 2020 at 09:34:50AM +0000, Hendrik Friedel wrote: > Hello, > > I recently read this article about the write-hole in md: > https://lwn.net/Articles/665299/ > > Whilst the article is focused on the journal as a fix for the write hole (by > the way: Is that possible with btrfs?), it made me wonder, if the write hole > in btrfs is any worse than in md? It is hard to compare them directly, because write hole is only one of several ways a raid5 array can fail on either mdadm or btrfs, and both have significant shortcomings. btrfs and mdadm have separate strengths and weaknesses in their raid5 implementations. e.g. btrfs can often recover from data corruption that is not reported by the drives, while mdadm can't detect or repair it. On the other hand, mdadm has no problems reading a degraded non-corrupted raid5 array that I know of, while btrfs has some known troubles there. It's possible to implement a raid5 stripe update journal (or tree), but it's not the only possible solution (or only part of a complete solution). Other possible solutions include: - adjust the allocator to minimize stripe RMW update operations (effectively banning them outright for datacow and metadata), or - throw out the existing raid5/6 implementation and start over with an implementation that works in harmony with the copy-on-write semantics, more like the way data compression in btrfs works now (effectively solving the problem the same way ZFS did). These all have various performance and capability tradeoffs. Some of them can even be combined (e.g. minimize RMW updates with allocator changes, fall back to stripe log tree for the rest). > Regards, > Hendrik > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-10-13 23:01 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-10-13 9:34 Raid5 Write Hole: Is it worse than in MD? Hendrik Friedel 2020-10-13 9:43 ` Johannes Thumshirn 2020-10-13 13:46 ` Piotr Szymaniak 2020-10-13 22:54 ` Zygo Blaxell, @hungrycats.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox