* Planning out new fs. Am I missing anything? @ 2020-05-25 1:13 Justin Engwer 2020-05-26 12:47 ` Neal Gompa 2020-05-27 2:20 ` Chris Murphy 0 siblings, 2 replies; 9+ messages in thread From: Justin Engwer @ 2020-05-25 1:13 UTC (permalink / raw) To: linux-btrfs Hi, I'm the guy who lost all his VMs due to a massive configuration oversight. I'm looking to implement the remaining 4 x 3tb drives into a new fs and just want someone to look over things. I'm intending to use them for backup storage (veeam). Centos 7 Kernel 5.5.2-1.el7.elrepo.x86_64 btrfs-progs v4.9.1 mkfs.btrfs -m raid1c4 -d raid1 /dev/disk/by-id/ata-ST3000*-part1 echo "UUID=whatever /mnt/btrfs/ btrfs defaults,space_cache=v2 0 2" >> /etc/fstab mount /mnt/btrfs RAID1 over 4 disks and RAID1C4 metadata. Mounting with space_cache=v2. Any other mount switches or btrfs creation switches I should be aware of? Should I consider RAID5/6 instead? 6tb should be sufficient, so it's not like I'd get anything out of RAID5, but RAID6 I suppose could provide a little more safety in the case of multiple drive failures at once. Cheers, Justin ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Planning out new fs. Am I missing anything? 2020-05-25 1:13 Planning out new fs. Am I missing anything? Justin Engwer @ 2020-05-26 12:47 ` Neal Gompa 2020-05-27 2:20 ` Chris Murphy 1 sibling, 0 replies; 9+ messages in thread From: Neal Gompa @ 2020-05-26 12:47 UTC (permalink / raw) To: Justin Engwer; +Cc: Btrfs BTRFS On Sun, May 24, 2020 at 9:35 PM Justin Engwer <justin@mautobu.com> wrote: > > Hi, I'm the guy who lost all his VMs due to a massive configuration oversight. > > I'm looking to implement the remaining 4 x 3tb drives into a new fs > and just want someone to look over things. I'm intending to use them > for backup storage (veeam). > > Centos 7 Kernel 5.5.2-1.el7.elrepo.x86_64 > btrfs-progs v4.9.1 > > mkfs.btrfs -m raid1c4 -d raid1 /dev/disk/by-id/ata-ST3000*-part1 > echo "UUID=whatever /mnt/btrfs/ btrfs defaults,space_cache=v2 0 2" >> /etc/fstab > mount /mnt/btrfs > > RAID1 over 4 disks and RAID1C4 metadata. Mounting with space_cache=v2. > Any other mount switches or btrfs creation switches I should be aware > of? Should I consider RAID5/6 instead? 6tb should be sufficient, so > it's not like I'd get anything out of RAID5, but RAID6 I suppose could > provide a little more safety in the case of multiple drive failures at > once. > In general, this looks fine, but I'd suggest that you switch to CentOS 8. There's a COPR for btrfs-progs for EL8 that's keeps in sync with Fedora: https://copr.fedorainfracloud.org/coprs/ngompa/btrfs-progs-el8/ For CentOS 8, you should continue to plan to use ELRepo.org kernels. :) -- 真実はいつも一つ!/ Always, there's only one truth! ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Planning out new fs. Am I missing anything? 2020-05-25 1:13 Planning out new fs. Am I missing anything? Justin Engwer 2020-05-26 12:47 ` Neal Gompa @ 2020-05-27 2:20 ` Chris Murphy 2020-05-27 5:22 ` Andrei Borzenkov 1 sibling, 1 reply; 9+ messages in thread From: Chris Murphy @ 2020-05-27 2:20 UTC (permalink / raw) To: Justin Engwer; +Cc: Btrfs BTRFS On Sun, May 24, 2020 at 7:13 PM Justin Engwer <justin@mautobu.com> wrote: > > Hi, I'm the guy who lost all his VMs due to a massive configuration oversight. > > I'm looking to implement the remaining 4 x 3tb drives into a new fs > and just want someone to look over things. I'm intending to use them > for backup storage (veeam). > > Centos 7 Kernel 5.5.2-1.el7.elrepo.x86_64 > btrfs-progs v4.9.1 I suggest updating the btrfs-progs, that's old. > > mkfs.btrfs -m raid1c4 -d raid1 /dev/disk/by-id/ata-ST3000*-part1 > echo "UUID=whatever /mnt/btrfs/ btrfs defaults,space_cache=v2 0 2" >> /etc/fstab > mount /mnt/btrfs Add noatime. https://lwn.net/Articles/499293/ I don't recommend space_cache=v2 in fstab. Use it once manually with clear_cache,space_cache=v2, and a feature flag will be set to use it from that point on. Soon v2 will be the default and you won't have to worry about this at all. fs_passno should be 0 for btrfs. man fsck.btrfs - it's a no op, it's not designed for unattended use during startup. XFS is the same. > RAID1 over 4 disks and RAID1C4 metadata. Mounting with space_cache=v2. > Any other mount switches or btrfs creation switches I should be aware > of? Should I consider RAID5/6 instead? 6tb should be sufficient, so > it's not like I'd get anything out of RAID5, but RAID6 I suppose could > provide a little more safety in the case of multiple drive failures at > once. single, dup, raid0, raid1 (all), raid10 are safe and stable. raid56 has caveats and you need to take precautions that kinda amount to hand holding. If there is a crash or power fail you need to do a scrub (full file system scrub) when raid56. It's a good idea, but not "very necessary" with other profiles. If you mount raid56 degraided, you seriously need to consider not doing writes or being very skeptical of depending on those writes because there's some evidence of degraded writes being corrupted. You can check the archives for more information from Zygo, about raid56 pitfalls. It is table on stable storage. But the point of any raid is to withstand a non-stable situation like a device failure. And there's still work needed on raid56 to get to that point, without handholding. If you need raid5, you might consider mdadm for the raid5, and then format it with btrfs using defaults which will get you DUP metadata and single copy data. You'll get cheap snapshots. Faster scrubs. And warnings for any corruptions of metadata or data. Also consider mkfs.btrfs --checksum=xxhash, but you definitely need btrfs-progs 5.5 or newer, and kernel 5.6 or newer. If those are too new for your use case, skip it. crc32c is fine, but it is intended for detection of casual incidental corruption and can't be used for dedup. xxhash64 is about as fast, but much better collision resistance. -- Chris Murphy ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Planning out new fs. Am I missing anything? 2020-05-27 2:20 ` Chris Murphy @ 2020-05-27 5:22 ` Andrei Borzenkov 2020-05-27 6:25 ` Chris Murphy 0 siblings, 1 reply; 9+ messages in thread From: Andrei Borzenkov @ 2020-05-27 5:22 UTC (permalink / raw) To: Chris Murphy, Justin Engwer; +Cc: Btrfs BTRFS 27.05.2020 05:20, Chris Murphy пишет: > > single, dup, raid0, raid1 (all), raid10 are safe and stable. Until btrfs can reliably detect and automatically handle outdated device I would not call any multi-device profiles "safe", at least unconditionally. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Planning out new fs. Am I missing anything? 2020-05-27 5:22 ` Andrei Borzenkov @ 2020-05-27 6:25 ` Chris Murphy 2020-05-27 16:23 ` Goffredo Baroncelli 0 siblings, 1 reply; 9+ messages in thread From: Chris Murphy @ 2020-05-27 6:25 UTC (permalink / raw) To: Andrei Borzenkov; +Cc: Chris Murphy, Justin Engwer, Btrfs BTRFS On Tue, May 26, 2020 at 11:22 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: > > 27.05.2020 05:20, Chris Murphy пишет: > > > > single, dup, raid0, raid1 (all), raid10 are safe and stable. > > Until btrfs can reliably detect and automatically handle outdated device > I would not call any multi-device profiles "safe", at least unconditionally. I agree. -- Chris Murphy ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Planning out new fs. Am I missing anything? 2020-05-27 6:25 ` Chris Murphy @ 2020-05-27 16:23 ` Goffredo Baroncelli 2020-05-27 18:40 ` Chris Murphy 0 siblings, 1 reply; 9+ messages in thread From: Goffredo Baroncelli @ 2020-05-27 16:23 UTC (permalink / raw) To: Chris Murphy, Andrei Borzenkov; +Cc: Justin Engwer, Btrfs BTRFS Hi All, On 5/27/20 8:25 AM, Chris Murphy wrote: > On Tue, May 26, 2020 at 11:22 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: >> >> 27.05.2020 05:20, Chris Murphy пишет: >>> >>> single, dup, raid0, raid1 (all), raid10 are safe and stable. >> >> Until btrfs can reliably detect and automatically handle outdated device >> I would not call any multi-device profiles "safe", at least unconditionally. > > I agree. > Checking the generation of each device should be sufficient to detect "outdated" devices. Why this check is not performed ? May be that I am missing something ? Of course this could solves the "detection"; the handling of outdated devices is another story.. BR G.Baroncelli -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Planning out new fs. Am I missing anything? 2020-05-27 16:23 ` Goffredo Baroncelli @ 2020-05-27 18:40 ` Chris Murphy 2020-05-27 19:51 ` Goffredo Baroncelli 0 siblings, 1 reply; 9+ messages in thread From: Chris Murphy @ 2020-05-27 18:40 UTC (permalink / raw) To: Goffredo Baroncelli Cc: Chris Murphy, Andrei Borzenkov, Justin Engwer, Btrfs BTRFS On Wed, May 27, 2020 at 10:23 AM Goffredo Baroncelli <kreijack@libero.it> wrote: > > Hi All, > > On 5/27/20 8:25 AM, Chris Murphy wrote: > > On Tue, May 26, 2020 at 11:22 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: > >> > >> 27.05.2020 05:20, Chris Murphy пишет: > >>> > >>> single, dup, raid0, raid1 (all), raid10 are safe and stable. > >> > >> Until btrfs can reliably detect and automatically handle outdated device > >> I would not call any multi-device profiles "safe", at least unconditionally. > > > > I agree. > > > > Checking the generation of each device should be sufficient to detect "outdated" devices. Why this check is not performed ? > May be that I am missing something ? But transid isn't unique enough except in isolation. Degraded volumes are treated completely independently. So if I take a 2x raid1 and mount each one degraded on separate computers and modify them. Then join them back together, how can Btrfs resolve the differences? It's a mess. Yes that is obviously a kind of sabotage. While not literal sabotage, the effect is the same if you have alternating degraded drives in successive boots. So you just cannot use degraded with either fstab or rootflags. It's bad advice to anyone who gives it and we need to be vigilant about recommending against it. Maybe the man 5 btrfs page should expressly say not to include degraded in fstab, or at least warn there are consequences. -- Chris Murphy ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Planning out new fs. Am I missing anything? 2020-05-27 18:40 ` Chris Murphy @ 2020-05-27 19:51 ` Goffredo Baroncelli 2020-05-28 2:14 ` Chris Murphy 0 siblings, 1 reply; 9+ messages in thread From: Goffredo Baroncelli @ 2020-05-27 19:51 UTC (permalink / raw) To: Chris Murphy; +Cc: Andrei Borzenkov, Justin Engwer, Btrfs BTRFS On 5/27/20 8:40 PM, Chris Murphy wrote: > On Wed, May 27, 2020 at 10:23 AM Goffredo Baroncelli <kreijack@libero.it> wrote: >> >> Hi All, >> >> On 5/27/20 8:25 AM, Chris Murphy wrote: >>> On Tue, May 26, 2020 at 11:22 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: >>>> >>>> 27.05.2020 05:20, Chris Murphy пишет: >>>>> >>>>> single, dup, raid0, raid1 (all), raid10 are safe and stable. >>>> >>>> Until btrfs can reliably detect and automatically handle outdated device >>>> I would not call any multi-device profiles "safe", at least unconditionally. >>> >>> I agree. >>> >> >> Checking the generation of each device should be sufficient to detect "outdated" devices. Why this check is not performed ? >> May be that I am missing something ? > > But transid isn't unique enough except in isolation. Degraded volumes > are treated completely independently. So if I take a 2x raid1 and > mount each one degraded on separate computers and modify them. Then > join them back together, how can Btrfs resolve the differences? It's a > mess. Yes that is obviously a kind of sabotage. While not literal > sabotage, the effect is the same if you have alternating degraded > drives in successive boots. Even tough we can't close all the holes, we can reduce the likelihood of a this issue. Anyway mounting a filesystem with different generation number is wrong. And the fact the we can't prevent all the kind of mismatches doesn't mean that we don't have to do anything. I am thinking about adding a "opt in" check. I.e. if the mismatch happens btrfs should raise a warning. If a flag is passed at mount (like mount -o prevent-generation-mismatch) and the generations don't match, the mount fails. Then, on the basis of feedback returned, in the future we can change the flags from "opt in" to "opt out" (mount -o no-prevent-generation-mismatch) > > So you just cannot use degraded with either fstab or rootflags. It's > bad advice to anyone who gives it and we need to be vigilant about > recommending against it. Maybe the man 5 btrfs page should expressly > say not to include degraded in fstab, or at least warn there are > consequences. > -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Planning out new fs. Am I missing anything? 2020-05-27 19:51 ` Goffredo Baroncelli @ 2020-05-28 2:14 ` Chris Murphy 0 siblings, 0 replies; 9+ messages in thread From: Chris Murphy @ 2020-05-28 2:14 UTC (permalink / raw) To: Goffredo Baroncelli Cc: Chris Murphy, Andrei Borzenkov, Justin Engwer, Btrfs BTRFS On Wed, May 27, 2020 at 1:51 PM Goffredo Baroncelli <kreijack@inwind.it> wrote: > > On 5/27/20 8:40 PM, Chris Murphy wrote: > > On Wed, May 27, 2020 at 10:23 AM Goffredo Baroncelli <kreijack@libero.it> wrote: > >> > >> Hi All, > >> > >> On 5/27/20 8:25 AM, Chris Murphy wrote: > >>> On Tue, May 26, 2020 at 11:22 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: > >>>> > >>>> 27.05.2020 05:20, Chris Murphy пишет: > >>>>> > >>>>> single, dup, raid0, raid1 (all), raid10 are safe and stable. > >>>> > >>>> Until btrfs can reliably detect and automatically handle outdated device > >>>> I would not call any multi-device profiles "safe", at least unconditionally. > >>> > >>> I agree. > >>> > >> > >> Checking the generation of each device should be sufficient to detect "outdated" devices. Why this check is not performed ? > >> May be that I am missing something ? > > > > But transid isn't unique enough except in isolation. Degraded volumes > > are treated completely independently. So if I take a 2x raid1 and > > mount each one degraded on separate computers and modify them. Then > > join them back together, how can Btrfs resolve the differences? It's a > > mess. Yes that is obviously a kind of sabotage. While not literal > > sabotage, the effect is the same if you have alternating degraded > > drives in successive boots. > > Even tough we can't close all the holes, we can reduce the likelihood of a this issue. > > Anyway mounting a filesystem with different generation number is wrong. And the > fact the we can't prevent all the kind of mismatches doesn't mean that > we don't have to do anything. Yep. You're right. > > I am thinking about adding a "opt in" check. I.e. if the mismatch happens > btrfs should raise a warning. If a flag is passed at mount (like > mount -o prevent-generation-mismatch) and the generations don't match, > the mount fails. I wonder about using a compat_flag to set a device as having been mounted degraded. The next time a mount happens, all devices with compat_flag degraded set should have identical transids or we know something is screwy. If there is a device that does not have degraded flag, and has older transid, there could be some kind of sanity check to make sure the last 1-3 transids transactions are the same (?) and if so (a) allow a non-degraded mount, (b) warn, (c) "replay" the transactions between stale and current, so that all devices are caught up, similar to the partial rebuild mdadm does using write intent bitmap as the hint for what needs to be caught up. -- Chris Murphy ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-05-28 2:14 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-05-25 1:13 Planning out new fs. Am I missing anything? Justin Engwer 2020-05-26 12:47 ` Neal Gompa 2020-05-27 2:20 ` Chris Murphy 2020-05-27 5:22 ` Andrei Borzenkov 2020-05-27 6:25 ` Chris Murphy 2020-05-27 16:23 ` Goffredo Baroncelli 2020-05-27 18:40 ` Chris Murphy 2020-05-27 19:51 ` Goffredo Baroncelli 2020-05-28 2:14 ` Chris Murphy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).