* btrfs raid-1 uuid-fstab
@ 2015-02-14 2:31 James
2015-02-14 11:52 ` Chris Murphy
0 siblings, 1 reply; 8+ messages in thread
From: James @ 2015-02-14 2:31 UTC (permalink / raw)
To: linux-btrfs
Ok,
I have (2) identical 2T drives btrfs + ext4 formated and . I want to use
uuid in the fstab. No swap for now (each system had 32G) if I need
swap later, I can just setup a file and use swapon? Usually I set up
"boot root and swap" but it seems that I have confused how to do
that correct with btrfs in a raid 1. What I want is if a drive fails,
I can just replace it, or pull one drive out, replace it with a second
blank, 2T new drive. Them move the removed drive into a second (identical)
system to build a cloned workstation. From what I've read, uuid numbers
are suppose to be use with fstab + btrfs Partuuid is still flaky. But the
UUID numbers to not appear uniq (due to raid-1)? Do the only get listed once
in fstab?
So I'm finishing up a new install of btrfs-raid1 on a gentoo system.
The machine is in a chroot right now:
gdisk /dev/sdb
Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present
Found valid GPT with protective MBR; using GPT.
Number Start (sector) End (sector) Size Code Name
1 2048 8191 3.0 MiB EF02 grub2biosboot
2 8192 1024000 496.0 MiB 8300 boot
3 1026048 3907029134 1.8 TiB 8300 root
gdisk /dev/sda
Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present
Found valid GPT with protective MBR; using GPT.
Number Start (sector) End (sector) Size Code Name
1 2048 8191 3.0 MiB EF02 grub2biosboot
2 8192 1024000 496.0 MiB 8300 boot
3 1026048 3907029134 1.8 TiB 8300 root
# blkid
/dev/loop0: TYPE="squashfs"
/dev/sda1: UUID="85cd9d86-4f4d-4113-b14e-cf5339373e20" TYPE="ext4"
PARTLABEL="grub2biosboot" PARTUUID="f88a8259-a4e4-4db8-86df-e709d135fe47"
/dev/sda2: LABEL="BOOT" UUID="d67a8d19-64bc-4ee1-bebf-48c935b039fa"
UUID_SUB="3eb62dd8-3f07-440f-8606-0c6d99362f6e" TYPE="btrfs"
PARTLABEL="boot" PARTUUID="8a6f7b5f-28a8-4f87-938f-386a93ebe07f"
/dev/sda3: LABEL="BTROOT" UUID="b7753366-a9a9-4074-8e0e-3beea50fee56"
UUID_SUB="e546ce31-098f-4897-bffd-6c5628f6b62e" TYPE="btrfs"
PARTLABEL="root" PARTUUID="6a8fa54b-3d58-4ac5-8784-6d540f2e65fc"
/dev/sdb2: LABEL="BOOT" UUID="d67a8d19-64bc-4ee1-bebf-48c935b039fa"
UUID_SUB="02034edf-c537-4fc6-9375-1599e8af2737" TYPE="btrfs"
PARTLABEL="boot" PARTUUID="b7b88ea7-b59a-4a4d-b857-4f55a1be3830"
/dev/sdb3: LABEL="BTROOT" UUID="b7753366-a9a9-4074-8e0e-3beea50fee56"
UUID_SUB="8a76be85-6106-47ea-90ae-756fb8c37bf1" TYPE="btrfs"
PARTLABEL="root" PARTUUID="3c2c6f88-a1da-40de-83be-21af71a5ce26"
/dev/sr0: UUID="2014-08-28-06-08-20-22" LABEL="Gentoo Linux amd64 20140828"
TYPE="iso9660" PTUUID="1047d058" PTTYPE="dos"
/dev/sdb1: PARTLABEL="grub2biosboot"
PARTUUID="3c7a0935-57d4-4bff-a492-aaa261e62212"
UUID=d67a8d19-64bc-4ee1-bebf-48c935b039fa
/boot btrfs noauto,noatime 1 2
UUID=b7753366-a9a9-4074-8e0e-3beea50fee56 /
btrfs defaults,noatime,compress=lzo,space_cache 0 0
UUID=3c2c6f88-a1da-40de-83be-21af71a5ce26 /boot
UUID=d67a8d19-64bc-4ee1-bebf-48c935b039fa /
btrfs
UUID=b7753366-a9a9-4074-8e0e-3beea50fee56
UUID=85cd9d86-4f4d-4113-b14e-cf5339373e20 /grub2biosboot
ext4 1 2
PARTUUID=3c7a0935-57d4-4bff-a492-aaa261e62212" /grub2biosboot
??? 0 0
First I notice the last partition (sdb1) seems to be missing the ext4 file
system I guess when I exit the chroot I can just fix that to match sda1.
So my fstab should look like this?:
You know, it's obvious to me that I have not idea how to create
the fstab for this installation. Any help or guidance would be keen,
to help salvage the installation and get a few partitions installed
with btrfs. Maybe I can somehow migrate to a raid-1 configuration
under btrfs.
James
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: btrfs raid-1 uuid-fstab
2015-02-14 2:31 btrfs raid-1 uuid-fstab James
@ 2015-02-14 11:52 ` Chris Murphy
2015-02-15 6:28 ` Duncan
0 siblings, 1 reply; 8+ messages in thread
From: Chris Murphy @ 2015-02-14 11:52 UTC (permalink / raw)
To: James; +Cc: Btrfs BTRFS
On Fri, Feb 13, 2015 at 7:31 PM, James <wireless@tampabay.rr.com> wrote:
>No swap for now (each system had 32G) if I need
> swap later, I can just setup a file and use swapon?
No. You should read the wiki.
https://btrfs.wiki.kernel.org/index.php/FAQ#Does_btrfs_support_swap_files.3F
> What I want is if a drive fails,
> I can just replace it, or pull one drive out, replace it with a second
> blank, 2T new drive. Them move the removed drive into a second (identical)
> system to build a cloned workstation. From what I've read, uuid numbers
> are suppose to be use with fstab + btrfs Partuuid is still flaky. But the
> UUID numbers to not appear uniq (due to raid-1)? Do the only get listed once
> in fstab?
Once is enough. Kernel code will find both devices.
For degraded use, this gets tricky, you have to use boot param
rootflags=degraded to get it to mount, otherwise mount fails and
you'll be dropped to a pre-mount shell in the initramfs. Also, there's
a nasty little gotcha, there is no equivalent for mdadm bitmap. So
once one member drive is mounted degraded+rw, it's changed, and
there's no way to "catch up" the other drive - if you reconnect, it
might seem things are OK but there's a good chance of corruption in
such a case. You have to make sure you wipe the "lost" drive (the
older version one). wipefs -a should be sufficient, then use 'device
add' and 'device delete missing' to rebuild it.
This should not be formatted ext4, it's strictly for GRUB, it doesn't
get a file system. You should use wipefs -a on this.
This fstab has lots of problems. Based on your partition scheme it
should only have two entries total. A btrfs /boot UUID="d67a... and a
btrfs / UUID="b7753... There is no mountpoint for biosboot, it's used
by GRUB and is never formatted or mounted.
> First I notice the last partition (sdb1) seems to be missing the ext4 file
> system I guess when I exit the chroot I can just fix that to match sda1.
No the problem is sda1 is wrongly formatted ext4, you should use
wipefs -a on it.
> Any help or guidance would be keen,
> to help salvage the installation and get a few partitions installed
> with btrfs. Maybe I can somehow migrate to a raid-1 configuration
> under btrfs.
Good luck. Make backups often. Btrfs raid1 is not a backup. Btrfs
snapshots are not a backup. And use recent kernels. Recent on this
list means 3.18.3 or newer, and is listed unstable on this list
http://packages.gentoo.org/package/sys-kernel/gentoo-sources Based on
the kernel.org change log, you'd probably be fine running 3.14.31, but
if you have problems and ask about it on this list, there's a decent
chance the first question will be "can you reproduce the problem on a
current kernel?"
Anyway, I suggest reading the entire btrfs wiki.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: btrfs raid-1 uuid-fstab
2015-02-14 11:52 ` Chris Murphy
@ 2015-02-15 6:28 ` Duncan
2015-02-15 11:11 ` Kai Krakow
2015-02-16 5:29 ` Chris Murphy
0 siblings, 2 replies; 8+ messages in thread
From: Duncan @ 2015-02-15 6:28 UTC (permalink / raw)
To: linux-btrfs
Chris Murphy posted on Sat, 14 Feb 2015 04:52:12 -0700 as excerpted:
> On Fri, Feb 13, 2015 at 7:31 PM, James <wireless@tampabay.rr.com> wrote:
>> What I want is if a drive fails,
>> I can just replace it, or pull one drive out, replace it with a second
>> blank, 2T new drive. Them move the removed drive into a second
>> (identical) system to build a cloned workstation. From what I've read,
>> uuid numbers are suppose to be use with fstab + btrfs Partuuid is still
>> flaky. But the UUID numbers to not appear uniq (due to raid-1)? Do the
>> only get listed once in fstab?
>
> Once is enough. Kernel code will find both devices.
[Preliminary note. FWIW, gentooer here too, running a btrfs raid1 root,
altho I strongly prefer several smaller filesystems over a single large
filesystem, so all my data eggs aren't in the same filesystem basket if
the proverbial bottom drops out of it. So /home is a separate
filesystem, as is /var/log, as is my updates stuff (gentoo and other
repos, including kernel, sources, binpkgs, ccache, everything I use to
update the system on a single filesystem, kept unmounted unless I'm
updating), as is my media partition, and of course /tmp, which is tmpfs.
But of interest here is that I'm running a btrfs raid1 root.]
CM is correct. =:^)
But in addition, for a btrfs raid1 root (or any multi-device btrfs root,
for that matter), you *WILL* need an initr*, because normally the kernel
must run a userspace (initr* to mount root) btrfs device scan, before it
can actually assemble a multi-device btrfs properly. As I don't believe
Chris is a gentooer, I'm guessing he's used to an initr* and thus forgot
about this requirement, which can be a big one for a gentooer, since we
build our own kernels and often build in at least the modules required to
mount root, thus in many cases making an initr* unnecessary.
Unfortunately, for a multi-device btrfs root, it's necessary. =:^(
While in theory btrfs has the device= mount option, and the kernel has
rootflags= to tell it what mount options to use, at least last I checked
a few kernel cycles ago (I'd say last summer, so 3-5 kernel cycles ago),
for some reason rootflags=device= doesn't appear to work correctly. My
theory is that the kernel commandline parser breaks at the second/last =
instead of the first, so instead of seeing settings for the rootflags
parameter, it sees settings for the rootflags=device parameter, which of
course makes no sense to the kernel and is ignored. But that's just my
best theory. All I know for sure is that the subject has come up a
number of times here and has been acknowledged by the btrfs devs, I had
to set up an initr* to get a raid1 btrfs root to mount when I originally
set it up here, and some time later when I decided to try an initr*-less
rootflags= boot again and see if the problem had been fixed, it still
didn't work.
So for a multi-device btrfs root, plan on that initr*. If you'd never
really learned how to set one up, as was the case here, you will probably
either have to learn, or skip the idea of a multi-device btrfs root until
the problem is, eventually/hopefully, fixed.
FWIW, I use dracut to create my initr* here, and have the kernel options
set such that the dracut-pre-created initr* is attached to each kernel I
build as an initramfs, so I don't have to have an initr* setting in grub2
-- each kernel image has its own, attached.
And FWIW, when I first setup the btrfs root (and dracut-based initr*), I
was running openrc (and thus using sysv-init as my init). I've since
switched to systemd and activated the appropriate dracut systemd module.
So I know from personal experience, a dracut-based initr* can be setup to
boot either openrc/sysvinit, or systemd. Both work. =:^)
> For degraded use, this gets tricky, you have to use boot param
> rootflags=degraded to get it to mount, otherwise mount fails and you'll
> be dropped to a pre-mount shell in the initramfs.
See, assumed initr*. =:^\
But while on the topic of rootflags=degraded, in my experimentation,
without an initr* with its pre-mount btrfs device scan, since it /was/ a
two-device btrfs raid1 both data and metadata, thus with copies of
everything on each device, the only way to boot without an initr* was to
set rootflags=degraded, since the kernel would only know about the root=
device in that case.
And that worked, so the kernel certainly could parse rootflags= and pass
the mount options to btrfs as it should. It simply broke when device=
was passed in those rootflags. Thus my theory about the parser breaking
at the wrong =.
> Also, there's a nasty
> little gotcha, there is no equivalent for mdadm bitmap. So once one
> member drive is mounted degraded+rw, it's changed, and there's no way to
> "catch up" the other drive - if you reconnect, it might seem things are
> OK but there's a good chance of corruption in such a case. You have to
> make sure you wipe the "lost" drive (the older version one). wipefs -a
> should be sufficient, then use 'device add' and 'device delete missing'
> to rebuild it.
I caught this in my initial btrfs experimentation, before I set it up
permanently. It's worth repeating for emphasis, with a bit more
information as well.
*** If you break up a btrfs raid1 and attempt to recombine afterward, be
*SURE* you *ONLY* mount the one side writable after that. As long as
ONLY one side is written to, that one side will consistently have a later
generation than the device that was dropped out, and you can add the
dropped device back in, with the caveat that you should then immediately
run a btrfs scrub, which will scan both the updated devices and the
behind one, and catch up the behind one.
Never, ever, separately mount both devices writable, and then try to
recombine them, without first wiping the one.
Because at least in theory (that is, barring bugs), if one device had
more transactions and is thus at a later transaction generation (an
integral part of btrfs and tracked in the superblock), the filesystem
should pick the later generation and a scrub will update the older one as
necessary. This is how things work if only one side was written to or if
they were both written to, how btrfs picks which side to consider valid.
However, if the two sides were both written to separately, and the
generation happens to be the same on both, the filesystem will consider
them both valid even tho they differ, and "bad things can happen."
The best way to avoid those "bad things" is to avoid splitting and
recombining where possible. If it must be done, be sure btrfs only sees
one side updated since the split, either by only mounting the one side
writable and doing a scrub after recombine to update the other one, or if
for some reason they were both mounted writable, wipe the one before
reattaching it, so btrfs never sees the diverged writes and there's never
a chance of corruption as a result.
> This should not be formatted ext4, it's strictly for GRUB, it doesn't
> get a file system. You should use wipefs -a on this.
"This" referring of course to the grub2 bios boot.
What grub2 actually uses this for is to store the grub-core, with the
various modules it needs to read /boot builtin. This was what grub1
called stage-1.5.
On a BIOS system, the firmware reads and loads the boot sector, but
that's only 512 bytes, far too small to contain the main grub binary.
All it has room for is a small stub and a pointer to a larger core.
On the simplest /boot filesystems, this pointer can be directly to the
binary on /boot, but that only works as long as the filesystem doesn't
move that binary around (defrag or for btrfs, balance), and as long as
that binary was stored serially, in terms of device LBA addressing. In
the grub1 era, these filesystems were the ones that didn't require a
stage-1.5, with the grub binary on /boot being the stage2.
With now legacy mbr-based partitioning, the only place grub could put a
stage-1.5, if needed to read the stage-2 on /boot, was in the clear space
many partitioners left at the beginning of the partition.
With grub2 and gpt partitioning, as long as there's a grub2biosboot
partition reserved, that's where grub2 now places this core, formerly
stage-1.5, with grub2 updated to dynamically add any grub modules (for
gpt, the filesystem, raid, lvm, etc) necessary to access /boot to the
core dynamically, before it places it in this reserved partition.
But the gpt reserved biosboot partition should not have a filesystem and
is never mounted -- grub2 writes the core-plus-necessary-modules binary
directly to the reserved partition without a filesystem, in LBA address
order so it can be read serially by the very simple code that's still
held in that 512-byte boot sector.
In fact, that very simple 512-byte boot-sector code knows nothing about
gpt, it simply knows how to read the pointer that points to the LBA
address of the first grubcore sector, and starts reading from there until
it hits the magic sequence that tells it to stop. Only after it has read
and loaded that grub2-core code, does grub as we know it start to execute.
And in fact, as long as the grub2-core code can be read and loaded, even
if grub can't find and load its config file and the other modules on
/boot for some reason, you'll still get a rescue shell, and with a bit of
grub knowledge, can point grub either at its /boot config and additional
modules manually, or at a backup /boot, possibly on another device, and
load normal mode and hopefully be able to continue booting normally, from
there.
What's nice about gpt is that it has a dedicated bios-boot reserved
partition for grub2, or other boot loader, to use. This is far more
reliable than hoping the partitioner and filesystem left enough room at
the beginning of the partition to store the stage-1.5, as grub1 used to
have to do, and as grub2 still has to do on legacy mbr-formatted systems.
> This fstab has lots of problems. Based on your partition scheme it
> should only have two entries total. A btrfs /boot UUID="d67a... and a
> btrfs / UUID="b7753... There is no mountpoint for biosboot, it's used by
> GRUB and is never formatted or mounted.
Spot on.
>> First I notice the last partition (sdb1) seems to be missing the ext4
>> file system I guess when I exit the chroot I can just fix that to match
>> sda1.
>
> No the problem is sda1 is wrongly formatted ext4, you should use wipefs
> -a on it.
Spot on.
>> Any help or guidance would be keen,
>> to help salvage the installation and get a few partitions installed
>> with btrfs. Maybe I can somehow migrate to a raid-1 configuration under
>> btrfs.
>
> Good luck. Make backups often. Btrfs raid1 is not a backup. Btrfs
> snapshots are not a backup. And use recent kernels. Recent on this list
> means 3.18.3 or newer, and is listed unstable on this list
> http://packages.gentoo.org/package/sys-kernel/gentoo-sources Based on
> the kernel.org change log, you'd probably be fine running 3.14.31, but
> if you have problems and ask about it on this list, there's a decent
> chance the first question will be "can you reproduce the problem on a
> current kernel?"
>
> Anyway, I suggest reading the entire btrfs wiki.
Absolutely. Well, the entire user documentation section, anyway. If
you're not a dev, you can skip that stuff unless you're curious.
Just as reading the rest of the gentoo handbook, not just the install
section, can save you a lot of needlessly wasted time and headaches on
gentoo, so reading the entire user documentation section on the btrfs
wiki can save you lots of wasted time and headaches, and since it's a
filesystem on which you're placing data presumably of some value, very
possibly needlessly lost data, as well.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: btrfs raid-1 uuid-fstab
2015-02-15 6:28 ` Duncan
@ 2015-02-15 11:11 ` Kai Krakow
2015-02-16 5:50 ` Duncan
2015-02-16 5:29 ` Chris Murphy
1 sibling, 1 reply; 8+ messages in thread
From: Kai Krakow @ 2015-02-15 11:11 UTC (permalink / raw)
To: linux-btrfs
Duncan <1i5t5.duncan@cox.net> schrieb:
> While in theory btrfs has the device= mount option, and the kernel has
> rootflags= to tell it what mount options to use, at least last I checked
> a few kernel cycles ago (I'd say last summer, so 3-5 kernel cycles ago),
> for some reason rootflags=device= doesn't appear to work correctly. My
> theory is that the kernel commandline parser breaks at the second/last =
> instead of the first, so instead of seeing settings for the rootflags
> parameter, it sees settings for the rootflags=device parameter, which of
> course makes no sense to the kernel and is ignored. But that's just my
> best theory.
Gentoo here, too. And I tried to fiddle around with the exact same issue
some kernel versions back and didn't get it to work, so I did go with dracut
which works pretty well for me - combined with grub2, multi-device detection
works pretty well tho you sometimes need rootdelay={1,2,3} to wait up to
three seconds for btrfs figure out its setup. Looks like btrfs devices are
assembled with a delay by the kernel and at the point you try to mount one
of the compound devices, if done too early, the kernel code cannot yet find
all the other devices of the set. Maybe "rootwait" would also do tho I
didn't tried that yet (it probably won't as the root device is initrd
initially). It may be a side-effect of the kernel doing async SCSI device
detection. It may be worth trying to turn that option of.
But about your theory: I don't think the cmdline parser works incorrect,
becauce rootflags=subvol=something works. It's probably just a flaw that
btrfs device composition comes up later and the kernel tries to early to
mount root. "rootwait" probably won't help here, too. But "rootdelay" may
help that case tho I myself don't have the ambitions to experiment with it.
My dracut initrd setup works fine and has some benefits like early debug
shell to investigate problems without resorting to rescue systems or
bootable USB sticks.
--
Replies to list only preferred.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: btrfs raid-1 uuid-fstab
2015-02-15 6:28 ` Duncan
2015-02-15 11:11 ` Kai Krakow
@ 2015-02-16 5:29 ` Chris Murphy
1 sibling, 0 replies; 8+ messages in thread
From: Chris Murphy @ 2015-02-16 5:29 UTC (permalink / raw)
To: Btrfs BTRFS
On Sat, Feb 14, 2015 at 11:28 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Chris Murphy posted on Sat, 14 Feb 2015 04:52:12 -0700 as excerpted:
>> Also, there's a nasty
>> little gotcha, there is no equivalent for mdadm bitmap. So once one
>> member drive is mounted degraded+rw, it's changed, and there's no way to
>> "catch up" the other drive - if you reconnect, it might seem things are
>> OK but there's a good chance of corruption in such a case. You have to
>> make sure you wipe the "lost" drive (the older version one). wipefs -a
>> should be sufficient, then use 'device add' and 'device delete missing'
>> to rebuild it.
>
> I caught this in my initial btrfs experimentation, before I set it up
> permanently. It's worth repeating for emphasis, with a bit more
> information as well.
>
> *** If you break up a btrfs raid1 and attempt to recombine afterward, be
> *SURE* you *ONLY* mount the one side writable after that. As long as
> ONLY one side is written to, that one side will consistently have a later
> generation than the device that was dropped out, and you can add the
> dropped device back in,
Right. I left out the distinguishing factor in whether or not it
corrupts. I'm uncertain how bad this corruption is, I've never tried
reproducing it.
--
Chris Murphy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: btrfs raid-1 uuid-fstab
2015-02-15 11:11 ` Kai Krakow
@ 2015-02-16 5:50 ` Duncan
2015-02-16 23:15 ` Kai Krakow
0 siblings, 1 reply; 8+ messages in thread
From: Duncan @ 2015-02-16 5:50 UTC (permalink / raw)
To: linux-btrfs
Kai Krakow posted on Sun, 15 Feb 2015 12:11:56 +0100 as excerpted:
> Duncan <1i5t5.duncan@cox.net> schrieb:
>
> Gentoo here, too. And I tried to fiddle around with the exact same issue
> some kernel versions back and didn't get it to work, so I did go with
> dracut which works pretty well for me - combined with grub2,
> multi-device detection works pretty well tho you sometimes need
> rootdelay={1,2,3} to wait up to three seconds for btrfs figure out its
> setup. Looks like btrfs devices are assembled with a delay by the kernel
> and at the point you try to mount one of the compound devices, if done
> too early, the kernel code cannot yet find all the other devices of the
> set. Maybe "rootwait" would also do tho I didn't tried that yet (it
> probably won't as the root device is initrd initially). It may be a
> side-effect of the kernel doing async SCSI device detection. It may be
> worth trying to turn that option of.
Interesting. I had forgotten I had rootwait set as a builtin kernel
commandline-option, and was about to reply that I had SCSI_ASYNC_SCAN
turned on and had never seen problems, but then I remembered having to
turn on rootwait.
Actually, I had tried rootdelay=N some years ago, perhaps before rootwait
actually became an kernel commandline option, certainly before I knew of
it. I used it with mdraid (initr*-less) too. But eventually I got tired
of having to play with rootdelay timeouts, and when I came across rootwait
I decided to try it, and that solved my timeouts issue once and for all.
So I can confirm that rootwait seems to work for multi-device btrfs as
well, which of course requires an initr*. But that actually might be
dracut reading the kernel commandline and applying the same option at the
initr* level, and thus not work with other initr*-generators, if they
don't do the same thing. I'm actually not sure.
What I can say, however, is that after I set rootwait here, I've had no
more block-device-detection-timing issues. It has "just worked" in terms
of timing.
And what's nice is that rootwait actually appears to go into a loop,
checking for a mountable root, as well, and will continue immediately
upon finding it. So the delay is exactly as long as it needs to be, and
no longer. (I don't remember whether rootdelay=N could terminate the
delay early if it found all necessary devices, or not, but certainly,
rootwait does.)
> But about your theory: I don't think the cmdline parser works incorrect,
> becauce rootflags=subvol=something works.
Well, so much for /that/ theory, then. I /thought/ the kernel devs were
too smart to have let a bug that simple, especially where it was likely
to be triggered by other = options as well, remain for as long as this
has. But that was what I came up with as a possible explanation. I
think your theory below makes more sense.
> It's probably just a flaw that
> btrfs device composition comes up later and the kernel tries to early to
> mount root. "rootwait" probably won't help here, too. But "rootdelay"
> may help that case tho I myself don't have the ambitions to experiment
> with it. My dracut initrd setup works fine and has some benefits like
> early debug shell to investigate problems without resorting to rescue
> systems or bootable USB sticks.
FWIW, my root backup and rescue solution are one and the same, an
occasional (every few kernel cycles) "snapshot" copy (not btrfs snapshot,
a full copy) of my root filesystem, made when things seem reasonably
stable and have been working for awhile, to an identically sized "backup
root filesystem" located elsewhere. That way, I have effectively a fully
operational system "snapshot" copy, taken when the system was known to be
operational, complete with everything I normally use, X, KDE, firefox,
media players, games, everything, and of course tested to boot and run as
normal. No crippled semi-functional rescue media for me! =:^)
With a root filesystem of 8 GiB, that's easy enough, and I keep several
backup copies available, the first one another 8 GiB partitions each pair-
device btrfs raid1 on the same physical pair of SSDs, with a second and
third 8 GiB root backup on reiserfs on spinning rust, in case the pair of
SSD physical devices fail, or if btrfs itself gets majorly bugged out,
such that booting to the first backup kills it just like it did the
working copy.
And I have my grub2 menu setup with the root= boot option assigned a
variable, and menu options to set that variable to point to any of the
backups as necessary. So to boot a particular backup, I just select the
option to set the pointer variable appropriately, and then select boot.
Similarly with other kernel commandline options, including the kernel
choice and init=. They're all loaded into pointer variables, and if I
want to choose a different one, I simply select the menu option that sets
the pointer variable appropriately, and then select boot.
Very flexible, this grub2 is! =:^)
Meanwhile, grub2 is setup on both ssds (which have identical partition
layouts) and on the spinning rust, with each one having its own /boot,
thus giving me backup /boots as well, and of course I can select any of
them from the BIOS to boot, so I'm pretty well set as long as I don't
lose all three devices at once.
If I lose all three devices at once, I figure it's quite likely I'm
dealing with a rather larger disaster, say a fire or flood or the like,
and will probably have my hands full just surviving for awhile. When I
do get back to worrying about the computer, likely after replacing what I
lost in the disaster, it won't be that big a deal to start over
downloading a live image and doing a new install from the stage-3
starter. After all, the *REAL* important backup is in my head, and if I
lose that, I guess I won't be worrying much about computers any more,
even if I'm still "alive" in some facility somewhere. Tho I /do/ have
some stuff backed up on USB thumb drive and the like as well. But I
don't put much priority in it, because I figure if I'm having to restore
from that backup in the first place, I'm pretty much screwed in any case,
and the /last/ thing I'm likely to be worried about is having to start
over with a new computer install.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: btrfs raid-1 uuid-fstab
2015-02-16 5:50 ` Duncan
@ 2015-02-16 23:15 ` Kai Krakow
2015-02-18 1:12 ` Duncan
0 siblings, 1 reply; 8+ messages in thread
From: Kai Krakow @ 2015-02-16 23:15 UTC (permalink / raw)
To: linux-btrfs
Duncan <1i5t5.duncan@cox.net> schrieb:
>> It's probably just a flaw that
>> btrfs device composition comes up later and the kernel tries to early to
>> mount root. "rootwait" probably won't help here, too. But "rootdelay"
>> may help that case tho I myself don't have the ambitions to experiment
>> with it. My dracut initrd setup works fine and has some benefits like
>> early debug shell to investigate problems without resorting to rescue
>> systems or bootable USB sticks.
>
> FWIW, my root backup and rescue solution are one and the same, an
> occasional (every few kernel cycles) "snapshot" copy (not btrfs snapshot,
> a full copy) of my root filesystem, made when things seem reasonably
> stable and have been working for awhile, to an identically sized "backup
> root filesystem" located elsewhere. That way, I have effectively a fully
> operational system "snapshot" copy, taken when the system was known to be
> operational, complete with everything I normally use, X, KDE, firefox,
> media players, games, everything, and of course tested to boot and run as
> normal. No crippled semi-functional rescue media for me! =:^)
I accidently forced myself into using my USB3 backup drive as my rootfs due
to fiddling around with dracut build options without thinking about it too
much while waiting for my btrfs device add/del disk jockeying to migrate to
bcache. Long story short: I managed to strip dracut down to too few modules
and it lost its ability to mount anything and even could not spawn a shell.
*gnarf
And when that wasn't fun enough, my BIOS decided to no longer initialize USB
so I could neither get into BIOS nor into Grub shell. I don't know when that
problem happened. Probably been that for a while and I never noticed. Just
that it went a lot slower through BIOS after I managed to convince it to
initialize USB again (by opening the case and shorting the reset jumper).
The next fun part was: My backup was incomplete in a special way: It had no
directories dev, proc, run, sys and friends... Don't ask me how I solved
that, probably by "init=/bin/bash". It happens, because I used "rsync" with
the option to exclude those dirs. But well: In the end by backup was tested
bootable. :-)
I fixed by dracut setup and in the same procedure also fixed a long-standing
issue with "btrfs check" telling me nlink errors. Luckily, this newer
version could tell me the paths and I just delete those files in the chrome
profile and var/lib/bluetooth directory. I wonder if those errors were
causing me issues with chrome freezing the PC and bluetooth stopped working
sometimes.
And BTW: bcache is pretty fast, booting to graphical.target within 3-8
seconds (mostly around 5). Now I wonder what I need the resume swap for
which I created in the process: It takes longer to resume from swap than
just booting to complete KDE desktop. Well, without the benefit of having a
fully running session at least.
> Very flexible, this grub2 is! =:^)
I've been waiting long before doing the switch. But I had to use it when I
migrated from legacy to UEFI boot mode. Although every configuration bit
looked confusing and cumbersome, everything worked automatically out of the
box. Very suprising it is. :-)
> If I lose all three devices at once, I figure it's quite likely I'm
> dealing with a rather larger disaster, say a fire or flood or the like,
> and will probably have my hands full just surviving for awhile. When I
> do get back to worrying about the computer, likely after replacing what I
> lost in the disaster, it won't be that big a deal to start over
> downloading a live image and doing a new install from the stage-3
> starter. After all, the *REAL* important backup is in my head, and if I
> lose that, I guess I won't be worrying much about computers any more,
> even if I'm still "alive" in some facility somewhere. Tho I /do/ have
> some stuff backed up on USB thumb drive and the like as well. But I
> don't put much priority in it, because I figure if I'm having to restore
> from that backup in the first place, I'm pretty much screwed in any case,
> and the /last/ thing I'm likely to be worried about is having to start
> over with a new computer install.
>From my own experience, the head is not a very good backup. While there are
things which you simply cannot remember to rebuild, there are other things
which, when rebuilt from scratch, probably get better and more well thought
about but very frustrating to rebuild and thus never reach the same stage of
completeness again. So, no: Not a good backup. It's no fun even when I had
no other stuff to deal with...
But to get back to the multi-device btrfs booting issue: Thanks for
recommending "rootwait", I will try that. I had thought it would have no
effect if booting from initrd. Let's see if dracut+systemd with rootwait
will work for me, too.
--
Replies to list only preferred.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: btrfs raid-1 uuid-fstab
2015-02-16 23:15 ` Kai Krakow
@ 2015-02-18 1:12 ` Duncan
0 siblings, 0 replies; 8+ messages in thread
From: Duncan @ 2015-02-18 1:12 UTC (permalink / raw)
To: linux-btrfs
Kai Krakow posted on Tue, 17 Feb 2015 00:15:50 +0100 as excerpted:
> Long story short: I managed to strip dracut down to
> too few modules and it lost its ability to mount anything and even could
> not spawn a shell. *gnarf
Ouch!
FWIW, that's why I use a kernel built-in initramfs. If I upgrade dracut
or change its config and it fails to work, just as if the new kernel the
initramfs is appended to fails to work, I simply boot an older kernel...
with a known-working dracut-created initramfs.
Tho I /did/ have trouble with an older dracut locking to a particular
default-root UUID at one point, so it would boot any root= I pointed it
at, but *ONLY* as long as that particular UUID continued to exist!
Which is pretty hard to test for, since until you actually mkfs the
existing default-root, its UUID will continue to exist, and you'll never
know that your boot to the backup root using root= is working now, but
will fail as soon as the default-root ceases to exist, until you're
actually in the situation and can't boot, using any kernel/dracut
combination!
That did drop me to the dracut/initramfs shell, but I was new enough with
dracut at the time that I didn't really know how to fix it from there,
nor could I properly edit a file or even view an entire file (cat worked,
but that only let me see the last N lines and I didn't have a pager in
the initramfs), to try to read documentation and fix the issue.
What I finally did to get out of that hole was manually ln -s the /dev/
disk/by-uuid/* symlink that the dracut/initramfs scripts were looking for
based on the error, pointing it at an existing /dev/sdXN. It didn't have
to point at the root device, it could point at any device-block file, as
long as that device-block file actually existed.
I didn't originally file a bug on that as the host-only option
documentation warned about it being host-specific, so I figured it was
/designed/ to do that. Only later, when host-only was being discussed as
the gentoo-recommended default on gentoo-dev and I explained that it
wasn't always suitable as it broke if/when you blew away your default-
root and recreated it with a new UUID, and the gentoo dracut maintainer
asked why I hadn't filed a bug, did I figure out it /was/ a bug, not a
"confusingly documented feature". So I filed a bug and the gentoo
maintainer filed one upstream as well, and it was apparently fixed. But
of course by then I had long since worked around the problem with more
specific dracut-module include and exclude statements in the config,
instead of using host-only, and that was working and continues to work,
so I've never had reason to go back and test the more loosely specified
host-only mode, and thus have never confirmed whether the bug was
actually fixed or not, since I don't use that mode any more.
> And when that wasn't fun enough, my BIOS decided to no longer initialize
> USB so I could neither get into BIOS nor into Grub shell. I don't know
> when that problem happened. Probably been that for a while and I never
> noticed. Just that it went a lot slower through BIOS after I managed to
> convince it to initialize USB again (by opening the case and shorting
> the reset jumper).
Ouch. FWIW my mobo has dual-bios, which is nice, but I've been down the
bios-reset road before, several times.
I even had a BIOS update go bad once (due to bad RAM), screwed up the
last-ditch bios-rescue it offered as I didn't know what I was doing, and
had to use my netbook to setup a webmail account (didn't have the
passwords to my normal email as I don't normally keep anything private on
the netbook at all, in case I lose it, and couldn't access my other disks
without a device to convert them to external/USB) and order a new BIOS
shipped to me.
That is of course the big reason my new machine is dual-bios! =:^) Tho
it's not an absolute cure-all, as once it successfully boots from the
main BIOS it auto-overwrites the second one, if different. I'd actually
rather make the auto-overwrite bit manual, so I could update it only when
I was sufficiently sure it worked _reliably_, but oh, well, better than
not having a backup BIOS at all, as I learned from experience!
> The next fun part was: My backup was incomplete in a special way: It had
> no directories dev, proc, run, sys and friends... Don't ask me how I
> solved that, probably by "init=/bin/bash".
init=/bin/bash is indeed a very handy tool to have as a sysadmin. =:^)
I think I mentioned that setting that (via grub var) is actually one of
my grub2 menu options, in the backup menu, FWIW.
> It happens, because I used
> "rsync" with the option to exclude those dirs. But well: In the end by
> backup was tested bootable. :-)
>
> I fixed by dracut setup and in the same procedure also fixed a
> long-standing issue with "btrfs check" telling me nlink errors. Luckily,
> this newer version could tell me the paths and I just delete those files
> in the chrome profile and var/lib/bluetooth directory. I wonder if those
> errors were causing me issues with chrome freezing the PC and bluetooth
> stopped working sometimes.
Likely.
With all my filesystems being rather small, and having (tested) backup
versions of them available, I'd probably just ensure that I had a current
backup, blow the filesystem away and recreate it fresh, restoring from
backup, at the first sign of nlink errors or the like.
> And BTW: bcache is pretty fast, booting to graphical.target within 3-8
> seconds (mostly around 5). Now I wonder what I need the resume swap for
> which I created in the process: It takes longer to resume from swap than
> just booting to complete KDE desktop. Well, without the benefit of
> having a fully running session at least.
Since my main filesystems are all on ssd, I get that too. Tho I can say
I was rather surprised at how much faster systemd was than even
parallelized openrc. Systemd's demand-based socket setup, not making
final initialization of a daemon thru opening the socket a necessity
before starting other services that depend on that socket, probably has a
lot to do with that. Openrc can parallelize, but if a daemon must be up
and a socket it creates usable before another service depending on it can
start, it must be, and openrc doesn't have the demand-based socket
activation available to shortcut that, as systemd does. That makes more
difference than I expected.
As for resume-swap, back when my system was still on spinning rust, yes,
resume took longer, dramatically longer if I made the resume image big
enough to save and restore everything in cache, but it was still worth
it, because dropping cache as one did on reboot was EXPENSIVE, due to
having to reload all that stuff from slow spinning-rust over time.
But now that most of the system's on ssd (basically everything but the
media partition, with media being primarily serially accessed big files
anyway, such that speed doesn't make so much difference as long as it's
faster than the play-rate consumption, as even spinning-rust is for most
media... tho full 4k video may change that) cache is still faster, but
more like an order of magnitude faster instead of about three orders of
magnitude faster.
So dropping and having to reread gigabytes of cached files off of ssd
isn't the big deal it was when it was gigabytes of cached files off of
spinning rust.
But bcache will get most of that benefit, too. I went full ssd instead,
both because I did it a bit earlier, before bcache was mature, and
because ssds became cheap enough for my limited non-media requirements
that I decided it was worth throwing the required money at it, to avoid
the hassle of ssd-cache-of-still-spinning-rust vs. just ssd pretty much
everything.
But bigger ssds have continued to drop in price, and even my media files
requirements aren't /that/ big, particularly as local storage
increasingly is simply cache for media otherwise streamed from the net
anyway (they're advertising "gigablast" here, now, tho my own connection
remains under single-digit MiByte/sec, in-single-digit Mbit/sec), so I'll
probably just go full ssd, even for media files, at some point, if only
to be able to kill off the constant incremental power draw and noise of
spinning-rust, entirely.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-02-18 1:12 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-14 2:31 btrfs raid-1 uuid-fstab James
2015-02-14 11:52 ` Chris Murphy
2015-02-15 6:28 ` Duncan
2015-02-15 11:11 ` Kai Krakow
2015-02-16 5:50 ` Duncan
2015-02-16 23:15 ` Kai Krakow
2015-02-18 1:12 ` Duncan
2015-02-16 5:29 ` Chris Murphy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).