From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from pepin.polanet.pl ([193.34.52.2]:47712 "EHLO pepin.polanet.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753152AbeA3PYT (ORCPT ); Tue, 30 Jan 2018 10:24:19 -0500 Date: Tue, 30 Jan 2018 16:24:16 +0100 From: Tomasz Pala To: Btrfs BTRFS Subject: Re: degraded permanent mount option Message-ID: <20180130152415.GC7126@polanet.pl> References: <20180127132641.mhmdhpokqrahgd4n@angband.pl> <20180128003910.GA31699@polanet.pl> <20180128223946.GA26726@polanet.pl> <20180129085404.GA2500@polanet.pl> <20180129112456.r7ksq5mwp3ie6gmg@angband.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-2 In-Reply-To: Sender: linux-btrfs-owner@vger.kernel.org List-ID: On Mon, Jan 29, 2018 at 14:00:53 -0500, Austin S. Hemmelgarn wrote: > We already do so in the accepted standard manner. If the mount fails > because of a missing device, you get a very specific message in the > kernel log about it, as is the case for most other common errors (for > uncommon ones you usually just get a generic open_ctree error). This is > really the only option too, as the mount() syscall (which the mount > command calls) returns only 0 on success or -1 and an appropriate errno > value on failure, and we can't exactly go about creating a half dozen > new error numbers just for this (well, technically we could, but I very > much doubt that they would be accepted upstream, which defeats the purpose). This is exacly why the separate communication channel being the ioctl is currently used. And I really don't understand why do you fight against expanding this ioctl response. > With what you're proposing for BTRFS however, _everything_ is a > complicated decision, namely: > 1. Do you retry at all? During boot, the answer should usually be yes, > but during normal system operation it should normally be no (because we > should be letting the user handle issues at that point). This is exactly why I propose to introduce ioctl in btrfs.ko that accepts userspace-configured (as per-volume policy) expectations. > 2. How long should you wait before you retry? There is no right answer > here that will work in all cases (I've seen systems which take multiple > minutes for devices to become available on boot), especially considering > those of us who would rather have things fail early. btrfs-last-resort@.timer per analogy to mdadm-last-resort@.timer > 3. If the retry fails, do you retry again? How many times before it > just outright fails? This is going to be system specific policy. On > systems where devices may take a while to come online, the answer is > probably yes and some reasonably large number, while on systems where > devices are known to reliably be online immediately, it makes no sense > to retry more than once or twice. All of this is systemd timer/service job. > 4. If you are going to retry, should you try a degraded mount? Again, > this is going to be system specific policy (regular users would probably > want this to be a yes, while people who care about data integrity over > availability would likely want it to be a no). Just like above - user-configured in systemd timers/services easily. > 5. Assuming you do retry with the degraded mount, how many times should > a normal mount fail before things go degraded? This ties in with 3 and > has the same arguments about variability I gave there. As above. > 6. How many times do you try a degraded mount before just giving up? > Again, similar variability to 3. > 7. Should each attempt try first a regular mount and then a degraded > one, or do you try just normal a couple times and then switch to > degraded, or even start out trying normal and then start alternating? > Any of those patterns has valid arguments both for and against it, so > this again needs to be user configurable policy. > > Altogether, that's a total of 7 policy decisions that should be user > configurable. All of them easy to implement if the btrfs.ko could accept 'allow-degraded' per-volume instruction and return 'try-degraded' in the ioctl. > Having a config file other than /etc/fstab for the mount > command should probably be avoided for sanity reasons (again, BTRFS is a > filesystem, not a volume manager), so they would all have to be handled > through mount options. The kernel will additionally have to understand > that those options need to be ignored (things do try to mount > filesystems without calling a mount helper, most notably the kernel when > it mounts the root filesystem on boot if you're not using an initramfs). > All in all, this type of thing gets out of hand _very_ fast. You need to think about the two separately: 1. tracking STATE - this is remembering 'allow-degraded' option for now, 2. configured POLICY - this is to be handled by init system. -- Tomasz Pala