All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Brown <david.brown@hesbynett.no>
To: Francis Moreau <francis.moro@gmail.com>,
	Chris Murphy <lists@colorremedies.com>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: Soft RAID and EFI systems
Date: Tue, 04 Feb 2014 10:35:15 +0100	[thread overview]
Message-ID: <52F0B453.3070108@hesbynett.no> (raw)
In-Reply-To: <52F0AD83.4030300@gmail.com>

On 04/02/14 10:06, Francis Moreau wrote:
> On 02/04/2014 09:57 AM, David Brown wrote:
>> On 04/02/14 09:32, Francis Moreau wrote:
>>> On 02/02/2014 11:30 PM, Chris Murphy wrote:
>>>>
>>>> On Feb 2, 2014, at 2:34 PM, Francis Moreau <francis.moro@gmail.com>
>>>> wrote:
>>>>>
>>>>> That's funny because one of the reasons I want to use UEFI
>>>>> firmware is to get rid of grub (I don't like it and the way it
>>>>> has become such a bloated beast): since /boot is vfat and has its
>>>>> own partition, I prefer use a much simpler bootloader such as
>>>>> gummyboot.
>>>>
>>>> It might be possible to do what you want with mdadm metadata
>>>> version 1.0. Typically bootable raid1 is ext4 on md raid1 using
>>>> metadata format 1.0, and an internal bitmap. When the partitions
>>>> are not assembled, they each appear as separate ext4 partitions. If
>>>> FAT32 on md raid1 with metadata 1.0 still looks like FAT32 as a
>>>> separate partition, and the mdadm v1.0 metadata at the end of the
>>>> partition doesn't confuse the firmware, what should happen is any
>>>> ESP can boot the system. Once the kernel and initramfs are loaded,
>>>> mdadm will locate the mdadm metadata on each partition and assemble
>>>> them into a single md device, and fstab mounts the md device at
>>>> /boot. So prior to boot they are separate ESPs, and after boot it's
>>>> a single ESP (mirrored). But I haven't tested this arrangement with
>>>> ESPs and UEFI.
>>>
>>> I'll test this configuration and see if it works soon.
>>>
>>>>
>>>> The easiest scenario I've found for resilient boot on EFI systems
>>>> is, well, not easy. First, I put shim and grub package files onto
>>>> each ESP along with the previously posted grub.cfg snippet. Those
>>>> grub.cfgs are one time, non-updatable files, that point to
>>>> /boot/grub2/grub.cfg (produced with grub2-mkconfig on Fedora) on
>>>> Btrfs raid1. That's about as reliable as it gets because the only
>>>> dependencies are grub (which understands Btrfs multiple devices)
>>>> and dracut baking the btrfs module into initramfs. It gets
>>>> essentially fool proof if btrfs is compiled into the kernel. Other
>>>> combinations are easier to break. I basically want ESPs that aren't
>>>> being modified if at all avoidable because FAT32 breaks easily if
>>>> anything is being written to it and there is a crash or power
>>>> failure.
>>>>
>>>
>>> I agree that FAT32 can break during power failure, that's the reason
>>> why I'm trying to make it mirrored. But I want to get rid of grub as
>>> much as possible so I would prefer to use the first solution.
>>
>> Mirroring will not help FAT32 during power failure - you have a good
>> chance of getting two copies of the same error.  And if your power fail
>> hits during writes, you also have a good chance of the two disks having
>> /different/ errors and inconsistencies.  The problem lies in FAT32
>> having no log, and no barriers or ordering when it makes changes -
>> updates to the file data, the directory structure, and the FAT table can
>> happen in different orders, and a power failure can leave one part
>> updated and the other part with old data.  Raid cannot help with this
>> problem.
> 
> Ok, so basically RAID helps only in case of disk failure, right ?

Exactly correct (where "disk failure" includes both complete failure of
the disk, and unrecoverable read errors).  Raid does not help against
corruption due to power fails (if you have a raid card with a battery
backup, and a filesystem with journalling, it should help here), and it
does not help against the most common cause of data loss - human error!

> 
> It seems odd to have chosen FAT32 in the first place then.

FAT32 is the worst possible choice of a filesystem, except for three
aspects - it is quite simple and can be implemented in a small amount of
code (such as in EFI or a bootloader), it is usable on small disks or
partitions, and it is supported by brain-dead OS's that don't understand
better alternatives (NTFS has journalling, but is a monster to implement
in something the size of EFI).

It's a crap filesystem, but it is the "industry standard" for small
disks and small systems.

> 
>>
>> The most important way to protect your FAT32 system is simply to avoid
>> writing to it except when absolutely necessary.  If it is mounted
>> read-only, and only updated when changing grub or updating the kernel,
>> then just make sure you don't power-cycle your machine at that time.
> 
> Well, the problem is that you never know when power failures happen at
> least for me with a small server without any power backup.

The answer here is staring you in the face... get an UPS.  A small one
is not expensive - you only need it to run the server for a couple of
minutes.  Even though journalled filesystems can keep their /metadata/
consistency after a power failure, they don't normally guarantee /data/
consistency, and certainly cannot guarantee /application level/
consistency.  You get that from doing a proper shutdown.  And remember
also that after an unclean shutdown, restarts involve long consistency
checks at the raid level and at the filesystem level - an UPS will let
you avoid that.

> 
>> The smaller the critical window, the smaller the chances of problems.
>>
>> If you need to do updates more regularly, then your best bet is to have
>> independent FAT32 partitions on the two disks.  Make your updates on one
>> disk, and when it is finished copy the changes onto the other disk.
>> Then you always have a good copy - if you get a crash while the first
>> disk is being updated, then when you re-start the computer, use its boot
>> menu to choose booting from the second disk.
> 
> That seems the best thing to do then.
> 
> Thanks.
> 
> 


  reply	other threads:[~2014-02-04  9:35 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-31 17:02 Soft RAID and EFI systems Francis Moreau
2014-02-01 22:04 ` Martin Wilck
2014-02-02 21:39   ` Francis Moreau
2014-02-02 21:56     ` Martin Wilck
2014-02-02 20:39 ` Chris Murphy
2014-02-02 21:34   ` Francis Moreau
2014-02-02 22:30     ` Chris Murphy
2014-02-02 22:57       ` Phil Turmel
2014-02-03  7:19         ` Martin Wilck
2014-02-04  8:41         ` Francis Moreau
2014-02-04  8:48           ` David Brown
2014-02-04  8:53             ` Francis Moreau
2014-02-04 12:27             ` Phil Turmel
2014-02-04 15:13             ` Chris Murphy
2014-02-04 15:29               ` Chris Murphy
2014-02-07  7:42               ` Francis Moreau
2014-02-04  8:32       ` Francis Moreau
2014-02-04  8:57         ` David Brown
2014-02-04  9:06           ` Francis Moreau
2014-02-04  9:35             ` David Brown [this message]
2014-02-04  9:45               ` Francis Moreau
2014-02-04 15:27             ` Chris Murphy
2014-02-04 15:40           ` Chris Murphy
2014-02-04 14:50         ` Chris Murphy
2014-02-07  8:00           ` Francis Moreau
2014-02-03  9:56 ` David Brown
2014-02-04  8:22   ` Francis Moreau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52F0B453.3070108@hesbynett.no \
    --to=david.brown@hesbynett.no \
    --cc=francis.moro@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.