Migration to BTRFS

All of lore.kernel.org
 help / color / mirror / Atom feed

* Migration to BTRFS
@ 2019-04-28 19:35 Hendrik Friedel
  2019-04-28 20:14 ` Andrei Borzenkov
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Hendrik Friedel @ 2019-04-28 19:35 UTC (permalink / raw)
  To: linux-btrfs@vger.kernel.org

Hello,

I intend to move to BTRFS and of course I have some data already.
I currently have several single 4TB drives and I would like to move the 
Data onto new drives (2*8TB). I need no raid, as I prefer a backup. 
Nevertheless, having raid nice for availability. So why not in the end. 
I currently use ~6TB, so it may work, but I would be able to remove the 
redundancy later.

So, if I understand correctly, today I want
-m raid1 -d raid1

whereas later, I want
-m raid1 -d single

What is very important to me is, that with one failing drive, I have no 
risk of losing the whole filesystem, but only losing the affected drive. 
Is that possible with both of these variants?

Is it possible to move between the two (doing a balance, of course?
Any other thoughts/recommendations?

Greetings,
Hendrik

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration to BTRFS
  2019-04-28 19:35 Migration to BTRFS Hendrik Friedel
@ 2019-04-28 20:14 ` Andrei Borzenkov
  2019-04-29 11:43   ` Austin S. Hemmelgarn
  2019-04-28 20:46 ` waxhead
  2019-05-25 13:21 ` Hendrik Friedel
  2 siblings, 1 reply; 9+ messages in thread
From: Andrei Borzenkov @ 2019-04-28 20:14 UTC (permalink / raw)
  To: Hendrik Friedel, linux-btrfs@vger.kernel.org

28.04.2019 22:35, Hendrik Friedel пишет:
> Hello,
> 
> I intend to move to BTRFS and of course I have some data already.
> I currently have several single 4TB drives and I would like to move the
> Data onto new drives (2*8TB). I need no raid, as I prefer a backup.
> Nevertheless, having raid nice for availability. So why not in the end.
> I currently use ~6TB, so it may work, but I would be able to remove the
> redundancy later.
> 
> So, if I understand correctly, today I want
> -m raid1 -d raid1
> 
> whereas later, I want
> -m raid1 -d single
> 
> What is very important to me is, that with one failing drive, I have no
> risk of losing the whole filesystem, but only losing the affected drive.
> Is that possible with both of these variants?
> 

With "single" data profile you won't lose filesystem, but you will
irretrievably lose any data on the missing drive. Also "single" profile
does not support auto-healing (repairing of bad copy from good copy). If
this is acceptable to you, then yes, both variants will do what you want.

> Is it possible to move between the two (doing a balance, of course?

Yes as long as you have sufficient free space for target profile.

> Any other thoughts/recommendations?
> 

As of today there is no provision for automatic mounting of incomplete
multi-device btrfs in degraded mode. Actually, with systemd it is flat
impossible to mount incomplete btrfs because standard framework only
proceeds to mount it after all devices have been seen. As long as you do
not use systemd in initramfs you may be able to boot by passing suitable
root mount flags on kernel command line.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration to BTRFS
  2019-04-28 20:14 ` Andrei Borzenkov
@ 2019-04-29 11:43   ` Austin S. Hemmelgarn
       [not found]     ` <em12ddda3f-4221-4678-aa1c-0854489007e1@ryzen>
  0 siblings, 1 reply; 9+ messages in thread
From: Austin S. Hemmelgarn @ 2019-04-29 11:43 UTC (permalink / raw)
  To: Andrei Borzenkov, Hendrik Friedel, linux-btrfs@vger.kernel.org

On 2019-04-28 16:14, Andrei Borzenkov wrote:
> 28.04.2019 22:35, Hendrik Friedel пишет:
>> Hello,
>>
>> I intend to move to BTRFS and of course I have some data already.
>> I currently have several single 4TB drives and I would like to move the
>> Data onto new drives (2*8TB). I need no raid, as I prefer a backup.
>> Nevertheless, having raid nice for availability. So why not in the end.
>> I currently use ~6TB, so it may work, but I would be able to remove the
>> redundancy later.
>>
>> So, if I understand correctly, today I want
>> -m raid1 -d raid1
>>
>> whereas later, I want
>> -m raid1 -d single
>>
>> What is very important to me is, that with one failing drive, I have no
>> risk of losing the whole filesystem, but only losing the affected drive.
>> Is that possible with both of these variants?
>>
> 
> With "single" data profile you won't lose filesystem, but you will
> irretrievably lose any data on the missing drive. Also "single" profile
> does not support auto-healing (repairing of bad copy from good copy). If
> this is acceptable to you, then yes, both variants will do what you want.
Actually, it's a bit worse than this potentially.  You may lose 
individual files if you lose one disk with the proposed setup, but you 
may also lose _parts_ of individual files, especially if you have lots 
of large (>1-5GB in size) files.  And on top of this, finding what data 
went missing will essentially require trying to read every byte of every 
file in the volume.
> 
>> Is it possible to move between the two (doing a balance, of course?
> 
> Yes as long as you have sufficient free space for target profile.
> 
>> Any other thoughts/recommendations?
>>
> 
> As of today there is no provision for automatic mounting of incomplete
> multi-device btrfs in degraded mode. Actually, with systemd it is flat
> impossible to mount incomplete btrfs because standard framework only
> proceeds to mount it after all devices have been seen. As long as you do
> not use systemd in initramfs you may be able to boot by passing suitable
> root mount flags on kernel command line.
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

[parent not found: <em12ddda3f-4221-4678-aa1c-0854489007e1@ryzen>]

* Re: Migration to BTRFS
       [not found]     ` <em12ddda3f-4221-4678-aa1c-0854489007e1@ryzen>
@ 2019-04-29 17:20       ` Austin S. Hemmelgarn
  2019-04-29 17:31         ` Andrei Borzenkov
  0 siblings, 1 reply; 9+ messages in thread
From: Austin S. Hemmelgarn @ 2019-04-29 17:20 UTC (permalink / raw)
  To: Hendrik Friedel, Andrei Borzenkov, linux-btrfs@vger.kernel.org

On 2019-04-29 12:16, Hendrik Friedel wrote:
> Hello,
>>> With "single" data profile you won't lose filesystem, but you will
>>> irretrievably lose any data on the missing drive. Also "single" profile
>>> does not support auto-healing (repairing of bad copy from good copy). If
>>> this is acceptable to you, then yes, both variants will do what you want.
>> Actually, it's a bit worse than this potentially. You may lose 
>> individual files if you lose one disk with the proposed setup, but you 
>> may also lose _parts_ of individual files, especially if you have lots 
>> of large (>1-5GB in size) files.
> You mean if parts of the files are on the failed drive, or what do you 
> have in mind?
Yes, it's if parts of the files are on the failed drive. Essentially, if 
a file has more than one extent, then with the single profile those 
extents may be stored on different drives.  The common case for this is 
dealing with files larger than the data chunk size for the filesystem 
(typically between 1-5GB on most reasonably sized volumes), because an 
extent can't be larger than a chunk.
> 
>> And on top of this, finding what data went missing will essentially 
>> require trying to read every byte of every file in the volume.
> Why is that and how would it be done (scrub, I suppose?)
There's no other way short of scanning the filesystem internals to 
figure out what chunks would be present on a missing disk and then map 
the contents of those chunks to the files they are part of.  Ideally, 
this wouldn't be the case, but it's a unusual enough situation that it's 
just not been a priority to provide a tool to do it.

As far as the actual process itself, scrub is one way to do it, but it 
requires using a separate tool to map the inode numbers spit out by the 
scrub messages in the kernel logs to actual file names.  There are a 
bunch of other ways to do it too though.  Personally, I'd probably 
through something together in Python to try and read each file all the 
way through, bail if it hit _any_ IO error, and then log the names of 
files it found IO errors in, though even something just chaining `find` 
and `cat` together and then watching the kernel log for IO error 
messages would be enough.
> I am wondering, why the design of 'single' is that way? It seems to me, 
> that this is unneccessarily increasing the failure probability. My 
> thinking: If I have two separate file-systems, I have a FP of Z, with Z 
> the probability of one drive to fail. If I one btrfs-system in single 
> profile, I have a FP of Z^N, wheras it could -with a different design- 
> still be Z, no?
Yes, it is technically possible, you just place each file entirely on 
one device.  In fact, you can see this as a placement option in many 
distributed filesystems.  There are a couple of reasons it's not done 
with local filesystems backed with conventional block storage:

* It adds an extra layer of complexity.  In a distributed filesystem, or 
even with mhddfs, you already have a nice, easy to use filesystem 
interface (or an object-storage interface) so you don't have to handle 
block mapping.  With a local filesystem though, you still have to do 
block translation, which then becomes far more complicated because of 
the new, extra, constraint on where each block can go.
* It is very good at confusing regular end-users.  Assume you have to 
place a 4GB file on a volume arranged like this, but only have 2G of 
space left on each disk.  You still technically have 4G of free space, 
but you can't put the file on the volume because there isn't enough 
space on either disk for it.  This type of situation is extremely 
confusing for normal users, and is not all that uncommon in desktop 
usage scenarios.  BTRFS also already has issues like this to begin with, 
and adding another source for them is not a good idea.
* The exact benefits of this usually don't matter for (comparatively) 
small local storage devices.  The primary reason it's done at all is for 
big hosting companies so that they can trivially guarantee that services 
will be fully functional if they can actually see all the files.  For a 
regular user on a small desktop, it just doesn't matter in most cases.

>>> As of today there is no provision for automatic mounting of incomplete
>>> multi-device btrfs in degraded mode. Actually, with systemd it is flat
>>> impossible to mount incomplete btrfs because standard framework only
>>> proceeds to mount it after all devices have been seen.
> Do you talk about the mount during boot or about mounting in general?
Both, unless you do some heavy modifications of some of the standard 
installed files (you need to disable some specific udev rules and then 
replace the standard `mount.btrfs` wrapper that systemd uses).
> 
>  > If I where you, with your use case I would consider using mhddfs
>  > https://romanrm.net/mhddfs which is filesystem agnostic layer on top 
> of 2x [-m
>  > DUP, -d SINGLE] BTRFS drives. Last time I tested mhddfs (about 5+ 
> years ago) it
>  > was dead slow, but that might not be very important to you. For what 
> it does it
>  > works great!
> 
> In fact, that is what I am using today. But when using snapshots, this 
> would become a bit messy (having to do the snapshot on each device 
> separately, but identically.
> 
>  > remember that backup is not a backup unless it has a extra backup
> 
> I do have two backups (one offsite) of all data that is irreplacable and 
> one of data that is nice to have (TV-Recordings).
> 
> 
> Greetings,
> Hendrik
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration to BTRFS
  2019-04-29 17:20       ` Austin S. Hemmelgarn
@ 2019-04-29 17:31         ` Andrei Borzenkov
  2019-04-29 18:25           ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 9+ messages in thread
From: Andrei Borzenkov @ 2019-04-29 17:31 UTC (permalink / raw)
  To: Austin S. Hemmelgarn, Hendrik Friedel,
	linux-btrfs@vger.kernel.org

29.04.2019 20:20, Austin S. Hemmelgarn пишет:
> 
>>>> As of today there is no provision for automatic mounting of incomplete
>>>> multi-device btrfs in degraded mode. Actually, with systemd it is flat
>>>> impossible to mount incomplete btrfs because standard framework only
>>>> proceeds to mount it after all devices have been seen.
>> Do you talk about the mount during boot or about mounting in general?
> Both,

Sorry for chiming in, but the quoted part was mine, and I was speaking
about automatic mount during boot. Manual mount using "mount" command
after boot is of course possible (and does not involve systemd in any
way). There is systemd-mount tool which will likely have the same issue.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration to BTRFS
  2019-04-29 17:31         ` Andrei Borzenkov
@ 2019-04-29 18:25           ` Austin S. Hemmelgarn
  2019-04-30  3:27             ` Andrei Borzenkov
  0 siblings, 1 reply; 9+ messages in thread
From: Austin S. Hemmelgarn @ 2019-04-29 18:25 UTC (permalink / raw)
  To: Andrei Borzenkov, Hendrik Friedel, linux-btrfs@vger.kernel.org

On 2019-04-29 13:31, Andrei Borzenkov wrote:
> 29.04.2019 20:20, Austin S. Hemmelgarn пишет:
>>
>>>>> As of today there is no provision for automatic mounting of incomplete
>>>>> multi-device btrfs in degraded mode. Actually, with systemd it is flat
>>>>> impossible to mount incomplete btrfs because standard framework only
>>>>> proceeds to mount it after all devices have been seen.
>>> Do you talk about the mount during boot or about mounting in general?
>> Both,
> 
> Sorry for chiming in, but the quoted part was mine, and I was speaking
> about automatic mount during boot. Manual mount using "mount" command
> after boot is of course possible (and does not involve systemd in any
> way). There is systemd-mount tool which will likely have the same issue.
> 
Based on my own experience, it still has issues in some cases, even if 
mounted manually.  In the past, I've had systemd _unmount_ degraded 
BTRFS volumes I had just manually mounted because it thought they 
shouldn't be mounted (because devices were missing, therefore the device 
ready ioctl was returning false).  Only ever seems to happen for volumes 
in `/etc/fstab` or managed as native mount units, but still an issue.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration to BTRFS
  2019-04-29 18:25           ` Austin S. Hemmelgarn
@ 2019-04-30  3:27             ` Andrei Borzenkov
  0 siblings, 0 replies; 9+ messages in thread
From: Andrei Borzenkov @ 2019-04-30  3:27 UTC (permalink / raw)
  To: Austin S. Hemmelgarn, Hendrik Friedel,
	linux-btrfs@vger.kernel.org

29.04.2019 21:25, Austin S. Hemmelgarn пишет:
> On 2019-04-29 13:31, Andrei Borzenkov wrote:
>> 29.04.2019 20:20, Austin S. Hemmelgarn пишет:
>>>
>>>>>> As of today there is no provision for automatic mounting of
>>>>>> incomplete
>>>>>> multi-device btrfs in degraded mode. Actually, with systemd it is
>>>>>> flat
>>>>>> impossible to mount incomplete btrfs because standard framework only
>>>>>> proceeds to mount it after all devices have been seen.
>>>> Do you talk about the mount during boot or about mounting in general?
>>> Both,
>>
>> Sorry for chiming in, but the quoted part was mine, and I was speaking
>> about automatic mount during boot. Manual mount using "mount" command
>> after boot is of course possible (and does not involve systemd in any
>> way). There is systemd-mount tool which will likely have the same issue.
>>
> Based on my own experience, it still has issues in some cases, even if
> mounted manually.  In the past, I've had systemd _unmount_ degraded
> BTRFS volumes I had just manually mounted because it thought they
> shouldn't be mounted (because devices were missing, therefore the device
> ready ioctl was returning false).  Only ever seems to happen for volumes
> in `/etc/fstab` or managed as native mount units, but still an issue.
> 

Ah, OK, that's true and has been plaguing systemd users for quite some
time. It should be fixed in current systemd which hopefully no more
decides to unmount filesystem even when it believes underlying device is
not present. Here is initial commit:

commit 628c89cc68ab96fce2de7ebba5933725d147aecc
Author: Lennart Poettering <lennart@poettering.net>
Date:   Fri Feb 27 21:55:08 2015 +0100

    core: rework device state logic

    This change introduces a new state "tentative" for device units. Device
    units are considered "plugged" when udev announced them, "dead" when
    they are not available in the kernel, and "tentative" when they are
    referenced in /proc/self/mountinfo or /proc/swaps but not (yet)
    announced via udev.

    This should fix a race when device nodes (like loop devices) are created
    and immediately mounted. Previously, systemd might end up seeing the
    mount unit before the device, and would thus pull down the mount because
    its BindTo dependency on the device would not be fulfilled.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration to BTRFS
  2019-04-28 19:35 Migration to BTRFS Hendrik Friedel
  2019-04-28 20:14 ` Andrei Borzenkov
@ 2019-04-28 20:46 ` waxhead
  2019-05-25 13:21 ` Hendrik Friedel
  2 siblings, 0 replies; 9+ messages in thread
From: waxhead @ 2019-04-28 20:46 UTC (permalink / raw)
  To: Hendrik Friedel, linux-btrfs@vger.kernel.org

Hendrik Friedel wrote:
> Hello,
> 
> I intend to move to BTRFS and of course I have some data already.
> I currently have several single 4TB drives and I would like to move the 
> Data onto new drives (2*8TB). I need no raid, as I prefer a backup. 
> Nevertheless, having raid nice for availability. So why not in the end. 
> I currently use ~6TB, so it may work, but I would be able to remove the 
> redundancy later.
> 
> So, if I understand correctly, today I want
> -m raid1 -d raid1
> 
> whereas later, I want
> -m raid1 -d single
> 
> What is very important to me is, that with one failing drive, I have no 
> risk of losing the whole filesystem, but only losing the affected drive. 
> Is that possible with both of these variants?
> 
> Is it possible to move between the two (doing a balance, of course?
> Any other thoughts/recommendations?
> 
If I where you, with your use case I would consider using mhddfs 
https://romanrm.net/mhddfs which is filesystem agnostic layer on top of 
2x [-m DUP, -d SINGLE] BTRFS drives. Last time I tested mhddfs (about 5+ 
years ago) it was dead slow, but that might not be very important to 
you. For what it does it works great!

If you label your device DISK_A and DISK_B and then your backups 
BACKUP_A and BACKUP_B you just have to copy back the A or B set if one 
disk fails.

And before Duncan jumps in , remember that backup is not a backup unless 
it has a extra backup ... But seriously (really, seriously!!), read any 
of Duncan's posts. He does a stellar job of explaining why you need to 
have have tested, working backups of data you care about!


> Greetings,
> Hendrik
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration to BTRFS
  2019-04-28 19:35 Migration to BTRFS Hendrik Friedel
  2019-04-28 20:14 ` Andrei Borzenkov
  2019-04-28 20:46 ` waxhead
@ 2019-05-25 13:21 ` Hendrik Friedel
  2 siblings, 0 replies; 9+ messages in thread
From: Hendrik Friedel @ 2019-05-25 13:21 UTC (permalink / raw)
  To: linux-btrfs@vger.kernel.org

Hello

now after the filesystem worked fine as a single drive for a while, I'd 
like to add the second device.

Status:
btrfs fi show .
Label: 'DataPool1' uuid: c4a6a2c9-5cf0-49b8-812a-0784953f9ba3
         Total devices 1 FS bytes used 6.61TiB
         devid 1 size 7.28TiB used 6.89TiB path /dev/sdh1


>I intend to move to BTRFS and of course I have some data already.
>I currently have several single 4TB drives and I would like to move the Data onto new drives (2*8TB). I need no raid, as I prefer a backup. Nevertheless, having raid nice for availability. So why not in the end. I currently use ~6TB, so it may work, but I would be able to remove the redundancy later.
>
>So, if I understand correctly, today I want
>-m raid1 -d raid1
>
>whereas later, I want
>-m raid1 -d single
>
>What is very important to me is, that with one failing drive, I have no risk of losing the whole filesystem, but only losing the affected drive. Is that possible with both of these variants?

So, now I'd like to go this step:
-m raid1 -d raid1

Is it correct to:
btrfs device add /dev/sdd /srv/DataPool
btrfs balance start -dconvert=raid1 -mconvert=raid1

Or is there anything else, that I need to take care off?

There is not so much space left. Is it sufficient for the balance?

Regards,
Hendrik


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-05-25 13:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-04-28 19:35 Migration to BTRFS Hendrik Friedel
2019-04-28 20:14 ` Andrei Borzenkov
2019-04-29 11:43   ` Austin S. Hemmelgarn
     [not found]     ` <em12ddda3f-4221-4678-aa1c-0854489007e1@ryzen>
2019-04-29 17:20       ` Austin S. Hemmelgarn
2019-04-29 17:31         ` Andrei Borzenkov
2019-04-29 18:25           ` Austin S. Hemmelgarn
2019-04-30  3:27             ` Andrei Borzenkov
2019-04-28 20:46 ` waxhead
2019-05-25 13:21 ` Hendrik Friedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.