Purposely using btrfs RAID1 in degraded mode ?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Purposely using btrfs RAID1 in degraded mode ?
@ 2016-01-04 17:00 Alphazo
  2016-01-04 17:41 ` Chris Murphy
  2016-01-05 16:34 ` Psalle
  0 siblings, 2 replies; 11+ messages in thread
From: Alphazo @ 2016-01-04 17:00 UTC (permalink / raw)
  To: linux-btrfs

Hello,

My picture library today lies on an external hard drive that I sync on
a regular basis with a couple of servers and other external drives.
I'm interested by the on-the-fly checksum brought by btrfs and would
like to get your opinion on the following unusual use case that I have
tested:
- Create a btrfs with the two drives with RAID1
- When at home I can work with the two drives connected so I can enjoy
the self-healing feature if a bit goes mad so I only backup perfect
copies to my backup servers.
- When not at home I only bring one external drive and manually mount
it in degraded mode so I can continue working on my pictures while
still having checksum error detection (but not correction).
- When coming back home I can plug-back the seconde drive and initiate
a scrub or balance to get the second drive duplicated.

I have tested the above use case with a couple of USB flash drive and
even used btrfs over dm-crypt partitions and it seemed to work fine
but I wanted to get some advices from the community if this is really
a bad practice that should not be used on the long run. Is there any
limitation/risk to read/write to/from a degraded filesystem knowing it
will be re-synced later?

Thanks
alphazo

PS: I have also investigated the RAID1 on a single drive with two
partitions but I cannot afford the half capacity resulting from that
approach.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Purposely using btrfs RAID1 in degraded mode ?
  2016-01-04 17:00 Purposely using btrfs RAID1 in degraded mode ? Alphazo
@ 2016-01-04 17:41 ` Chris Murphy
  2016-01-06 12:30   ` Alphazo
  2016-01-09 10:08   ` Duncan
  2016-01-05 16:34 ` Psalle
  1 sibling, 2 replies; 11+ messages in thread
From: Chris Murphy @ 2016-01-04 17:41 UTC (permalink / raw)
  To: Alphazo; +Cc: Btrfs BTRFS

On Mon, Jan 4, 2016 at 10:00 AM, Alphazo <alphazo@gmail.com> wrote:

> I have tested the above use case with a couple of USB flash drive and
> even used btrfs over dm-crypt partitions and it seemed to work fine
> but I wanted to get some advices from the community if this is really
> a bad practice that should not be used on the long run. Is there any
> limitation/risk to read/write to/from a degraded filesystem knowing it
> will be re-synced later?

As long as you realize you're testing a sort of edge case, but an
important one (it should work, that's the point of rw degraded mounts
being possible), then I think it's fine.

The warning though is, you need to designate a specific drive for the
rw,degraded mounts. If you were to separately rw,degraded mount the
two drives, the fs will become irreparably corrupt if they are
rejoined. And you'll probably lose everything on the volume. The other
thing is that to "resync" you have to manually initiate a scrub, it's
not going to resync automatically, and it has to read everything on
both drives to compare and fix what's missing. There is no equivalent
to a write intent bitmap on Btrfs like with mdadm (the information
ostensibly could be inferred from btrfs generation metadata similar to
how incremental snapshot send/receive works) but that work isn't done.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Purposely using btrfs RAID1 in degraded mode ?
  2016-01-04 17:00 Purposely using btrfs RAID1 in degraded mode ? Alphazo
  2016-01-04 17:41 ` Chris Murphy
@ 2016-01-05 16:34 ` Psalle
  2016-01-06 12:34   ` Alphazo
  1 sibling, 1 reply; 11+ messages in thread
From: Psalle @ 2016-01-05 16:34 UTC (permalink / raw)
  To: Alphazo, linux-btrfs

Hello Alphazo,

I am a mere btrfs user, but given the discussions I regularly see here 
about difficulties with degraded filesystems I wouldn't rely on this 
(yet?) as a regular work strategy, even if it's supposed to work.

If you're familiar with git, perhaps git-annex could be an alternative.

-Psalle.

On 04/01/16 18:00, Alphazo wrote:
> Hello,
>
> My picture library today lies on an external hard drive that I sync on
> a regular basis with a couple of servers and other external drives.
> I'm interested by the on-the-fly checksum brought by btrfs and would
> like to get your opinion on the following unusual use case that I have
> tested:
> - Create a btrfs with the two drives with RAID1
> - When at home I can work with the two drives connected so I can enjoy
> the self-healing feature if a bit goes mad so I only backup perfect
> copies to my backup servers.
> - When not at home I only bring one external drive and manually mount
> it in degraded mode so I can continue working on my pictures while
> still having checksum error detection (but not correction).
> - When coming back home I can plug-back the seconde drive and initiate
> a scrub or balance to get the second drive duplicated.
>
> I have tested the above use case with a couple of USB flash drive and
> even used btrfs over dm-crypt partitions and it seemed to work fine
> but I wanted to get some advices from the community if this is really
> a bad practice that should not be used on the long run. Is there any
> limitation/risk to read/write to/from a degraded filesystem knowing it
> will be re-synced later?
>
> Thanks
> alphazo
>
> PS: I have also investigated the RAID1 on a single drive with two
> partitions but I cannot afford the half capacity resulting from that
> approach.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Purposely using btrfs RAID1 in degraded mode ?
  2016-01-04 17:41 ` Chris Murphy
@ 2016-01-06 12:30   ` Alphazo
  2016-01-09 10:08   ` Duncan
  1 sibling, 0 replies; 11+ messages in thread
From: Alphazo @ 2016-01-06 12:30 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

Thanks Chris for the warning. I agree that mounting both drives
separately in degraded r/w will lead to very funky results when trying
to scrub them when put together.

On Mon, Jan 4, 2016 at 6:41 PM, Chris Murphy <lists@colorremedies.com> wrote:
> On Mon, Jan 4, 2016 at 10:00 AM, Alphazo <alphazo@gmail.com> wrote:
>
>> I have tested the above use case with a couple of USB flash drive and
>> even used btrfs over dm-crypt partitions and it seemed to work fine
>> but I wanted to get some advices from the community if this is really
>> a bad practice that should not be used on the long run. Is there any
>> limitation/risk to read/write to/from a degraded filesystem knowing it
>> will be re-synced later?
>
> As long as you realize you're testing a sort of edge case, but an
> important one (it should work, that's the point of rw degraded mounts
> being possible), then I think it's fine.
>
> The warning though is, you need to designate a specific drive for the
> rw,degraded mounts. If you were to separately rw,degraded mount the
> two drives, the fs will become irreparably corrupt if they are
> rejoined. And you'll probably lose everything on the volume. The other
> thing is that to "resync" you have to manually initiate a scrub, it's
> not going to resync automatically, and it has to read everything on
> both drives to compare and fix what's missing. There is no equivalent
> to a write intent bitmap on Btrfs like with mdadm (the information
> ostensibly could be inferred from btrfs generation metadata similar to
> how incremental snapshot send/receive works) but that work isn't done.
>
>
>
>
> --
> Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Purposely using btrfs RAID1 in degraded mode ?
  2016-01-05 16:34 ` Psalle
@ 2016-01-06 12:34   ` Alphazo
  2016-01-07 12:57     ` Psalle
  0 siblings, 1 reply; 11+ messages in thread
From: Alphazo @ 2016-01-06 12:34 UTC (permalink / raw)
  To: Psalle; +Cc: Btrfs BTRFS

Thanks Psalle. This is the kind of feedback I was looking for. I do
realize that using a filesystem in a degraded mode is not the wisest
thing to do. While I looked at git-annex I'm not sure it can help to
solve bit-rot detection. Now I noticed that my current backup solution
borg-backup also has a checksum verification feature so I can at least
detect errors. In addition it provides incremental deduplicated backup
so it should get me covered if I discover that something went wrong.

alphazo

On Tue, Jan 5, 2016 at 5:34 PM, Psalle <psalleetsile@gmail.com> wrote:
> Hello Alphazo,
>
> I am a mere btrfs user, but given the discussions I regularly see here about
> difficulties with degraded filesystems I wouldn't rely on this (yet?) as a
> regular work strategy, even if it's supposed to work.
>
> If you're familiar with git, perhaps git-annex could be an alternative.
>
> -Psalle.
>
>
> On 04/01/16 18:00, Alphazo wrote:
>>
>> Hello,
>>
>> My picture library today lies on an external hard drive that I sync on
>> a regular basis with a couple of servers and other external drives.
>> I'm interested by the on-the-fly checksum brought by btrfs and would
>> like to get your opinion on the following unusual use case that I have
>> tested:
>> - Create a btrfs with the two drives with RAID1
>> - When at home I can work with the two drives connected so I can enjoy
>> the self-healing feature if a bit goes mad so I only backup perfect
>> copies to my backup servers.
>> - When not at home I only bring one external drive and manually mount
>> it in degraded mode so I can continue working on my pictures while
>> still having checksum error detection (but not correction).
>> - When coming back home I can plug-back the seconde drive and initiate
>> a scrub or balance to get the second drive duplicated.
>>
>> I have tested the above use case with a couple of USB flash drive and
>> even used btrfs over dm-crypt partitions and it seemed to work fine
>> but I wanted to get some advices from the community if this is really
>> a bad practice that should not be used on the long run. Is there any
>> limitation/risk to read/write to/from a degraded filesystem knowing it
>> will be re-synced later?
>>
>> Thanks
>> alphazo
>>
>> PS: I have also investigated the RAID1 on a single drive with two
>> partitions but I cannot afford the half capacity resulting from that
>> approach.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Purposely using btrfs RAID1 in degraded mode ?
  2016-01-06 12:34   ` Alphazo
@ 2016-01-07 12:57     ` Psalle
  2016-01-07 13:09       ` Alphazo
  0 siblings, 1 reply; 11+ messages in thread
From: Psalle @ 2016-01-07 12:57 UTC (permalink / raw)
  To: Alphazo; +Cc: Btrfs BTRFS

On 06/01/16 13:34, Alphazo wrote:
> Thanks Psalle. This is the kind of feedback I was looking for. I do
> realize that using a filesystem in a degraded mode is not the wisest
> thing to do. While I looked at git-annex I'm not sure it can help to
> solve bit-rot detection. Now I noticed that my current backup solution
> borg-backup also has a checksum verification feature so I can at least
> detect errors. In addition it provides incremental deduplicated backup
> so it should get me covered if I discover that something went wrong.

Is that bup? I see it isn't, I guess they're similar. That is 
interesting too.

Git (or hg or similar) helps with bit rot because 'git fsck' will check 
the hashes of the objects in the repository. If you detected a problem 
you could re-clone from the good copy (assuming you have two drives with 
the same repository in each one). Admittedly, it's a purely manual 
method but is better than being unable to detect problems at all. 
git-annex is a layer on top of git that automates things to some extent 
and is tailored to large files, although the learning curve is not 
shallow in my experience.

-Psalle.

>
> alphazo
>
> On Tue, Jan 5, 2016 at 5:34 PM, Psalle <psalleetsile@gmail.com> wrote:
>> Hello Alphazo,
>>
>> I am a mere btrfs user, but given the discussions I regularly see here about
>> difficulties with degraded filesystems I wouldn't rely on this (yet?) as a
>> regular work strategy, even if it's supposed to work.
>>
>> If you're familiar with git, perhaps git-annex could be an alternative.
>>
>> -Psalle.
>>
>>
>> On 04/01/16 18:00, Alphazo wrote:
>>> Hello,
>>>
>>> My picture library today lies on an external hard drive that I sync on
>>> a regular basis with a couple of servers and other external drives.
>>> I'm interested by the on-the-fly checksum brought by btrfs and would
>>> like to get your opinion on the following unusual use case that I have
>>> tested:
>>> - Create a btrfs with the two drives with RAID1
>>> - When at home I can work with the two drives connected so I can enjoy
>>> the self-healing feature if a bit goes mad so I only backup perfect
>>> copies to my backup servers.
>>> - When not at home I only bring one external drive and manually mount
>>> it in degraded mode so I can continue working on my pictures while
>>> still having checksum error detection (but not correction).
>>> - When coming back home I can plug-back the seconde drive and initiate
>>> a scrub or balance to get the second drive duplicated.
>>>
>>> I have tested the above use case with a couple of USB flash drive and
>>> even used btrfs over dm-crypt partitions and it seemed to work fine
>>> but I wanted to get some advices from the community if this is really
>>> a bad practice that should not be used on the long run. Is there any
>>> limitation/risk to read/write to/from a degraded filesystem knowing it
>>> will be re-synced later?
>>>
>>> Thanks
>>> alphazo
>>>
>>> PS: I have also investigated the RAID1 on a single drive with two
>>> partitions but I cannot afford the half capacity resulting from that
>>> approach.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Purposely using btrfs RAID1 in degraded mode ?
  2016-01-07 12:57     ` Psalle
@ 2016-01-07 13:09       ` Alphazo
  2016-01-07 17:34         ` Sree Harsha Totakura
  2016-01-11 14:25         ` Psalle
  0 siblings, 2 replies; 11+ messages in thread
From: Alphazo @ 2016-01-07 13:09 UTC (permalink / raw)
  To: Psalle; +Cc: Btrfs BTRFS

I'm a former bup user but I switched to borgbackup
https://borgbackup.readthedocs.org/en/stable/ which is a more active
fork of Attic and that solves two issues I had with bup: increasing
time required to perform the incremental backup on large dataset with
only few modifications and more importantly the impossibility to prune
older backups. Also borgbackup natively supports encryption (AES256)
and authentication (HMAC-SHA256).

For offline long term backups I also used to work with hashdeep to
perform and store a hash of all the files and recently started playing
with FIM https://evrignaud.github.io/fim/ which is similar but with a
git backend for storing history. Don't get fooled by fim being a java
application. It easily outperformed hashdeep on large datasets.

Alphazo

On Thu, Jan 7, 2016 at 1:57 PM, Psalle <psalleetsile@gmail.com> wrote:
> On 06/01/16 13:34, Alphazo wrote:
>>
>> Thanks Psalle. This is the kind of feedback I was looking for. I do
>> realize that using a filesystem in a degraded mode is not the wisest
>> thing to do. While I looked at git-annex I'm not sure it can help to
>> solve bit-rot detection. Now I noticed that my current backup solution
>> borg-backup also has a checksum verification feature so I can at least
>> detect errors. In addition it provides incremental deduplicated backup
>> so it should get me covered if I discover that something went wrong.
>
>
> Is that bup? I see it isn't, I guess they're similar. That is interesting
> too.
>
> Git (or hg or similar) helps with bit rot because 'git fsck' will check the
> hashes of the objects in the repository. If you detected a problem you could
> re-clone from the good copy (assuming you have two drives with the same
> repository in each one). Admittedly, it's a purely manual method but is
> better than being unable to detect problems at all. git-annex is a layer on
> top of git that automates things to some extent and is tailored to large
> files, although the learning curve is not shallow in my experience.
>
> -Psalle.
>
>
>>
>> alphazo
>>
>> On Tue, Jan 5, 2016 at 5:34 PM, Psalle <psalleetsile@gmail.com> wrote:
>>>
>>> Hello Alphazo,
>>>
>>> I am a mere btrfs user, but given the discussions I regularly see here
>>> about
>>> difficulties with degraded filesystems I wouldn't rely on this (yet?) as
>>> a
>>> regular work strategy, even if it's supposed to work.
>>>
>>> If you're familiar with git, perhaps git-annex could be an alternative.
>>>
>>> -Psalle.
>>>
>>>
>>> On 04/01/16 18:00, Alphazo wrote:
>>>>
>>>> Hello,
>>>>
>>>> My picture library today lies on an external hard drive that I sync on
>>>> a regular basis with a couple of servers and other external drives.
>>>> I'm interested by the on-the-fly checksum brought by btrfs and would
>>>> like to get your opinion on the following unusual use case that I have
>>>> tested:
>>>> - Create a btrfs with the two drives with RAID1
>>>> - When at home I can work with the two drives connected so I can enjoy
>>>> the self-healing feature if a bit goes mad so I only backup perfect
>>>> copies to my backup servers.
>>>> - When not at home I only bring one external drive and manually mount
>>>> it in degraded mode so I can continue working on my pictures while
>>>> still having checksum error detection (but not correction).
>>>> - When coming back home I can plug-back the seconde drive and initiate
>>>> a scrub or balance to get the second drive duplicated.
>>>>
>>>> I have tested the above use case with a couple of USB flash drive and
>>>> even used btrfs over dm-crypt partitions and it seemed to work fine
>>>> but I wanted to get some advices from the community if this is really
>>>> a bad practice that should not be used on the long run. Is there any
>>>> limitation/risk to read/write to/from a degraded filesystem knowing it
>>>> will be re-synced later?
>>>>
>>>> Thanks
>>>> alphazo
>>>>
>>>> PS: I have also investigated the RAID1 on a single drive with two
>>>> partitions but I cannot afford the half capacity resulting from that
>>>> approach.
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>>>> in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Purposely using btrfs RAID1 in degraded mode ?
  2016-01-07 13:09       ` Alphazo
@ 2016-01-07 17:34         ` Sree Harsha Totakura
  2016-01-11 14:25         ` Psalle
  1 sibling, 0 replies; 11+ messages in thread
From: Sree Harsha Totakura @ 2016-01-07 17:34 UTC (permalink / raw)
  To: Alphazo, Psalle; +Cc: Btrfs BTRFS

Thank you Alphazo; I have been looking for something similar for a
while, but didn't know where/what to look at.
Your pointers solve my problem.

Regards,
Sree
On 01/07/2016 02:09 PM, Alphazo wrote:
> I'm a former bup user but I switched to borgbackup
> https://borgbackup.readthedocs.org/en/stable/ which is a more active
> fork of Attic and that solves two issues I had with bup: increasing
> time required to perform the incremental backup on large dataset with
> only few modifications and more importantly the impossibility to prune
> older backups. Also borgbackup natively supports encryption (AES256)
> and authentication (HMAC-SHA256).
> 
> For offline long term backups I also used to work with hashdeep to
> perform and store a hash of all the files and recently started playing
> with FIM https://evrignaud.github.io/fim/ which is similar but with a
> git backend for storing history. Don't get fooled by fim being a java
> application. It easily outperformed hashdeep on large datasets.
> 
> Alphazo


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Purposely using btrfs RAID1 in degraded mode ?
  2016-01-04 17:41 ` Chris Murphy
  2016-01-06 12:30   ` Alphazo
@ 2016-01-09 10:08   ` Duncan
  2016-01-11 22:17     ` Alphazo
  1 sibling, 1 reply; 11+ messages in thread
From: Duncan @ 2016-01-09 10:08 UTC (permalink / raw)
  To: linux-btrfs

Chris Murphy posted on Mon, 04 Jan 2016 10:41:09 -0700 as excerpted:

> On Mon, Jan 4, 2016 at 10:00 AM, Alphazo <alphazo@gmail.com> wrote:
> 
>> I have tested the above use case with a couple of USB flash drive and
>> even used btrfs over dm-crypt partitions and it seemed to work fine but
>> I wanted to get some advices from the community if this is really a bad
>> practice that should not be used on the long run. Is there any
>> limitation/risk to read/write to/from a degraded filesystem knowing it
>> will be re-synced later?
> 
> As long as you realize you're testing a sort of edge case, but an
> important one (it should work, that's the point of rw degraded mounts
> being possible), then I think it's fine.
> 
> The warning though is, you need to designate a specific drive for the
> rw,degraded mounts. If you were to separately rw,degraded mount the two
> drives, the fs will become irreparably corrupt if they are rejoined. And
> you'll probably lose everything on the volume. The other thing is that
> to "resync" you have to manually initiate a scrub, it's not going to
> resync automatically, and it has to read everything on both drives to
> compare and fix what's missing. There is no equivalent to a write intent
> bitmap on Btrfs like with mdadm (the information ostensibly could be
> inferred from btrfs generation metadata similar to how incremental
> snapshot send/receive works) but that work isn't done.

In addition to what CMurphy says above (which I see you/Alphazo acked), 
be aware that btrfs' chunk-writing behavior isn't particularly well 
suited to this sort of split-raid1 application.

In general, btrfs allocates space in two steps.  First, it allocates 
rather large "chunks" of space, data chunks separately from metadata 
(unless you use --mixed mode, when you first setup the filesystem with 
mkfs.btrfs, then data and metadata are mixed together in the same 
chunks).  Data chunks are typically 1 GiB in size except on filesystems 
over 100 GiB (where they're larger), while metadata chunks are typically 
256 MiB (as are mixed-mode chunks).

Then btrfs uses space from these chunks until they get full, at which 
point it will attempt to allocate more chunks.

Older btrfs (before kernel 3.17, IIRC) could allocate chunks, but didn't 
know how to deallocate chunks when empty, so a common problem back then 
was that over time, all free space would be allocated to empty data 
chunks, and people would run into ENOSPC errors when metadata chunks ran 
out of space, but more couldn't be created because all the empty space 
was in data chunks.

Newer btrfs automatically reclaims empty chunks, so this doesn't happen 
so often.

But here comes the problem for the use-case you've described.  Btrfs 
can't allocate raid1 chunks if there's only a single device, because 
raid1 requires two devices.

So what's likely to happen is that at some point, you'll be away from 
home and the existing raid1 chunks, either data or metadata, will fill 
up, and btrfs will try to allocate more.  But you'll be running in 
degraded mode with only a single device, and it wouldn't be able to 
allocate raid1 chunks with just that single device.

Oops!  Big problem!

Now until very recently (I believe thru current 4.3), what would happen 
in this case is that btrfs would find that it couldn't create a new chunk 
in raid1 mode, and if operating degraded, would then fall back to 
creating it in single mode.  Which lets you continue writing, so all is 
well.  Except... once you unmounted and attempted to mount the device 
again, still degraded, it would see the single-mode chunks on a 
filesystem that was supposed to have two devices, and would refuse to 
mount degraded,rw again.  You could only mount degraded,ro.  Of course in 
your use-case, you could still wait until you got home and mount 
undegraded again, which would allow you to mount writable.

But a scrub wouldn't sync the single chunks.  For that, after the scrub, 
you'd need to run a filtered balance-convert, to convert the single 
chunks back to raid1.  Something like this (one command):

btrfs balance start -dprofile=single,convert=raid1 
-mprofile=single,convert=raid1

There are very new patches that should solve the problem of not being 
able to mount degraded,rw after single mode chunks are found, provided 
all those single mode chunks actually exist on the found device(s).  I 
think but I'm not sure, that they're in 4.4.  That would give you more 
flexibility in terms of mounting degraded,rw after single chunks have 
been created on the device you have with you, but you'd still need to run 
both a scrub, to sync the raid1 chunks, and a balance, to convert the 
single chunks to raid1 and sync them, once you had both devices connected.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Purposely using btrfs RAID1 in degraded mode ?
  2016-01-07 13:09       ` Alphazo
  2016-01-07 17:34         ` Sree Harsha Totakura
@ 2016-01-11 14:25         ` Psalle
  1 sibling, 0 replies; 11+ messages in thread
From: Psalle @ 2016-01-11 14:25 UTC (permalink / raw)
  To: Alphazo; +Cc: Btrfs BTRFS



On 07/01/16 14:09, Alphazo wrote:
> I'm a former bup user but I switched to borgbackup
> https://borgbackup.readthedocs.org/en/stable/ which is a more active
> fork of Attic and that solves two issues I had with bup: increasing
> time required to perform the incremental backup on large dataset with
> only few modifications and more importantly the impossibility to prune
> older backups. Also borgbackup natively supports encryption (AES256)
> and authentication (HMAC-SHA256).
>
> For offline long term backups I also used to work with hashdeep to
> perform and store a hash of all the files and recently started playing
> with FIM https://evrignaud.github.io/fim/ which is similar but with a
> git backend for storing history. Don't get fooled by fim being a java
> application. It easily outperformed hashdeep on large datasets.
Interesting pointers, thanks.

-Psalle.

>
> Alphazo
>
> On Thu, Jan 7, 2016 at 1:57 PM, Psalle <psalleetsile@gmail.com> wrote:
>> On 06/01/16 13:34, Alphazo wrote:
>>> Thanks Psalle. This is the kind of feedback I was looking for. I do
>>> realize that using a filesystem in a degraded mode is not the wisest
>>> thing to do. While I looked at git-annex I'm not sure it can help to
>>> solve bit-rot detection. Now I noticed that my current backup solution
>>> borg-backup also has a checksum verification feature so I can at least
>>> detect errors. In addition it provides incremental deduplicated backup
>>> so it should get me covered if I discover that something went wrong.
>>
>> Is that bup? I see it isn't, I guess they're similar. That is interesting
>> too.
>>
>> Git (or hg or similar) helps with bit rot because 'git fsck' will check the
>> hashes of the objects in the repository. If you detected a problem you could
>> re-clone from the good copy (assuming you have two drives with the same
>> repository in each one). Admittedly, it's a purely manual method but is
>> better than being unable to detect problems at all. git-annex is a layer on
>> top of git that automates things to some extent and is tailored to large
>> files, although the learning curve is not shallow in my experience.
>>
>> -Psalle.
>>
>>
>>> alphazo
>>>
>>> On Tue, Jan 5, 2016 at 5:34 PM, Psalle <psalleetsile@gmail.com> wrote:
>>>> Hello Alphazo,
>>>>
>>>> I am a mere btrfs user, but given the discussions I regularly see here
>>>> about
>>>> difficulties with degraded filesystems I wouldn't rely on this (yet?) as
>>>> a
>>>> regular work strategy, even if it's supposed to work.
>>>>
>>>> If you're familiar with git, perhaps git-annex could be an alternative.
>>>>
>>>> -Psalle.
>>>>
>>>>
>>>> On 04/01/16 18:00, Alphazo wrote:
>>>>> Hello,
>>>>>
>>>>> My picture library today lies on an external hard drive that I sync on
>>>>> a regular basis with a couple of servers and other external drives.
>>>>> I'm interested by the on-the-fly checksum brought by btrfs and would
>>>>> like to get your opinion on the following unusual use case that I have
>>>>> tested:
>>>>> - Create a btrfs with the two drives with RAID1
>>>>> - When at home I can work with the two drives connected so I can enjoy
>>>>> the self-healing feature if a bit goes mad so I only backup perfect
>>>>> copies to my backup servers.
>>>>> - When not at home I only bring one external drive and manually mount
>>>>> it in degraded mode so I can continue working on my pictures while
>>>>> still having checksum error detection (but not correction).
>>>>> - When coming back home I can plug-back the seconde drive and initiate
>>>>> a scrub or balance to get the second drive duplicated.
>>>>>
>>>>> I have tested the above use case with a couple of USB flash drive and
>>>>> even used btrfs over dm-crypt partitions and it seemed to work fine
>>>>> but I wanted to get some advices from the community if this is really
>>>>> a bad practice that should not be used on the long run. Is there any
>>>>> limitation/risk to read/write to/from a degraded filesystem knowing it
>>>>> will be re-synced later?
>>>>>
>>>>> Thanks
>>>>> alphazo
>>>>>
>>>>> PS: I have also investigated the RAID1 on a single drive with two
>>>>> partitions but I cannot afford the half capacity resulting from that
>>>>> approach.
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>>>>> in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Purposely using btrfs RAID1 in degraded mode ?
  2016-01-09 10:08   ` Duncan
@ 2016-01-11 22:17     ` Alphazo
  0 siblings, 0 replies; 11+ messages in thread
From: Alphazo @ 2016-01-11 22:17 UTC (permalink / raw)
  To: Duncan; +Cc: Btrfs BTRFS

Hi Duncan,

Awesome!  Thanks for taking the time to go over the details. This was
a very informative reading.

Alphazo

On Sat, Jan 9, 2016 at 11:08 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> Chris Murphy posted on Mon, 04 Jan 2016 10:41:09 -0700 as excerpted:
>
>> On Mon, Jan 4, 2016 at 10:00 AM, Alphazo <alphazo@gmail.com> wrote:
>>
>>> I have tested the above use case with a couple of USB flash drive and
>>> even used btrfs over dm-crypt partitions and it seemed to work fine but
>>> I wanted to get some advices from the community if this is really a bad
>>> practice that should not be used on the long run. Is there any
>>> limitation/risk to read/write to/from a degraded filesystem knowing it
>>> will be re-synced later?
>>
>> As long as you realize you're testing a sort of edge case, but an
>> important one (it should work, that's the point of rw degraded mounts
>> being possible), then I think it's fine.
>>
>> The warning though is, you need to designate a specific drive for the
>> rw,degraded mounts. If you were to separately rw,degraded mount the two
>> drives, the fs will become irreparably corrupt if they are rejoined. And
>> you'll probably lose everything on the volume. The other thing is that
>> to "resync" you have to manually initiate a scrub, it's not going to
>> resync automatically, and it has to read everything on both drives to
>> compare and fix what's missing. There is no equivalent to a write intent
>> bitmap on Btrfs like with mdadm (the information ostensibly could be
>> inferred from btrfs generation metadata similar to how incremental
>> snapshot send/receive works) but that work isn't done.
>
> In addition to what CMurphy says above (which I see you/Alphazo acked),
> be aware that btrfs' chunk-writing behavior isn't particularly well
> suited to this sort of split-raid1 application.
>
> In general, btrfs allocates space in two steps.  First, it allocates
> rather large "chunks" of space, data chunks separately from metadata
> (unless you use --mixed mode, when you first setup the filesystem with
> mkfs.btrfs, then data and metadata are mixed together in the same
> chunks).  Data chunks are typically 1 GiB in size except on filesystems
> over 100 GiB (where they're larger), while metadata chunks are typically
> 256 MiB (as are mixed-mode chunks).
>
> Then btrfs uses space from these chunks until they get full, at which
> point it will attempt to allocate more chunks.
>
> Older btrfs (before kernel 3.17, IIRC) could allocate chunks, but didn't
> know how to deallocate chunks when empty, so a common problem back then
> was that over time, all free space would be allocated to empty data
> chunks, and people would run into ENOSPC errors when metadata chunks ran
> out of space, but more couldn't be created because all the empty space
> was in data chunks.
>
> Newer btrfs automatically reclaims empty chunks, so this doesn't happen
> so often.
>
> But here comes the problem for the use-case you've described.  Btrfs
> can't allocate raid1 chunks if there's only a single device, because
> raid1 requires two devices.
>
> So what's likely to happen is that at some point, you'll be away from
> home and the existing raid1 chunks, either data or metadata, will fill
> up, and btrfs will try to allocate more.  But you'll be running in
> degraded mode with only a single device, and it wouldn't be able to
> allocate raid1 chunks with just that single device.
>
> Oops!  Big problem!
>
> Now until very recently (I believe thru current 4.3), what would happen
> in this case is that btrfs would find that it couldn't create a new chunk
> in raid1 mode, and if operating degraded, would then fall back to
> creating it in single mode.  Which lets you continue writing, so all is
> well.  Except... once you unmounted and attempted to mount the device
> again, still degraded, it would see the single-mode chunks on a
> filesystem that was supposed to have two devices, and would refuse to
> mount degraded,rw again.  You could only mount degraded,ro.  Of course in
> your use-case, you could still wait until you got home and mount
> undegraded again, which would allow you to mount writable.
>
> But a scrub wouldn't sync the single chunks.  For that, after the scrub,
> you'd need to run a filtered balance-convert, to convert the single
> chunks back to raid1.  Something like this (one command):
>
> btrfs balance start -dprofile=single,convert=raid1
> -mprofile=single,convert=raid1
>
> There are very new patches that should solve the problem of not being
> able to mount degraded,rw after single mode chunks are found, provided
> all those single mode chunks actually exist on the found device(s).  I
> think but I'm not sure, that they're in 4.4.  That would give you more
> flexibility in terms of mounting degraded,rw after single chunks have
> been created on the device you have with you, but you'd still need to run
> both a scrub, to sync the raid1 chunks, and a balance, to convert the
> single chunks to raid1 and sync them, once you had both devices connected.
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-01-11 22:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-04 17:00 Purposely using btrfs RAID1 in degraded mode ? Alphazo
2016-01-04 17:41 ` Chris Murphy
2016-01-06 12:30   ` Alphazo
2016-01-09 10:08   ` Duncan
2016-01-11 22:17     ` Alphazo
2016-01-05 16:34 ` Psalle
2016-01-06 12:34   ` Alphazo
2016-01-07 12:57     ` Psalle
2016-01-07 13:09       ` Alphazo
2016-01-07 17:34         ` Sree Harsha Totakura
2016-01-11 14:25         ` Psalle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).