linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* btrfs and mdadm raid 6
@ 2012-08-20 16:22 Curtis Jones
  2012-08-20 17:06 ` Roman Mamedov
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Curtis Jones @ 2012-08-20 16:22 UTC (permalink / raw)
  To: linux-btrfs

Hi. I'm considering an imminent switch from ext4 to btrfs and I'm hoping that someone can lend me advice before I do something unsupported.

I have a software raid 6 array configured via mdadm. It was sitting at 8 x 3TB until I recently doubled that, grew the array and found that ext4 doesn't want to resize. So, I'm looking to:

	1. convert from ext4 to btrfs
	2. grow the fs to the full array size

My concerns are:

	1. is btrfs-convert on /dev/md0 stable/reliable/tested/not-a-stupid-thing-to-do?
	2. based on the reading I've done, resizing btrfs is supported. can you confirm?
	3. there aren't any known compatibility or other issues with running btrfs on top of mdadm (raid 6)
	4. any other caveats I might want to consider?

I just upgraded from kernel v3.5.1 to v3.5.2 and I have the btrfs-tools (v0.19) compiled straight from git.

Any words of wisdom would be appreciated.

Thanks!

--
Curtis Jones


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs and mdadm raid 6
  2012-08-20 16:22 btrfs and mdadm raid 6 Curtis Jones
@ 2012-08-20 17:06 ` Roman Mamedov
  2012-08-20 22:24   ` Curtis Jones
  2012-08-21 14:51   ` David Sterba
  2012-08-20 17:09 ` Roman Mamedov
  2012-08-21 19:11 ` Jeremy Sanders
  2 siblings, 2 replies; 7+ messages in thread
From: Roman Mamedov @ 2012-08-20 17:06 UTC (permalink / raw)
  To: Curtis Jones; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2791 bytes --]

On Mon, 20 Aug 2012 12:22:31 -0400
Curtis Jones <curtis.jones@gmail.com> wrote:

> 	1. is btrfs-convert on /dev/md0 stable/reliable/tested/not-a-stupid-thing-to-do?

btrfs-convert does not care on what kind of block device an FS resides, so it's OK.

> 	2. based on the reading I've done, resizing btrfs is supported. can you confirm?

Yes, both growing and shrinking.

> 	3. there aren't any known compatibility or other issues with running btrfs on top of mdadm (raid 6)

Not that I know of.

But... if we were a year into the future and there was working btrfs RAID6,
then that configuration (btrfs native RAID6 rather than single-device btrfs on
top of mdadm) would provide more resilience, as blocks with failed checksums
could be automatically reconstructed from 'good' data on other devices in the
array.

In the current situation though, btrfs checksums will only tell you that you
lost data due to some corruption underneath, in (unlikely)case that it
happens and mdadm lets it through.

>	4. any other caveats I might want to consider?

1) AFAIK the patch [1] is still not in the mainline, so you'll either have to
include it into your kernels yourself, or you will end up with a truly and
enormous metadata allocation size, if I'm counting correctly on your array with
42 TB of usable space you will have 840GB * 2 = 1700 GB reserved for metadata.

[1] http://comments.gmane.org/gmane.comp.file-systems.btrfs/19200

2) On filesystem converted with btrfs-convert the metadata allocation is
unnecessarily large due to some other, conversion-related reasons; but this
can be fixed with "btrfs filesystem balance -musage=5 /mount/point" (do
several runs increasing the value from 5 to 10, 20 or more, if it fails to
free up a sufficient amount of space). This will defragment metadata and free
up chunks which end up being completely unused (which will be a lot of them),
but only down to the kernel's desired minimum allocation, see point #1.

3) Due to the point #1 and in general for performance reasons, considering
also that you're already running on top of a parity-protected RAID, you might
want to consider switching the metadata profile from DUP to single (i.e. just
one copy of metadata on the device, not two).
"btrfs fi balance start -mconvert=single /mnt/point"

Regarding balance, see https://btrfs.wiki.kernel.org/index.php/Balance_Filters

> I just upgraded from kernel v3.5.1 to v3.5.2 and I have the btrfs-tools (v0.19) compiled straight from git.

You're doing great :)

Also, btw, I hope you have a full backup of everything you care about.

-- 
With respect,
Roman

~~~~~~~~~~~~~~~~~~~~~~~~~~~
"Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free."

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs and mdadm raid 6
  2012-08-20 16:22 btrfs and mdadm raid 6 Curtis Jones
  2012-08-20 17:06 ` Roman Mamedov
@ 2012-08-20 17:09 ` Roman Mamedov
  2012-08-21 19:11 ` Jeremy Sanders
  2 siblings, 0 replies; 7+ messages in thread
From: Roman Mamedov @ 2012-08-20 17:09 UTC (permalink / raw)
  To: Curtis Jones; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 523 bytes --]

On Mon, 20 Aug 2012 12:22:31 -0400
Curtis Jones <curtis.jones@gmail.com> wrote:

> 	4. any other caveats I might want to consider?

One more thing: if you do not fancy waiting for days/weeks for btrfs-convert
to checksum all your existing data, you might want to use
 
  btrfs-convert -d

so that only newly-written data will be checksummed.


-- 
With respect,
Roman

~~~~~~~~~~~~~~~~~~~~~~~~~~~
"Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free."

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs and mdadm raid 6
  2012-08-20 17:06 ` Roman Mamedov
@ 2012-08-20 22:24   ` Curtis Jones
  2012-08-21  1:21     ` Chris Samuel
  2012-08-21 14:51   ` David Sterba
  1 sibling, 1 reply; 7+ messages in thread
From: Curtis Jones @ 2012-08-20 22:24 UTC (permalink / raw)
  To: linux-btrfs@vger.kernel.org

Roman,

Thanks a lot for your response.

Through no small miracle, I am in a position to start over without risking my data. You mentioned that btrfs was going to set aside a ton of space for metadata. Is that entirely due to going ext4 -> btrfs? Since I can now create a btrfs file system from scratch, is it a non-issue or is there a parameter I can use to avoid that - without having to recompile my kernel with that patch?

Thanks again.

--
Curtis Jones
curtisjones.us
404.492.6437




On Aug 20, 2012, at 13.06.03, Roman Mamedov <rm@romanrm.ru> wrote:

> On Mon, 20 Aug 2012 12:22:31 -0400
> Curtis Jones <curtis.jones@gmail.com> wrote:
> 
>> 	1. is btrfs-convert on /dev/md0 stable/reliable/tested/not-a-stupid-thing-to-do?
> 
> btrfs-convert does not care on what kind of block device an FS resides, so it's OK.
> 
>> 	2. based on the reading I've done, resizing btrfs is supported. can you confirm?
> 
> Yes, both growing and shrinking.
> 
>> 	3. there aren't any known compatibility or other issues with running btrfs on top of mdadm (raid 6)
> 
> Not that I know of.
> 
> But... if we were a year into the future and there was working btrfs RAID6,
> then that configuration (btrfs native RAID6 rather than single-device btrfs on
> top of mdadm) would provide more resilience, as blocks with failed checksums
> could be automatically reconstructed from 'good' data on other devices in the
> array.
> 
> In the current situation though, btrfs checksums will only tell you that you
> lost data due to some corruption underneath, in (unlikely)case that it
> happens and mdadm lets it through.
> 
>> 	4. any other caveats I might want to consider?
> 
> 1) AFAIK the patch [1] is still not in the mainline, so you'll either have to
> include it into your kernels yourself, or you will end up with a truly and
> enormous metadata allocation size, if I'm counting correctly on your array with
> 42 TB of usable space you will have 840GB * 2 = 1700 GB reserved for metadata.
> 
> [1] http://comments.gmane.org/gmane.comp.file-systems.btrfs/19200
> 
> 2) On filesystem converted with btrfs-convert the metadata allocation is
> unnecessarily large due to some other, conversion-related reasons; but this
> can be fixed with "btrfs filesystem balance -musage=5 /mount/point" (do
> several runs increasing the value from 5 to 10, 20 or more, if it fails to
> free up a sufficient amount of space). This will defragment metadata and free
> up chunks which end up being completely unused (which will be a lot of them),
> but only down to the kernel's desired minimum allocation, see point #1.
> 
> 3) Due to the point #1 and in general for performance reasons, considering
> also that you're already running on top of a parity-protected RAID, you might
> want to consider switching the metadata profile from DUP to single (i.e. just
> one copy of metadata on the device, not two).
> "btrfs fi balance start -mconvert=single /mnt/point"
> 
> Regarding balance, see https://btrfs.wiki.kernel.org/index.php/Balance_Filters
> 
>> I just upgraded from kernel v3.5.1 to v3.5.2 and I have the btrfs-tools (v0.19) compiled straight from git.
> 
> You're doing great :)
> 
> Also, btw, I hope you have a full backup of everything you care about.
> 
> -- 
> With respect,
> Roman
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> "Stallman had a printer,
> with code he could not see.
> So he began to tinker,
> and set the software free."


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs and mdadm raid 6
  2012-08-20 22:24   ` Curtis Jones
@ 2012-08-21  1:21     ` Chris Samuel
  0 siblings, 0 replies; 7+ messages in thread
From: Chris Samuel @ 2012-08-21  1:21 UTC (permalink / raw)
  To: Curtis Jones; +Cc: linux-btrfs@vger.kernel.org

On 21/08/12 08:24, Curtis Jones wrote:

> You mentioned that btrfs was going to set aside a ton of space
> for metadata. Is that entirely due to going ext4 -> btrfs?

No, I believe that's a regression in btrfs in recent kernels..

-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs and mdadm raid 6
  2012-08-20 17:06 ` Roman Mamedov
  2012-08-20 22:24   ` Curtis Jones
@ 2012-08-21 14:51   ` David Sterba
  1 sibling, 0 replies; 7+ messages in thread
From: David Sterba @ 2012-08-21 14:51 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: Curtis Jones, linux-btrfs

On Mon, Aug 20, 2012 at 11:06:03PM +0600, Roman Mamedov wrote:
> 2) On filesystem converted with btrfs-convert the metadata allocation is
> unnecessarily large due to some other, conversion-related reasons; but this
> can be fixed with "btrfs filesystem balance -musage=5 /mount/point" (do
> several runs increasing the value from 5 to 10, 20 or more, if it fails to
> free up a sufficient amount of space). This will defragment metadata and free
> up chunks which end up being completely unused (which will be a lot of them),
> but only down to the kernel's desired minimum allocation, see point #1.

There's one recommended preceding step -- remove the saved
ext2_subvol/image .  (General note, that further rollback to ext4 is
impossible, does not apply in this case.)

The data blocks will otherwise inherit the layout from ext4 and are
(probably and naturally) allocated using different assumptions and
needs.


david

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs and mdadm raid 6
  2012-08-20 16:22 btrfs and mdadm raid 6 Curtis Jones
  2012-08-20 17:06 ` Roman Mamedov
  2012-08-20 17:09 ` Roman Mamedov
@ 2012-08-21 19:11 ` Jeremy Sanders
  2 siblings, 0 replies; 7+ messages in thread
From: Jeremy Sanders @ 2012-08-21 19:11 UTC (permalink / raw)
  To: linux-btrfs

Curtis Jones wrote:

> 1. is btrfs-convert on /dev/md0
> stable/reliable/tested/not-a-stupid-thing-to-do? 2. based on the reading
> I've done, resizing btrfs is supported. can you confirm? 3. there aren't
> any known compatibility or other issues with running btrfs on top of mdadm
> (raid 6) 4. any other caveats I might want to consider?

We've been running btrfs on mdadm for a year or so with no problems (it was 
a fresh file system, though).

Jeremy



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-08-21 19:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-20 16:22 btrfs and mdadm raid 6 Curtis Jones
2012-08-20 17:06 ` Roman Mamedov
2012-08-20 22:24   ` Curtis Jones
2012-08-21  1:21     ` Chris Samuel
2012-08-21 14:51   ` David Sterba
2012-08-20 17:09 ` Roman Mamedov
2012-08-21 19:11 ` Jeremy Sanders

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).