* btrfs and mdadm raid 6 @ 2012-08-20 16:22 Curtis Jones 2012-08-20 17:06 ` Roman Mamedov ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Curtis Jones @ 2012-08-20 16:22 UTC (permalink / raw) To: linux-btrfs Hi. I'm considering an imminent switch from ext4 to btrfs and I'm hoping that someone can lend me advice before I do something unsupported. I have a software raid 6 array configured via mdadm. It was sitting at 8 x 3TB until I recently doubled that, grew the array and found that ext4 doesn't want to resize. So, I'm looking to: 1. convert from ext4 to btrfs 2. grow the fs to the full array size My concerns are: 1. is btrfs-convert on /dev/md0 stable/reliable/tested/not-a-stupid-thing-to-do? 2. based on the reading I've done, resizing btrfs is supported. can you confirm? 3. there aren't any known compatibility or other issues with running btrfs on top of mdadm (raid 6) 4. any other caveats I might want to consider? I just upgraded from kernel v3.5.1 to v3.5.2 and I have the btrfs-tools (v0.19) compiled straight from git. Any words of wisdom would be appreciated. Thanks! -- Curtis Jones ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: btrfs and mdadm raid 6 2012-08-20 16:22 btrfs and mdadm raid 6 Curtis Jones @ 2012-08-20 17:06 ` Roman Mamedov 2012-08-20 22:24 ` Curtis Jones 2012-08-21 14:51 ` David Sterba 2012-08-20 17:09 ` Roman Mamedov 2012-08-21 19:11 ` Jeremy Sanders 2 siblings, 2 replies; 7+ messages in thread From: Roman Mamedov @ 2012-08-20 17:06 UTC (permalink / raw) To: Curtis Jones; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 2791 bytes --] On Mon, 20 Aug 2012 12:22:31 -0400 Curtis Jones <curtis.jones@gmail.com> wrote: > 1. is btrfs-convert on /dev/md0 stable/reliable/tested/not-a-stupid-thing-to-do? btrfs-convert does not care on what kind of block device an FS resides, so it's OK. > 2. based on the reading I've done, resizing btrfs is supported. can you confirm? Yes, both growing and shrinking. > 3. there aren't any known compatibility or other issues with running btrfs on top of mdadm (raid 6) Not that I know of. But... if we were a year into the future and there was working btrfs RAID6, then that configuration (btrfs native RAID6 rather than single-device btrfs on top of mdadm) would provide more resilience, as blocks with failed checksums could be automatically reconstructed from 'good' data on other devices in the array. In the current situation though, btrfs checksums will only tell you that you lost data due to some corruption underneath, in (unlikely)case that it happens and mdadm lets it through. > 4. any other caveats I might want to consider? 1) AFAIK the patch [1] is still not in the mainline, so you'll either have to include it into your kernels yourself, or you will end up with a truly and enormous metadata allocation size, if I'm counting correctly on your array with 42 TB of usable space you will have 840GB * 2 = 1700 GB reserved for metadata. [1] http://comments.gmane.org/gmane.comp.file-systems.btrfs/19200 2) On filesystem converted with btrfs-convert the metadata allocation is unnecessarily large due to some other, conversion-related reasons; but this can be fixed with "btrfs filesystem balance -musage=5 /mount/point" (do several runs increasing the value from 5 to 10, 20 or more, if it fails to free up a sufficient amount of space). This will defragment metadata and free up chunks which end up being completely unused (which will be a lot of them), but only down to the kernel's desired minimum allocation, see point #1. 3) Due to the point #1 and in general for performance reasons, considering also that you're already running on top of a parity-protected RAID, you might want to consider switching the metadata profile from DUP to single (i.e. just one copy of metadata on the device, not two). "btrfs fi balance start -mconvert=single /mnt/point" Regarding balance, see https://btrfs.wiki.kernel.org/index.php/Balance_Filters > I just upgraded from kernel v3.5.1 to v3.5.2 and I have the btrfs-tools (v0.19) compiled straight from git. You're doing great :) Also, btw, I hope you have a full backup of everything you care about. -- With respect, Roman ~~~~~~~~~~~~~~~~~~~~~~~~~~~ "Stallman had a printer, with code he could not see. So he began to tinker, and set the software free." [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: btrfs and mdadm raid 6 2012-08-20 17:06 ` Roman Mamedov @ 2012-08-20 22:24 ` Curtis Jones 2012-08-21 1:21 ` Chris Samuel 2012-08-21 14:51 ` David Sterba 1 sibling, 1 reply; 7+ messages in thread From: Curtis Jones @ 2012-08-20 22:24 UTC (permalink / raw) To: linux-btrfs@vger.kernel.org Roman, Thanks a lot for your response. Through no small miracle, I am in a position to start over without risking my data. You mentioned that btrfs was going to set aside a ton of space for metadata. Is that entirely due to going ext4 -> btrfs? Since I can now create a btrfs file system from scratch, is it a non-issue or is there a parameter I can use to avoid that - without having to recompile my kernel with that patch? Thanks again. -- Curtis Jones curtisjones.us 404.492.6437 On Aug 20, 2012, at 13.06.03, Roman Mamedov <rm@romanrm.ru> wrote: > On Mon, 20 Aug 2012 12:22:31 -0400 > Curtis Jones <curtis.jones@gmail.com> wrote: > >> 1. is btrfs-convert on /dev/md0 stable/reliable/tested/not-a-stupid-thing-to-do? > > btrfs-convert does not care on what kind of block device an FS resides, so it's OK. > >> 2. based on the reading I've done, resizing btrfs is supported. can you confirm? > > Yes, both growing and shrinking. > >> 3. there aren't any known compatibility or other issues with running btrfs on top of mdadm (raid 6) > > Not that I know of. > > But... if we were a year into the future and there was working btrfs RAID6, > then that configuration (btrfs native RAID6 rather than single-device btrfs on > top of mdadm) would provide more resilience, as blocks with failed checksums > could be automatically reconstructed from 'good' data on other devices in the > array. > > In the current situation though, btrfs checksums will only tell you that you > lost data due to some corruption underneath, in (unlikely)case that it > happens and mdadm lets it through. > >> 4. any other caveats I might want to consider? > > 1) AFAIK the patch [1] is still not in the mainline, so you'll either have to > include it into your kernels yourself, or you will end up with a truly and > enormous metadata allocation size, if I'm counting correctly on your array with > 42 TB of usable space you will have 840GB * 2 = 1700 GB reserved for metadata. > > [1] http://comments.gmane.org/gmane.comp.file-systems.btrfs/19200 > > 2) On filesystem converted with btrfs-convert the metadata allocation is > unnecessarily large due to some other, conversion-related reasons; but this > can be fixed with "btrfs filesystem balance -musage=5 /mount/point" (do > several runs increasing the value from 5 to 10, 20 or more, if it fails to > free up a sufficient amount of space). This will defragment metadata and free > up chunks which end up being completely unused (which will be a lot of them), > but only down to the kernel's desired minimum allocation, see point #1. > > 3) Due to the point #1 and in general for performance reasons, considering > also that you're already running on top of a parity-protected RAID, you might > want to consider switching the metadata profile from DUP to single (i.e. just > one copy of metadata on the device, not two). > "btrfs fi balance start -mconvert=single /mnt/point" > > Regarding balance, see https://btrfs.wiki.kernel.org/index.php/Balance_Filters > >> I just upgraded from kernel v3.5.1 to v3.5.2 and I have the btrfs-tools (v0.19) compiled straight from git. > > You're doing great :) > > Also, btw, I hope you have a full backup of everything you care about. > > -- > With respect, > Roman > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~ > "Stallman had a printer, > with code he could not see. > So he began to tinker, > and set the software free." ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: btrfs and mdadm raid 6 2012-08-20 22:24 ` Curtis Jones @ 2012-08-21 1:21 ` Chris Samuel 0 siblings, 0 replies; 7+ messages in thread From: Chris Samuel @ 2012-08-21 1:21 UTC (permalink / raw) To: Curtis Jones; +Cc: linux-btrfs@vger.kernel.org On 21/08/12 08:24, Curtis Jones wrote: > You mentioned that btrfs was going to set aside a ton of space > for metadata. Is that entirely due to going ext4 -> btrfs? No, I believe that's a regression in btrfs in recent kernels.. -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: btrfs and mdadm raid 6 2012-08-20 17:06 ` Roman Mamedov 2012-08-20 22:24 ` Curtis Jones @ 2012-08-21 14:51 ` David Sterba 1 sibling, 0 replies; 7+ messages in thread From: David Sterba @ 2012-08-21 14:51 UTC (permalink / raw) To: Roman Mamedov; +Cc: Curtis Jones, linux-btrfs On Mon, Aug 20, 2012 at 11:06:03PM +0600, Roman Mamedov wrote: > 2) On filesystem converted with btrfs-convert the metadata allocation is > unnecessarily large due to some other, conversion-related reasons; but this > can be fixed with "btrfs filesystem balance -musage=5 /mount/point" (do > several runs increasing the value from 5 to 10, 20 or more, if it fails to > free up a sufficient amount of space). This will defragment metadata and free > up chunks which end up being completely unused (which will be a lot of them), > but only down to the kernel's desired minimum allocation, see point #1. There's one recommended preceding step -- remove the saved ext2_subvol/image . (General note, that further rollback to ext4 is impossible, does not apply in this case.) The data blocks will otherwise inherit the layout from ext4 and are (probably and naturally) allocated using different assumptions and needs. david ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: btrfs and mdadm raid 6 2012-08-20 16:22 btrfs and mdadm raid 6 Curtis Jones 2012-08-20 17:06 ` Roman Mamedov @ 2012-08-20 17:09 ` Roman Mamedov 2012-08-21 19:11 ` Jeremy Sanders 2 siblings, 0 replies; 7+ messages in thread From: Roman Mamedov @ 2012-08-20 17:09 UTC (permalink / raw) To: Curtis Jones; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 523 bytes --] On Mon, 20 Aug 2012 12:22:31 -0400 Curtis Jones <curtis.jones@gmail.com> wrote: > 4. any other caveats I might want to consider? One more thing: if you do not fancy waiting for days/weeks for btrfs-convert to checksum all your existing data, you might want to use btrfs-convert -d so that only newly-written data will be checksummed. -- With respect, Roman ~~~~~~~~~~~~~~~~~~~~~~~~~~~ "Stallman had a printer, with code he could not see. So he began to tinker, and set the software free." [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: btrfs and mdadm raid 6 2012-08-20 16:22 btrfs and mdadm raid 6 Curtis Jones 2012-08-20 17:06 ` Roman Mamedov 2012-08-20 17:09 ` Roman Mamedov @ 2012-08-21 19:11 ` Jeremy Sanders 2 siblings, 0 replies; 7+ messages in thread From: Jeremy Sanders @ 2012-08-21 19:11 UTC (permalink / raw) To: linux-btrfs Curtis Jones wrote: > 1. is btrfs-convert on /dev/md0 > stable/reliable/tested/not-a-stupid-thing-to-do? 2. based on the reading > I've done, resizing btrfs is supported. can you confirm? 3. there aren't > any known compatibility or other issues with running btrfs on top of mdadm > (raid 6) 4. any other caveats I might want to consider? We've been running btrfs on mdadm for a year or so with no problems (it was a fresh file system, though). Jeremy ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-08-21 19:11 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-08-20 16:22 btrfs and mdadm raid 6 Curtis Jones 2012-08-20 17:06 ` Roman Mamedov 2012-08-20 22:24 ` Curtis Jones 2012-08-21 1:21 ` Chris Samuel 2012-08-21 14:51 ` David Sterba 2012-08-20 17:09 ` Roman Mamedov 2012-08-21 19:11 ` Jeremy Sanders
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).