linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Lyakas <alex.btrfs@zadarastorage.com>
To: Hugo Mills <hugo@carfax.org.uk>,
	Austin S Hemmelgarn <ahferroin7@gmail.com>,
	Chris Murphy <lists@colorremedies.com>,
	Steve Leung <sjleung@shaw.ca>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: safe/necessary to balance system chunks?
Date: Thu, 19 Jun 2014 14:32:51 +0300	[thread overview]
Message-ID: <CAOcd+r0fo55dC2QbPwdwpCB+GD7HR7wrrn3s+srrsNSnrNECjA@mail.gmail.com> (raw)
In-Reply-To: <20140425191448.GJ2391@carfax.org.uk>

On Fri, Apr 25, 2014 at 10:14 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> On Fri, Apr 25, 2014 at 02:12:17PM -0400, Austin S Hemmelgarn wrote:
>> On 2014-04-25 13:24, Chris Murphy wrote:
>> >
>> > On Apr 25, 2014, at 8:57 AM, Steve Leung <sjleung@shaw.ca> wrote:
>> >
>> >>
>> >> Hi list,
>> >>
>> >> I've got a 3-device RAID1 btrfs filesystem that started out life as single-device.
>> >>
>> >> btrfs fi df:
>> >>
>> >> Data, RAID1: total=1.31TiB, used=1.07TiB
>> >> System, RAID1: total=32.00MiB, used=224.00KiB
>> >> System, DUP: total=32.00MiB, used=32.00KiB
>> >> System, single: total=4.00MiB, used=0.00
>> >> Metadata, RAID1: total=66.00GiB, used=2.97GiB
>> >>
>> >> This still lists some system chunks as DUP, and not as RAID1.  Does this mean that if one device were to fail, some system chunks would be unrecoverable?  How bad would that be?
>> >
>> > Since it's "system" type, it might mean the whole volume is toast if the drive containing those 32KB dies. I'm not sure what kind of information is in system chunk type, but I'd expect it's important enough that if unavailable that mounting the file system may be difficult or impossible. Perhaps btrfs restore would still work?
>> >
>> > Anyway, it's probably a high penalty for losing only 32KB of data.  I think this could use some testing to try and reproduce conversions where some amount of "system" or "metadata" type chunks are stuck in DUP. This has come up before on the list but I'm not sure how it's happening, as I've never encountered it.
>> >
>> As far as I understand it, the system chunks are THE root chunk tree for
>> the entire system, that is to say, it's the tree of tree roots that is
>> pointed to by the superblock. (I would love to know if this
>> understanding is wrong).  Thus losing that data almost always means
>> losing the whole filesystem.
>
>    From a conversation I had with cmason a while ago, the System
> chunks contain the chunk tree. They're special because *everything* in
> the filesystem -- including the locations of all the trees, including
> the chunk tree and the roots tree -- is positioned in terms of the
> internal virtual address space. Therefore, when starting up the FS,
> you can read the superblock (which is at a known position on each
> device), which tells you the virtual address of the other trees... and
> you still need to find out where that really is.
>
>    The superblock has (I think) a list of physical block addresses at
> the end of it (sys_chunk_array), which allows you to find the blocks
> for the chunk tree and work out this mapping, which allows you to find
> everything else. I'm not 100% certain of the actual format of that
> array -- it's declared as u8 [2048], so I'm guessing there's a load of
> casting to something useful going on in the code somewhere.
The format is just a list of pairs:
struct btrfs_disk_key,  struct btrfs_chunk
struct btrfs_disk_key,  struct btrfs_chunk
...

For each SYSTEM block-group (btrfs_chunk), we need one entry in the
sys_chunk_array. During mkfs the first SYSTEM block group is created,
for me its 4MB. So only if the whole chunk tree grows over 4MB, we
need to create an additional SYSTEM block group, and then we need to
have a second entry in the sys_chunk_array. And so on.

Alex.


>
>    Hugo.
>
> --
> === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
>   PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
>     --- Is it still called an affair if I'm sleeping with my wife ---
>                         behind her lover's back?

  reply	other threads:[~2014-06-19 11:32 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-25 14:57 safe/necessary to balance system chunks? Steve Leung
2014-04-25 17:24 ` Chris Murphy
2014-04-25 18:12   ` Austin S Hemmelgarn
2014-04-25 18:43     ` Steve Leung
2014-04-25 19:07       ` Austin S Hemmelgarn
2014-04-26  4:01         ` Duncan
2014-04-26  1:11       ` Duncan
2014-04-26  1:24       ` Chris Murphy
2014-04-26  2:56         ` Steve Leung
2014-04-26  4:05           ` Chris Murphy
2014-04-26  4:55           ` Duncan
2014-04-25 19:14     ` Hugo Mills
2014-06-19 11:32       ` Alex Lyakas [this message]
2014-04-25 23:03     ` Duncan
2014-04-26  1:41       ` Chris Murphy
2014-04-26  4:23         ` Duncan
2014-04-25 18:36   ` Steve Leung

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOcd+r0fo55dC2QbPwdwpCB+GD7HR7wrrn3s+srrsNSnrNECjA@mail.gmail.com \
    --to=alex.btrfs@zadarastorage.com \
    --cc=ahferroin7@gmail.com \
    --cc=hugo@carfax.org.uk \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=sjleung@shaw.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).