From: Miao Xie <miaox@cn.fujitsu.com>
To: "Piotr Pawłow" <pp@siedziba.pl>,
"Chris Murphy" <lists@colorremedies.com>,
linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: device balance times
Date: Thu, 23 Oct 2014 17:19:26 +0800 [thread overview]
Message-ID: <5448C81E.4060701@cn.fujitsu.com> (raw)
In-Reply-To: <5447A5CF.9060405@siedziba.pl>
On Wed, 22 Oct 2014 14:40:47 +0200, Piotr Pawłow wrote:
> On 22.10.2014 03:43, Chris Murphy wrote:
>> On Oct 21, 2014, at 4:14 PM, Piotr Pawłow<pp@siedziba.pl> wrote:
>>> Looks normal to me. Last time I started a balance after adding 6th device to my FS, it took 4 days to move 25GBs of data.
>> It's long term untenable. At some point it must be fixed. It's way, way slower than md raid.
>> At a certain point it needs to fallback to block level copying, with a ~ 32KB block. It can't be treating things as if they're 1K files, doing file level copying that takes forever. It's just too risky that another device fails in the meantime.
>
> There's "device replace" for restoring redundancy, which is fast, but not implemented yet for RAID5/6.
Now my colleague and I is implementing the scrub/replace for RAID5/6
and I have a plan to reimplement the balance and split it off from the metadata/file data process. the main idea is
- allocate a new chunk which has the same size as the relocated one, but don't insert it into the block group list, so we don't
allocate the free space from it.
- set the source chunk to be Read-only
- copy the data from the source chunk to the new chunk
- replace the extent map of the source chunk with the one of the new chunk(The new chunk has
the same logical address and the length as the old one)
- release the source chunk
By this way, we needn't deal the data one extent by one extent, and needn't do any space reservation,
so the speed will be very fast even we have lots of snapshots.
Thanks
Miao
>
> I think the problem is that balance was originally used for balancing data / metadata split - moving stuff out of mostly empty chunks to free them and use for something else. It pretty much has to be done on the extent level.
>
> Then balance was repurposed for things like converting RAID profiles and restoring redundancy and balancing device usage in multi-device configurations. It works, but the approach to do it extent by extent is slow.
>
> I wonder if we could do some of these operations by just copying whole chunks in bulk. Wasn't that the point of introducing logical addresses? - to be able to move chunks around quickly without changing anything except updating chunk pointers?
>
> BTW: I'd love a simple interface to be able to select a chunk and tell it to move somewhere else. I'd like to tell chunks with metadata, or with tons of extents: Hey, chunks! Why don't you move to my SSDs? :)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2014-10-23 9:17 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-21 18:59 device balance times Tomasz Chmielewski
2014-10-21 20:14 ` Piotr Pawłow
2014-10-21 20:44 ` Arnaud Kapp
2014-10-22 1:10 ` 5 _thousand_ snapshots? even 160? (was: device balance times) Robert White
2014-10-22 4:02 ` Zygo Blaxell
2014-10-22 4:05 ` Duncan
2014-10-23 20:38 ` 5 _thousand_ snapshots? even 160? Arnaud Kapp
2014-10-22 11:30 ` Austin S Hemmelgarn
2014-10-22 17:32 ` Goffredo Baroncelli
2014-10-22 11:22 ` device balance times Austin S Hemmelgarn
2014-10-22 1:43 ` Chris Murphy
2014-10-22 12:40 ` Piotr Pawłow
2014-10-22 16:59 ` Bob Marley
2014-10-23 7:39 ` Russell Coker
2014-10-23 8:49 ` Duncan
2014-10-23 9:19 ` Miao Xie [this message]
2014-10-23 11:39 ` Austin S Hemmelgarn
2014-10-24 1:05 ` Duncan
2014-10-24 2:35 ` Zygo Blaxell
2014-10-24 5:13 ` Duncan
2014-10-24 15:18 ` Zygo Blaxell
2014-10-24 10:58 ` Rich Freeman
2014-10-24 16:07 ` Zygo Blaxell
2014-10-24 19:58 ` Rich Freeman
2014-10-22 16:15 ` Chris Murphy
2014-10-23 2:44 ` Duncan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5448C81E.4060701@cn.fujitsu.com \
--to=miaox@cn.fujitsu.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
--cc=pp@siedziba.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).