* BTRFS hot relocation not merged
@ 2015-02-19 11:49 Max Schettler
2015-02-19 21:06 ` Duncan
0 siblings, 1 reply; 3+ messages in thread
From: Max Schettler @ 2015-02-19 11:49 UTC (permalink / raw)
To: Zhi Yong Wu; +Cc: linux-btrfs
Hi,
I recently was looking for the status of hot relocation on btrfs.
There seemed to be some activity on the mailinglist around 5/2013
regarding patches that should provide the functionality.
However they have not been merged yet and there hasn`t been
further discussion about them (to my knowledge).
What is the status of hot relocation?
Max
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: BTRFS hot relocation not merged
2015-02-19 11:49 BTRFS hot relocation not merged Max Schettler
@ 2015-02-19 21:06 ` Duncan
2015-02-19 23:07 ` Kai Krakow
0 siblings, 1 reply; 3+ messages in thread
From: Duncan @ 2015-02-19 21:06 UTC (permalink / raw)
To: linux-btrfs
Max Schettler posted on Thu, 19 Feb 2015 12:49:37 +0100 as excerpted:
> I recently was looking for the status of hot relocation on btrfs.
> There seemed to be some activity on the mailinglist around 5/2013
> regarding patches that should provide the functionality.
> However they have not been merged yet and there hasn`t been further
> discussion about them (to my knowledge).
> What is the status of hot relocation?
The current suggestion is to use something like bcache or dmcache in
tandem with btrfs. I'm not sure of dmcache/btrfs status, but there are
people actually using bcache/btrfs here on this list, with the reports
I've read generally very positive.
Longer term, the feature in various forms remains on the wiki's project
ideas page, here:
https://btrfs.wiki.kernel.org/index.php/Project_ideas
However, as can be seen on that page, btrfs is definitely not lacking in
ideas for future development, rather the reverse, and unfortunately btrfs
in general has a history of wildly optimistic feature ETAs, tho they do
eventually come online, with raid56 mode being the most recent example.
That being the case and with none of the variants of the suggestion
already formally claimed and in-progress, I'd suggest checking back in
3-5 years... unless of course this is a feeler and you're proposing to
claim and implement it yourself. =:^)
There's also this rather vague comment on the wiki, on the main page,
under Features, additional features in development or planned (so closer
to News, then scroll up a bit)...
* Hot data tracking and moving to faster devices (currently being pushed
as a generic feature available through VFS)
https://btrfs.wiki.kernel.org/index.php/Main_Page#News
(and scroll up a bit)
I'm not sure if that refers to bcache and similar, or something else, tho
I didn't check the talk and history pages, which may have a hint...
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: BTRFS hot relocation not merged
2015-02-19 21:06 ` Duncan
@ 2015-02-19 23:07 ` Kai Krakow
0 siblings, 0 replies; 3+ messages in thread
From: Kai Krakow @ 2015-02-19 23:07 UTC (permalink / raw)
To: linux-btrfs
Duncan <1i5t5.duncan@cox.net> schrieb:
> Max Schettler posted on Thu, 19 Feb 2015 12:49:37 +0100 as excerpted:
>
>> I recently was looking for the status of hot relocation on btrfs.
>> There seemed to be some activity on the mailinglist around 5/2013
>> regarding patches that should provide the functionality.
>> However they have not been merged yet and there hasn`t been further
>> discussion about them (to my knowledge).
>> What is the status of hot relocation?
>
> The current suggestion is to use something like bcache or dmcache in
> tandem with btrfs. I'm not sure of dmcache/btrfs status, but there are
> people actually using bcache/btrfs here on this list, with the reports
> I've read generally very positive.
Yes, here's one! :-)
[...]
> There's also this rather vague comment on the wiki, on the main page,
> under Features, additional features in development or planned (so closer
> to News, then scroll up a bit)...
>
> * Hot data tracking and moving to faster devices (currently being pushed
> as a generic feature available through VFS)
>
> https://btrfs.wiki.kernel.org/index.php/Main_Page#News
>
> (and scroll up a bit)
>
> I'm not sure if that refers to bcache and similar, or something else, tho
> I didn't check the talk and history pages, which may have a hint...
Actually, bcache does not implement hot data tracking. It more or less acts
as a huge scheduler (so it is in a range with deadline/cfq/... and friends)
and thus minimizes seek times as its primary focus. This is achieved by
trying to detect random reads and optionally writes, and caching those in a
log structured file systems by using access patterns optimized for non-
rotational media. Optionally cached writes are written back lazily in the
background and reordered to minimize seek and maximize throuput to the
rotational media. Linear access patterns are directly passed through to the
rotational media as they are not that bad for those kind of access patterns
(at least compared with past-generation SSDs). In that regard, even a good
USB stick could do as a cache, or an internal card reader - tho I'd probably
strongly recommend against using it.
The nice thing is, that this way, bcache can combine mixed fast SSD random
access patterns and linear HDD access patterns into one stream with summed
transfer rates. So it is by definition faster than plain HDD access on its
own.
But it even goes beyond: The read and write latencies of the cache devices
are measured, and if it goes above a certain threshold, it will fall back
fetching the data from the slower device which probably will, and this is a
heuristic, have the data ready faster then the congested caching device.
This is pretty neat, as it adds benefit to the summed transfer rates.
With this, if I can trust ksysguard, I get transfer rates of up to 800 MB/s
in a bcache+3xbtrfs(mraid1,draid0) setup, tho most times it peaks at around
150 MB/s where I had around 80 MB/s usual peaks without bcache. But this is
not the main benefit. My access latencies and IO queue depths have gone down
to virtually zero. And this is probably where the most speedup comes from.
System boot (on systemd, with services like postfix and mariadb, using
autodefrag and readahead) went down from around 60s to 5s (measured in
systemd-analyze critical path), with almost no seeking sounds from the
harddisks. KDE starts a lot faster now (maybe another 60-80s down to around
10s) and is instantly responsive with all panels, backgrounds and icons
loaded when the splash fades out while I had a black background and a lot of
ongoing IO previously after splash faded out.
The cache hit rate is usually above 80% with an 80 GB bcache partition for a
3x 1TB btrfs volume. My SSD is specified with 550 MB/s reading and 150 MB/s
writing. Measured it's lower (around 480/130) but still faster than HDD even
at linear writing.
I'm using writeback. And I had no data loss or inconsistencies yet, even I
had to hard reboot one time or another. But btrfs without bcache has also
been rock solid for me in the past few months wrt hard reboots or powerloss.
Some people actually say, with bcache the probability of loosing data should
be potentially lower as the data is faster on stable storage and thus
transactions on btrfs can be closed faster. While bcache will still be in
dirty state, it will write back data later and replay its log if it didn't
finish before rebooting. Well, bcache is always in dirty state, by design.
I just wonder what role bcache would play in writeback mode and btrfs-raid
scenario as a single bcache device covers multiple btrfs devices when btrfs
itself assumes (and only sees) multiple devices - but it's actually one when
passed through bcache first. Write errors may go undetected (because bcache
writes behind) while btrfs still sees good data from the cache. But btrfs
checksums should probably handle this anyways... I'm not sure. Maybe bcache
should not allow reading blocks from the cache which are going to be written
back, and then evict written blocks from the cache before those need to be
read again from the backing device. It would ensure that btrfs really sees
what is on the platter instead of what's maybe cached. Probably in the end,
it's the same problem as bit-rot when bcache and HDD unkowningly don't match
and later bcache evicts good data from cache and leaves bad data behind.
Ahh, complicated... ;-)
But I trust bcache by now though I didn't forcibly try the big disasters (by
cutting the power cord during heavy IO or similar funny things). And
nevertheless, I still have my daily backups around. ;-)
Altogehter, I wonder if having a real hot data cache would bring so much
additional benefit. Maybe only when it's huge and when it's really fast (I
mean those SSDs capable of doing 500+ MB/s at reading AND writing).
--
Replies to list only preferred.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-02-19 23:17 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-19 11:49 BTRFS hot relocation not merged Max Schettler
2015-02-19 21:06 ` Duncan
2015-02-19 23:07 ` Kai Krakow
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).