linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Martin Steigerwald <martin@lichtvoll.de>,
	Kai Krakow <hurikhan77@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Use fast device only for metadata?
Date: Mon, 8 Feb 2016 08:29:29 -0500	[thread overview]
Message-ID: <56B89839.1060709@gmail.com> (raw)
In-Reply-To: <56B8962C.6050302@gmx.com>

On 2016-02-08 08:20, Qu Wenruo wrote:
> On 02/08/2016 08:24 PM, Austin S. Hemmelgarn wrote:
>> On 2016-02-07 15:59, Martin Steigerwald wrote:
>>> Am Sonntag, 7. Februar 2016, 21:07:13 CET schrieb Kai Krakow:
>>>> Am Sun, 07 Feb 2016 11:06:58 -0800
>>>>
>>>> schrieb Nikolaus Rath <Nikolaus@rath.org>:
>>>>> Hello,
>>>>>
>>>>> I have a large home directory on a spinning disk that I regularly
>>>>> synchronize between different computers using unison. That takes ages,
>>>>> even though the amount of changed files is typically small. I suspect
>>>>> most if the time is spend walking through the file system and checking
>>>>> mtimes.
>>>>>
>>>>> So I was wondering if I could possibly speed-up this operation by
>>>>> storing all btrfs metadata on a fast, SSD drive. It seems that
>>>>> mkfs.btrfs allows me to put the metadata in raid1 or dup mode, and the
>>>>> file contents in single mode. However, I could not find a way to tell
>>>>> btrfs to use a device *only* for metadata. Is there a way to do that?
>>>>>
>>>>> Also, what is the difference between using "dup" and "raid1" for the
>>>>> metadata?
>>>>
>>>> You may want to try bcache. It will speedup random access which is
>>>> probably the main cause for your slow sync. Unfortunately it requires
>>>> you to reformat your btrfs partitions to add a bcache superblock. But
>>>> it's worth the efforts.
>>>>
>>>> I use a nightly rsync to USB3 disk, and bcache reduced it from 5+ hours
>>>> to typically 1.5-3 depending on how much data changed.
>>>
>>> An alternative is using dm-cache, I think it doesn´t need to recreate
>>> the
>>> filesystem.
>> That's correct, dm-cache can use a regular underlying storage device.
>> This of course has potential implications for a multi-device filesystem
>> (it can seriously confuse BTRFS and cause data corruption), but it works
>> just fine for a single device filesystem.  This makes it a bit easier to
>> test run, but also means you need more devices (internally, it uses 3,
>> one backing device, one cache device, and a metadata device for
>> persistently mapping between the two).  It's really easy to set up
>> though if you have a recent version of LVM built with dm-cache support.
>>
>> In general, bcache takes a bit more setup, but avoids the multi-device
>> issues, and importantly, doesn't require LVM or dmsetup (which are
>> usually pretty big packages on many distros).  The caveat with bcache
>> though is that there have been issues in the past with data integrity
>> when used with BTRFS, but if you're on a recent kernel (at least 4.0 if
>> you're using BTRFS for actual data storage), you should have no issues.
>
> And I just want to add more about using a device *only* for metadata.
>
> The short answer is, unfortunately, NO.
>
> 1) Even using bcache/dm-cache, it may still cache small data write
>
> Although I'm not quite sure about dm-cache/bcache, but as long as the
> top file is Btrfs, it won't be possible to limit data/metadata to/from
> specific device.
>
> IIRC, bcache or similiar method may cache most random r/w of metadata,
> it's still quite possible to cache a lot of random r/w of data.
>
> And depending on the sector size(minimal data block size) and leaf size
> (metadata block size), it's even more possible to cache small data other
> than metadata under specific worload.
> As default sectorsize is 4K, but leafsize is 16K.
The mention of dm-cache/bcache was more intended as an alternative, 
since BTRFS currently can't do what Nikolaus was trying to achieve. 
Neither will give quite the performance profile that a dedicated 
metadata device might, but they should still significantly improve 
general performance.  In essence, these function for BTRFS like L2ARC on 
an SSD does for ZFS.
>
> 2) Btrfs don't have special preference on chunk allocation.
>
> Btrfs just allocate chunks in the order of unallocated space.
> So, even there is a super big TB or PB spinning device, and GB level
> SSD, btrfs will just trust them according to unallocated space.
On at least the project page, there is a suggestion to provide this 
functionality.  In a way, it's essentially equivalent to the external 
journal device supported by ext4, XFS, OCFS2 and some other filesystems, 
and as such, I'd say it's a feature we should seriously consider looking 
at implementing eventually, even if just for feature parity, and even if 
we speed up metadata operations in BTRFS.


  reply	other threads:[~2016-02-08 13:30 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-07 19:06 Use fast device only for metadata? Nikolaus Rath
2016-02-07 20:07 ` Kai Krakow
2016-02-07 20:59   ` Martin Steigerwald
2016-02-08  1:04     ` Duncan
2016-02-08 12:24     ` Austin S. Hemmelgarn
2016-02-08 13:20       ` Qu Wenruo
2016-02-08 13:29         ` Austin S. Hemmelgarn [this message]
2016-02-08 14:23           ` Qu Wenruo
2016-02-08 21:44     ` Nikolaus Rath
2016-02-08 22:12       ` Duncan
2016-02-09  7:29       ` Kai Krakow
2016-02-09 16:09         ` Nikolaus Rath
2016-02-09 21:43           ` Kai Krakow
2016-02-09 22:02             ` Chris Murphy
2016-02-09 22:38             ` Nikolaus Rath
2016-02-10  1:12               ` Henk Slager
2016-02-09 16:10         ` Nikolaus Rath
2016-02-09 21:29           ` Kai Krakow
2016-02-09 18:23         ` Henk Slager
2016-02-09 13:22       ` Austin S. Hemmelgarn
2016-02-10  4:08       ` Nikolaus Rath

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56B89839.1060709@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=hurikhan77@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=martin@lichtvoll.de \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).