linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Henk Slager <eye1tm@gmail.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Hot data tracking / hybrid storage
Date: Mon, 23 May 2016 07:32:19 -0400	[thread overview]
Message-ID: <f1fd9821-03e7-079a-3894-2543c26aafd0@gmail.com> (raw)
In-Reply-To: <CAPmG0jbjdWf+98wr-NaL622wVAkfMJdHEij7feM9OK=AwZw68A@mail.gmail.com>

On 2016-05-20 18:26, Henk Slager wrote:
> Yes, sorry, I took some shortcut in the discussion and jumped to a
> method for avoiding this 0.5-2% slowdown that you mention. (Or a
> kernel crashing in bcache code due to corrupt SB on a backing device
> or corrupted caching device contents).
> I am actually bit surprised that there is a measurable slowdown,
> considering that it is basically just one 8KiB offset on a certain
> layer in the kernel stack, but I haven't looked at that code.
There's still a layer of indirection in the kernel code, even in the 
pass-through mode with no cache, and that's probably where the slowdown 
comes from.  My testing was also in a VM with it's backing device on an 
SSD though, so you may get different results on other hardware
> I don't know other tables than MBR and GPT, but this bcache SB
> 'insertion' works with both. Indeed, if GRUB is involved, it can get
> complicated, I have avoided that. If there is less than 8KiB slack
> space on a HDD, I would worry about alignment/performance first, then
> there is likely a reason to fully rewrite the HDD with a standard 1M
> alingment.
The 'alignment' things is mostly bogus these days.  It originated when 
1M was a full track on the disk, and you wanted your filesystem to start 
on the beginning of a track for performance reasons.  On most modern 
disks though, this is not a full track, but it got kept because a number 
of bootloaders (GRUB included) used to use the slack space this caused 
to embed themselves before the filesystem.  The only case where 1M 
alignment actually makes sense is on SSD's with a 1M erase block size 
(which are rare, most consumer devices have a 4M erase block).  As far 
as partition tables, you're not likely to see any other formats these 
days (the only ones I've dealt with other than MBR and GPT are APM (the 
old pre-OSX Apple format), RDB (the Amiga format, which is kind of neat 
because it can embed drivers), and the old Sun disk labels (from before 
SunOS became Solaris)), and I had actually forgotten that a GPT is only 
32k, hence my comment about it potentially being an issue.
> If there is more partitions and the partition in front of the one you
> would like to be bcached, I personally would shrink it by 8KiB (like
> NTFS or swap or ext4 ) if that saves me TeraBytes of datatransfers.
Definitely, although depending on how the system is set up, this will 
almost certainly need down time.
>
>> This also doesn't change the fact that without careful initial formatting
>> (it is possible on some filesystems to embed the bcache SB at the beginning
>> of the FS itself, many of them have some reserved space at the beginning of
>> the partition for bootloaders, and this space doesn't have to exist when
>> mounting the FS) or manual alteration of the partition, it's not possible to
>> mount the FS on a system without bcache support.
>
> If we consider a non-bootable single HDD btrfs FS, are you then
> suggesting that the bcache SB could be placed in the first 64KiB where
> also GRUB stores its code if the FS would need booting ?
> That would be interesting, it would mean that also for btrfs on raw
> device (and also multi-device) there is no extra exclusive 8KiB space
> needed in front.
> Is there someone who has this working? I think it would lead to issues
> on the blocklayer, but I have currently no clue about that.
I don't think it would work on BTRFS, we expect the SB at a fixed 
location into the device, and it wouldn't be there on the bcache device. 
  It might work on ext4 though, but I'm not certain about that.  I do 
know of at least one person who got it working with a FAT32 filesystem 
as a proof of concept though.  Trying to do that even if it would work 
on BTRFS would be _really_ risky though, because the kernel would 
potentially see both devices, and you would probably have the same 
issues that you do with block level copies.

  reply	other threads:[~2016-05-23 11:32 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-15 12:12 Hot data tracking / hybrid storage Ferry Toth
2016-05-15 21:11 ` Duncan
2016-05-15 23:05   ` Kai Krakow
2016-05-17  6:27     ` Ferry Toth
2016-05-17 11:32       ` Austin S. Hemmelgarn
2016-05-17 18:33         ` Kai Krakow
2016-05-18 22:44           ` Ferry Toth
2016-05-19 18:09             ` Kai Krakow
2016-05-19 18:51               ` Austin S. Hemmelgarn
2016-05-19 21:01                 ` Kai Krakow
2016-05-20 11:46                   ` Austin S. Hemmelgarn
2016-05-19 23:23                 ` Henk Slager
2016-05-20 12:03                   ` Austin S. Hemmelgarn
2016-05-20 17:02                     ` Ferry Toth
2016-05-20 17:59                       ` Austin S. Hemmelgarn
2016-05-20 21:31                         ` Henk Slager
2016-05-29  6:23                         ` Andrei Borzenkov
2016-05-29 17:53                           ` Chris Murphy
2016-05-29 18:03                             ` Holger Hoffstätte
2016-05-29 18:33                               ` Chris Murphy
2016-05-29 20:45                                 ` Ferry Toth
2016-05-31 12:21                                   ` Austin S. Hemmelgarn
2016-06-01 10:45                                   ` Dmitry Katsubo
2016-05-20 22:26                     ` Henk Slager
2016-05-23 11:32                       ` Austin S. Hemmelgarn [this message]
2016-05-16 11:25 ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f1fd9821-03e7-079a-3894-2543c26aafd0@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=eye1tm@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).