linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gordan Bobic <gordan@bobich.net>
To: <linux-btrfs@vger.kernel.org>
Subject: Re: SSD Optimizations
Date: Thu, 11 Mar 2010 16:03:59 +0000	[thread overview]
Message-ID: <0592c2cb505638c1110eaef97192eb60@localhost> (raw)
In-Reply-To: <20100311163533.0ea09173.skraw@ithnet.com>

On Thu, 11 Mar 2010 16:35:33 +0100, Stephan von Krawczynski
<skraw@ithnet.com> wrote:

>> Besides, why shouldn't we help the drive firmware by 
>> - writing the data only in erase-block sizes
>> - trying to write blocks that are smaller than the erase-block in a way
>> that won't cross the erase-block boundary
> 
> Because if the designing engineer of a good SSD controller wasn't able
to
> cope with that he will have no chance to design a second one.

You seem to be confusing quality of implementation with theoretical
possibility.

>> This will not only increase the life of the SSD but also increase its 
>> performance.
> 
> TRIM: maybe yes. Rest: pure handwaving.
> 
>> [...]
>> > > And your guess is that intel engineers had no glue when designing
>> > > the XE
>> > > including its controller? You think they did not know what you and
me
>> > > know and
>> > > therefore pray every day that some smart fs designer falls from
>> > > heaven
>> > > and saves their product from dying in between? Really?
>> > 
>> > I am saying that there are problems that CANNOT be solved on the disk
>> > firmware level. Some problems HAVE to be addressed higher up the
stack.
>> 
>> Exactly, you can't assume that the SSDs firmware understands any and
all
>> file
>> system layouts, especially if they are on fragmented LVM or other
>> logical
>> volume manager partitions.
> 
> Hopefully the firmware understands exactly no fs layout at all. That
would
> be
> braindead. Instead it should understand how to arrange incoming and
> outgoing
> data in a way that its own technical requirements are met as perfect as
> possible. This is no spinning disk, it is completely irrelevant what the
> data
> layout looks like as long as the controller finds its way through and
copes
> best with read/write/erase cycles. It may well use additional RAM for
> caching and data reordering.
> Do you really believe ascending block numbers are placed in ascending
> addresses inside the disk (as an example)? Why should they? What does
that
> mean for fs block ordering? If you don't know anyway what a controller
> does to
> your data ordering, how do you want to help it with its job?
> Please accept that we are _not_ talking about trivial flash mem here or
> pseudo-SSDs consisting of sd cards. The market has already evolved
better
> products. The dinosaurs are extincted even if some are still looking
alive.

I am assuming that you are being deliberately facetious here (the
alternative is less kind). The simple fact is that you cannot come up with
some magical data (re)ordering method that nullifies problems of common
use-cases that are quite nasty for flash based media.

For example - you have a disk that has had all it's addressable blocks
tainted. A new write comes in - what do you do with it? Worse, a write
comes in spanning two erase blocks as a consequence of the data
re-alignment in the firmware. You have no choice but to wipe them both and
re-write the data. You'd be better off not doing the magic and assuming
that the FS is sensibly aligned.

Having a large chunk of spare non-addressable space for this doesn't
necessarily help you, either, unless it is about the same size as the
addressable space (worse case scenario, if you accept that the vast
majority of FS-es use 4KB block sizes, you can cut a corner there by a
factor of 8). All of that adds to cost - flash is still expensive.

The bottom line is that you _cannot_ solve wear-leveling completely just
in firmware. There is no doubt you can get some of the way there, but it is
mathematically impossible to solve completely without intervention from
further up the stack. Since some black-box firmware optimizations may quite
concievably make the wear problem worse, it makes perfect sense to just
hopefully assume that the FS is trying to help - it's unlikely to make
things worse and may well make things a lot better.

Gordan

  reply	other threads:[~2010-03-11 16:03 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-10 19:49 SSD Optimizations Gordan Bobic
2010-03-10 21:14 ` Marcus Fritzsch
2010-03-10 21:22   ` Marcus Fritzsch
2010-03-10 23:13   ` Gordan Bobic
2010-03-11 10:35     ` Daniel J Blueman
2010-03-11 12:03       ` Gordan Bobic
2010-03-10 23:12 ` Mike Fedyk
2010-03-10 23:22   ` Gordan Bobic
2010-03-11  7:38     ` Sander
2010-03-11 10:59       ` Hubert Kario
2010-03-11 11:31         ` Stephan von Krawczynski
2010-03-11 12:17           ` Gordan Bobic
2010-03-11 12:59             ` Stephan von Krawczynski
2010-03-11 13:20               ` Gordan Bobic
2010-03-11 14:01                 ` Hubert Kario
2010-03-11 15:35                   ` Stephan von Krawczynski
2010-03-11 16:03                     ` Gordan Bobic [this message]
2010-03-11 16:19                       ` Chris Mason
2010-03-12  1:07                         ` Hubert Kario
2010-03-12  1:42                           ` Chris Mason
2010-03-12  9:15                           ` Stephan von Krawczynski
2010-03-12 16:00                             ` Hubert Kario
2010-03-13 17:02                               ` Stephan von Krawczynski
2010-03-13 19:01                                 ` Hubert Kario
2010-03-11 16:48             ` Martin K. Petersen
2010-03-11 14:39           ` Sander
2010-03-11 17:35             ` Stephan von Krawczynski
2010-03-11 18:00               ` Chris Mason
2010-03-13 16:43                 ` Stephan von Krawczynski
2010-03-13 19:41                   ` Hubert Kario
2010-03-13 21:48                   ` Chris Mason
2010-03-14  3:19                   ` Jeremy Fitzhardinge
2010-03-11 12:09         ` Gordan Bobic
2010-03-11 16:22           ` Martin K. Petersen
2010-03-11 11:59       ` Gordan Bobic
2010-03-11 15:59         ` Asdo
     [not found]         ` <4B98F350.6080804@shiftmail.org>
2010-03-11 16:15           ` Gordan Bobic
2010-03-11 14:21 ` Chris Mason
2010-03-11 16:18   ` Gordan Bobic
2010-03-11 16:29     ` Chris Mason
  -- strict thread matches above, loose matches on Subject: below --
2010-12-12 17:24 SSD optimizations Paddy Steed
2010-12-13  0:04 ` Gordan Bobic
2010-12-13  5:11   ` Sander
2010-12-13  9:25     ` Gordan Bobic
2010-12-13 14:33       ` Peter Harris
2010-12-13 15:04         ` Gordan Bobic
2010-12-13 15:17       ` cwillu
2010-12-13 16:48         ` Gordan Bobic
2010-12-13 17:17   ` Paddy Steed
2010-12-13 17:47     ` Gordan Bobic
2010-12-13 18:20     ` Tomasz Torcz
2010-12-13 19:34       ` Ric Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0592c2cb505638c1110eaef97192eb60@localhost \
    --to=gordan@bobich.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).