linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gordan Bobic <gordan@bobich.net>
To: cwillu <cwillu@cwillu.com>
Cc: sander@humilis.net, jarktasaa@gmail.com, linux-btrfs@vger.kernel.org
Subject: Re: SSD optimizations
Date: Mon, 13 Dec 2010 16:48:19 +0000	[thread overview]
Message-ID: <4D064E53.6050207@bobich.net> (raw)
In-Reply-To: <AANLkTinpD+CXsHMB10DmXm-jF5a_rCUJ-tkeVthy4FYV@mail.gmail.com>

On 13/12/2010 15:17, cwillu wrote:

>>>>> In a few weeks parts for my new computer will be arriving. The st=
orage
>>>>> will be a 128GB SSD. A few weeks after that I will order three la=
rge
>>>>> disks for a RAID array. I understand that BTRFS RAID 5 support wi=
ll be
>>>>> available shortly. What is the best possible way for me to get th=
e
>>>>> highest performance out of this setup. I know of the option to op=
timize
>>>>> for SSD's
>>>>
>>>> BTRFS is hardly the best option for SSDs. I typically use ext4
>>>> without a journal on SSDs, or ext2 if that is not available.
>>>> Journalling causes more writes to hit the disk, which wears out
>>>> flash faster. Plus, SSDs typically have much slower writes than
>>>> reads, so avoiding writes is a good thing.
>>>
>>> Gordan, this you wrote is so wrong I don't even know where to begin=
=2E
>>>
>>> You'd better google a bit on the subject (ssd, and btrfs on ssd) as=
 much
>>> is written about it already.
>>
>> I suggest you back your opinion up with some hard data before making=
 such
>> statements. Here's a quick test - make an ext2 fs and a btrfs on two=
 similar
>> disk partitions (any disk, for the sake of the experiment it doesn't=
 have to
>> be an ssd), then check vmstat -d to get a base line. Then put the ke=
rnel
>> sources on each it, do a full build, then make clean and check vmsta=
t -d
>> again. Check the vmstat -d output again. See how many writes (sector=
s) hit
>> the disk with ext2 and how many with btrfs. You'll find that there w=
ere many
>> more writes with BTRFS. You can't go faster when doing more. Journal=
ing is
>> expensive.
>
> Of course.  But that applies to rotating media as well (where the
> seeks involved hurt much more), and has little if anything to do with
> why you would use btrfs instead of ext2.

Indeed - btrfs is about features, most specifically the chesumming that=
=20
allows smart recovery from disk media failure. But on flash, write=20
volumes are something that shouldn't be ignored.

> Good ssd drives (by which I mean anything but consumer flash as it
> exists on sd cards and usb sticks) have very good wear leveling, good
> enough that you could overwrite the same logical sector billions of
> times before you'd experience any failure due to wear.

It comes down to volumes even in the best case scenario. A _very_ good=20
SSD (e.g. Intel) might get write amplification down to about 1.2:1, but=
=20
more typical figures are in the region of 10-20:1. Every write that can=
=20
be avoided, should be avoided.

> The issues
> with cheaper ssd drives (which I distinguish from things like sd
> cards) are uniformly performance degredation due to crappy garbage
> collection and lack of trim support to compensate.  A journal is _not=
_
> a problem here.

The journal doesn't help. It can cause more than a 50% overhead on=20
metadata-heavy operations.

> On crappy flash, yes, you want to avoid a journal, mainly because the
> write leveling for a given sector only occurs over a fixed small
> number of erase blocks, resulting in a filesystem that you can burn
> out quite easily =97 I have a small pile of sd cards on my desk that =
I
> sent to such a fate.  Even here there is reason to use btrfs.  The
> journaling performed is much less strenuous that ext3/4:  it's
> basically just a version stamp, as opposed to actually journaling the
> metadata involved.  The actual metadata writes, being copy-on-write,
> provide pretty much the best case for crappy flash, as cow inherently
> wear-levels over the entire device (ssd_spread).  To say nothing of
> checksums and duplicated metadata, allowing you to actually determine
> if you're running into corrupted metadata, and often recover from it
> transparently.  Ext2's behavior in this respect is less than ideal.

I'm not disputing that, but the OP was talking about using the SSD as a=
=20
cache for a slower disk subsystem. That is likely to waste the SSD=20
pretty quickly purely by volume of writes, regardless of how good the=20
wear leveling is. That may be fine on a setup where the SSD is treated=20
as disposable throw-away cache item that doesn't lose you data when it=20
goes wrong, but what was being discussed isn't an expensive enterprise=20
grade setup that behaves that way.

Gordan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2010-12-13 16:48 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-12 17:24 SSD optimizations Paddy Steed
2010-12-13  0:04 ` Gordan Bobic
2010-12-13  5:11   ` Sander
2010-12-13  9:25     ` Gordan Bobic
2010-12-13 14:33       ` Peter Harris
2010-12-13 15:04         ` Gordan Bobic
2010-12-13 15:17       ` cwillu
2010-12-13 16:48         ` Gordan Bobic [this message]
2010-12-13 17:17   ` Paddy Steed
2010-12-13 17:47     ` Gordan Bobic
2010-12-13 18:20     ` Tomasz Torcz
2010-12-13 19:34       ` Ric Wheeler
  -- strict thread matches above, loose matches on Subject: below --
2010-03-10 19:49 SSD Optimizations Gordan Bobic
2010-03-10 21:14 ` Marcus Fritzsch
2010-03-10 21:22   ` Marcus Fritzsch
2010-03-10 23:13   ` Gordan Bobic
2010-03-11 10:35     ` Daniel J Blueman
2010-03-11 12:03       ` Gordan Bobic
2010-03-10 23:12 ` Mike Fedyk
2010-03-10 23:22   ` Gordan Bobic
2010-03-11  7:38     ` Sander
2010-03-11 10:59       ` Hubert Kario
2010-03-11 11:31         ` Stephan von Krawczynski
2010-03-11 12:17           ` Gordan Bobic
2010-03-11 12:59             ` Stephan von Krawczynski
2010-03-11 13:20               ` Gordan Bobic
2010-03-11 14:01                 ` Hubert Kario
2010-03-11 15:35                   ` Stephan von Krawczynski
2010-03-11 16:03                     ` Gordan Bobic
2010-03-11 16:19                       ` Chris Mason
2010-03-12  1:07                         ` Hubert Kario
2010-03-12  1:42                           ` Chris Mason
2010-03-12  9:15                           ` Stephan von Krawczynski
2010-03-12 16:00                             ` Hubert Kario
2010-03-13 17:02                               ` Stephan von Krawczynski
2010-03-13 19:01                                 ` Hubert Kario
2010-03-11 16:48             ` Martin K. Petersen
2010-03-11 14:39           ` Sander
2010-03-11 17:35             ` Stephan von Krawczynski
2010-03-11 18:00               ` Chris Mason
2010-03-13 16:43                 ` Stephan von Krawczynski
2010-03-13 19:41                   ` Hubert Kario
2010-03-13 21:48                   ` Chris Mason
2010-03-14  3:19                   ` Jeremy Fitzhardinge
2010-03-11 12:09         ` Gordan Bobic
2010-03-11 16:22           ` Martin K. Petersen
2010-03-11 11:59       ` Gordan Bobic
2010-03-11 15:59         ` Asdo
     [not found]         ` <4B98F350.6080804@shiftmail.org>
2010-03-11 16:15           ` Gordan Bobic
2010-03-11 14:21 ` Chris Mason
2010-03-11 16:18   ` Gordan Bobic
2010-03-11 16:29     ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D064E53.6050207@bobich.net \
    --to=gordan@bobich.net \
    --cc=cwillu@cwillu.com \
    --cc=jarktasaa@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=sander@humilis.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).