All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gordan Bobic <gordan@bobich.net>
To: cwillu <cwillu@cwillu.com>
Cc: sander@humilis.net, jarktasaa@gmail.com, linux-btrfs@vger.kernel.org
Subject: Re: SSD optimizations
Date: Mon, 13 Dec 2010 16:48:19 +0000	[thread overview]
Message-ID: <4D064E53.6050207@bobich.net> (raw)
In-Reply-To: <AANLkTinpD+CXsHMB10DmXm-jF5a_rCUJ-tkeVthy4FYV@mail.gmail.com>

On 13/12/2010 15:17, cwillu wrote:

>>>>> In a few weeks parts for my new computer will be arriving. The st=
orage
>>>>> will be a 128GB SSD. A few weeks after that I will order three la=
rge
>>>>> disks for a RAID array. I understand that BTRFS RAID 5 support wi=
ll be
>>>>> available shortly. What is the best possible way for me to get th=
e
>>>>> highest performance out of this setup. I know of the option to op=
timize
>>>>> for SSD's
>>>>
>>>> BTRFS is hardly the best option for SSDs. I typically use ext4
>>>> without a journal on SSDs, or ext2 if that is not available.
>>>> Journalling causes more writes to hit the disk, which wears out
>>>> flash faster. Plus, SSDs typically have much slower writes than
>>>> reads, so avoiding writes is a good thing.
>>>
>>> Gordan, this you wrote is so wrong I don't even know where to begin=
=2E
>>>
>>> You'd better google a bit on the subject (ssd, and btrfs on ssd) as=
 much
>>> is written about it already.
>>
>> I suggest you back your opinion up with some hard data before making=
 such
>> statements. Here's a quick test - make an ext2 fs and a btrfs on two=
 similar
>> disk partitions (any disk, for the sake of the experiment it doesn't=
 have to
>> be an ssd), then check vmstat -d to get a base line. Then put the ke=
rnel
>> sources on each it, do a full build, then make clean and check vmsta=
t -d
>> again. Check the vmstat -d output again. See how many writes (sector=
s) hit
>> the disk with ext2 and how many with btrfs. You'll find that there w=
ere many
>> more writes with BTRFS. You can't go faster when doing more. Journal=
ing is
>> expensive.
>
> Of course.  But that applies to rotating media as well (where the
> seeks involved hurt much more), and has little if anything to do with
> why you would use btrfs instead of ext2.

Indeed - btrfs is about features, most specifically the chesumming that=
=20
allows smart recovery from disk media failure. But on flash, write=20
volumes are something that shouldn't be ignored.

> Good ssd drives (by which I mean anything but consumer flash as it
> exists on sd cards and usb sticks) have very good wear leveling, good
> enough that you could overwrite the same logical sector billions of
> times before you'd experience any failure due to wear.

It comes down to volumes even in the best case scenario. A _very_ good=20
SSD (e.g. Intel) might get write amplification down to about 1.2:1, but=
=20
more typical figures are in the region of 10-20:1. Every write that can=
=20
be avoided, should be avoided.

> The issues
> with cheaper ssd drives (which I distinguish from things like sd
> cards) are uniformly performance degredation due to crappy garbage
> collection and lack of trim support to compensate.  A journal is _not=
_
> a problem here.

The journal doesn't help. It can cause more than a 50% overhead on=20
metadata-heavy operations.

> On crappy flash, yes, you want to avoid a journal, mainly because the
> write leveling for a given sector only occurs over a fixed small
> number of erase blocks, resulting in a filesystem that you can burn
> out quite easily =97 I have a small pile of sd cards on my desk that =
I
> sent to such a fate.  Even here there is reason to use btrfs.  The
> journaling performed is much less strenuous that ext3/4:  it's
> basically just a version stamp, as opposed to actually journaling the
> metadata involved.  The actual metadata writes, being copy-on-write,
> provide pretty much the best case for crappy flash, as cow inherently
> wear-levels over the entire device (ssd_spread).  To say nothing of
> checksums and duplicated metadata, allowing you to actually determine
> if you're running into corrupted metadata, and often recover from it
> transparently.  Ext2's behavior in this respect is less than ideal.

I'm not disputing that, but the OP was talking about using the SSD as a=
=20
cache for a slower disk subsystem. That is likely to waste the SSD=20
pretty quickly purely by volume of writes, regardless of how good the=20
wear leveling is. That may be fine on a setup where the SSD is treated=20
as disposable throw-away cache item that doesn't lose you data when it=20
goes wrong, but what was being discussed isn't an expensive enterprise=20
grade setup that behaves that way.

Gordan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2010-12-13 16:48 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-12 17:24 SSD optimizations Paddy Steed
2010-12-13  0:04 ` Gordan Bobic
2010-12-13  5:11   ` Sander
2010-12-13  9:25     ` Gordan Bobic
2010-12-13 14:33       ` Peter Harris
2010-12-13 15:04         ` Gordan Bobic
2010-12-13 15:17       ` cwillu
2010-12-13 16:48         ` Gordan Bobic [this message]
2010-12-13 17:17   ` Paddy Steed
2010-12-13 17:47     ` Gordan Bobic
2010-12-13 18:20     ` Tomasz Torcz
2010-12-13 19:34       ` Ric Wheeler
  -- strict thread matches above, loose matches on Subject: below --
2010-03-10 19:49 SSD Optimizations Gordan Bobic
2010-03-10 21:14 ` Marcus Fritzsch
2010-03-10 21:22   ` Marcus Fritzsch
2010-03-10 23:13   ` Gordan Bobic
2010-03-11 10:35     ` Daniel J Blueman
2010-03-11 12:03       ` Gordan Bobic
2010-03-10 23:12 ` Mike Fedyk
2010-03-10 23:22   ` Gordan Bobic
2010-03-11  7:38     ` Sander
2010-03-11 10:59       ` Hubert Kario
2010-03-11 11:31         ` Stephan von Krawczynski
2010-03-11 12:17           ` Gordan Bobic
2010-03-11 12:59             ` Stephan von Krawczynski
2010-03-11 13:20               ` Gordan Bobic
2010-03-11 14:01                 ` Hubert Kario
2010-03-11 15:35                   ` Stephan von Krawczynski
2010-03-11 16:03                     ` Gordan Bobic
2010-03-11 16:19                       ` Chris Mason
2010-03-12  1:07                         ` Hubert Kario
2010-03-12  1:42                           ` Chris Mason
2010-03-12  9:15                           ` Stephan von Krawczynski
2010-03-12 16:00                             ` Hubert Kario
2010-03-13 17:02                               ` Stephan von Krawczynski
2010-03-13 19:01                                 ` Hubert Kario
2010-03-11 16:48             ` Martin K. Petersen
2010-03-11 14:39           ` Sander
2010-03-11 17:35             ` Stephan von Krawczynski
2010-03-11 18:00               ` Chris Mason
2010-03-13 16:43                 ` Stephan von Krawczynski
2010-03-13 19:41                   ` Hubert Kario
2010-03-13 21:48                   ` Chris Mason
2010-03-14  3:19                   ` Jeremy Fitzhardinge
2010-03-11 12:09         ` Gordan Bobic
2010-03-11 16:22           ` Martin K. Petersen
2010-03-11 11:59       ` Gordan Bobic
2010-03-11 15:59         ` Asdo
     [not found]         ` <4B98F350.6080804@shiftmail.org>
2010-03-11 16:15           ` Gordan Bobic
2010-03-11 14:21 ` Chris Mason
2010-03-11 16:18   ` Gordan Bobic
2010-03-11 16:29     ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D064E53.6050207@bobich.net \
    --to=gordan@bobich.net \
    --cc=cwillu@cwillu.com \
    --cc=jarktasaa@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=sander@humilis.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.