linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Greg Freemyer <greg.freemyer@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
	Chris Worley <worleys@gmail.com>, "Majed B." <majedb@gmail.com>,
	Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: Intel Updates SSDs, Supports TRIM, Faster Writes
Date: Tue, 10 Nov 2009 17:56:26 -0500	[thread overview]
Message-ID: <yq1ocnam1k5.fsf@sermon.lab.mkp.net> (raw)
In-Reply-To: <87f94c370911101301v4b71ce74hbd4ebd20e7ce2419@mail.gmail.com> (Greg Freemyer's message of "Tue, 10 Nov 2009 16:01:44 -0500")

>>>>> "Greg" == Greg Freemyer <greg.freemyer@gmail.com> writes:

Greg> I'm not sure where it ended up, but the big SSD / discard
Greg> discussion of a few months ago talked about 3 kinds of solutions,
Greg> and I thought the plan was to support all 3.

We don't design for the past.


Greg> 1) optimization 1 - A white-listed instant discard feature.  In
Greg>    this methodology, the filesystems would immediately send
Greg>    discard calls down to the block layer would send them on down
Greg>    the block stack to the physical devices with very minimal
Greg>    buffering.

There's no whitelist.  That's just how it works.

Yes, there were a few crappy devices out there.  Windows 7 issuing TRIM
commands in realtime made them instantly obsolete.  If future devices
suck with Windows 7 nobody will buy them.


Greg> 2) optimization 2 - The block layer would accept those small
Greg>    discards, but accumulate them for a short period.  (less than a
Greg>    second was my impression).  Then coalesce them into larger
Greg>    discards and send them down the block stack and eventually to
Greg>    the physical device.

SSDs are special in that they actually track map state on a per-logical
block basis.  Other thinly provisioned devices track space in units
ranging from 16-32-64KB up to megabytes.

It's up to each block device to track the map space.  The way most
arrays work is that they'll ignore the portions of the request that are
not aligned to and a multiple of their internal allocation unit.

The same applies to MD.  IOW, MD would only unmap the portions of the
discard request that constitute entire stripes.  No keeping state
required.

Jens just queued my patch which allows block devices to communicate
their unmap granularity and alignment to the filesystems.  This means we
can potentially use this to influence filesystem allocators.  For SCSI
arrays these values are queried and passed up the stack.  MD can choose
to manually set the granularity to its stripe size.


Greg> 3) optimization 3 - a background freespace scanner would run from
Greg> time to time that scanned a filesystem for free blocks and send a
Greg> discard / trim command down to the device.  This is what Mark Lord
Greg> was working on.  His solution was primarily in user space and was
Greg> controlled by cron.

I think that's a fine approach for legacy devices.  But as I said I
think Windows 7 will root out all devices with poor TRIM performance
pretty quickly.

-- 
Martin K. Petersen	Oracle Linux Engineering

  parent reply	other threads:[~2009-11-10 22:56 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-08 17:57 Intel Updates SSDs, Supports TRIM, Faster Writes Bill Davidsen
2009-11-08 22:30 ` Thomas Fjellstrom
2009-11-09  1:13 ` Majed B.
2009-11-09 16:37   ` Chris Worley
2009-11-09 16:42     ` Majed B.
2009-11-09 16:59       ` Chris Worley
2009-11-10  9:42         ` Kasper Sandberg
2009-11-10 15:39           ` Chris Worley
2009-11-10 15:43             ` Majed B.
2009-11-10 15:58               ` Chris Worley
2009-11-10 16:01                 ` Majed B.
2009-11-10 16:15                   ` Robin Hill
2009-11-10 16:31                     ` Chris Worley
2009-11-10 16:18                   ` Chris Worley
2009-11-10 18:31                     ` Majed B.
2009-11-10 23:03                       ` Mathieu Chouquet-Stringer
2009-11-11  2:52                         ` Majed B.
2009-11-10 18:40                     ` Kasper Sandberg
2009-11-10 15:48             ` Asdo
2009-11-10 16:04               ` Chris Worley
2009-11-11 18:02                 ` Default User
2009-11-10 18:38             ` Kasper Sandberg
2009-11-10 16:36         ` Martin K. Petersen
2009-11-10 17:22           ` Chris Worley
2009-11-10 20:11             ` Martin K. Petersen
2009-11-10 20:45               ` Chris Worley
2009-11-10 22:35                 ` Martin K. Petersen
2009-11-11 18:17                   ` Chris Worley
2009-11-10 21:01               ` Greg Freemyer
2009-11-10 21:17                 ` Chris Worley
2009-11-10 22:56                 ` Martin K. Petersen [this message]
2009-11-11 17:00                   ` Greg Freemyer
2009-11-12  5:50                     ` Martin K. Petersen
2009-11-09 18:42   ` Greg Freemyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq1ocnam1k5.fsf@sermon.lab.mkp.net \
    --to=martin.petersen@oracle.com \
    --cc=greg.freemyer@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=majedb@gmail.com \
    --cc=worleys@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).