public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* file system for solid state disks
@ 2007-08-23  5:01 Richard Ballantyne
  2007-08-23  5:52 ` Jan Engelhardt
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Ballantyne @ 2007-08-23  5:01 UTC (permalink / raw)
  To: linux-kernel

What file system that is already in the linux kernel do people recommend
I use for my laptop that now contains a solid state disk?

I appreciate your feedback.

Thank you,
Richard Ballantyne


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-08-23  5:01 Richard Ballantyne
@ 2007-08-23  5:52 ` Jan Engelhardt
  2007-08-23 10:26   ` Theodore Tso
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Engelhardt @ 2007-08-23  5:52 UTC (permalink / raw)
  To: Richard Ballantyne; +Cc: linux-kernel


On Aug 23 2007 01:01, Richard Ballantyne wrote:
>
>What file system that is already in the linux kernel do people recommend
>I use for my laptop that now contains a solid state disk?

If I had to choose, the list of options seems to be:

- logfs
  [unmerged]

- UBI layer with any fs you like
  [just a guess]

- UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
  [does not support ACLs/quotas]



	Jan
-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
@ 2007-08-23  8:55 Daniel J Blueman
  2007-08-23 12:45 ` James Courtier-Dutton
  2007-09-05 12:34 ` Denys Vlasenko
  0 siblings, 2 replies; 17+ messages in thread
From: Daniel J Blueman @ 2007-08-23  8:55 UTC (permalink / raw)
  To: Jan Engelhardt, Richard Ballantyne; +Cc: Linux Kernel

On 23 Aug, 07:00, Jan Engelhardt <jengelh@computergmbh.de> wrote:
> On Aug 23 2007 01:01, Richard Ballantyne wrote:
> >What file system that is already in the linux kernel do people recommend
> >I use for my laptop that now contains a solid state disk?
>
> If I had to choose, the list of options seems to be:
>
> - logfs
>   [unmerged]
>
> - UBI layer with any fs you like
>   [just a guess]
>
> - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
>   [does not support ACLs/quotas]

Isn't it that with modern rotational wear-levelling, re-writing hot
blocks many times is not an issue, as they are internally moved around
anyway? So, using a journalled filesystem such as ext3 is still good
(robustness and maturity in mind). Due to lack of write buffering,
perhaps a wandering log (journal) filesystem would be more suitable
though? I use ext3 on my >35MB/s compact flash filesystem.

I can see there being advantage in selecting a filesystem which is
lower complexity due to no additional spatial optimisation complexity,
but those advantages do buy other efficiency (eg the Orlov allocator
reducing fragmentation, thus less overhead), right?

Also, it would be natural to employ 'elevator=none', but perhaps there
is a small advantage in holding a group of flash blocks 'ready' (like
SDRAM pages being selected on-chip for lower bus access latency) -
however this no longer holds when logical->physical remapping is
performed, so perhaps it's better without an elevator.

Clearly, benchmarks speak...but perhaps it would make sense to have
libata disable the elevator for the (compact) flash block device?

Daniel
-- 
Daniel J Blueman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-08-23  5:52 ` Jan Engelhardt
@ 2007-08-23 10:26   ` Theodore Tso
  2007-08-23 11:25     ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Theodore Tso @ 2007-08-23 10:26 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Richard Ballantyne, linux-kernel

On Thu, Aug 23, 2007 at 07:52:46AM +0200, Jan Engelhardt wrote:
> 
> On Aug 23 2007 01:01, Richard Ballantyne wrote:
> >
> >What file system that is already in the linux kernel do people recommend
> >I use for my laptop that now contains a solid state disk?
> 
> If I had to choose, the list of options seems to be:
> 
> - logfs
>   [unmerged]
> 
> - UBI layer with any fs you like
>   [just a guess]

The question is whether the solid state disk gives you access to the
raw flash, or whether you have to go through the flash translation
layer because it's trying to look (exclusively) like a PATA or SATA
drive.  There are some SSD's that have a form factor and interfaces
that make them a drop-in replacement for a laptop hard drive, and a
number of the newer laptops that are supporting SSD's seem to be these
because (a) they don't have to radically change their design, (b) so
they can be compatible with Windows, and (c) so that users can
purchase the laptop either with a traditional hard drive or a SSD's as
an option, since at the moment SSD's are far more expensive than
disks.

So if you can't get access to the raw flash layer, then what you're
probably going to be looking at is a traditional block-oriented
filesystem, such as ext3, although there are clearly some things that
could be done such as disabling the elevator.   
      
						- Ted

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-08-23 10:26   ` Theodore Tso
@ 2007-08-23 11:25     ` Jens Axboe
  2007-08-29 17:36       ` Bill Davidsen
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2007-08-23 11:25 UTC (permalink / raw)
  To: Theodore Tso, Jan Engelhardt, Richard Ballantyne, linux-kernel

On Thu, Aug 23 2007, Theodore Tso wrote:
> On Thu, Aug 23, 2007 at 07:52:46AM +0200, Jan Engelhardt wrote:
> > 
> > On Aug 23 2007 01:01, Richard Ballantyne wrote:
> > >
> > >What file system that is already in the linux kernel do people recommend
> > >I use for my laptop that now contains a solid state disk?
> > 
> > If I had to choose, the list of options seems to be:
> > 
> > - logfs
> >   [unmerged]
> > 
> > - UBI layer with any fs you like
> >   [just a guess]
> 
> The question is whether the solid state disk gives you access to the
> raw flash, or whether you have to go through the flash translation
> layer because it's trying to look (exclusively) like a PATA or SATA
> drive.  There are some SSD's that have a form factor and interfaces
> that make them a drop-in replacement for a laptop hard drive, and a
> number of the newer laptops that are supporting SSD's seem to be these
> because (a) they don't have to radically change their design, (b) so
> they can be compatible with Windows, and (c) so that users can
> purchase the laptop either with a traditional hard drive or a SSD's as
> an option, since at the moment SSD's are far more expensive than
> disks.
> 
> So if you can't get access to the raw flash layer, then what you're
> probably going to be looking at is a traditional block-oriented
> filesystem, such as ext3, although there are clearly some things that
> could be done such as disabling the elevator.   

It's more complicated than that, I'd say. If the job of the elevator was
purely to sort request based on sector criteria, then I'd agree that
noop was the best way to go. But the elevator also abritrates access to
the disk for processes. Even if you don't pay a seek penalty, you still
would rather like to get your sync reads in without having to wait for
that huge writer that just queued hundreds of megabytes of io in front
of you (and will have done so behind your read, making you wait again
for a subsequent read).

My plan in this area is to add a simple storage profile and attach it to
the queue. Just start simple, allow a device driver to inform the block
layer that this device has no seek penalty. Then the io scheduler can
make more informed decisions on what to do - eg for ssd, sector
proximity may not have much meaning, so we should not take that into
account.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-08-23  8:55 file system for solid state disks Daniel J Blueman
@ 2007-08-23 12:45 ` James Courtier-Dutton
  2007-08-23 12:56   ` Daniel J Blueman
  2007-09-05 12:34 ` Denys Vlasenko
  1 sibling, 1 reply; 17+ messages in thread
From: James Courtier-Dutton @ 2007-08-23 12:45 UTC (permalink / raw)
  To: Daniel J Blueman; +Cc: Jan Engelhardt, Richard Ballantyne, Linux Kernel

Daniel J Blueman wrote:
> On 23 Aug, 07:00, Jan Engelhardt <jengelh@computergmbh.de> wrote:
>   
>> On Aug 23 2007 01:01, Richard Ballantyne wrote:
>>     
>>> What file system that is already in the linux kernel do people recommend
>>> I use for my laptop that now contains a solid state disk?
>>>       
>> If I had to choose, the list of options seems to be:
>>
>> - logfs
>>   [unmerged]
>>
>> - UBI layer with any fs you like
>>   [just a guess]
>>
>> - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
>>   [does not support ACLs/quotas]
>>     
>
> Isn't it that with modern rotational wear-levelling, re-writing hot
> blocks many times is not an issue, as they are internally moved around
> anyway? So, using a journalled filesystem such as ext3 is still good
> (robustness and maturity in mind). Due to lack of write buffering,
> perhaps a wandering log (journal) filesystem would be more suitable
> though? I use ext3 on my >35MB/s compact flash filesystem.
>
> I can see there being advantage in selecting a filesystem which is
> lower complexity due to no additional spatial optimisation complexity,
> but those advantages do buy other efficiency (eg the Orlov allocator
> reducing fragmentation, thus less overhead), right?
>
> Also, it would be natural to employ 'elevator=none', but perhaps there
> is a small advantage in holding a group of flash blocks 'ready' (like
> SDRAM pages being selected on-chip for lower bus access latency) -
> however this no longer holds when logical->physical remapping is
> performed, so perhaps it's better without an elevator.
>
> Clearly, benchmarks speak...but perhaps it would make sense to have
> libata disable the elevator for the (compact) flash block device?
>
> Daniel
>   

Also, sector read ahead will actually have a performance impact on 
Flash, instead of speed things up with a spinning disc.
For example, a request might read 128 sectors instead of the one 
requested at little or no extra performance impact for a spinning disc.
For flash, reading 128 sectors instead of the one requested will have a 
noticeable performance impact.
Spinning discs have high seek latency, low serial sector read latency 
and equal latency for read/write
Flash has low seek latency, high serial sector read latency and longer 
write than read times.

James


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-08-23 12:45 ` James Courtier-Dutton
@ 2007-08-23 12:56   ` Daniel J Blueman
       [not found]     ` <20070823134359.GB5576@mail.ustc.edu.cn>
  0 siblings, 1 reply; 17+ messages in thread
From: Daniel J Blueman @ 2007-08-23 12:56 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Jan Engelhardt, Richard Ballantyne, Linux Kernel,
	James Courtier-Dutton

Hi Fengguang,

On 23/08/07, James Courtier-Dutton <James@superbug.co.uk> wrote:
> Daniel J Blueman wrote:
> > On 23 Aug, 07:00, Jan Engelhardt <jengelh@computergmbh.de> wrote:
> >> On Aug 23 2007 01:01, Richard Ballantyne wrote:
> >>
> >>> What file system that is already in the linux kernel do people recommend
> >>> I use for my laptop that now contains a solid state disk?
> >>>
> >> If I had to choose, the list of options seems to be:
> >>
> >> - logfs
> >>   [unmerged]
> >>
> >> - UBI layer with any fs you like
> >>   [just a guess]
> >>
> >> - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
> >>   [does not support ACLs/quotas]
> >
> > Isn't it that with modern rotational wear-levelling, re-writing hot
> > blocks many times is not an issue, as they are internally moved around
> > anyway? So, using a journalled filesystem such as ext3 is still good
> > (robustness and maturity in mind). Due to lack of write buffering,
> > perhaps a wandering log (journal) filesystem would be more suitable
> > though? I use ext3 on my >35MB/s compact flash filesystem.
> >
> > I can see there being advantage in selecting a filesystem which is
> > lower complexity due to no additional spatial optimisation complexity,
> > but those advantages do buy other efficiency (eg the Orlov allocator
> > reducing fragmentation, thus less overhead), right?
> >
> > Also, it would be natural to employ 'elevator=none', but perhaps there
> > is a small advantage in holding a group of flash blocks 'ready' (like
> > SDRAM pages being selected on-chip for lower bus access latency) -
> > however this no longer holds when logical->physical remapping is
> > performed, so perhaps it's better without an elevator.
> >
> > Clearly, benchmarks speak...but perhaps it would make sense to have
> > libata disable the elevator for the (compact) flash block device?
> >
> > Daniel
>
> Also, sector read ahead will actually have a performance impact on
> Flash, instead of speed things up with a spinning disc.
> For example, a request might read 128 sectors instead of the one
> requested at little or no extra performance impact for a spinning disc.
> For flash, reading 128 sectors instead of the one requested will have a
> noticeable performance impact.
> Spinning discs have high seek latency, low serial sector read latency
> and equal latency for read/write
> Flash has low seek latency, high serial sector read latency and longer
> write than read times.

I was having problem invoking the readahead logic on my compact flash
rootfs (ext3) with tweaking the RA with 'hdparm -a' from 8 to 1024
blocks and some benchmarks (I forget which).

Fengguang, what is your favourite benchmark for finding differences in
readahead values (running on eg ext3 on a flashdisk), with the current
RA semantics in mainline kernels (eg 2.6.23-rc3)?

Thanks,
  Daniel
-- 
Daniel J Blueman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
       [not found]     ` <20070823134359.GB5576@mail.ustc.edu.cn>
@ 2007-08-23 13:43       ` Fengguang Wu
  2007-08-23 15:09         ` Daniel J Blueman
  0 siblings, 1 reply; 17+ messages in thread
From: Fengguang Wu @ 2007-08-23 13:43 UTC (permalink / raw)
  To: Daniel J Blueman
  Cc: Jan Engelhardt, Richard Ballantyne, Linux Kernel,
	James Courtier-Dutton

On Thu, Aug 23, 2007 at 01:56:17PM +0100, Daniel J Blueman wrote:
> Hi Fengguang,
> 
> On 23/08/07, James Courtier-Dutton <James@superbug.co.uk> wrote:
> > Daniel J Blueman wrote:
> > > On 23 Aug, 07:00, Jan Engelhardt <jengelh@computergmbh.de> wrote:
> > >> On Aug 23 2007 01:01, Richard Ballantyne wrote:
> > >>
> > >>> What file system that is already in the linux kernel do people recommend
> > >>> I use for my laptop that now contains a solid state disk?
> > >>>
> > >> If I had to choose, the list of options seems to be:
> > >>
> > >> - logfs
> > >>   [unmerged]
> > >>
> > >> - UBI layer with any fs you like
> > >>   [just a guess]
> > >>
> > >> - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
> > >>   [does not support ACLs/quotas]
> > >
> > > Isn't it that with modern rotational wear-levelling, re-writing hot
> > > blocks many times is not an issue, as they are internally moved around
> > > anyway? So, using a journalled filesystem such as ext3 is still good
> > > (robustness and maturity in mind). Due to lack of write buffering,
> > > perhaps a wandering log (journal) filesystem would be more suitable
> > > though? I use ext3 on my >35MB/s compact flash filesystem.
> > >
> > > I can see there being advantage in selecting a filesystem which is
> > > lower complexity due to no additional spatial optimisation complexity,
> > > but those advantages do buy other efficiency (eg the Orlov allocator
> > > reducing fragmentation, thus less overhead), right?
> > >
> > > Also, it would be natural to employ 'elevator=none', but perhaps there
> > > is a small advantage in holding a group of flash blocks 'ready' (like
> > > SDRAM pages being selected on-chip for lower bus access latency) -
> > > however this no longer holds when logical->physical remapping is
> > > performed, so perhaps it's better without an elevator.
> > >
> > > Clearly, benchmarks speak...but perhaps it would make sense to have
> > > libata disable the elevator for the (compact) flash block device?
> > >
> > > Daniel
> >
> > Also, sector read ahead will actually have a performance impact on
> > Flash, instead of speed things up with a spinning disc.
> > For example, a request might read 128 sectors instead of the one
> > requested at little or no extra performance impact for a spinning disc.
> > For flash, reading 128 sectors instead of the one requested will have a
> > noticeable performance impact.
> > Spinning discs have high seek latency, low serial sector read latency
> > and equal latency for read/write
> > Flash has low seek latency, high serial sector read latency and longer
> > write than read times.

A little bit of readahead will be helpful for flash memory.  Its latency is
low, but sure not zero. Asynchronous readahead will help to hide the latency.

> I was having problem invoking the readahead logic on my compact flash
> rootfs (ext3) with tweaking the RA with 'hdparm -a' from 8 to 1024
> blocks and some benchmarks (I forget which).
> 
> Fengguang, what is your favourite benchmark for finding differences in
> readahead values (running on eg ext3 on a flashdisk), with the current
> RA semantics in mainline kernels (eg 2.6.23-rc3)?

My favorite test cases are

big file:
        time cp $file /dev/null &>/dev/null
        time dd if=$file of=/dev/null \ bs=${bs:-4k} &>/dev/null

big file, parallel:
        time diff $file $file.clone

small files:
        time grep -qr 'doruimi' $dir 2>/dev/null

Don't forget to clear the cache before each run:
        echo 2 > /proc/sys/vm/drop_caches


Cheers,
Fengguang


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-08-23 13:43       ` Fengguang Wu
@ 2007-08-23 15:09         ` Daniel J Blueman
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel J Blueman @ 2007-08-23 15:09 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Jan Engelhardt, Richard Ballantyne, Linux Kernel,
	James Courtier-Dutton

On 23/08/07, Fengguang Wu <wfg@mail.ustc.edu.cn> wrote:
> On Thu, Aug 23, 2007 at 01:56:17PM +0100, Daniel J Blueman wrote:
> > Hi Fengguang,
> >
> > On 23/08/07, James Courtier-Dutton <James@superbug.co.uk> wrote:
> > > Daniel J Blueman wrote:
> > > > On 23 Aug, 07:00, Jan Engelhardt <jengelh@computergmbh.de> wrote:
> > > >> On Aug 23 2007 01:01, Richard Ballantyne wrote:
> > > >>
> > > >>> What file system that is already in the linux kernel do people recommend
> > > >>> I use for my laptop that now contains a solid state disk?
> > > >>>
> > > >> If I had to choose, the list of options seems to be:
> > > >>
> > > >> - logfs
> > > >>   [unmerged]
> > > >>
> > > >> - UBI layer with any fs you like
> > > >>   [just a guess]
> > > >>
> > > >> - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
> > > >>   [does not support ACLs/quotas]
> > > >
> > > > Isn't it that with modern rotational wear-levelling, re-writing hot
> > > > blocks many times is not an issue, as they are internally moved around
> > > > anyway? So, using a journalled filesystem such as ext3 is still good
> > > > (robustness and maturity in mind). Due to lack of write buffering,
> > > > perhaps a wandering log (journal) filesystem would be more suitable
> > > > though? I use ext3 on my >35MB/s compact flash filesystem.
> > > >
> > > > I can see there being advantage in selecting a filesystem which is
> > > > lower complexity due to no additional spatial optimisation complexity,
> > > > but those advantages do buy other efficiency (eg the Orlov allocator
> > > > reducing fragmentation, thus less overhead), right?
> > > >
> > > > Also, it would be natural to employ 'elevator=none', but perhaps there
> > > > is a small advantage in holding a group of flash blocks 'ready' (like
> > > > SDRAM pages being selected on-chip for lower bus access latency) -
> > > > however this no longer holds when logical->physical remapping is
> > > > performed, so perhaps it's better without an elevator.
> > > >
> > > > Clearly, benchmarks speak...but perhaps it would make sense to have
> > > > libata disable the elevator for the (compact) flash block device?
> > > >
> > > > Daniel
> > >
> > > Also, sector read ahead will actually have a performance impact on
> > > Flash, instead of speed things up with a spinning disc.
> > > For example, a request might read 128 sectors instead of the one
> > > requested at little or no extra performance impact for a spinning disc.
> > > For flash, reading 128 sectors instead of the one requested will have a
> > > noticeable performance impact.
> > > Spinning discs have high seek latency, low serial sector read latency
> > > and equal latency for read/write
> > > Flash has low seek latency, high serial sector read latency and longer
> > > write than read times.
>
> A little bit of readahead will be helpful for flash memory.  Its latency is
> low, but sure not zero. Asynchronous readahead will help to hide the latency.
>
> > I was having problem invoking the readahead logic on my compact flash
> > rootfs (ext3) with tweaking the RA with 'hdparm -a' from 8 to 1024
> > blocks and some benchmarks (I forget which).
> >
> > Fengguang, what is your favourite benchmark for finding differences in
> > readahead values (running on eg ext3 on a flashdisk), with the current
> > RA semantics in mainline kernels (eg 2.6.23-rc3)?
>
> My favorite test cases are
>
> big file:
>         time cp $file /dev/null &>/dev/null
>         time dd if=$file of=/dev/null \ bs=${bs:-4k} &>/dev/null
>
> big file, parallel:
>         time diff $file $file.clone
>
> small files:
>         time grep -qr 'doruimi' $dir 2>/dev/null
>
> Don't forget to clear the cache before each run:
>         echo 2 > /proc/sys/vm/drop_caches

The maximal case we're looking for is where up to 1024-block
read-ahead doesn't pay off, but actually wastes finite bandwidth, thus
time.

We clearly need a database-type workload which forward reads enough to
open the RA window some, but then reads at a different location. We
then prove that we benefit with a smaller read ahead window, at
negligible cost to your linear case.

To open the RA window, I know we need no competing threads, but how
far do we need to sequentially read? I'll cook a micro-benchmark with
memory-backed files.
-- 
Daniel J Blueman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
@ 2007-08-25  8:41 Just Marc
  2007-08-30 18:25 ` Jan Engelhardt
  0 siblings, 1 reply; 17+ messages in thread
From: Just Marc @ 2007-08-25  8:41 UTC (permalink / raw)
  To: linux-kernel

Hi,

It's important to note that disk-replacement type SSDs perform much 
better with very small block operations, generally 512 bytes.  So the 
lower your file system block size, the better -- this will be the single 
most significant performance tweak one should do.   This is true for the 
benchmarks I've seen where the difference between 4KB and 512Byte block 
sizes was almost 100%.   YMMV -- always benchmark.

On SSDs which contain built in wear leveling, pretty much any file 
system can be used.   For SSDs that lack such low level housekeeping, 
use stuff like JFFS2.

Marc

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-08-23 11:25     ` Jens Axboe
@ 2007-08-29 17:36       ` Bill Davidsen
  2007-08-29 17:57         ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Bill Davidsen @ 2007-08-29 17:36 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Theodore Tso, Jan Engelhardt, Richard Ballantyne, linux-kernel

Jens Axboe wrote:
> On Thu, Aug 23 2007, Theodore Tso wrote:
>> On Thu, Aug 23, 2007 at 07:52:46AM +0200, Jan Engelhardt wrote:
>>> On Aug 23 2007 01:01, Richard Ballantyne wrote:
>>>> What file system that is already in the linux kernel do people recommend
>>>> I use for my laptop that now contains a solid state disk?
>>> If I had to choose, the list of options seems to be:
>>>
>>> - logfs
>>>   [unmerged]
>>>
>>> - UBI layer with any fs you like
>>>   [just a guess]
>> The question is whether the solid state disk gives you access to the
>> raw flash, or whether you have to go through the flash translation
>> layer because it's trying to look (exclusively) like a PATA or SATA
>> drive.  There are some SSD's that have a form factor and interfaces
>> that make them a drop-in replacement for a laptop hard drive, and a
>> number of the newer laptops that are supporting SSD's seem to be these
>> because (a) they don't have to radically change their design, (b) so
>> they can be compatible with Windows, and (c) so that users can
>> purchase the laptop either with a traditional hard drive or a SSD's as
>> an option, since at the moment SSD's are far more expensive than
>> disks.
>>
>> So if you can't get access to the raw flash layer, then what you're
>> probably going to be looking at is a traditional block-oriented
>> filesystem, such as ext3, although there are clearly some things that
>> could be done such as disabling the elevator.   
> 
> It's more complicated than that, I'd say. If the job of the elevator was
> purely to sort request based on sector criteria, then I'd agree that
> noop was the best way to go. But the elevator also abritrates access to
> the disk for processes. Even if you don't pay a seek penalty, you still
> would rather like to get your sync reads in without having to wait for
> that huge writer that just queued hundreds of megabytes of io in front
> of you (and will have done so behind your read, making you wait again
> for a subsequent read).

In most cases the time in the elevator is minimal compared to the 
benefits. Even without your next suggestion.
> 
> My plan in this area is to add a simple storage profile and attach it to
> the queue. Just start simple, allow a device driver to inform the block
> layer that this device has no seek penalty. Then the io scheduler can
> make more informed decisions on what to do - eg for ssd, sector
> proximity may not have much meaning, so we should not take that into
> account.
> 
Eventually the optimal solution may require both bandwidth and seek 
information. If "solid state disk" means flash, it's on a peripheral 
bus, it's probably not all that fast at transfer rate. If it means NV 
memory, battery backed or core, probably nothing changes, again *if* 
it's on a peripheral bus, but if it's on a card plugged to the 
backplane, the transfer rate may be high enough to make ordering cost 
more than waiting. This could be extended to nbd and iSCSI devices as 
well, I think, to optimize performance.

Your plan seems a good one in this area, but if you agree that transfer 
rate will be important (if it isn't already), perhaps you will be able 
to design allowing for that capability to be easily added.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-08-29 17:36       ` Bill Davidsen
@ 2007-08-29 17:57         ` Jens Axboe
  0 siblings, 0 replies; 17+ messages in thread
From: Jens Axboe @ 2007-08-29 17:57 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: Theodore Tso, Jan Engelhardt, Richard Ballantyne, linux-kernel

On Wed, Aug 29 2007, Bill Davidsen wrote:
> Jens Axboe wrote:
> >On Thu, Aug 23 2007, Theodore Tso wrote:
> >>On Thu, Aug 23, 2007 at 07:52:46AM +0200, Jan Engelhardt wrote:
> >>>On Aug 23 2007 01:01, Richard Ballantyne wrote:
> >>>>What file system that is already in the linux kernel do people recommend
> >>>>I use for my laptop that now contains a solid state disk?
> >>>If I had to choose, the list of options seems to be:
> >>>
> >>>- logfs
> >>>  [unmerged]
> >>>
> >>>- UBI layer with any fs you like
> >>>  [just a guess]
> >>The question is whether the solid state disk gives you access to the
> >>raw flash, or whether you have to go through the flash translation
> >>layer because it's trying to look (exclusively) like a PATA or SATA
> >>drive.  There are some SSD's that have a form factor and interfaces
> >>that make them a drop-in replacement for a laptop hard drive, and a
> >>number of the newer laptops that are supporting SSD's seem to be these
> >>because (a) they don't have to radically change their design, (b) so
> >>they can be compatible with Windows, and (c) so that users can
> >>purchase the laptop either with a traditional hard drive or a SSD's as
> >>an option, since at the moment SSD's are far more expensive than
> >>disks.
> >>
> >>So if you can't get access to the raw flash layer, then what you're
> >>probably going to be looking at is a traditional block-oriented
> >>filesystem, such as ext3, although there are clearly some things that
> >>could be done such as disabling the elevator.   
> >
> >It's more complicated than that, I'd say. If the job of the elevator was
> >purely to sort request based on sector criteria, then I'd agree that
> >noop was the best way to go. But the elevator also abritrates access to
> >the disk for processes. Even if you don't pay a seek penalty, you still
> >would rather like to get your sync reads in without having to wait for
> >that huge writer that just queued hundreds of megabytes of io in front
> >of you (and will have done so behind your read, making you wait again
> >for a subsequent read).
> 
> In most cases the time in the elevator is minimal compared to the 
> benefits. Even without your next suggestion.

Runtime overhead, yes. Head optimizations like trying to avoid seeks,
definitely no. That can be several miliseconds for a request, and if you
waste that time often, then you are going noticably slower than you
could be.

> >My plan in this area is to add a simple storage profile and attach it to
> >the queue. Just start simple, allow a device driver to inform the block
> >layer that this device has no seek penalty. Then the io scheduler can
> >make more informed decisions on what to do - eg for ssd, sector
> >proximity may not have much meaning, so we should not take that into
> >account.
> >
> Eventually the optimal solution may require both bandwidth and seek 
> information. If "solid state disk" means flash, it's on a peripheral 
> bus, it's probably not all that fast at transfer rate. If it means NV 
> memory, battery backed or core, probably nothing changes, again *if* 
> it's on a peripheral bus, but if it's on a card plugged to the 
> backplane, the transfer rate may be high enough to make ordering cost 
> more than waiting. This could be extended to nbd and iSCSI devices as 
> well, I think, to optimize performance.

I've yet to see any real runtime overhead problems for any workload, so
the ordering is not an issue imo. It's easy enough to bypass for any io
scheduler, should it become interesting.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-08-25  8:41 Just Marc
@ 2007-08-30 18:25 ` Jan Engelhardt
  2007-08-30 18:26   ` Just Marc
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Engelhardt @ 2007-08-30 18:25 UTC (permalink / raw)
  To: Just Marc; +Cc: linux-kernel


On Aug 25 2007 09:41, Just Marc wrote:
>
> On SSDs which contain built in wear leveling, pretty much any file 
> system can be used.  For SSDs that lack such low level housekeeping, 
> use stuff like JFFS2.

The question is, how can you find out whether it does automatic 
wear-leveling? (Perhaps when a CF is advertised as "holds 10 years!"?)


	Jan
-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-08-30 18:25 ` Jan Engelhardt
@ 2007-08-30 18:26   ` Just Marc
  0 siblings, 0 replies; 17+ messages in thread
From: Just Marc @ 2007-08-30 18:26 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: linux-kernel

One must consult the documentation of that device.   This wear leveling 
is low level and most devices do not export any information about it.  
Recent SSDs start to export some values through SMART that let you 
monitor the state.

Some companies think that hiding is better than exposing...


Jan Engelhardt wrote:
> On Aug 25 2007 09:41, Just Marc wrote:
>   
>> On SSDs which contain built in wear leveling, pretty much any file 
>> system can be used.  For SSDs that lack such low level housekeeping, 
>> use stuff like JFFS2.
>>     
>
> The question is, how can you find out whether it does automatic 
> wear-leveling? (Perhaps when a CF is advertised as "holds 10 years!"?)
>
>
> 	Jan
>   


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-08-23  8:55 file system for solid state disks Daniel J Blueman
  2007-08-23 12:45 ` James Courtier-Dutton
@ 2007-09-05 12:34 ` Denys Vlasenko
  2007-09-05 12:56   ` linux-os (Dick Johnson)
  1 sibling, 1 reply; 17+ messages in thread
From: Denys Vlasenko @ 2007-09-05 12:34 UTC (permalink / raw)
  To: Daniel J Blueman; +Cc: Jan Engelhardt, Richard Ballantyne, Linux Kernel

On Thursday 23 August 2007 09:55, Daniel J Blueman wrote:
> On 23 Aug, 07:00, Jan Engelhardt <jengelh@computergmbh.de> wrote:
> > On Aug 23 2007 01:01, Richard Ballantyne wrote:
> > >What file system that is already in the linux kernel do people recommend
> > >I use for my laptop that now contains a solid state disk?
> >
> > If I had to choose, the list of options seems to be:
> >
> > - logfs
> >   [unmerged]
> >
> > - UBI layer with any fs you like
> >   [just a guess]
> >
> > - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
> >   [does not support ACLs/quotas]
> 
> Isn't it that with modern rotational wear-levelling, re-writing hot
> blocks many times is not an issue, as they are internally moved around
> anyway? So, using a journalled filesystem such as ext3 is still good
> (robustness and maturity in mind).

Crap hardware (one which only _claim_ to do it) is out there,
and is typically cheaper, so users preferentially buy that ;)
--
vda

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-09-05 12:34 ` Denys Vlasenko
@ 2007-09-05 12:56   ` linux-os (Dick Johnson)
  2007-09-05 13:04     ` Manu Abraham
  0 siblings, 1 reply; 17+ messages in thread
From: linux-os (Dick Johnson) @ 2007-09-05 12:56 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Daniel J Blueman, Jan Engelhardt, Richard Ballantyne,
	Linux Kernel


On Wed, 5 Sep 2007, Denys Vlasenko wrote:

> On Thursday 23 August 2007 09:55, Daniel J Blueman wrote:
>> On 23 Aug, 07:00, Jan Engelhardt <jengelh@computergmbh.de> wrote:
>>> On Aug 23 2007 01:01, Richard Ballantyne wrote:
>>>> What file system that is already in the linux kernel do people recommend
>>>> I use for my laptop that now contains a solid state disk?
>>>
>>> If I had to choose, the list of options seems to be:
>>>
>>> - logfs
>>>   [unmerged]
>>>
>>> - UBI layer with any fs you like
>>>   [just a guess]
>>>
>>> - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
>>>   [does not support ACLs/quotas]
>>
>> Isn't it that with modern rotational wear-levelling, re-writing hot
>> blocks many times is not an issue, as they are internally moved around
>> anyway? So, using a journalled filesystem such as ext3 is still good
>> (robustness and maturity in mind).
>
> Crap hardware (one which only _claim_ to do it) is out there,
> and is typically cheaper, so users preferentially buy that ;)
> --
> vda

You might want to check and see what is actually being
used for the solid-state disk. Some solid state disks
are SRAM and DRAM. SRAM is fast, it doesn't require refresh,
is now as cheap as flash, and does R/W forever. It retains
its data for 10 years of power being removed by using an
embedded battery.

http://en.wikipedia.org/wiki/Solid_state_drive

This is exactly what I proposed on this list a long
time ago. It is now a reality.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.30 BogoMips).
My book : http://www.AbominableFirebug.com/
_


****************************************************************
The information transmitted in this message is confidential and may be privileged.  Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited.  If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: file system for solid state disks
  2007-09-05 12:56   ` linux-os (Dick Johnson)
@ 2007-09-05 13:04     ` Manu Abraham
  0 siblings, 0 replies; 17+ messages in thread
From: Manu Abraham @ 2007-09-05 13:04 UTC (permalink / raw)
  To: linux-os (Dick Johnson)
  Cc: Denys Vlasenko, Daniel J Blueman, Jan Engelhardt,
	Richard Ballantyne, Linux Kernel

linux-os (Dick Johnson) wrote:


> http://en.wikipedia.org/wiki/Solid_state_drive
> 
> This is exactly what I proposed on this list a long
> time ago. It is now a reality.

It's been around for a couple of years ;-)

http://forum.pcvsconsole.com/viewthread.php?tid=15802
http://www.anandtech.com/tradeshows/showdoc.aspx?i=2431&p=5



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-09-05 13:05 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-23  8:55 file system for solid state disks Daniel J Blueman
2007-08-23 12:45 ` James Courtier-Dutton
2007-08-23 12:56   ` Daniel J Blueman
     [not found]     ` <20070823134359.GB5576@mail.ustc.edu.cn>
2007-08-23 13:43       ` Fengguang Wu
2007-08-23 15:09         ` Daniel J Blueman
2007-09-05 12:34 ` Denys Vlasenko
2007-09-05 12:56   ` linux-os (Dick Johnson)
2007-09-05 13:04     ` Manu Abraham
  -- strict thread matches above, loose matches on Subject: below --
2007-08-25  8:41 Just Marc
2007-08-30 18:25 ` Jan Engelhardt
2007-08-30 18:26   ` Just Marc
2007-08-23  5:01 Richard Ballantyne
2007-08-23  5:52 ` Jan Engelhardt
2007-08-23 10:26   ` Theodore Tso
2007-08-23 11:25     ` Jens Axboe
2007-08-29 17:36       ` Bill Davidsen
2007-08-29 17:57         ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox