public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* ext4 bug ? "Intel 320 SSD write performance – contd."
@ 2011-10-31 20:38 Vincent Pelletier
  2011-10-31 21:09 ` Ted Ts'o
  0 siblings, 1 reply; 7+ messages in thread
From: Vincent Pelletier @ 2011-10-31 20:38 UTC (permalink / raw)
  To: linux-kernel

Hi.

Reading this blog post[1], I thought the "2nd iteration" results could be
considered a bug in mkfs.ext4 (and possibly any mkfs implementation):
shouldn't mkfs run [FI]TRIM on its target before creating filesystem
structure ?

Disclaimers:
I don't know much about mkfs nor in-kernel fs support to tell which part
should implement this - so I cannot even tell for sure this isn't done
already.
I have no idea how expensive those new calls would be (in general, this
means trimming a _lot_ of pages...).
I don't know how other filesystems/os behave on such bench. But I
don't think this is a problem any SSD could solve at its level.

[1] http://www.mysqlperformanceblog.com/2011/09/28/intel-320-ssd-write-performance-contd/

Regards,
-- 
Vincent Pelletier

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ext4 bug ? "Intel 320 SSD write performance – contd."
  2011-10-31 20:38 ext4 bug ? "Intel 320 SSD write performance – contd." Vincent Pelletier
@ 2011-10-31 21:09 ` Ted Ts'o
  2011-10-31 21:47   ` Vincent Pelletier
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Ted Ts'o @ 2011-10-31 21:09 UTC (permalink / raw)
  To: Vincent Pelletier; +Cc: linux-kernel

On Mon, Oct 31, 2011 at 09:38:37PM +0100, Vincent Pelletier wrote:
> 
> Reading this blog post[1], I thought the "2nd iteration" results could be
> considered a bug in mkfs.ext4 (and possibly any mkfs implementation):
> shouldn't mkfs run [FI]TRIM on its target before creating filesystem
> structure ?

It's not enabled by default, because there are crappy SSD's out there
where use of the TRIM command will turn them into bricks.  (No, it's
not the Intel X-25 drives that I'm worried about.)

So I (and the distributions) don't want to make it the default, since
if you buy crap drives and then mke2fs turns them into bricks, who are
you likely to blame?  The crap SSD manufacturer?  Yourself for trying
to buy SSD's on the cheap?  Or the program that issued the TRIM
command?

You can enable the trim behaviour by default by adding to your
/etc/mke2fs.conf file:

[defaults]
	discard = true

But then it's on your head if anything bad happens.  :-/

        	      	      	      - Ted

P.S.  For a similar reason we don't enable TRIM commands in the
kernel, where we have three possible ways of issuing TRIM.  One is
continuously, as files get unlinked (and the file system transaction
is committed).  Another way is via a userspace progam run out of cron
which calls the FITRIM ioctl; and the third way is at e2fsck time,
after a full e2fsck run without any file system errors detected.
Depending how the SSD implemented TRIM, (and of course your workload),
some of these methods can be performance disasters, and have resulted
in SSD's getting bricked, which is again why none of these are turned
on by default.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ext4 bug ? "Intel 320 SSD write performance – contd."
  2011-10-31 21:09 ` Ted Ts'o
@ 2011-10-31 21:47   ` Vincent Pelletier
  2011-11-01 13:34   ` Stephen Clark
  2012-01-15 18:37   ` ext4 bug ? "Intel 320 SSD write performance ??? contd." Pavel Machek
  2 siblings, 0 replies; 7+ messages in thread
From: Vincent Pelletier @ 2011-10-31 21:47 UTC (permalink / raw)
  To: Ted Ts'o, linux-kernel

On Mon, Oct 31, 2011 at 10:09 PM, Ted Ts'o <tytso@mit.edu> wrote:
> So I (and the distributions) don't want to make it the default, since
> if you buy crap drives and then mke2fs turns them into bricks, who are
> you likely to blame?  The crap SSD manufacturer?  Yourself for trying
> to buy SSD's on the cheap?  Or the program that issued the TRIM
> command?

Thanks for the explanation. I don't know much about SSDs so far: I indeed
heard about bricks, but not because of TRIM.
And I thank you (and distributions) for staying on the safe side.

Regards,
-- 
Vincent Pelletier

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ext4 bug ? "Intel 320 SSD write performance – contd."
  2011-10-31 21:09 ` Ted Ts'o
  2011-10-31 21:47   ` Vincent Pelletier
@ 2011-11-01 13:34   ` Stephen Clark
  2011-11-01 13:41     ` Theodore Tso
  2012-01-15 18:37   ` ext4 bug ? "Intel 320 SSD write performance ??? contd." Pavel Machek
  2 siblings, 1 reply; 7+ messages in thread
From: Stephen Clark @ 2011-11-01 13:34 UTC (permalink / raw)
  To: Ted Ts'o, Vincent Pelletier, linux-kernel

On 10/31/2011 05:09 PM, Ted Ts'o wrote:
> On Mon, Oct 31, 2011 at 09:38:37PM +0100, Vincent Pelletier wrote:
>    
>> Reading this blog post[1], I thought the "2nd iteration" results could be
>> considered a bug in mkfs.ext4 (and possibly any mkfs implementation):
>> shouldn't mkfs run [FI]TRIM on its target before creating filesystem
>> structure ?
>>      
> It's not enabled by default, because there are crappy SSD's out there
> where use of the TRIM command will turn them into bricks.  (No, it's
> not the Intel X-25 drives that I'm worried about.)
>
> So I (and the distributions) don't want to make it the default, since
> if you buy crap drives and then mke2fs turns them into bricks, who are
> you likely to blame?  The crap SSD manufacturer?  Yourself for trying
> to buy SSD's on the cheap?  Or the program that issued the TRIM
> command?
>
> You can enable the trim behaviour by default by adding to your
> /etc/mke2fs.conf file:
>
> [defaults]
> 	discard = true
>
> But then it's on your head if anything bad happens.  :-/
>
>          	      	      	      - Ted
>
>    
<snip>

What about using discard in fstab like:
LABEL=/ /                      ext4    
defaults,discard,noatime,nodiratime        1 1

Thanks,
Steve

-- 

"They that give up essential liberty to obtain temporary safety,
deserve neither liberty nor safety."  (Ben Franklin)

"The course of history shows that as a government grows, liberty
decreases."  (Thomas Jefferson)




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ext4 bug ? "Intel 320 SSD write performance – contd."
  2011-11-01 13:34   ` Stephen Clark
@ 2011-11-01 13:41     ` Theodore Tso
  2011-11-01 14:00       ` Stephen Clark
  0 siblings, 1 reply; 7+ messages in thread
From: Theodore Tso @ 2011-11-01 13:41 UTC (permalink / raw)
  To: sclark46; +Cc: Theodore Tso, Vincent Pelletier, linux-kernel


On Nov 1, 2011, at 9:34 AM, Stephen Clark wrote:

>> 
>> You can enable the trim behaviour by default by adding to your
>> /etc/mke2fs.conf file:
>> 
>> [defaults]
>> 	discard = true
>> 
>> But then it's on your head if anything bad happens.  :-/
>> 
>>         	      	      	      - Ted
>> 
>>   
> <snip>
> 
> What about using discard in fstab like:
> LABEL=/ /                      ext4    defaults,discard,noatime,nodiratime        1 1

That does something different; this does continuous discards as you delete files.   This is probably the best thing to do if you are using thin-provisioning.   However, with SATA devices, a discard requires a queue flush, which can be a performance disaster.   (Also some cheaper SSD's also have real performance difficulties if there are frequent trims.)   There have also been reports that high frequency discards can trigger bugs that cause crappier SSD's to turn themselves into bricks.   The performance problems and the possibility of bricking crappier SSD's is why this isn't turned on by default.

Because of this, another method is to do FITRIM's periodically using cron.   This has less of a performance impact, so it's probably the better approach in many cases.

Basically, with SSD's and thin-provisioning systems, we are very much at the mercy at the competence of firmware authors.   As we know, sometimes firmware authors can be quite competent, or very, extremely incompetent.   So there will be a wide range of outcomes, which is one of the reasons why the best answer is to test to see what works best for you, preferably before you put use your systems in production.

-- Ted


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ext4 bug ? "Intel 320 SSD write performance – contd."
  2011-11-01 13:41     ` Theodore Tso
@ 2011-11-01 14:00       ` Stephen Clark
  0 siblings, 0 replies; 7+ messages in thread
From: Stephen Clark @ 2011-11-01 14:00 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Vincent Pelletier, linux-kernel

On 11/01/2011 09:41 AM, Theodore Tso wrote:
> On Nov 1, 2011, at 9:34 AM, Stephen Clark wrote:
>
>    
>>> You can enable the trim behaviour by default by adding to your
>>> /etc/mke2fs.conf file:
>>>
>>> [defaults]
>>> 	discard = true
>>>
>>> But then it's on your head if anything bad happens.  :-/
>>>
>>>          	      	      	      - Ted
>>>
>>>
>>>        
>> <snip>
>>
>> What about using discard in fstab like:
>> LABEL=/ /                      ext4    defaults,discard,noatime,nodiratime        1 1
>>      
> That does something different; this does continuous discards as you delete files.   This is probably the best thing to do if you are using thin-provisioning.   However, with SATA devices, a discard requires a queue flush, which can be a performance disaster.   (Also some cheaper SSD's also have real performance difficulties if there are frequent trims.)   There have also been reports that high frequency discards can trigger bugs that cause crappier SSD's to turn themselves into bricks.   The performance problems and the possibility of bricking crappier SSD's is why this isn't turned on by default.
>
> Because of this, another method is to do FITRIM's periodically using cron.   This has less of a performance impact, so it's probably the better approach in many cases.
>
> Basically, with SSD's and thin-provisioning systems, we are very much at the mercy at the competence of firmware authors.   As we know, sometimes firmware authors can be quite competent, or very, extremely incompetent.   So there will be a wide range of outcomes, which is one of the reasons why the best answer is to test to see what works best for you, preferably before you put use your systems in production.
>
> -- Ted
>    
So what I hear if that if you are using sata interface and not doing 
virtualization we should be using FITRIMS's from  a cron job. At what 
frequency should
we be doing this?

Thanks,
Steve

-- 

"They that give up essential liberty to obtain temporary safety,
deserve neither liberty nor safety."  (Ben Franklin)

"The course of history shows that as a government grows, liberty
decreases."  (Thomas Jefferson)




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ext4 bug ? "Intel 320 SSD write performance ??? contd."
  2011-10-31 21:09 ` Ted Ts'o
  2011-10-31 21:47   ` Vincent Pelletier
  2011-11-01 13:34   ` Stephen Clark
@ 2012-01-15 18:37   ` Pavel Machek
  2 siblings, 0 replies; 7+ messages in thread
From: Pavel Machek @ 2012-01-15 18:37 UTC (permalink / raw)
  To: Ted Ts'o, Vincent Pelletier, linux-kernel

Hi!

> > Reading this blog post[1], I thought the "2nd iteration" results could be
> > considered a bug in mkfs.ext4 (and possibly any mkfs implementation):
> > shouldn't mkfs run [FI]TRIM on its target before creating filesystem
> > structure ?
> 
> It's not enabled by default, because there are crappy SSD's out there
> where use of the TRIM command will turn them into bricks.  (No, it's
> not the Intel X-25 drives that I'm worried about.)
> 
> So I (and the distributions) don't want to make it the default, since
> if you buy crap drives and then mke2fs turns them into bricks, who are
> you likely to blame?  The crap SSD manufacturer?  Yourself for trying
> to buy SSD's on the cheap?  Or the program that issued the TRIM
> command?

Kernel for allowing userspace damage the hardware?

If there are known-bad drives, we should blacklist them, and kernel
should -EPERM on attempts to trim.

If it is too widespread, we  should probably create a whitelist...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-01-15 18:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-31 20:38 ext4 bug ? "Intel 320 SSD write performance – contd." Vincent Pelletier
2011-10-31 21:09 ` Ted Ts'o
2011-10-31 21:47   ` Vincent Pelletier
2011-11-01 13:34   ` Stephen Clark
2011-11-01 13:41     ` Theodore Tso
2011-11-01 14:00       ` Stephen Clark
2012-01-15 18:37   ` ext4 bug ? "Intel 320 SSD write performance ??? contd." Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox