From: Marcus Sundman <marcus@hibox.fi>
To: Jan Kara <jack@suse.cz>
Cc: "Theodore Ts'o" <tytso@mit.edu>,
Dave Chinner <david@fromorbit.com>,
linux-kernel@vger.kernel.org
Subject: Re: Debugging system freezes on filesystem writes
Date: Thu, 12 Sep 2013 16:47:43 +0300 [thread overview]
Message-ID: <5231C5FF.3060504@hibox.fi> (raw)
In-Reply-To: <20130912131051.GA14664@quack.suse.cz>
On 12.09.2013 16:10, Jan Kara wrote:
> On Thu 12-09-13 15:57:32, Marcus Sundman wrote:
>> On 27.02.2013 01:17, Jan Kara wrote:
>>> On Tue 26-02-13 20:41:36, Marcus Sundman wrote:
>>>> On 24.02.2013 03:20, Theodore Ts'o wrote:
>>>>> On Sun, Feb 24, 2013 at 11:12:22AM +1100, Dave Chinner wrote:
>>>>>>>> /dev/sda6 /home ext4 rw,noatime,discard 0 0
>>>>>> ^^^^^^^
>>>>>> I'd say that's your problem....
>>>>> Looks like the Sandisk U100 is a good SSD for me to put on my personal
>>>>> "avoid" list:
>>>>>
>>>>> http://thessdreview.com/our-reviews/asus-zenbook-ssd-review-not-necessarily-sandforce-driven-shows-significant-speed-bump/
>>>>>
>>>>> There are a number of SSD's which do not implement "trim" efficiently,
>>>>> so these days, the recommended way to use trim is to run the "fstrim"
>>>>> command out of crontab.
>>>> OK. Removing 'discard' made it much better (the 60-600 second
>>>> freezes are now 1-50 second freezes), but it's still at least an
>>>> order of magnitude worse than a normal HD. When writing, that is --
>>>> reading is very fast (when there's no writing going on).
>>>>
>>>> So, after reading up a bit on this trimming I'm thinking maybe my
>>>> filesystem's block sizes don't match up with my SSD's blocks (or
>>>> whatever its write unit is called). Then writing a FS block would
>>>> always write to multiple SSD blocks, causing multiple
>>>> read-erase-write sequences, right? So how can I check this, and how
>>>> can I make the FS blocks match the SSD blocks?
>>> As Ted wrote, alignment isn't usually a problem with SSDs. And even if it
>>> was, it would be at most a factor 2 slow down and we don't seem to be at
>>> that fine grained level :)
>>>
>>> At this point you might try mounting the fs with nobarrier mount option (I
>>> know you tried that before but without discard the difference could be more
>>> visible), switching IO scheduler to CFQ (for crappy SSDs it actually isn't
>>> a bad choice), and we'll see how much we can squeeze out of your drive...
>> I repartitioned the drive and reinstalled ubuntu and after that it
>> gladly wrote over 100 MB/s to the SSD without any hangs. However,
>> after a couple of months I noticed it had degraded considerably, and
>> it keeps degrading. Now it's slowly becoming completely unusable
>> again, with write speeds of the magnitude 1 MB/s and dropping.
>>
>> As far as I can tell I have not made any relevant changes. Also, the
>> amount of free space hasn't changed considerably, but it seems that
>> the longer it's been since I reformatted the drive the more free
>> space is required for it to perform well.
>>
>> So, maybe the cause is fragmentation? I tried running e4defrag and
>> then fstrim, but it didn't really help (well, maybe a little bit,
>> but after a couple of days it was back in unusable-land). Also,
>> "e4defrag -c" gives a fragmenation score of less than 5, so...
>>
>> Any ideas?
> So now you run without 'discard' mount option, right? My guess then would
> be that the FTL layer on your SSD is just crappy and as the erase blocks
> get more fragmented as the filesystem is used it cannot keep up. But it's
> easy to put blame on someone else :)
>
> You can check whether this is a problem of Linux or your SSD by writing a
> large file (few GB or more) like 'dd if=/dev/zero of=testfile bs=1M
> count=4096 oflag=direct'. What is the throughput? If it is bad, check output
> of 'filefrag -v testfile'. If the extents are reasonably large (1 MB and
> more), then the problem is in your SSD firmware. Not much we can do about
> it in that case...
>
> If it really is SSD's firmware, maybe you could try f2fs or similar flash
> oriented filesystem which should put lower load on the disk's FTL.
----8<---------------------------
$ grep LABEL /etc/fstab
LABEL=system / ext4 errors=remount-ro,nobarrier,noatime 0 1
LABEL=home /home ext4 defaults,nobarrier,noatime 0 2
$ df -h|grep home
/dev/sda3 104G 98G 5.1G 96% /home
$ sync && time dd if=/dev/zero of=testfile bs=1M count=2048 oflag=direct
&& time sync
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 404.571 s, 5.3 MB/s
real 6m44.575s
user 0m0.000s
sys 0m1.300s
real 0m0.111s
user 0m0.000s
sys 0m0.004s
$ filefrag -v testfile
Filesystem type is: ef53
File size of testfile is 2147483648 (524288 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 21339392 512
[... http://sundman.iki.fi/extents.txt ...]
282 523520 1618176 1568000 768 eof
testfile: 282 extents found
$
----8<---------------------------
Many extents are around 400 blocks(?) -- is this good or bad? (This
partition has a fragmentation score of 0 according to e4defrag.)
Regards,
Marcus
next prev parent reply other threads:[~2013-09-12 13:47 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-28 22:39 Debugging system freezes on filesystem writes Marcus Sundman
2012-11-01 19:01 ` Jan Kara
2012-11-02 2:19 ` Marcus Sundman
2012-11-07 16:17 ` Jan Kara
2012-11-08 23:41 ` Marcus Sundman
2012-11-09 13:12 ` Marcus Sundman
2012-11-13 13:51 ` Jan Kara
2012-11-16 1:11 ` Marcus Sundman
2012-11-21 23:30 ` Jan Kara
2012-11-27 16:14 ` Marcus Sundman
2012-12-05 15:32 ` Jan Kara
2013-02-20 8:42 ` Marcus Sundman
2013-02-20 11:40 ` Marcus Sundman
2013-02-22 20:51 ` Jan Kara
2013-02-22 23:27 ` Marcus Sundman
2013-02-24 0:12 ` Dave Chinner
2013-02-24 1:20 ` Theodore Ts'o
2013-02-26 18:41 ` Marcus Sundman
2013-02-26 22:17 ` Theodore Ts'o
2013-02-26 23:17 ` Jan Kara
2013-09-12 12:57 ` Marcus Sundman
2013-09-12 13:10 ` Jan Kara
2013-09-12 13:47 ` Marcus Sundman [this message]
2013-09-12 14:39 ` Jan Kara
2013-09-12 15:08 ` Marcus Sundman
2013-09-12 16:35 ` Jan Kara
2013-09-12 17:59 ` Marcus Sundman
2013-09-12 20:46 ` Jan Kara
2013-09-13 6:35 ` Marcus Sundman
2013-09-13 20:54 ` Jan Kara
2013-09-14 2:41 ` Theodore Ts'o
2013-09-15 19:19 ` Marcus Sundman
2013-09-16 0:06 ` Theodore Ts'o
2013-02-25 13:05 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5231C5FF.3060504@hibox.fi \
--to=marcus@hibox.fi \
--cc=david@fromorbit.com \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.