* Massive BTRFS performance degradation
@ 2014-03-09 7:48 KC
2014-03-09 8:17 ` Swâmi Petaramesh
0 siblings, 1 reply; 32+ messages in thread
From: KC @ 2014-03-09 7:48 UTC (permalink / raw)
To: linux-btrfs
I am experiencing massive performance degradation on my BTRFS root
partition on SSD. Except for regular daily updates, nothing changed in
the system. The mount point remained the same:
/ btrfs rw,noatime,compress=lzo,ssd,space_cache,autodefrag 0 0
but the performance dropped to less than 8% of norm.
Before:
# dd if=tempfile of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.57307 s, 683 MB/s
Now:
# dd if=tempfile of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 26.4373 s, 40.6 MB/s
I created a new btrfs partition on the SSD with the same mount options
and it is not being affected:
# dd if=/mnt/er/tempfile of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.57634 s, 681 MB/s
I also did
btrfs filesystem balance start /
wit no effect.
I tried changing mount options - still no effect.
I'd appreciate some suggestions.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Massive BTRFS performance degradation
2014-03-09 7:48 Massive BTRFS performance degradation KC
@ 2014-03-09 8:17 ` Swâmi Petaramesh
2014-03-09 10:01 ` Martin Steigerwald
2014-03-09 17:36 ` Massive BTRFS performance degradation Austin S Hemmelgarn
0 siblings, 2 replies; 32+ messages in thread
From: Swâmi Petaramesh @ 2014-03-09 8:17 UTC (permalink / raw)
To: linux-btrfs; +Cc: impactoria
Le dimanche 9 mars 2014 08:48:20 KC a écrit :
> I am experiencing massive performance degradation on my BTRFS root
> partition on SSD.
BTW, is BTRFS still a SSD-killer ? It had this reputation a while ago, and I'm
not sure if this still is the case, but I don't dare (yet) converting to BTRFS
one of my laptops that has a SSD...
--
Swâmi Petaramesh <swami@petaramesh.org> http://petaramesh.org PGP 9076E32E
A bus station is where buses stop. A train station is where trains stop.
On my desk, there is a workstation...
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Massive BTRFS performance degradation
2014-03-09 8:17 ` Swâmi Petaramesh
@ 2014-03-09 10:01 ` Martin Steigerwald
2014-03-09 10:23 ` Swâmi Petaramesh
2014-03-09 17:36 ` Massive BTRFS performance degradation Austin S Hemmelgarn
1 sibling, 1 reply; 32+ messages in thread
From: Martin Steigerwald @ 2014-03-09 10:01 UTC (permalink / raw)
To: Swâmi Petaramesh; +Cc: linux-btrfs, impactoria
Am Sonntag, 9. März 2014, 09:17:24 schrieb Swâmi Petaramesh:
> Le dimanche 9 mars 2014 08:48:20 KC a écrit :
> > I am experiencing massive performance degradation on my BTRFS root
> > partition on SSD.
>
> BTW, is BTRFS still a SSD-killer ? It had this reputation a while ago, and
> I'm not sure if this still is the case, but I don't dare (yet) converting
> to BTRFS one of my laptops that has a SSD...
I never heard about this reputation and luckily the Intel SSD 320 didn´t
either. Its almost three years old by now:
SMART Attributes Data Structure revision number: 5
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
3 Spin_Up_Time 0x0020 100 100 000 Old_age Offline - 0
4 Start_Stop_Count 0x0030 100 100 000 Old_age Offline - 0
5 Reallocated_Sector_Ct 0x0032 100 100 000 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 9171
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 2603
170 Reserve_Block_Count 0x0033 100 100 010 Pre-fail Always - 0
171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 169
183 Runtime_Bad_Block 0x0030 100 100 000 Old_age Offline - 1
184 End-to-End_Error 0x0032 100 100 090 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
192 Unsafe_Shutdown_Count 0x0032 100 100 000 Old_age Always - 225
199 UDMA_CRC_Error_Count 0x0030 100 100 000 Old_age Offline - 0
225 Host_Writes_32MiB 0x0032 100 100 000 Old_age Always - 393645
226 Workld_Media_Wear_Indic 0x0032 100 100 000 Old_age Always - 2204244
227 Workld_Host_Reads_Perc 0x0032 100 100 000 Old_age Always - 49
228 Workload_Minutes 0x0032 100 100 000 Old_age Always - 13145477
232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always - 0
233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 0
241 Host_Writes_32MiB 0x0032 100 100 000 Old_age Always - 393645
242 Host_Reads_32MiB 0x0032 100 100 000 Old_age Always - 1002465
Media wearout indicator basically says the SSD considers itself to be
"new". Value is the same 100 as it was as it was new. The raw value tough
raised for the first time. On 2013-10-12 is was:
233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 0
For more about this indicator read in Intel PDF about it.
There are some Erase fails that happened I think in the first year of
SSD life, but that 169 raw value so far never raised gain.
There have been 393645 * 32 MiB = 12,01 TiB of writes. The SSD itself is
specified to be usable for at least 5 years with 20 TB of host writes each
day. That is about 7,3 TB or 7,1 TiB. I assumed TB in the Intel
specification document, if its TiB, its then its 7,3 TiB.
Anyway with conversative 7 TiB a year or 21 TiB in three years of which
only 12 TiB are used up, I am quite confident that this SSD could last
longer than 5 years.
This ThinkPad T520 has been with BTRFS since installation of the Debian
sid system on it with Kernel 2.6.39 or even 2.6.38 (where Sandybridge
graphics didn´t work so well as today yet).
So that much to any FUD about BTRFS and SSDs.
Thanks,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Massive BTRFS performance degradation
2014-03-09 10:01 ` Martin Steigerwald
@ 2014-03-09 10:23 ` Swâmi Petaramesh
2014-03-09 11:33 ` Hugo Mills
0 siblings, 1 reply; 32+ messages in thread
From: Swâmi Petaramesh @ 2014-03-09 10:23 UTC (permalink / raw)
To: Martin Steigerwald; +Cc: linux-btrfs
Le dimanche 9 mars 2014 11:01:17 vous avez écrit :
> This ThinkPad T520 has been with BTRFS since installation of the Debian
> sid system on it with Kernel 2.6.39 or even 2.6.38 (where Sandybridge
> graphics didn´t work so well as today yet).
>
> So that much to any FUD about BTRFS and SSDs.
Wow !
Thanks for this very interesting info. Would you tell me if you use any of the
SSD optimisation mount options: discard, ssd or ssd_spread ?
Myself I've been moving back and forth between BTRFS / ZFS ans ext4 over the
past 2-3 years, each time giving a chance to BTRFS, then typically 3-4 months
later switching back to either ext4 or ZFS after having either lost all of my
data, or seen the filesystem slow down to the point it becomes unusable, beyond
defragmentation or removing snapshots or whatever...
So my yo-yo-game is kind of "Is BTRFS now ready for use ?... Let's give it a
chance... OMFG... Lost everything, unusable system... Never want to hear about
BTRFS anymore... Wel... Maybe will come back next year... etc"
I've been used to consider for 3 years that :
- Next kernel release will have a truly excellent and mature BTRFS support.
- Current kernel release has correct BTRFS support - but most mainline distros
don't have it yet, maybe in 6 months ?
- Previous kernel release (the one that all current distros come with) have a
completely broke BTRFS support...
#LOL
Well I hope it's quite not the case anymore for I just installed my neighbour,
old lady's system with a Linux Mint 16 (kernel 3.11) on BTRFS with skinny
extents...
But for myself running ArchLinux in kernel 3.13, I still find out that :
- "btrfs send" causes my kernel to BUG :-/ (the wiki says it's working
stuff...)
- btrfs-defrag.sgh hangs because of some glitch with "filefrag".
- bedup crashes badly and looks completely unmaintained as far as I can tell
and nobody seems to care.
Soooo weeelllll... Looks like readiness for prime time is still ahead of us...
(But still my 2 main systems are now BTRFS, including my main storage machine
running BTRFS RAID-1, so I hope it can be reliable, at least...)
Kind regards.
--
Swâmi Petaramesh <swami@petaramesh.org> http://petaramesh.org PGP 9076E32E
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Massive BTRFS performance degradation
2014-03-09 10:23 ` Swâmi Petaramesh
@ 2014-03-09 11:33 ` Hugo Mills
2014-03-09 11:54 ` Martin Steigerwald
` (2 more replies)
0 siblings, 3 replies; 32+ messages in thread
From: Hugo Mills @ 2014-03-09 11:33 UTC (permalink / raw)
To: Swâmi Petaramesh; +Cc: Martin Steigerwald, linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 4016 bytes --]
On Sun, Mar 09, 2014 at 11:23:29AM +0100, Swâmi Petaramesh wrote:
> Le dimanche 9 mars 2014 11:01:17 vous avez écrit :
> > This ThinkPad T520 has been with BTRFS since installation of the Debian
> > sid system on it with Kernel 2.6.39 or even 2.6.38 (where Sandybridge
> > graphics didn´t work so well as today yet).
> >
> > So that much to any FUD about BTRFS and SSDs.
>
> Wow !
>
> Thanks for this very interesting info. Would you tell me if you use any of the
> SSD optimisation mount options: discard, ssd or ssd_spread ?
I would recommend none of the three. :)
ssd should be activated automatically on any non-rotational device.
ssd_spread is generally slower on modern SSDs than the ssd option.
discard is, except on the very latest hardware, a synchronous command
(it's a limitation of the SATA standard), and therefore results in
very very poor performance.
[snip]
> I've been used to consider for 3 years that :
>
> - Next kernel release will have a truly excellent and mature BTRFS support.
I don't think anyone's claimed that. The next version tends to fix
most of the *known* problems.
> - Current kernel release has correct BTRFS support - but most
> mainline distros don't have it yet, maybe in 6 months ?
This is usually true -- but by the time the current kernels come
round, there's usually been another swathe of bugs uncovered, thus
falling into this problem:
> - Previous kernel release (the one that all current distros come
> with) have a completely broke BTRFS support...
Not completely broken, but with known and identified bugs that have
been fixed in later versions.
> #LOL
>
> Well I hope it's quite not the case anymore for I just installed my neighbour,
> old lady's system with a Linux Mint 16 (kernel 3.11) on BTRFS with skinny
> extents...
There's one known and serious bug in 3.11 before 3.11.6 which
affects balances. Please make sure that you're running 3.11.6 or
later. There may be other bugs in there that have been fixed in later
kernel versions as well, but that's the "headline" one.
> But for myself running ArchLinux in kernel 3.13, I still find out that :
>
> - "btrfs send" causes my kernel to BUG :-/ (the wiki says it's working
> stuff...)
We don't get many bug reports of kernel oopses in send. This may be
that we don't have many people trying to use it (it is, after all,
fairly deep and poorly explained magic at the moment). It may be that
you have some corruption that's gone undetected otherwise, and the
send code isn't handling it well. Or it may be an actual bug in send.
At least you've reported it. (It might also be worth putting a copy of
the report on bugzilla.kernel.org, because then it doesn't get
forgotten in the email noise here).
> - btrfs-defrag.sgh hangs because of some glitch with "filefrag".
Is that a btrfs problem, or a filefrag problem? btrfs-defrag.sh
isn't something I've heard of before, so I'd say it's unlikely to be
maintained by any of the main btrfs developers (and hence is much more
likely to be unmaintained or just plain broken in general).
> - bedup crashes badly and looks completely unmaintained as far as I can tell
> and nobody seems to care.
That's because nobody here is connected to bedup in any way. It was
a third-party piece of software written by someone (I don't even
recall who) who hasn't, as far as I know, engaged with the main btrfs
developers at all.
> Soooo weeelllll... Looks like readiness for prime time is still
> ahead of us...
I think that's fair to say. However, it is noticeably improving
over time. The timescales are just quite long.
Hugo.
--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- Well, you don't get to be a kernel hacker simply by looking ---
good in Speedos. -- Rusty Russell
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Massive BTRFS performance degradation
2014-03-09 11:33 ` Hugo Mills
@ 2014-03-09 11:54 ` Martin Steigerwald
2014-03-09 12:10 ` Swâmi Petaramesh
2014-03-14 2:11 ` discard synchronous on most SSDs? Marc MERLIN
2 siblings, 0 replies; 32+ messages in thread
From: Martin Steigerwald @ 2014-03-09 11:54 UTC (permalink / raw)
To: Hugo Mills, Swâmi Petaramesh, linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 1217 bytes --]
Am Sonntag, 9. März 2014, 11:33:50 schrieb Hugo Mills:
> On Sun, Mar 09, 2014 at 11:23:29AM +0100, Swâmi Petaramesh wrote:
> > Le dimanche 9 mars 2014 11:01:17 vous avez écrit :
> > > This ThinkPad T520 has been with BTRFS since installation of the Debian
> > > sid system on it with Kernel 2.6.39 or even 2.6.38 (where Sandybridge
> > > graphics didn´t work so well as today yet).
> > >
> > > So that much to any FUD about BTRFS and SSDs.
> >
> >
> >
> > Wow !
> >
> >
> >
> > Thanks for this very interesting info. Would you tell me if you use any of
> > the SSD optimisation mount options: discard, ssd or ssd_spread ?
>
> I would recommend none of the three.
>
> ssd should be activated automatically on any non-rotational device.
> ssd_spread is generally slower on modern SSDs than the ssd option.
> discard is, except on the very latest hardware, a synchronous command
> (it's a limitation of the SATA standard), and therefore results in
> very very poor performance.
Thats exactly how I use it. I just fstrim the partitions from time to time.
Thanks,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Massive BTRFS performance degradation
2014-03-09 11:33 ` Hugo Mills
2014-03-09 11:54 ` Martin Steigerwald
@ 2014-03-09 12:10 ` Swâmi Petaramesh
2014-03-09 17:14 ` boris
2014-03-14 2:11 ` discard synchronous on most SSDs? Marc MERLIN
2 siblings, 1 reply; 32+ messages in thread
From: Swâmi Petaramesh @ 2014-03-09 12:10 UTC (permalink / raw)
To: Hugo Mills, linux-btrfs
Le dimanche 9 mars 2014 11:33:50 Hugo Mills a écrit :
>
> ssd should be activated automatically on any non-rotational device.
> ssd_spread is generally slower on modern SSDs than the ssd option.
> discard is, except on the very latest hardware, a synchronous command
> (it's a limitation of the SATA standard), and therefore results in
> very very poor performance.
Thanks for the info Hugo :-)
> There's one known and serious bug in 3.11 before 3.11.6 which
> affects balances. Please make sure that you're running 3.11.6 or
> later. There may be other bugs in there that have been fixed in later
> kernel versions as well, but that's the "headline" one.
Latest Ubuntu / Mint now have 3.11.0-18. Anyway I don't think my "old lady
neighbour" will ever hear about balance or care, and will ever try to run it
on her laptop. She would first have to figure out what a terminal and command
line are ;-)
> We don't get many bug reports of kernel oopses in send. This may be
> that we don't have many people trying to use it (it is, after all,
> fairly deep and poorly explained magic at the moment). It may be that
> you have some corruption that's gone undetected otherwise,
Well, that's a rather "young" BTRFS setup (less than a month) that passes
scrub without detecting any error, and a plain "btrfs send" works, then an
incremental one fails...
> send code isn't handling it well. Or it may be an actual bug in send.
I would tend to believe so ;-)
> At least you've reported it. (It might also be worth putting a copy of
> the report on bugzilla.kernel.org, because then it doesn't get
> forgotten in the email noise here).
> > - btrfs-defrag.sgh hangs because of some glitch with "filefrag".
>
> Is that a btrfs problem, or a filefrag problem?
Looks like it's a filefrag problem. Looks like filefrag stalls forever trying to
figure out the fragmentation status of some files...
> btrfs-defrag.sh isn't something I've heard of before, so I'd say it's
> unlikely to be maintained by any of the main btrfs developers (and hence is
> much more likely to be unmaintained or just plain broken in general).
It's a useful script that can be found there
https://gitorious.org/btrfs-defrag
...and it's maintained by Dmitry, who's a nice, responsive and helpful guy.
> > - bedup crashes badly and looks completely unmaintained as far as I can
> > tell and nobody seems to care.
>
> That's because nobody here is connected to bedup in any way. It was
> a third-party piece of software written by someone (I don't even
> recall who) who hasn't, as far as I know, engaged with the main btrfs
> developers at all.
bedup is mentioned on the BRTFS wiki
https://btrfs.wiki.kernel.org/index.php/Deduplication
...as being the only current way to perform BTRFS deduplication. I found it in
the wiki and belived/hoped it was something more "official and maintained" that
what you seem to mean - alas...
Actually deduplication WAS the reason why I recently made the move to BTRFS
again, for deduplication in ZFS is working, but *SO* memory hungry and
performance killer unless you have *lots* of RAM...
So I wanted to give a try at BTRFS offline bedup.
> > Soooo weeelllll... Looks like readiness for prime time is still
> > ahead of us...
>
> I think that's fair to say. However, it is noticeably improving
> over time. The timescales are just quite long.
If the timescales become really too long, people with just end keeping with
the idea that BTRFS is not ready for production and won't be any previsible
time soon...
Kind regards.
--
Swâmi Petaramesh <swami@petaramesh.org> http://petaramesh.org PGP 9076E32E
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Massive BTRFS performance degradation
2014-03-09 12:10 ` Swâmi Petaramesh
@ 2014-03-09 17:14 ` boris
0 siblings, 0 replies; 32+ messages in thread
From: boris @ 2014-03-09 17:14 UTC (permalink / raw)
To: linux-btrfs
Swâmi Petaramesh <swami <at> petaramesh.org> writes:
> Actually deduplication WAS the reason why I recently made the move to BTRFS
> again, for deduplication in ZFS is working, but *SO* memory hungry and
> performance killer unless you have *lots* of RAM...
>
If you think about what dedup is has to do it's going to be fairly memory
hungry; hopefully there are a few maths (yes, it's maths not math! Think of
the game dominoe :-D ) bods on the team.
<starts to get excited about how one would tackle it then decides he needs
to get out more>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Massive BTRFS performance degradation
2014-03-09 8:17 ` Swâmi Petaramesh
2014-03-09 10:01 ` Martin Steigerwald
@ 2014-03-09 17:36 ` Austin S Hemmelgarn
2014-03-09 18:55 ` Tobias Holst
1 sibling, 1 reply; 32+ messages in thread
From: Austin S Hemmelgarn @ 2014-03-09 17:36 UTC (permalink / raw)
To: Swâmi Petaramesh, linux-btrfs; +Cc: impactoria
On 03/09/2014 04:17 AM, Swâmi Petaramesh wrote:
> Le dimanche 9 mars 2014 08:48:20 KC a écrit :
>> I am experiencing massive performance degradation on my BTRFS
>> root partition on SSD.
>
> BTW, is BTRFS still a SSD-killer ? It had this reputation a while
> ago, and I'm not sure if this still is the case, but I don't dare
> (yet) converting to BTRFS one of my laptops that has a SSD...
>
Actually, because of the COW nature of BTRFS, it should be better for
SSD's than stuff like ext4 (which DOES kill SSD's when journaling is
enabled because it ends up doing thousands of read-modify-write cycles
to the same 128k of the disk under just generic usage). Just make
sure that you use the 'ssd' and 'discard' mount options.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Massive BTRFS performance degradation
2014-03-09 17:36 ` Massive BTRFS performance degradation Austin S Hemmelgarn
@ 2014-03-09 18:55 ` Tobias Holst
0 siblings, 0 replies; 32+ messages in thread
From: Tobias Holst @ 2014-03-09 18:55 UTC (permalink / raw)
To: Austin S Hemmelgarn; +Cc: Swâmi Petaramesh, linux-btrfs, impactoria
2014-03-09 18:36 GMT+01:00 Austin S Hemmelgarn <ahferroin7@gmail.com>:
> On 03/09/2014 04:17 AM, Swâmi Petaramesh wrote:
>> Le dimanche 9 mars 2014 08:48:20 KC a écrit :
>>> I am experiencing massive performance degradation on my BTRFS
>>> root partition on SSD.
>>
>> BTW, is BTRFS still a SSD-killer ? It had this reputation a while
>> ago, and I'm not sure if this still is the case, but I don't dare
>> (yet) converting to BTRFS one of my laptops that has a SSD...
>>
> Actually, because of the COW nature of BTRFS, it should be better for
> SSD's than stuff like ext4 (which DOES kill SSD's when journaling is
> enabled because it ends up doing thousands of read-modify-write cycles
> to the same 128k of the disk under just generic usage). Just make
> sure that you use the 'ssd' and 'discard' mount options.
Every modern SSD does "Wear Leveling". Doing a read-modify-write cycle
on the same block doesn't mean it writes to the same memory cell. The
SSD-controller distributes the write-cycles over all (empty) cells. So
in best-case every cell in the SSD is used equally, no matter of doing
random writes or writing the same block over and over. This works
better with lots of empty space on the SSD, that's why you should
never use more than 90% of the space on a SSD. Garbage collection and
TRIM also help the SSD-controller to find empty cells.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-09 11:33 ` Hugo Mills
2014-03-09 11:54 ` Martin Steigerwald
2014-03-09 12:10 ` Swâmi Petaramesh
@ 2014-03-14 2:11 ` Marc MERLIN
2014-03-14 3:39 ` Chris Murphy
2 siblings, 1 reply; 32+ messages in thread
From: Marc MERLIN @ 2014-03-14 2:11 UTC (permalink / raw)
To: Hugo Mills, Swâmi Petaramesh, Martin Steigerwald,
linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 627 bytes --]
On Sun, Mar 09, 2014 at 11:33:50AM +0000, Hugo Mills wrote:
> discard is, except on the very latest hardware, a synchronous command
> (it's a limitation of the SATA standard), and therefore results in
> very very poor performance.
Interesting. How do I know if a given SSD will hang on discard?
Is a Samsung EVO 840 1TB SSD latest hardware enough, or not? :)
Thanks
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 308 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-14 2:11 ` discard synchronous on most SSDs? Marc MERLIN
@ 2014-03-14 3:39 ` Chris Murphy
2014-03-14 5:17 ` Marc MERLIN
2014-03-14 7:27 ` Chris Samuel
0 siblings, 2 replies; 32+ messages in thread
From: Chris Murphy @ 2014-03-14 3:39 UTC (permalink / raw)
To: Btrfs BTRFS
On Mar 13, 2014, at 8:11 PM, Marc MERLIN <marc@merlins.org> wrote:
> On Sun, Mar 09, 2014 at 11:33:50AM +0000, Hugo Mills wrote:
>> discard is, except on the very latest hardware, a synchronous command
>> (it's a limitation of the SATA standard), and therefore results in
>> very very poor performance.
>
> Interesting. How do I know if a given SSD will hang on discard?
> Is a Samsung EVO 840 1TB SSD latest hardware enough, or not? :)
smartctl -a or -x will tell you what SATA revision is in place. The queued trim support is in SATA Rev 3.1. I'm not certain if this requires only the drive to support that revision level, or both controller and drive.
Chris Murphy
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-14 3:39 ` Chris Murphy
@ 2014-03-14 5:17 ` Marc MERLIN
2014-03-14 7:33 ` Chris Samuel
` (2 more replies)
2014-03-14 7:27 ` Chris Samuel
1 sibling, 3 replies; 32+ messages in thread
From: Marc MERLIN @ 2014-03-14 5:17 UTC (permalink / raw)
To: Chris Murphy; +Cc: Btrfs BTRFS
On Thu, Mar 13, 2014 at 09:39:02PM -0600, Chris Murphy wrote:
>
> On Mar 13, 2014, at 8:11 PM, Marc MERLIN <marc@merlins.org> wrote:
>
> > On Sun, Mar 09, 2014 at 11:33:50AM +0000, Hugo Mills wrote:
> >> discard is, except on the very latest hardware, a synchronous command
> >> (it's a limitation of the SATA standard), and therefore results in
> >> very very poor performance.
> >
> > Interesting. How do I know if a given SSD will hang on discard?
> > Is a Samsung EVO 840 1TB SSD latest hardware enough, or not? :)
>
> smartctl -a or -x will tell you what SATA revision is in place. The queued trim support is in SATA Rev 3.1. I'm not certain if this requires only the drive to support that revision level, or both controller and drive.
I'm not sure I'm seeing this, which field is that?
=== START OF INFORMATION SECTION ===
Device Model: Samsung SSD 840 EVO 1TB
Serial Number: S1D9NEAD934600N
LU WWN Device Id: 5 002538 85009a8ff
Firmware Version: EXT0BB0Q
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Size: 512 bytes logical/physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4c
Local Time is: Thu Mar 13 22:15:14 2014 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (15000) seconds.
Offline data collection
capabilities: (0x53) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 250) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
5 Reallocated_Sector_Ct PO--CK 100 100 010 - 0
9 Power_On_Hours -O--CK 099 099 000 - 2219
12 Power_Cycle_Count -O--CK 099 099 000 - 659
177 Wear_Leveling_Count PO--C- 099 099 000 - 3
179 Used_Rsvd_Blk_Cnt_Tot PO--C- 100 100 010 - 0
181 Program_Fail_Cnt_Total -O--CK 100 100 010 - 0
182 Erase_Fail_Count_Total -O--CK 100 100 010 - 0
183 Runtime_Bad_Block PO--C- 100 100 010 - 0
187 Reported_Uncorrect -O--CK 100 100 000 - 0
190 Airflow_Temperature_Cel -O--CK 054 041 000 - 46
195 Hardware_ECC_Recovered -O-RC- 200 200 000 - 0
199 UDMA_CRC_Error_Count -OSRCK 100 100 000 - 0
235 Unknown_Attribute -O--C- 099 099 000 - 35
241 Total_LBAs_Written -O--CK 099 099 000 - 12186944165
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-14 3:39 ` Chris Murphy
2014-03-14 5:17 ` Marc MERLIN
@ 2014-03-14 7:27 ` Chris Samuel
1 sibling, 0 replies; 32+ messages in thread
From: Chris Samuel @ 2014-03-14 7:27 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 493 bytes --]
On Thu, 13 Mar 2014 09:39:02 PM Chris Murphy wrote:
> smartctl -a or -x will tell you what SATA revision is in place. The queued
> trim support is in SATA Rev 3.1. I'm not certain if this requires only the
> drive to support that revision level, or both controller and drive.
Both I'd say as I believe it's the controller that has to issue it to the
drive, and the drive needs to understand it.
cheers,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 482 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-14 5:17 ` Marc MERLIN
@ 2014-03-14 7:33 ` Chris Samuel
2014-03-14 19:26 ` Marc MERLIN
2014-03-15 4:06 ` Chris Samuel
2014-03-14 12:07 ` Duncan
2014-03-14 21:44 ` Chris Murphy
2 siblings, 2 replies; 32+ messages in thread
From: Chris Samuel @ 2014-03-14 7:33 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 464 bytes --]
Hi Marc,
On Thu, 13 Mar 2014 10:17:50 PM Marc MERLIN wrote:
> I'm not sure I'm seeing this, which field is that?
I *think* you want smartctl -i instead, and look for the field that says
something like:
ATA Version is: ATA8-ACS, ACS-2 T13/2015-D revision 3
So if my understanding is correct that says it's just rev. 3.0 so TRIM for
this is synchronous.
Good luck!
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 482 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-14 5:17 ` Marc MERLIN
2014-03-14 7:33 ` Chris Samuel
@ 2014-03-14 12:07 ` Duncan
2014-03-14 21:44 ` Chris Murphy
2 siblings, 0 replies; 32+ messages in thread
From: Duncan @ 2014-03-14 12:07 UTC (permalink / raw)
To: linux-btrfs
Marc MERLIN posted on Thu, 13 Mar 2014 22:17:50 -0700 as excerpted:
> On Thu, Mar 13, 2014 at 09:39:02PM -0600, Chris Murphy wrote:
>>
>> On Mar 13, 2014, at 8:11 PM, Marc MERLIN <marc@merlins.org> wrote:
>>
>> > On Sun, Mar 09, 2014 at 11:33:50AM +0000, Hugo Mills wrote:
>> >> discard is, except on the very latest hardware, a synchronous
>> >> command (it's a limitation of the SATA standard), and therefore
>> >> results in very very poor performance.
>> >
>> > Interesting. How do I know if a given SSD will hang on discard?
>> > Is a Samsung EVO 840 1TB SSD latest hardware enough, or not? :)
>>
>> smartctl -a or -x will tell you what SATA revision is in place. The
>> queued trim support is in SATA Rev 3.1. I'm not certain if this
>> requires only the drive to support that revision level, or both
>> controller and drive.
>
> I'm not sure I'm seeing this, which field is that?
> ATA Version is: 8
> ATA Standard is: ATA-8-ACS revision 4c
Your drive didn't report it, but here, I have SATA fields as well, in
addition to the ATA fields:
Here's the fields from my Corsair Neutron SSDs:
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 2.5, 6.0 Gb/s
Here's the fields from my Seagate 500-gig 2.5-inch spinning rust:
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 2.6, 3.0 Gb/s
(More about that below.)
Smartctl version here is 6.2 2013-07-26 r3841, according to the output.
(I'm running gentoo/~amd64 FWIW so it's a local-build). You snipped that
bit of your output so I can't compare.
But it may also depend on whether smartctl auto-detected and used the ATA
or the SCSI (or something else) command set and how your devices are
actually connected, plus BIOS settings, etc. See the manpage
documentation for the -d TYPE (--device=TYPE) option and the ATA/SCSI/SAT
discussion rather further down the manpage for more.
Here I have direct SATA connections with the BIOS set to AHCI mode and am
thus using the kernel's AHCI drivers, since that's the most common SATA
chipset standard these days, thus increasing portability given my
monolithic kernel build.
smartctl's -d test reports an original guess of scsi, changed to sat
after detection.
Of course connection via USB bridge or the like complicates things
considerably.
Meanwhile, SATA 2.5, 6 Gb/s on the SSDs, SATA 2.6, 3 Gb/s on the spinning
rust? WTF? The SSDs have SATA 2.5 but 6 Gb/s while the spinning rust
has a later 2.6 but only 3 Gb/s (tho of course on a mechanical drive the
bus speed won't be the bottleneck)? Now I'm confused.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-14 7:33 ` Chris Samuel
@ 2014-03-14 19:26 ` Marc MERLIN
2014-03-14 19:57 ` Martin K. Petersen
2014-03-15 4:06 ` Chris Samuel
1 sibling, 1 reply; 32+ messages in thread
From: Marc MERLIN @ 2014-03-14 19:26 UTC (permalink / raw)
To: Chris Samuel, Duncan, Christopher Corsi; +Cc: linux-btrfs
On Fri, Mar 14, 2014 at 12:07:54PM +0000, Duncan wrote:
> Marc MERLIN posted on Thu, 13 Mar 2014 22:17:50 -0700 as excerpted:
>
> > On Thu, Mar 13, 2014 at 09:39:02PM -0600, Chris Murphy wrote:
> >>
> >> On Mar 13, 2014, at 8:11 PM, Marc MERLIN <marc@merlins.org> wrote:
> >>
> >> > On Sun, Mar 09, 2014 at 11:33:50AM +0000, Hugo Mills wrote:
> >> >> discard is, except on the very latest hardware, a synchronous
> >> >> command (it's a limitation of the SATA standard), and therefore
> >> >> results in very very poor performance.
> >> >
> >> > Interesting. How do I know if a given SSD will hang on discard?
> >> > Is a Samsung EVO 840 1TB SSD latest hardware enough, or not? :)
> >>
> >> smartctl -a or -x will tell you what SATA revision is in place. The
> >> queued trim support is in SATA Rev 3.1. I'm not certain if this
> >> requires only the drive to support that revision level, or both
> >> controller and drive.
> >
> > I'm not sure I'm seeing this, which field is that?
>
> > ATA Version is: 8
> > ATA Standard is: ATA-8-ACS revision 4c
>
> Your drive didn't report it, but here, I have SATA fields as well, in
> addition to the ATA fields:
>
> Here's the fields from my Corsair Neutron SSDs:
>
> ATA Version is: ATA8-ACS (minor revision not indicated)
> SATA Version is: SATA 2.5, 6.0 Gb/s
>
> Here's the fields from my Seagate 500-gig 2.5-inch spinning rust:
>
> ATA Version is: ATA8-ACS T13/1699-D revision 4
> SATA Version is: SATA 2.6, 3.0 Gb/s
Ok, my smartmontools was too old. I got a newer one and now have proper
output:
Device Model: Samsung SSD 840 EVO 1TB
Serial Number: S1D9NEAD934600N
LU WWN Device Id: 5 002538 85009a8ff
Firmware Version: EXT0BB0Q
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Mar 14 10:49:39 2014 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
So I have Sata 3.1, that's great news, it means I can keep using discard
without worrying about performance and hangs
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-14 19:26 ` Marc MERLIN
@ 2014-03-14 19:57 ` Martin K. Petersen
2014-03-14 20:46 ` Holger Hoffstätte
2014-03-15 5:25 ` Chris Samuel
0 siblings, 2 replies; 32+ messages in thread
From: Martin K. Petersen @ 2014-03-14 19:57 UTC (permalink / raw)
To: Marc MERLIN; +Cc: Chris Samuel, Duncan, Christopher Corsi, linux-btrfs
>>>>> "Marc" == Marc MERLIN <marc@merlins.org> writes:
Marc,
Marc> So I have Sata 3.1, that's great news, it means I can keep using
Marc> discard without worrying about performance and hangs
The fact that the drive reports compliance with a certain version of
SATA does not in any way imply that it implements all commands defined
in that specification.
The location where queued TRIM support is reported is somewhat unusual.
And last I looked hdparm -I had no infrastructure in place to report
stuff contained in log pages.
The kernel does look the right place to determine whether to issue the
queued or unqueued variant or not. But the information isn't exported to
userland.
So right now I'm afraid we don't have a good way for a user to determine
whether a device supports queued trims or not.
I guess we could consider either adding an ATA-specific "I don't suck"
flag in sysfs, add the missing code to hdparm, or both...
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-14 19:57 ` Martin K. Petersen
@ 2014-03-14 20:46 ` Holger Hoffstätte
2014-03-15 4:21 ` Marc MERLIN
2014-03-15 5:25 ` Chris Samuel
1 sibling, 1 reply; 32+ messages in thread
From: Holger Hoffstätte @ 2014-03-14 20:46 UTC (permalink / raw)
To: linux-btrfs
On Fri, 14 Mar 2014 15:57:41 -0400, Martin K. Petersen wrote:
> So right now I'm afraid we don't have a good way for a user to determine
> whether a device supports queued trims or not.
Mount with discard, unpack kernel tree, sync, rm -rf tree.
If it takes several seconds, you have sync discard, no?
This changed somewhere around kernel 3.8.x; before that it used to be
acceptably fast. Since then I only do batch trims, daily (server) or
weekly (laptop).
-h
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-14 5:17 ` Marc MERLIN
2014-03-14 7:33 ` Chris Samuel
2014-03-14 12:07 ` Duncan
@ 2014-03-14 21:44 ` Chris Murphy
2 siblings, 0 replies; 32+ messages in thread
From: Chris Murphy @ 2014-03-14 21:44 UTC (permalink / raw)
To: Marc MERLIN; +Cc: Btrfs BTRFS
On Mar 13, 2014, at 11:17 PM, Marc MERLIN <marc@merlins.org> wrote:
> On Thu, Mar 13, 2014 at 09:39:02PM -0600, Chris Murphy wrote:
>>
>> On Mar 13, 2014, at 8:11 PM, Marc MERLIN <marc@merlins.org> wrote:
>>
>>> On Sun, Mar 09, 2014 at 11:33:50AM +0000, Hugo Mills wrote:
>>>> discard is, except on the very latest hardware, a synchronous command
>>>> (it's a limitation of the SATA standard), and therefore results in
>>>> very very poor performance.
>>>
>>> Interesting. How do I know if a given SSD will hang on discard?
>>> Is a Samsung EVO 840 1TB SSD latest hardware enough, or not? :)
>>
>> smartctl -a or -x will tell you what SATA revision is in place. The queued trim support is in SATA Rev 3.1. I'm not certain if this requires only the drive to support that revision level, or both controller and drive.
>
> I'm not sure I'm seeing this, which field is that?
>
> === START OF INFORMATION SECTION ===
> Device Model: Samsung SSD 840 EVO 1TB
> Serial Number: S1D9NEAD934600N
> LU WWN Device Id: 5 002538 85009a8ff
> Firmware Version: EXT0BB0Q
> User Capacity: 1,000,204,886,016 bytes [1.00 TB]
> Sector Size: 512 bytes logical/physical
> Device is: Not in smartctl database [for details use: -P showall]
> ATA Version is: 8
> ATA Standard is: ATA-8-ACS revision 4c
> Local Time is: Thu Mar 13 22:15:14 2014 PDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
After ATA Version for me.
$ smartctl -a /dev/disk0
smartctl 6.1 2013-03-16 r3800 [x86_64-apple-darwin12.3.0] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Samsung based SSDs
Device Model: SAMSUNG SSD 830 Series
Serial Number: S0Z4NEAC933856
LU WWN Device Id: 5 002538 043584d30
Firmware Version: CXM03B1Q
User Capacity: 256,060,514,304 bytes [256 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2 T13/2015-D revision 2
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Mar 14 15:37:07 2014 MDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
The Samsung hardware by and large is fairly well behaved with discard in my experience. But it does really depend a lot on the workload. I'd notice occasional random freezes for a couple of seconds when I had it enabled in OS X (totally different animal from the kernel up), nothing severe. But it was annoying enough I disabled it, and the problem went away. Apple doesn't enable trim by default on non-Apple SSD's still, so the idea that "everyone else" is doing this isn't true. The Windows implementation is rather complex, and also isn't always used contrary to what's been reported (on the everybody panic or get mad NOW type web sites).
If you want to be conservative about it, I'd say just manually run fstrim when the system is idle. Do that once a week or two. Chron job it if you want.
Chris Murphy
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-14 7:33 ` Chris Samuel
2014-03-14 19:26 ` Marc MERLIN
@ 2014-03-15 4:06 ` Chris Samuel
2014-03-16 16:07 ` Martin K. Petersen
1 sibling, 1 reply; 32+ messages in thread
From: Chris Samuel @ 2014-03-15 4:06 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 1376 bytes --]
On Fri, 14 Mar 2014 06:33:24 PM Chris Samuel wrote:
> I *think* you want smartctl -i instead, and look for the field that says
> something like:
>
> ATA Version is: ATA8-ACS, ACS-2 T13/2015-D revision 3
Late night, cut and pasted the wrong line of output, mine says:
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Of course that's what the drive is reporting it supports, I'm not sure whether
that's the result of what has been negotiated between the controller and drive
or purely what the drive supports.
To get more information from smartctl you can use the --identify=wb option
instead of -i and that should give you a lot more detail about what then
drives claims to (and not to) support. On the version in Kubuntu 13.10
(6.1+svn3812-1) it only reports 3 things regarding TRIM for my drives.
chris@quad:/tmp$ sudo smartctl --identify=wb -d sat /dev/sdb | egrep -i 'trim|
discard'
69 14 1 Deterministic data after trim supported
69 5 0 Trimmed LBA range(s) returning zeroed data supported
169 0 1 Trim bit in DATA SET MANAGEMENT command supported
I'm currently doing a git clone of their SVN repo to see if there's any new
functionality that will gather any more information.
cheers,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 482 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-14 20:46 ` Holger Hoffstätte
@ 2014-03-15 4:21 ` Marc MERLIN
2014-03-15 9:38 ` Holger Hoffstätte
0 siblings, 1 reply; 32+ messages in thread
From: Marc MERLIN @ 2014-03-15 4:21 UTC (permalink / raw)
To: Holger Hoffstätte; +Cc: linux-btrfs
On Fri, Mar 14, 2014 at 08:46:09PM +0000, Holger Hoffstätte wrote:
> On Fri, 14 Mar 2014 15:57:41 -0400, Martin K. Petersen wrote:
>
> > So right now I'm afraid we don't have a good way for a user to determine
> > whether a device supports queued trims or not.
>
> Mount with discard, unpack kernel tree, sync, rm -rf tree.
> If it takes several seconds, you have sync discard, no?
Mmmh, interesting point.
legolas:/usr/src# time rm -rf linux-3.14-rc5
real 0m1.584s
user 0m0.008s
sys 0m1.524s
I remounted my FS with remount,nodiscard, and the time was the same.
> This changed somewhere around kernel 3.8.x; before that it used to be
> acceptably fast. Since then I only do batch trims, daily (server) or
> weekly (laptop).
I'm never really timed this before. Is it supposed to be faster than 1.5s on
a fast SSD?
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-14 19:57 ` Martin K. Petersen
2014-03-14 20:46 ` Holger Hoffstätte
@ 2014-03-15 5:25 ` Chris Samuel
2014-03-15 6:48 ` Chris Samuel
2014-03-16 16:22 ` Martin K. Petersen
1 sibling, 2 replies; 32+ messages in thread
From: Chris Samuel @ 2014-03-15 5:25 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 962 bytes --]
On Fri, 14 Mar 2014 03:57:41 PM Martin K. Petersen wrote:
> The fact that the drive reports compliance with a certain version of
> SATA does not in any way imply that it implements all commands defined
> in that specification.
It looks like drives that do support it can be detected with the kernel helper
function ata_fpdma_dsm_supported() defined in include/linux/libata.h.
I wonder if it would be possible to use that knowledge to extend the
smartctl's --identify functionality to report this?
Not even all drives that implement it do so correctly, the kernel has a
blacklist of drives that don't and currently lists just two:
/* devices that don't properly handle queued TRIM commands */
{ "Micron_M500*",· · NULL,· ATA_HORKAGE_NO_NCQ_TRIM, },
{ "Crucial_CT???M500SSD*",· NULL,· ATA_HORKAGE_NO_NCQ_TRIM, },
cheers,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 482 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-15 5:25 ` Chris Samuel
@ 2014-03-15 6:48 ` Chris Samuel
2014-03-15 11:26 ` Duncan
2014-03-16 16:22 ` Martin K. Petersen
1 sibling, 1 reply; 32+ messages in thread
From: Chris Samuel @ 2014-03-15 6:48 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 965 bytes --]
On Sat, 15 Mar 2014 04:25:05 PM Chris Samuel wrote:
> I wonder if it would be possible to use that knowledge to extend the
> smartctl's --identify functionality to report this?
After reading the SATA 3.1 spec I believe that smartctl *can* indicate if a
drive claims to support SATA 3.1 NCQ TRIM, thus:
$ sudo smartctl --identify /dev/sdb | fgrep 'Trim bit in DATA SET MANAGEMENT'
169 0 1 Trim bit in DATA SET MANAGEMENT command supported
$
If that command returns nothing then it's not reported as supported (and I've
tested that). You can get the same info with hdparm -I.
Of course, as Martin said, that doesn't necessarily mean the kernel is using
that reported ability.
My puzzle now is that I have two SSD drives that report supporting NCQ TRIM
(one confirmed via product info) but report only supporting SATA 3.0 not 3.1.
cheers,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 482 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-15 4:21 ` Marc MERLIN
@ 2014-03-15 9:38 ` Holger Hoffstätte
0 siblings, 0 replies; 32+ messages in thread
From: Holger Hoffstätte @ 2014-03-15 9:38 UTC (permalink / raw)
To: linux-btrfs
On Fri, 14 Mar 2014 21:21:16 -0700, Marc MERLIN wrote:
> On Fri, Mar 14, 2014 at 08:46:09PM +0000, Holger Hoffstätte wrote:
>> On Fri, 14 Mar 2014 15:57:41 -0400, Martin K. Petersen wrote:
>>
>> > So right now I'm afraid we don't have a good way for a user to
>> > determine whether a device supports queued trims or not.
>>
>> Mount with discard, unpack kernel tree, sync, rm -rf tree.
>> If it takes several seconds, you have sync discard, no?
>
> Mmmh, interesting point.
>
> legolas:/usr/src# time rm -rf linux-3.14-rc5 real 0m1.584s user
0m0.008s
> sys 0m1.524s
>
> I remounted my FS with remount,nodiscard, and the time was the same.
>
>> This changed somewhere around kernel 3.8.x; before that it used to be
>> acceptably fast. Since then I only do batch trims, daily (server) or
>> weekly (laptop).
>
> I'm never really timed this before. Is it supposed to be faster than
> 1.5s on a fast SSD?
No, ~1s + noise is OK and seems normal, depending on filesystem and
phase of the moon. To contrast here is the output from my laptop,
which has an old but still-going-strong Intel G2 with ext4:
$smartctl -i /dev/sda | grep ATA
ATA Version is: ATA/ATAPI-7 T13/1532D revision 1
SATA Version is: SATA 2.6, 3.0 Gb/s
without dicard:
rm -rf linux-3.12.14 0.05s user 1.28s system 98% cpu 1.364 total
remounted with discard & after an initial manual fstrim:
rm -rf linux-3.12.14 1.90s user 0.02s system 2% cpu 1:07.45 total
I think these numbers speak for themselves. :)
It's really good to know that SATA 3.1 apparently fixed this.
cheers
Holger
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-15 6:48 ` Chris Samuel
@ 2014-03-15 11:26 ` Duncan
2014-03-15 22:48 ` Chris Samuel
2014-03-16 6:06 ` Marc MERLIN
0 siblings, 2 replies; 32+ messages in thread
From: Duncan @ 2014-03-15 11:26 UTC (permalink / raw)
To: linux-btrfs
Chris Samuel posted on Sat, 15 Mar 2014 17:48:56 +1100 as excerpted:
> $ sudo smartctl --identify /dev/sdb | fgrep 'Trim bit in DATA SET
> MANAGEMENT'
> 169 0 1 Trim bit in DATA SET MANAGEMENT command
> supported
> $
>
> If that command returns nothing then it's not reported as supported (and
> I've tested that). You can get the same info with hdparm -I.
> My puzzle now is that I have two SSD drives that report supporting NCQ
> TRIM (one confirmed via product info) but report only supporting SATA
> 3.0 not 3.1.
My SATA 2.5 SSDs reported earlier, report support for it too, so it's
apparently not SATA 3.1 limited. (Note that I'm simply grepping word
169, in the command below. Since word 169 is trim support...)
sudo smartctl --identify /dev/sda | grep '^ 169'
169 - 0x0001 Data Set Management support
169 0 1 Trim bit in DATA SET MANAGEMENT command supported
Either that or that feature bit simply indicates trim support, not NCQ
trim support.
But it can be noted that if SATA 3.1 requires trim to be NCQ if its
supported at all (spinning rust would thus get a pass), then claiming 3.1
support as well as trim support should be the equivalent of claiming NCQ
trim support, likely with no indicator of whether that trim support is NCQ
or not, pre-3.1.
... Which would mean that my SATA 2.5 and your SATA 3.0 drives are simply
indicating trim support, not specifically NCQ trim support.
I guess you'd have to check the SATA 2.5 and 3.0 specs to find that out.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-15 11:26 ` Duncan
@ 2014-03-15 22:48 ` Chris Samuel
2014-03-16 6:06 ` Marc MERLIN
1 sibling, 0 replies; 32+ messages in thread
From: Chris Samuel @ 2014-03-15 22:48 UTC (permalink / raw)
To: linux-btrfs
On 15/03/14 22:26, Duncan wrote:
> Either that or that feature bit simply indicates trim support, not NCQ
> trim support.
You're quite right, I outsmarted myself by noticing at the fact that the
kernel tests for ATA_LOG_NCQ_SEND_RECV_DSM_TRIM and unsets that for
drives that don't support NCQ DSM TRIM and then seeing DSM TRIM in the
SATA 3.1 spec and inferred they were the same thing.
Looking closer at the kernel code that tests for what trim to use with
ATA_LOG_NCQ_SEND_RECV_DSM_TRIM it falls back to ATA_DSM_TRIM if it can't
do the NCQ version.
Mea culpa!
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-15 11:26 ` Duncan
2014-03-15 22:48 ` Chris Samuel
@ 2014-03-16 6:06 ` Marc MERLIN
2014-03-16 17:09 ` Chris Murphy
1 sibling, 1 reply; 32+ messages in thread
From: Marc MERLIN @ 2014-03-16 6:06 UTC (permalink / raw)
To: Duncan; +Cc: linux-btrfs
On Sat, Mar 15, 2014 at 11:26:27AM +0000, Duncan wrote:
> Chris Samuel posted on Sat, 15 Mar 2014 17:48:56 +1100 as excerpted:
>
> > $ sudo smartctl --identify /dev/sdb | fgrep 'Trim bit in DATA SET
> > MANAGEMENT'
> > 169 0 1 Trim bit in DATA SET MANAGEMENT command
> > supported
> > $
> >
> > If that command returns nothing then it's not reported as supported (and
> > I've tested that). You can get the same info with hdparm -I.
>
> > My puzzle now is that I have two SSD drives that report supporting NCQ
> > TRIM (one confirmed via product info) but report only supporting SATA
> > 3.0 not 3.1.
>
> My SATA 2.5 SSDs reported earlier, report support for it too, so it's
> apparently not SATA 3.1 limited. (Note that I'm simply grepping word
> 169, in the command below. Since word 169 is trim support...)
>
> sudo smartctl --identify /dev/sda | grep '^ 169'
> 169 - 0x0001 Data Set Management support
> 169 0 1 Trim bit in DATA SET MANAGEMENT command supported
>
> Either that or that feature bit simply indicates trim support, not NCQ
> trim support.
Mmmh, so now I'm confused.
See this:
=== START OF INFORMATION SECTION ===
Device Model: INTEL SSDSC2BW180A3L
Serial Number: CVCV215200XU180EGN
LU WWN Device Id: 5 001517 bb28c5317
Firmware Version: LE1i
User Capacity: 180,045,766,656 bytes [180 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 (minor revision not indicated)
SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Sat Mar 15 15:49:06 2014 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
polgara:/usr/src# smartctl --identify /dev/sda | grep '^ 169'
169 - 0x0001 Data Set Management support
169 0 1 Trim bit in DATA SET MANAGEMENT command supported
This is a super old SSD from 3 years ago. Clearly it can't support
synchronous dicard, right?
Yet, deleting a kernel tree also takes 1.5 seconds:
polgara:/usr/src# time rm -rf linux-3.14-rc5/
real 0m1.441s
user 0m0.048s
sys 0m1.352s
So maybe it's not the data level, but just the value of 169?
Either way, this SSD is more than 2 years old, maybe 3 actually.
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-15 4:06 ` Chris Samuel
@ 2014-03-16 16:07 ` Martin K. Petersen
0 siblings, 0 replies; 32+ messages in thread
From: Martin K. Petersen @ 2014-03-16 16:07 UTC (permalink / raw)
To: Chris Samuel; +Cc: linux-btrfs
>>>>> "Chris" == Chris Samuel <chris@csamuel.org> writes:
Chris> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Chris> Of course that's what the drive is reporting it supports, I'm not
Chris> sure whether that's the result of what has been negotiated
Chris> between the controller and drive or purely what the drive
Chris> supports.
It just what the drive reports. Often drives will implement features
before they are ratified in the spec and thus before they can claim
compliance with a specific version.
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-15 5:25 ` Chris Samuel
2014-03-15 6:48 ` Chris Samuel
@ 2014-03-16 16:22 ` Martin K. Petersen
2014-03-16 17:50 ` Marc MERLIN
1 sibling, 1 reply; 32+ messages in thread
From: Martin K. Petersen @ 2014-03-16 16:22 UTC (permalink / raw)
To: Chris Samuel; +Cc: linux-btrfs
>>>>> "Chris" == Chris Samuel <chris@csamuel.org> writes:
Chris> It looks like drives that do support it can be detected with the
Chris> kernel helper function ata_fpdma_dsm_supported() defined in
Chris> include/linux/libata.h.
Chris> I wonder if it would be possible to use that knowledge to extend
Chris> the smartctl's --identify functionality to report this?
Queued trim support is indicated in a log page and not the identify
information. However, we can get to the information we want using
smartctl's ability to look at log pages.
I don't have a single drive from any vendor in the lab that supports
queued trim, not even a prototype. I went out and bought a 840 EVO this
morning because the general lazyweb opinion seemed to indicate that this
drive supports queued trim. Well, it doesn't. At least not in the 120GB
version:
# smartctl -l gplog,0x13 /dev/sda
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.14.0-rc6+] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
General Purpose Log 0x13 does not exist (override with '-T permissive' option)
If there's a drive with a working queued trim implementation out there,
I'd like to know about it...
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-16 6:06 ` Marc MERLIN
@ 2014-03-16 17:09 ` Chris Murphy
0 siblings, 0 replies; 32+ messages in thread
From: Chris Murphy @ 2014-03-16 17:09 UTC (permalink / raw)
To: Btrfs
On Mar 16, 2014, at 12:06 AM, Marc MERLIN <marc@merlins.org> wrote:
>
> Mmmh, so now I'm confused.
>
> See this:
>
> === START OF INFORMATION SECTION ===
> Device Model: INTEL SSDSC2BW180A3L
> Serial Number: CVCV215200XU180EGN
> LU WWN Device Id: 5 001517 bb28c5317
> Firmware Version: LE1i
> User Capacity: 180,045,766,656 bytes [180 GB]
> Sector Size: 512 bytes logical/physical
> Rotation Rate: Solid State Device
> Device is: Not in smartctl database [for details use: -P showall]
> ATA Version is: ACS-2 (minor revision not indicated)
> SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
> Local Time is: Sat Mar 15 15:49:06 2014 PDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
>
> polgara:/usr/src# smartctl --identify /dev/sda | grep '^ 169'
> 169 - 0x0001 Data Set Management support
> 169 0 1 Trim bit in DATA SET MANAGEMENT command supported
>
> This is a super old SSD from 3 years ago. Clearly it can't support
> synchronous dicard, right?
No. The first signs I saw they were appearing in the wild was 3rd quarter 2013. I'm pretty sure SAS SSDs always have had a queued trim command. So in the workloads that demanded it and with a budget, this wouldn't have ever been a problem.
> Yet, deleting a kernel tree also takes 1.5 seconds:
> polgara:/usr/src# time rm -rf linux-3.14-rc5/
> real 0m1.441s
> user 0m0.048s
> sys 0m1.352s
I don't know that this is a good test for two reasons. Does rm always call trim before the rm command completes? If trim is batched or delayed it could happen well after. Second, and more of a factor, the queue needs to have pending commands in them that an async trim command will have to wait for. The problem with non-queued trim is that it requires the queue to be empty. So you'd need a test or workload that causes that behavior to be a problem.
And yet another factor with trim is that some SSDs immediately, aggressively start garbage collection and become slow to respond to anything. While others are smarter about doing this when the drive isn't as busy.
Chris Murphy
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: discard synchronous on most SSDs?
2014-03-16 16:22 ` Martin K. Petersen
@ 2014-03-16 17:50 ` Marc MERLIN
0 siblings, 0 replies; 32+ messages in thread
From: Marc MERLIN @ 2014-03-16 17:50 UTC (permalink / raw)
To: Martin K. Petersen; +Cc: Chris Samuel, linux-btrfs
On Sun, Mar 16, 2014 at 12:22:05PM -0400, Martin K. Petersen wrote:
> queued trim, not even a prototype. I went out and bought a 840 EVO this
> morning because the general lazyweb opinion seemed to indicate that this
> drive supports queued trim. Well, it doesn't. At least not in the 120GB
> version:
>
> # smartctl -l gplog,0x13 /dev/sda
> smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.14.0-rc6+] (local build)
> Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
>
> General Purpose Log 0x13 does not exist (override with '-T permissive' option)
>
> If there's a drive with a working queued trim implementation out there,
> I'd like to know about it...
I tried that for you on my 840 EVO 1TB and go the same as you:
legolas:/usr/src# smartctl -l gplog,0x13 /dev/sda
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.14.0-rc5-amd64-i915-preempt-20140216c] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
General Purpose Log 0x13 does not exist (override with '-T permissive' option)
Now, back to the fact that it takes 1.5sec to delete a kernel tree with
discard on, and it doesn't seem faster with discard off on either that
drive or my very old intel SSD, I'm starting to think that this is kind
of a non problem and/or that something else makes rm of a kernel tree
take around 1.5sec
Is is it much faster on an ssd for someone else?
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2014-03-16 17:50 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-09 7:48 Massive BTRFS performance degradation KC
2014-03-09 8:17 ` Swâmi Petaramesh
2014-03-09 10:01 ` Martin Steigerwald
2014-03-09 10:23 ` Swâmi Petaramesh
2014-03-09 11:33 ` Hugo Mills
2014-03-09 11:54 ` Martin Steigerwald
2014-03-09 12:10 ` Swâmi Petaramesh
2014-03-09 17:14 ` boris
2014-03-14 2:11 ` discard synchronous on most SSDs? Marc MERLIN
2014-03-14 3:39 ` Chris Murphy
2014-03-14 5:17 ` Marc MERLIN
2014-03-14 7:33 ` Chris Samuel
2014-03-14 19:26 ` Marc MERLIN
2014-03-14 19:57 ` Martin K. Petersen
2014-03-14 20:46 ` Holger Hoffstätte
2014-03-15 4:21 ` Marc MERLIN
2014-03-15 9:38 ` Holger Hoffstätte
2014-03-15 5:25 ` Chris Samuel
2014-03-15 6:48 ` Chris Samuel
2014-03-15 11:26 ` Duncan
2014-03-15 22:48 ` Chris Samuel
2014-03-16 6:06 ` Marc MERLIN
2014-03-16 17:09 ` Chris Murphy
2014-03-16 16:22 ` Martin K. Petersen
2014-03-16 17:50 ` Marc MERLIN
2014-03-15 4:06 ` Chris Samuel
2014-03-16 16:07 ` Martin K. Petersen
2014-03-14 12:07 ` Duncan
2014-03-14 21:44 ` Chris Murphy
2014-03-14 7:27 ` Chris Samuel
2014-03-09 17:36 ` Massive BTRFS performance degradation Austin S Hemmelgarn
2014-03-09 18:55 ` Tobias Holst
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox