* Poor performance using discard
@ 2012-02-28 22:56 Thomas Lynema
2012-02-28 23:58 ` Peter Grandi
2012-02-29 1:22 ` Dave Chinner
0 siblings, 2 replies; 14+ messages in thread
From: Thomas Lynema @ 2012-02-28 22:56 UTC (permalink / raw)
To: xfs
[-- Attachment #1.1: Type: text/plain, Size: 1270 bytes --]
Please reply to my personal email as well as I am not subscribed to the
list.
I have a PP120GS25SSDR it does support trim
cat /sys/block/sdc/queue/discard_max_bytes
2147450880
The entire drive is one partition that is totally used by LVM.
I made a test vg and formatted it with mkfs.xfs. Then mounted it with
discard and got the following result when deleting a kernel source:
/dev/mapper/ssdvg0-testLV on /media/temp type xfs
(rw,noatime,nodiratime,discard)
time rm -rf linux-3.2.6-gentoo/
real 5m7.139s
user 0m0.080s
sys 0m1.580s
There where lockups where the system would pause for about a minute
during the process.
ext4 handles this scenerio fine:
/dev/mapper/ssdvg0-testLV on /media/temp type ext4
(rw,noatime,nodiratime,discard)
time rm -rf linux-3.2.6-gentoo/
real 0m0.943s
user 0m0.050s
sys 0m0.830s
xfs mounted without discard seems to handle this fine:
/dev/mapper/ssdvg0-testLV on /media/temp type xfs
(rw,noatime,nodiratime)
time rm -rf linux-3.2.6-gentoo/
real 0m1.634s
user 0m0.040s
sys 0m1.420s
uname -a
Linux core24 3.2.5-gentoo #11 SMP PREEMPT Sat Feb 11 15:46:22 EST 2012
x86_64 Intel(R) Core(TM)2 Quad CPU Q6700 @ 2.66GHz GenuineIntel
GNU/Linux
Any suggestions?
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
2012-02-28 22:56 Poor performance using discard Thomas Lynema
@ 2012-02-28 23:58 ` Peter Grandi
2012-02-29 1:22 ` Dave Chinner
1 sibling, 0 replies; 14+ messages in thread
From: Peter Grandi @ 2012-02-28 23:58 UTC (permalink / raw)
To: Linux fs XFS
[ ... ]
> /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> (rw,noatime,nodiratime,discard) [ ... ] real 5m7.139s [ ... ]
> There where lockups where the system would pause for about a
> minute during the process.
> ext4 handles this scenerio fine:
> /dev/mapper/ssdvg0-testLV on /media/temp type ext4
> (rw,noatime,nodiratime,discard) [ ... ] real 0m0.943s [ ... ]
> xfs mounted without discard seems to handle this fine:
> /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> (rw,noatime,nodiratime) [ ... ] real 0m1.634s [ ... ]
[ ... ]
> Any suggestions?
* Look at 'vmstat 1' while 'rm' is running.
* Learn how TRIM is specified, and thus why many people prefer
running periodically 'fstrim' which uses FITRIM to mounting
with 'discard'.
* Compare with my Crucial M4 flash SSD with XFS:
# time sh -c 'sysctl vm/drop_caches=3; rm -r linux-2.6.32; sync'
vm.drop_caches = 3
real 0m59.604s
user 0m0.060s
sys 0m3.944s
That's pretty good for ~32k files and ~390MiB. Probably the
TRIM implementation on the M4 is rather faster than that on
the Patriot.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
2012-02-28 22:56 Poor performance using discard Thomas Lynema
2012-02-28 23:58 ` Peter Grandi
@ 2012-02-29 1:22 ` Dave Chinner
2012-02-29 2:00 ` Thomas Lynema
1 sibling, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2012-02-29 1:22 UTC (permalink / raw)
To: Thomas Lynema; +Cc: xfs
On Tue, Feb 28, 2012 at 05:56:18PM -0500, Thomas Lynema wrote:
> Please reply to my personal email as well as I am not subscribed to the
> list.
>
> I have a PP120GS25SSDR it does support trim
>
> cat /sys/block/sdc/queue/discard_max_bytes
> 2147450880
>
> The entire drive is one partition that is totally used by LVM.
>
> I made a test vg and formatted it with mkfs.xfs. Then mounted it with
> discard and got the following result when deleting a kernel source:
>
> /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> (rw,noatime,nodiratime,discard)
>
> time rm -rf linux-3.2.6-gentoo/
> real 5m7.139s
> user 0m0.080s
> sys 0m1.580s
>
I'd say your problem is that trim is extremely slow on your
hardware. You've told XFS to execute a discard command for every
single extent that is freed, and that can be very slow if you are
freeing lots of small extents (like a kernel tree contains) and you
have a device that is slow at executing discards.
> There where lockups where the system would pause for about a minute
> during the process.
Yup, that's because it runs as part of the journal commit
completion, and if your SSD is extremely slow the journal will stall
waiting for all the discards to complete.
Basically, online discard is not really a smart thing to use for
consumer SSDs. Indeed, it's just not a smart thign to run for most
workloads and use cases precisely because discard is a very slow
and non-queuable operation on most hardware that supports it.
If you really need to run discard, just run a background discard
(fstrim) from a cronjob that runs when the system is mostly idle.
You won't have any runtime overhead on every unlink but you'll still
get the benefit of discarding unused blocks regularly.
> ext4 handles this scenerio fine:
>
> /dev/mapper/ssdvg0-testLV on /media/temp type ext4
> (rw,noatime,nodiratime,discard)
>
> time rm -rf linux-3.2.6-gentoo/
>
> real 0m0.943s
> user 0m0.050s
> sys 0m0.830s
I very much doubt that a single discard IO was issued during that
workload - ext4 uses the same fine-grained discard method XFS does,
and it does it at journal checkpoint completion just like XFS. So
I'd say that ext4 didn't commit the journal during this workload,
and no discards were issued, unlike XFS.
So, now time how long it takes to run sync to get the discards
issued and completed on ext4. Do the same with XFS and see what
happens. i.e.:
$ time (rm -rf linux-3.2.6-gentoo/ ; sync)
is the only real way to compare performance....
> xfs mounted without discard seems to handle this fine:
>
> /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> (rw,noatime,nodiratime)
>
> time rm -rf linux-3.2.6-gentoo/
> real 0m1.634s
> user 0m0.040s
> sys 0m1.420s
Right, that's how long XFS takes with normal journal checkpoint
IO latency. Add to that the time it takes for all the discards to be
run, and you've got the above number.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
2012-02-29 1:22 ` Dave Chinner
@ 2012-02-29 2:00 ` Thomas Lynema
2012-02-29 4:08 ` Dave Chinner
0 siblings, 1 reply; 14+ messages in thread
From: Thomas Lynema @ 2012-02-29 2:00 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 7784 bytes --]
On Wed, 2012-02-29 at 12:22 +1100, Dave Chinner wrote:
> On Tue, Feb 28, 2012 at 05:56:18PM -0500, Thomas Lynema wrote:
> > Please reply to my personal email as well as I am not subscribed to the
> > list.
> >
> > I have a PP120GS25SSDR it does support trim
> >
> > cat /sys/block/sdc/queue/discard_max_bytes
> > 2147450880
> >
> > The entire drive is one partition that is totally used by LVM.
> >
> > I made a test vg and formatted it with mkfs.xfs. Then mounted it with
> > discard and got the following result when deleting a kernel source:
> >
> > /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> > (rw,noatime,nodiratime,discard)
> >
> > time rm -rf linux-3.2.6-gentoo/
> > real 5m7.139s
> > user 0m0.080s
> > sys 0m1.580s
> >
>
> I'd say your problem is that trim is extremely slow on your
> hardware. You've told XFS to execute a discard command for every
> single extent that is freed, and that can be very slow if you are
> freeing lots of small extents (like a kernel tree contains) and you
> have a device that is slow at executing discards.
>
> > There where lockups where the system would pause for about a minute
> > during the process.
>
> Yup, that's because it runs as part of the journal commit
> completion, and if your SSD is extremely slow the journal will stall
> waiting for all the discards to complete.
>
> Basically, online discard is not really a smart thing to use for
> consumer SSDs. Indeed, it's just not a smart thign to run for most
> workloads and use cases precisely because discard is a very slow
> and non-queuable operation on most hardware that supports it.
>
> If you really need to run discard, just run a background discard
> (fstrim) from a cronjob that runs when the system is mostly idle.
> You won't have any runtime overhead on every unlink but you'll still
> get the benefit of discarding unused blocks regularly.
>
> > ext4 handles this scenerio fine:
> >
> > /dev/mapper/ssdvg0-testLV on /media/temp type ext4
> > (rw,noatime,nodiratime,discard)
> >
> > time rm -rf linux-3.2.6-gentoo/
> >
> > real 0m0.943s
> > user 0m0.050s
> > sys 0m0.830s
>
> I very much doubt that a single discard IO was issued during that
> workload - ext4 uses the same fine-grained discard method XFS does,
> and it does it at journal checkpoint completion just like XFS. So
> I'd say that ext4 didn't commit the journal during this workload,
> and no discards were issued, unlike XFS.
>
> So, now time how long it takes to run sync to get the discards
> issued and completed on ext4. Do the same with XFS and see what
> happens. i.e.:
>
> $ time (rm -rf linux-3.2.6-gentoo/ ; sync)
>
> is the only real way to compare performance....
>
> > xfs mounted without discard seems to handle this fine:
> >
> > /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> > (rw,noatime,nodiratime)
> >
> > time rm -rf linux-3.2.6-gentoo/
> > real 0m1.634s
> > user 0m0.040s
> > sys 0m1.420s
>
> Right, that's how long XFS takes with normal journal checkpoint
> IO latency. Add to that the time it takes for all the discards to be
> run, and you've got the above number.
>
> Cheers,
>
> Dave.
Dave and Peter,
Thank you both for the replies. Dave, it is actually your article on
lwn and presentation that you did recently that lead me to use xfs on my
home computer.
Let's try this with the sync as Dave suggested and the command that
Peter used:
mount /dev/ssdvg0/testLV -t xfs -o
noatime,nodiratime,discard /media/temp/
time sh -c 'sysctl vm/drop_caches=3; rm -r linux-3.2.6-gentoo; sync'
vm.drop_caches = 3
real 6m35.768s
user 0m0.110s
sys 0m2.090s
vmstat samples. Not putting 6 minutes worth in the email unless it is
necessary.
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
0 1 3552 6604412 0 151108 0 0 6675 5982 3109 3477 3 24
55 18
0 1 3552 6594756 0 161032 0 0 9948 0 1655 2006 1 1
74 24
0 1 3552 6587068 0 168672 0 0 7572 8 2799 3130 1 1
74 24
1 0 3552 6580744 0 174852 0 0 6288 0 2880 3215 6 2
74 19
----i/o wait here----
1 0 3552 6580496 0 174972 0 0 0 0 782 1110 22 4
74 0
1 0 3552 6580744 0 174972 0 0 0 0 830 1194 22 4
74 0
1 0 3552 6580744 0 174972 0 0 0 0 771 1117 23 3
74 0
1 0 3552 6580744 0 174972 0 0 0 4 1538 2637 30 5
66 0
1 0 3552 6580744 0 174972 0 0 0 0 1168 1946 26 3
72 0
1 0 3552 6580744 0 174976 0 0 0 0 762 1169 23 4
73 0
1 0 3552 6580528 0 175052 0 0 0 0 785 1138 25 2
73 0
2 0 3552 6580528 0 175052 0 0 0 0 868 1350 24 7
69 0
1 0 3552 6580528 0 175052 0 0 0 0 866 1259 24 5
72 0
1 0 3552 6580528 0 175052 0 0 0 8 901 1364 26 5
69 0
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy
id wa
2 0 3552 6586348 0 175540 0 0 728 1069 1187 2057 26 7
66 1
2 0 3552 6583344 0 176068 0 0 1812 4 1427 2350 24 8
65 2
1 0 3552 6580920 0 177116 0 0 1964 0 1220 1961 25 8
67 1
1 0 3552 6566616 0 190232 0 0 13376 0 1291 1938 24 7
62 8
1 1 3552 6561780 0 193380 0 0 3344 12 1081 1953 22 4
58 15
1 1 3552 6532148 0 200548 0 0 7236 0 10488 3630 35
11 42 13
1 0 3552 6518508 0 200748 0 0 200 0 1929 4038 35 11
52 1
2 0 3552 6516516 0 200828 0 0 57 0 1308 2019 24 6
69 0
EXT4 sample
mkfs.ext4 /dev/ssdvg0/testLV
mount /dev/ssdvg0/testLV -t ext4 -o
discard,noatime,nodiratime /media/temp/
time sh -c 'sysctl vm/drop_caches=3; rm -r linux-3.2.6-gentoo; sync'
vm.drop_caches = 3
real 0m2.711s
user 0m0.030s
sys 0m1.330s
#because I didn't believe it, I ran the command a second time.
time sync
real 0m0.157s
user 0m0.000s
sys 0m0.000s
0m1.420s
vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy
id wa
1 0 3548 5474268 19736 1191868 0 0 0 0 1274 2097 25
3 72 0
1 0 3548 5474268 19736 1191872 0 0 0 0 1027 1614 26
3 71 0
2 1 3548 6649292 4688 154264 0 0 9512 8 2256 3267 11 18
58 12
2 2 3548 6633188 15920 161592 0 0 18788 7732 5137 6274 5 17
49 29
0 1 3548 6623044 19624 167936 0 0 9948 10081 3233 4810 4 7
54 35
0 1 3548 6621556 19624 170068 0 0 2112 2642 1294 2135 4 1
72 23
0 2 3548 6611140 19624 179420 0 0 10260 50 1677 2930 7 2
64 27
0 1 3548 6606660 19624 183828 0 0 4181 32 2192 2707 6 2
67 26
1 0 3548 6604700 19624 185864 0 0 2080 0 961 1451 7 2
74 17
1 0 3548 6604700 19624 185864 0 0 0 0 966 1715 24 3
73 0
2 0 3548 6604700 19624 185864 0 0 8 196 1025 1582 24 4
72 0
1 0 3548 6604700 19624 185864 0 0 0 0 1133 1901 24 3
73 0
This time, I ran a sync. That should mean all of the discard operations
were completed...right?
If it makes a difference, when I get the i/o hang during the xfs
deletes, my entire system seems to hang. It doesn't just hang that
particular mounted volumes' i/o.
Please let me know if there anything obvious that I'm missing from this
equation.
~tom
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
2012-02-29 2:00 ` Thomas Lynema
@ 2012-02-29 4:08 ` Dave Chinner
2012-02-29 10:38 ` Peter Grandi
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Dave Chinner @ 2012-02-29 4:08 UTC (permalink / raw)
To: Thomas Lynema; +Cc: xfs
On Tue, Feb 28, 2012 at 09:00:26PM -0500, Thomas Lynema wrote:
>
> On Wed, 2012-02-29 at 12:22 +1100, Dave Chinner wrote:
> > On Tue, Feb 28, 2012 at 05:56:18PM -0500, Thomas Lynema wrote:
> > > /dev/mapper/ssdvg0-testLV on /media/temp type ext4
> > > (rw,noatime,nodiratime,discard)
> > >
> > > time rm -rf linux-3.2.6-gentoo/
> > >
> > > real 0m0.943s
> > > user 0m0.050s
> > > sys 0m0.830s
> >
> > I very much doubt that a single discard IO was issued during that
> > workload - ext4 uses the same fine-grained discard method XFS does,
> > and it does it at journal checkpoint completion just like XFS. So
> > I'd say that ext4 didn't commit the journal during this workload,
> > and no discards were issued, unlike XFS.
> >
> > So, now time how long it takes to run sync to get the discards
> > issued and completed on ext4. Do the same with XFS and see what
> > happens. i.e.:
> >
> > $ time (rm -rf linux-3.2.6-gentoo/ ; sync)
> >
> > is the only real way to compare performance....
> >
> > > xfs mounted without discard seems to handle this fine:
> > >
> > > /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> > > (rw,noatime,nodiratime)
> > >
> > > time rm -rf linux-3.2.6-gentoo/
> > > real 0m1.634s
> > > user 0m0.040s
> > > sys 0m1.420s
> >
> > Right, that's how long XFS takes with normal journal checkpoint
> > IO latency. Add to that the time it takes for all the discards to be
> > run, and you've got the above number.
> >
> > Cheers,
> >
> > Dave.
>
>
> Dave and Peter,
>
> Thank you both for the replies. Dave, it is actually your article on
> lwn and presentation that you did recently that lead me to use xfs on my
> home computer.
>
> Let's try this with the sync as Dave suggested and the command that
> Peter used:
>
> mount /dev/ssdvg0/testLV -t xfs -o
> noatime,nodiratime,discard /media/temp/
>
> time sh -c 'sysctl vm/drop_caches=3; rm -r linux-3.2.6-gentoo; sync'
> vm.drop_caches = 3
>
> real 6m35.768s
> user 0m0.110s
> sys 0m2.090s
>
> vmstat samples. Not putting 6 minutes worth in the email unless it is
> necessary.
>
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> 0 1 3552 6604412 0 151108 0 0 6675 5982 3109 3477 3 24 55 18
> 0 1 3552 6594756 0 161032 0 0 9948 0 1655 2006 1 1 74 24
> 0 1 3552 6587068 0 168672 0 0 7572 8 2799 3130 1 1 74 24
> 1 0 3552 6580744 0 174852 0 0 6288 0 2880 3215 6 2 74 19
> ----i/o wait here----
> 1 0 3552 6580496 0 174972 0 0 0 0 782 1110 22 4 74 0
> 1 0 3552 6580744 0 174972 0 0 0 0 830 1194 22 4 74 0
> 1 0 3552 6580744 0 174972 0 0 0 0 771 1117 23 3 74 0
> 1 0 3552 6580744 0 174972 0 0 0 4 1538 2637 30 5 66 0
> 1 0 3552 6580744 0 174972 0 0 0 0 1168 1946 26 3 72 0
> 1 0 3552 6580744 0 174976 0 0 0 0 762 1169 23 4 73 0
There's no IO wait time here - it's apparently burning a CPU in userspace and
doing no IO at all. running discards all happens in kernel threads,
so there should be no user time at all if it was stuck doing
discards. What is consuming that CPU time?
....
> EXT4 sample
>
> mkfs.ext4 /dev/ssdvg0/testLV
> mount /dev/ssdvg0/testLV -t ext4 -o
> discard,noatime,nodiratime /media/temp/
>
>
> time sh -c 'sysctl vm/drop_caches=3; rm -r linux-3.2.6-gentoo; sync'
> vm.drop_caches = 3
>
> real 0m2.711s
> user 0m0.030s
> sys 0m1.330s
>
> #because I didn't believe it, I ran the command a second time.
>
> time sync
>
> real 0m0.157s
> user 0m0.000s
> sys 0m0.000s
> 0m1.420s
>
> vmstat 1
>
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 1 0 3548 5474268 19736 1191868 0 0 0 0 1274 2097 25 3 72 0
> 1 0 3548 5474268 19736 1191872 0 0 0 0 1027 1614 26 3 71 0
> 2 1 3548 6649292 4688 154264 0 0 9512 8 2256 3267 11 18 58 12
> 2 2 3548 6633188 15920 161592 0 0 18788 7732 5137 6274 5 17 49 29
> 0 1 3548 6623044 19624 167936 0 0 9948 10081 3233 4810 4 7 54 35
> 0 1 3548 6621556 19624 170068 0 0 2112 2642 1294 2135 4 1 72 23
> 0 2 3548 6611140 19624 179420 0 0 10260 50 1677 2930 7 2 64 27
> 0 1 3548 6606660 19624 183828 0 0 4181 32 2192 2707 6 2 67 26
> 1 0 3548 6604700 19624 185864 0 0 2080 0 961 1451 7 2 74 17
> 1 0 3548 6604700 19624 185864 0 0 0 0 966 1715 24 3 73 0
> 2 0 3548 6604700 19624 185864 0 0 8 196 1025 1582 24 4 72 0
> 1 0 3548 6604700 19624 185864 0 0 0 0 1133 1901 24 3 73 0
Same again - aparently when you system goes idle, it burns a CPU in
user time, but stops doing that when IO is in progress.
> This time, I ran a sync. That should mean all of the discard operations
> were completed...right?
Well, it certainly is the case for XFS. I'm not sure what is
happening with ext4 though.
> If it makes a difference, when I get the i/o hang during the xfs
> deletes, my entire system seems to hang. It doesn't just hang that
> particular mounted volumes' i/o.
Any errors in dmesg?
Also, I think you need to provide a block trace (output of
blktrace/blkparse for the rm -rf workloads) for both the XFS and
ext4 cases so we can see what discards are actually being issued and
how long they take to complete....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
2012-02-29 4:08 ` Dave Chinner
@ 2012-02-29 10:38 ` Peter Grandi
2012-02-29 19:46 ` Eric Sandeen
` (2 subsequent siblings)
3 siblings, 0 replies; 14+ messages in thread
From: Peter Grandi @ 2012-02-29 10:38 UTC (permalink / raw)
To: Linux fs XFS
[ ... ]
> Same again - aparently when you system goes idle, it burns a CPU in
> user time, but stops doing that when IO is in progress.
>> This time, I ran a sync. That should mean all of the discard
>> operations were completed...right?
> Well, it certainly is the case for XFS. I'm not sure what is
> happening with ext4 though.
>> If it makes a difference, when I get the i/o hang during the
>> xfs deletes, my entire system seems to hang. It doesn't just
>> hang that particular mounted volumes' i/o.
> Any errors in dmesg?
I had hoped that the OP would take my suggestions and runs some
web search, because I was obviously suggesting that there the
behaviour he saw is expected and correct for XFS:
>>> * Learn how TRIM is specified, and thus why many people prefer
>>> running periodically 'fstrim' which uses FITRIM to mounting
>>> with 'discard'.
So let's cut it short: the TRIM command is specified to be
synchronous.
Therefore a sequence if TRIM operations will lock up the disk,
and usually as a result the host too. In addition to this, in
some implementations TRIM is faster and in some it is slower,
but again the main problem is that it is synchronous.
Therefore using 'discard' which uses TRIM is subject to exactly
the behaviour reported. But 'ext4' batches TRIMs, issuing them
out of the journal, while XFS probably issues them with every
deletion operation, so 'ext4' should be hit less, but the
difference should not be as large as reported.
I suspect then that recent 'ext4' ignores 'discard' precisely
because of the many reports of freezes they may have received.
An early discussion of the issue with TRIM:
http://www.spinics.net/lists/linux-fsdevel/msg23064.html
From this report it seems that 'ext4' used to be "slow"
on 'discard' too:
https://patrick-nagel.net/blog/archives/337
"I did it three times with and three times without the “discard”
option, and then took the average of those three tries:
Without “discard” option:
Unpack: 1.21s
Sync: 1.66s (= 172 MB/s)
Delete: 0.47s
Sync: 0.17s
With “discard” option:
Unpack: 1.18s
Sync: 1.62s (= 176 MB/s)
Delete: 0.48s
Sync: 40.41s
So, with “discard” on, deleting a big bunch of small files is 64
times slower on my SSD. For those ~40 seconds any I/O is really
slow, so that’s pretty much the time when you get a fresh cup of
coffee, or waste time watching the mass storage activity LED."
Also the 'man' page for 'fstrim':
http://www.vdmeulen.net/cgi-bin/man/man2html?fstrim+8
"-m, --minimum minimum-free-extent
Minimum contiguous free range to discard, in bytes.
(This value is internally rounded up to a multiple of the
filesystem block size). Free ranges smaller than this will be
ignored. By increasing this value, the fstrim operation will
complete more quickly for filesystems with badly fragmented
freespace, although not all blocks will be discarded. Default
value is zero, discard every free block."
Note the "By increasing this value, the fstrim operation will
complete more quickly for filesystems with badly fragmented
freespace", which implies that FITRIM is also synchronous or slow
or both.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
2012-02-29 4:08 ` Dave Chinner
2012-02-29 10:38 ` Peter Grandi
@ 2012-02-29 19:46 ` Eric Sandeen
2012-03-01 5:59 ` Christoph Hellwig
[not found] ` <1330658311.6438.24.camel@core24>
2012-03-02 15:41 ` Thomas Lynema
3 siblings, 1 reply; 14+ messages in thread
From: Eric Sandeen @ 2012-02-29 19:46 UTC (permalink / raw)
To: Dave Chinner; +Cc: Thomas Lynema, xfs
On 2/28/12 10:08 PM, Dave Chinner wrote:
> Also, I think you need to provide a block trace (output of
> blktrace/blkparse for the rm -rf workloads) for both the XFS and
> ext4 cases so we can see what discards are actually being issued and
> how long they take to complete....
>
I ran a quick test on a loopback device on 3.3.0-rc4. Loopback supports
discards. I made 1G filesystems on loopback on ext4 & xfs, mounted with
-o discard, cloned a git tree to them, and ran rm -rf; sync under blktrace.
XFS took about 11 seconds, ext4 took about 1.7.
(without trim, times were roughly the same - but discard/trim is probably
quite fast on the looback file)
Both files were reduced in disk usage about the same amount, so online
discard was working for both:
# du -h ext4_fsfile xfs_fsfile
497M ext4_fsfile
491M xfs_fsfile
XFS issued many more discards than ext4:
# blkparse xfs.trace | grep -w D | wc -l
40205
# blkparse ext4.trace | grep -w D | wc -l
123
XFS issued many small discards (4k/8 sectors) and a few larger ones:
[sectors | # discards]
8 20079
16 6762
24 3627
32 2798
40 1439
...
1840 1
7256 1
26720 1
ext4 issued far fewer discards, but in much larger chunks:
8 29
16 9
24 4
32 6
...
35152 1
35248 1
53744 1
192320 1
261624 1
262144 1
So that could certainly explain the relative speed.
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
2012-02-29 19:46 ` Eric Sandeen
@ 2012-03-01 5:59 ` Christoph Hellwig
2012-03-01 6:27 ` Dave Chinner
0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2012-03-01 5:59 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Thomas Lynema, xfs
On Wed, Feb 29, 2012 at 01:46:34PM -0600, Eric Sandeen wrote:
> On 2/28/12 10:08 PM, Dave Chinner wrote:
>
> > Also, I think you need to provide a block trace (output of
> > blktrace/blkparse for the rm -rf workloads) for both the XFS and
> > ext4 cases so we can see what discards are actually being issued and
> > how long they take to complete....
> >
>
> I ran a quick test on a loopback device on 3.3.0-rc4. Loopback supports
> discards. I made 1G filesystems on loopback on ext4 & xfs, mounted with
> -o discard, cloned a git tree to them, and ran rm -rf; sync under blktrace.
>
> XFS took about 11 seconds, ext4 took about 1.7.
>
> (without trim, times were roughly the same - but discard/trim is probably
> quite fast on the looback file)
>
> Both files were reduced in disk usage about the same amount, so online
> discard was working for both:
>
> # du -h ext4_fsfile xfs_fsfile
> 497M ext4_fsfile
> 491M xfs_fsfile
>
> XFS issued many more discards than ext4:
XFS frees inode blocks, directory blocks and btree blocks. ext4 only
ever frees data blocks and the occasional indirect block on files.
So a proper discard implementation on XFS without either a reall fast
non-blocking and/or vectored trim (like actually supported in hardware)
XFS will be fairly slow.
Unfortunately all the required bits are missing in the Linux block
layer, thus you really should use fstrim for now.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
2012-03-01 5:59 ` Christoph Hellwig
@ 2012-03-01 6:27 ` Dave Chinner
2012-03-01 6:31 ` Christoph Hellwig
0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2012-03-01 6:27 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Eric Sandeen, Thomas Lynema, xfs
On Thu, Mar 01, 2012 at 12:59:43AM -0500, Christoph Hellwig wrote:
> On Wed, Feb 29, 2012 at 01:46:34PM -0600, Eric Sandeen wrote:
> > On 2/28/12 10:08 PM, Dave Chinner wrote:
> >
> > > Also, I think you need to provide a block trace (output of
> > > blktrace/blkparse for the rm -rf workloads) for both the XFS and
> > > ext4 cases so we can see what discards are actually being issued and
> > > how long they take to complete....
> > >
> >
> > I ran a quick test on a loopback device on 3.3.0-rc4. Loopback supports
> > discards. I made 1G filesystems on loopback on ext4 & xfs, mounted with
> > -o discard, cloned a git tree to them, and ran rm -rf; sync under blktrace.
> >
> > XFS took about 11 seconds, ext4 took about 1.7.
> >
> > (without trim, times were roughly the same - but discard/trim is probably
> > quite fast on the looback file)
> >
> > Both files were reduced in disk usage about the same amount, so online
> > discard was working for both:
> >
> > # du -h ext4_fsfile xfs_fsfile
> > 497M ext4_fsfile
> > 491M xfs_fsfile
> >
> > XFS issued many more discards than ext4:
>
> XFS frees inode blocks, directory blocks and btree blocks. ext4 only
> ever frees data blocks and the occasional indirect block on files.
One other thing the ext4 tracing implementation does is merge
adjacent ranges, whereas the XFS implementation does not. XFS has
more tracking complexity than ext4, though, in that it tracks free
extents in multiple concurrent journal commits whereas ext4 only has
to track across a single journal commit. Hence ext4 can merge
without having to care about where the adjacent range is being
committed in the same journal checkpoint.
Further, ext4 doesn't reallocate from the freed extents until after
the journal commit completes, whilst XFS can reallocate freed ranges
before the freeing is journalled and hence can modify ranges in the
free list prior to journal commit.
We could probably implement extent merging in the free extent
tracking similar to ext4, but I'm not sure how much it would gain us
because of the way we do reallocation of freed ranges prior to
journal commit....
> So a proper discard implementation on XFS without either a reall fast
> non-blocking and/or vectored trim (like actually supported in hardware)
> XFS will be fairly slow.
>
> Unfortunately all the required bits are missing in the Linux block
> layer, thus you really should use fstrim for now.
Another good reason for using fstrim instead of online discard... :/
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
2012-03-01 6:27 ` Dave Chinner
@ 2012-03-01 6:31 ` Christoph Hellwig
0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2012-03-01 6:31 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, Eric Sandeen, Thomas Lynema, xfs
On Thu, Mar 01, 2012 at 05:27:09PM +1100, Dave Chinner wrote:
> One other thing the ext4 tracing implementation does is merge
> adjacent ranges, whereas the XFS implementation does not. XFS has
> more tracking complexity than ext4, though, in that it tracks free
> extents in multiple concurrent journal commits whereas ext4 only has
> to track across a single journal commit. Hence ext4 can merge
> without having to care about where the adjacent range is being
> committed in the same journal checkpoint.
>
> Further, ext4 doesn't reallocate from the freed extents until after
> the journal commit completes, whilst XFS can reallocate freed ranges
> before the freeing is journalled and hence can modify ranges in the
> free list prior to journal commit.
>
> We could probably implement extent merging in the free extent
> tracking similar to ext4, but I'm not sure how much it would gain us
> because of the way we do reallocation of freed ranges prior to
> journal commit....
Also there generally aren't that many merging opportunities. Back when
I implemented the code and looked at block traces we'd get them
occasionally:
(a) for inode buffers due to the inode clusters beeing smaller than the
inode chunks. Better fixed by increasing the amount of inode
clustering we do.
(b) during rm -rf sometimes when lots of small files were end-to-end,
but this doesn't happen all that often.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
[not found] ` <1330658311.6438.24.camel@core24>
@ 2012-03-02 14:57 ` Thomas Lynema
0 siblings, 0 replies; 14+ messages in thread
From: Thomas Lynema @ 2012-03-02 14:57 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1.1: Type: text/plain, Size: 11148 bytes --]
> > Also, I think you need to provide a block trace (output of
> > blktrace/blkparse for the rm -rf workloads) for both the XFS and
> > ext4 cases so we can see what discards are actually being issued and
> > how long they take to complete....
>
Partial xfs results attached this was the overall summary with xfs.
The summary looks like this.
CPU0 (xfs.trace):
Reads Queued: 1,274, 9,216KiB Writes Queued: 9,981,
156,710KiB
Read Dispatches: 1,738, 8,656KiB Write Dispatches: 9,855,
157,818KiB
Reads Requeued: 495 Writes Requeued: 72
Reads Completed: 1,844, 12,916KiB Writes Completed: 15,486,
8,360KiB
Read Merges: 0, 0KiB Write Merges: 42,
364KiB
Read depth: 10 Write depth: 11
IO unplugs: 95 Timer unplugs: 3
CPU1 (xfs.trace):
Reads Queued: 153, 1,372KiB Writes Queued: 14,521,
183,608KiB
Read Dispatches: 235, 2,348KiB Write Dispatches: 15,246,
183,627KiB
Reads Requeued: 24 Writes Requeued: 748
Reads Completed: 197, 1,868KiB Writes Completed: 14,885,
11,825KiB
Read Merges: 0, 0KiB Write Merges: 19,
60KiB
Read depth: 10 Write depth: 11
IO unplugs: 39 Timer unplugs: 7
CPU2 (xfs.trace):
Reads Queued: 1,156, 7,520KiB Writes Queued: 10,873,
160,987KiB
Read Dispatches: 1,586, 7,416KiB Write Dispatches: 10,724,
160,687KiB
Reads Requeued: 446 Writes Requeued: 69
Reads Completed: 586, 3,820KiB Writes Completed: 5,310,
4,103KiB
Read Merges: 0, 0KiB Write Merges: 57,
664KiB
Read depth: 10 Write depth: 11
IO unplugs: 54 Timer unplugs: 8
CPU3 (xfs.trace):
Reads Queued: 126, 1,436KiB Writes Queued: 5,197,
47,004KiB
Read Dispatches: 135, 1,124KiB Write Dispatches: 5,388,
46,187KiB
Reads Requeued: 20 Writes Requeued: 333
Reads Completed: 82, 940KiB Writes Completed: 5,037,
1,551KiB
Read Merges: 0, 0KiB Write Merges: 91,
708KiB
Read depth: 10 Write depth: 11
IO unplugs: 50 Timer unplugs: 8
Total (xfs.trace):
Reads Queued: 2,709, 19,544KiB Writes Queued: 40,572,
548,309KiB
Read Dispatches: 3,694, 19,544KiB Write Dispatches: 41,213,
548,319KiB
Reads Requeued: 985 Writes Requeued: 1,222
Reads Completed: 2,709, 19,544KiB Writes Completed: 40,718,
25,839KiB
Read Merges: 0, 0KiB Write Merges: 209,
1,796KiB
IO unplugs: 238 Timer unplugs: 26
Throughput (R/W): 45KiB/s / 60KiB/s
Events (xfs.trace): 309,109 entries
Skips: 0 forward (0 - 0.0%)
CPU0 (xfs.trace):
Reads Queued: 1,274, 9,216KiB Writes Queued: 9,981,
156,710KiB
Read Dispatches: 1,738, 8,656KiB Write Dispatches: 9,855,
157,818KiB
Reads Requeued: 495 Writes Requeued: 72
Reads Completed: 1,844, 12,916KiB Writes Completed: 15,486,
8,360KiB
Read Merges: 0, 0KiB Write Merges: 42,
364KiB
Read depth: 10 Write depth: 11
IO unplugs: 95 Timer unplugs: 3
CPU1 (xfs.trace):
Reads Queued: 153, 1,372KiB Writes Queued: 14,521,
183,608KiB
Read Dispatches: 235, 2,348KiB Write Dispatches: 15,246,
183,627KiB
Reads Requeued: 24 Writes Requeued: 748
Reads Completed: 197, 1,868KiB Writes Completed: 14,885,
11,825KiB
Read Merges: 0, 0KiB Write Merges: 19,
60KiB
Read depth: 10 Write depth: 11
IO unplugs: 39 Timer unplugs: 7
CPU2 (xfs.trace):
Reads Queued: 1,156, 7,520KiB Writes Queued: 10,873,
160,987KiB
Read Dispatches: 1,586, 7,416KiB Write Dispatches: 10,724,
160,687KiB
Reads Requeued: 446 Writes Requeued: 69
Reads Completed: 586, 3,820KiB Writes Completed: 5,310,
4,103KiB
Read Merges: 0, 0KiB Write Merges: 57,
664KiB
Read depth: 10 Write depth: 11
IO unplugs: 54 Timer unplugs: 8
CPU3 (xfs.trace):
Reads Queued: 126, 1,436KiB Writes Queued: 5,197,
47,004KiB
Read Dispatches: 135, 1,124KiB Write Dispatches: 5,388,
46,187KiB
Reads Requeued: 20 Writes Requeued: 333
Reads Completed: 82, 940KiB Writes Completed: 5,037,
1,551KiB
Read Merges: 0, 0KiB Write Merges: 91,
708KiB
Read depth: 10 Write depth: 11
IO unplugs: 50 Timer unplugs: 8
Total (xfs.trace):
Reads Queued: 2,709, 19,544KiB Writes Queued: 40,572,
548,309KiB
Read Dispatches: 3,694, 19,544KiB Write Dispatches: 41,213,
548,319KiB
Reads Requeued: 985 Writes Requeued: 1,222
Reads Completed: 2,709, 19,544KiB Writes Completed: 40,718,
25,839KiB
Read Merges: 0, 0KiB Write Merges: 209,
1,796KiB
IO unplugs: 238 Timer unplugs: 26
Throughput (R/W): 45KiB/s / 60KiB/s
Events (xfs.trace): 309,109 entries
Skips: 0 forward (0 - 0.0%)
CPU0 (xfs.trace):
Reads Queued: 1,274, 9,216KiB Writes Queued: 9,981,
156,710KiB
Read Dispatches: 1,738, 8,656KiB Write Dispatches: 9,855,
157,818KiB
Reads Requeued: 495 Writes Requeued: 72
Reads Completed: 1,844, 12,916KiB Writes Completed: 15,486,
8,360KiB
Read Merges: 0, 0KiB Write Merges: 42,
364KiB
Read depth: 10 Write depth: 11
IO unplugs: 95 Timer unplugs: 3
CPU1 (xfs.trace):
Reads Queued: 153, 1,372KiB Writes Queued: 14,521,
183,608KiB
Read Dispatches: 235, 2,348KiB Write Dispatches: 15,246,
183,627KiB
Reads Requeued: 24 Writes Requeued: 748
Reads Completed: 197, 1,868KiB Writes Completed: 14,885,
11,825KiB
Read Merges: 0, 0KiB Write Merges: 19,
60KiB
Read depth: 10 Write depth: 11
IO unplugs: 39 Timer unplugs: 7
CPU2 (xfs.trace):
Reads Queued: 1,156, 7,520KiB Writes Queued: 10,873,
160,987KiB
Read Dispatches: 1,586, 7,416KiB Write Dispatches: 10,724,
160,687KiB
Reads Requeued: 446 Writes Requeued: 69
Reads Completed: 586, 3,820KiB Writes Completed: 5,310,
4,103KiB
Read Merges: 0, 0KiB Write Merges: 57,
664KiB
Read depth: 10 Write depth: 11
IO unplugs: 54 Timer unplugs: 8
CPU3 (xfs.trace):
Reads Queued: 126, 1,436KiB Writes Queued: 5,197,
47,004KiB
Read Dispatches: 135, 1,124KiB Write Dispatches: 5,388,
46,187KiB
Reads Requeued: 20 Writes Requeued: 333
Reads Completed: 82, 940KiB Writes Completed: 5,037,
1,551KiB
Read Merges: 0, 0KiB Write Merges: 91,
708KiB
Read depth: 10 Write depth: 11
IO unplugs: 50 Timer unplugs: 8
Total (xfs.trace):
Reads Queued: 2,709, 19,544KiB Writes Queued: 40,572,
548,309KiB
Read Dispatches: 3,694, 19,544KiB Write Dispatches: 41,213,
548,319KiB
Reads Requeued: 985 Writes Requeued: 1,222
Reads Completed: 2,709, 19,544KiB Writes Completed: 40,718,
25,839KiB
Read Merges: 0, 0KiB Write Merges: 209,
1,796KiB
IO unplugs: 238 Timer unplugs: 26
Throughput (R/W): 45KiB/s / 60KiB/s
Events (xfs.trace): 309,109 entries
Skips: 0 forward (0 - 0.0%)
CPU0 (xfs.trace):
Reads Queued: 1,274, 9,216KiB Writes Queued: 9,981,
156,710KiB
Read Dispatches: 1,738, 8,656KiB Write Dispatches: 9,855,
157,818KiB
Reads Requeued: 495 Writes Requeued: 72
Reads Completed: 1,844, 12,916KiB Writes Completed: 15,486,
8,360KiB
Read Merges: 0, 0KiB Write Merges: 42,
364KiB
Read depth: 10 Write depth: 11
IO unplugs: 95 Timer unplugs: 3
CPU1 (xfs.trace):
Reads Queued: 153, 1,372KiB Writes Queued: 14,521,
183,608KiB
Read Dispatches: 235, 2,348KiB Write Dispatches: 15,246,
183,627KiB
Reads Requeued: 24 Writes Requeued: 748
Reads Completed: 197, 1,868KiB Writes Completed: 14,885,
11,825KiB
Read Merges: 0, 0KiB Write Merges: 19,
60KiB
Read depth: 10 Write depth: 11
IO unplugs: 39 Timer unplugs: 7
CPU2 (xfs.trace):
Reads Queued: 1,156, 7,520KiB Writes Queued: 10,873,
160,987KiB
Read Dispatches: 1,586, 7,416KiB Write Dispatches: 10,724,
160,687KiB
Reads Requeued: 446 Writes Requeued: 69
Reads Completed: 586, 3,820KiB Writes Completed: 5,310,
4,103KiB
Read Merges: 0, 0KiB Write Merges: 57,
664KiB
Read depth: 10 Write depth: 11
IO unplugs: 54 Timer unplugs: 8
CPU3 (xfs.trace):
Reads Queued: 126, 1,436KiB Writes Queued: 5,197,
47,004KiB
Read Dispatches: 135, 1,124KiB Write Dispatches: 5,388,
46,187KiB
Reads Requeued: 20 Writes Requeued: 333
Reads Completed: 82, 940KiB Writes Completed: 5,037,
1,551KiB
Read Merges: 0, 0KiB Write Merges: 91,
708KiB
Read depth: 10 Write depth: 11
IO unplugs: 50 Timer unplugs: 8
Total (xfs.trace):
Reads Queued: 2,709, 19,544KiB Writes Queued: 40,572,
548,309KiB
Read Dispatches: 3,694, 19,544KiB Write Dispatches: 41,213,
548,319KiB
Reads Requeued: 985 Writes Requeued: 1,222
Reads Completed: 2,709, 19,544KiB Writes Completed: 40,718,
25,839KiB
Read Merges: 0, 0KiB Write Merges: 209,
1,796KiB
IO unplugs: 238 Timer unplugs: 26
Throughput (R/W): 45KiB/s / 60KiB/s
Events (xfs.trace): 309,109 entries
Skips: 0 forward (0 - 0.0%)
>
>
> Thanks for the input and patience.
>
> ~tom
>
>
[-- Attachment #1.1.2: xfs.trace.blktrace.3.bz2 --]
[-- Type: application/x-bzip, Size: 358698 bytes --]
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
2012-02-29 4:08 ` Dave Chinner
` (2 preceding siblings ...)
[not found] ` <1330658311.6438.24.camel@core24>
@ 2012-03-02 15:41 ` Thomas Lynema
2012-03-05 3:02 ` Dave Chinner
3 siblings, 1 reply; 14+ messages in thread
From: Thomas Lynema @ 2012-03-02 15:41 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1.1: Type: text/plain, Size: 14684 bytes --]
[....]
> Any errors in dmesg?
None
> Also, I think you need to provide a block trace (output of
> blktrace/blkparse for the rm -rf workloads) for both the XFS and
> ext4 cases
Here's the output from running similar
commands as Christoph:
blkparse xfs.trace.blktrace.* | grep -w D | wc -l
1134996
blkparse ext4.trace.blktrace.* | grep -w D | wc -l
27240
Attached archives of half of the ext4 results.
Summary:
CPU0 (ext4.trace):
Reads Queued: 372, 1,944KiB Writes Queued: 1,932, 7,728KiB
Read Dispatches: 372, 1,944KiB Write Dispatches: 19, 7,756KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 419, 2,104KiB Writes Completed: 13, 5,123KiB
Read Merges: 0, 0KiB Write Merges: 1,914, 7,656KiB
Read depth: 29 Write depth: 32
IO unplugs: 12 Timer unplugs: 0
CPU1 (ext4.trace):
Reads Queued: 3,119, 12,484KiB Writes Queued: 2,591, 420,624KiB
Read Dispatches: 3,168, 12,484KiB Write Dispatches: 356, 423,100KiB
Reads Requeued: 49 Writes Requeued: 115
Reads Completed: 3,271, 13,104KiB Writes Completed: 251, 11,317KiB
Read Merges: 0, 0KiB Write Merges: 2,358, 9,504KiB
Read depth: 29 Write depth: 32
IO unplugs: 10 Timer unplugs: 2
CPU2 (ext4.trace):
Reads Queued: 191, 844KiB Writes Queued: 2, 3KiB
Read Dispatches: 191, 844KiB Write Dispatches: 1, 3KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 144, 684KiB Writes Completed: 10, 2,608KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 29 Write depth: 32
IO unplugs: 6 Timer unplugs: 0
CPU3 (ext4.trace):
Reads Queued: 1,273, 5,116KiB Writes Queued: 852, 112,049KiB
Read Dispatches: 1,315, 5,116KiB Write Dispatches: 72, 109,545KiB
Reads Requeued: 42 Writes Requeued: 11
Reads Completed: 1,121, 4,496KiB Writes Completed: 56, 4,024KiB
Read Merges: 0, 0KiB Write Merges: 779, 3,392KiB
Read depth: 29 Write depth: 32
IO unplugs: 15 Timer unplugs: 11
Total (ext4.trace):
Reads Queued: 4,955, 20,388KiB Writes Queued: 5,377, 540,404KiB
Read Dispatches: 5,046, 20,388KiB Write Dispatches: 448, 540,404KiB
Reads Requeued: 91 Writes Requeued: 126
Reads Completed: 4,955, 20,388KiB Writes Completed: 330, 23,072KiB
Read Merges: 0, 0KiB Write Merges: 5,051, 20,552KiB
IO unplugs: 43 Timer unplugs: 13
Throughput (R/W): 5,011KiB/s / 5,671KiB/s
Events (ext4.trace): 57,930 entries
Skips: 0 forward (0 - 0.0%)
CPU0 (ext4.trace):
Reads Queued: 372, 1,944KiB Writes Queued: 1,932, 7,728KiB
Read Dispatches: 372, 1,944KiB Write Dispatches: 19, 7,756KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 419, 2,104KiB Writes Completed: 13, 5,123KiB
Read Merges: 0, 0KiB Write Merges: 1,914, 7,656KiB
Read depth: 29 Write depth: 32
IO unplugs: 12 Timer unplugs: 0
CPU1 (ext4.trace):
Reads Queued: 3,119, 12,484KiB Writes Queued: 2,591, 420,624KiB
Read Dispatches: 3,168, 12,484KiB Write Dispatches: 356, 423,100KiB
Reads Requeued: 49 Writes Requeued: 115
Reads Completed: 3,271, 13,104KiB Writes Completed: 251, 11,317KiB
Read Merges: 0, 0KiB Write Merges: 2,358, 9,504KiB
Read depth: 29 Write depth: 32
IO unplugs: 10 Timer unplugs: 2
CPU2 (ext4.trace):
Reads Queued: 191, 844KiB Writes Queued: 2, 3KiB
Read Dispatches: 191, 844KiB Write Dispatches: 1, 3KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 144, 684KiB Writes Completed: 10, 2,608KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 29 Write depth: 32
IO unplugs: 6 Timer unplugs: 0
CPU3 (ext4.trace):
Reads Queued: 1,273, 5,116KiB Writes Queued: 852, 112,049KiB
Read Dispatches: 1,315, 5,116KiB Write Dispatches: 72, 109,545KiB
Reads Requeued: 42 Writes Requeued: 11
Reads Completed: 1,121, 4,496KiB Writes Completed: 56, 4,024KiB
Read Merges: 0, 0KiB Write Merges: 779, 3,392KiB
Read depth: 29 Write depth: 32
IO unplugs: 15 Timer unplugs: 11
Total (ext4.trace):
Reads Queued: 4,955, 20,388KiB Writes Queued: 5,377, 540,404KiB
Read Dispatches: 5,046, 20,388KiB Write Dispatches: 448, 540,404KiB
Reads Requeued: 91 Writes Requeued: 126
Reads Completed: 4,955, 20,388KiB Writes Completed: 330, 23,072KiB
Read Merges: 0, 0KiB Write Merges: 5,051, 20,552KiB
IO unplugs: 43 Timer unplugs: 13
Throughput (R/W): 5,011KiB/s / 5,671KiB/s
Events (ext4.trace): 57,930 entries
Skips: 0 forward (0 - 0.0%)
CPU0 (ext4.trace):
Reads Queued: 372, 1,944KiB Writes Queued: 1,932, 7,728KiB
Read Dispatches: 372, 1,944KiB Write Dispatches: 19, 7,756KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 419, 2,104KiB Writes Completed: 13, 5,123KiB
Read Merges: 0, 0KiB Write Merges: 1,914, 7,656KiB
Read depth: 29 Write depth: 32
IO unplugs: 12 Timer unplugs: 0
CPU1 (ext4.trace):
Reads Queued: 3,119, 12,484KiB Writes Queued: 2,591, 420,624KiB
Read Dispatches: 3,168, 12,484KiB Write Dispatches: 356, 423,100KiB
Reads Requeued: 49 Writes Requeued: 115
Reads Completed: 3,271, 13,104KiB Writes Completed: 251, 11,317KiB
Read Merges: 0, 0KiB Write Merges: 2,358, 9,504KiB
Read depth: 29 Write depth: 32
IO unplugs: 10 Timer unplugs: 2
CPU2 (ext4.trace):
Reads Queued: 191, 844KiB Writes Queued: 2, 3KiB
Read Dispatches: 191, 844KiB Write Dispatches: 1, 3KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 144, 684KiB Writes Completed: 10, 2,608KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 29 Write depth: 32
IO unplugs: 6 Timer unplugs: 0
CPU3 (ext4.trace):
Reads Queued: 1,273, 5,116KiB Writes Queued: 852, 112,049KiB
Read Dispatches: 1,315, 5,116KiB Write Dispatches: 72, 109,545KiB
Reads Requeued: 42 Writes Requeued: 11
Reads Completed: 1,121, 4,496KiB Writes Completed: 56, 4,024KiB
Read Merges: 0, 0KiB Write Merges: 779, 3,392KiB
Read depth: 29 Write depth: 32
IO unplugs: 15 Timer unplugs: 11
Total (ext4.trace):
Reads Queued: 4,955, 20,388KiB Writes Queued: 5,377, 540,404KiB
Read Dispatches: 5,046, 20,388KiB Write Dispatches: 448, 540,404KiB
Reads Requeued: 91 Writes Requeued: 126
Reads Completed: 4,955, 20,388KiB Writes Completed: 330, 23,072KiB
Read Merges: 0, 0KiB Write Merges: 5,051, 20,552KiB
IO unplugs: 43 Timer unplugs: 13
Throughput (R/W): 5,011KiB/s / 5,671KiB/s
Events (ext4.trace): 57,930 entries
Skips: 0 forward (0 - 0.0%)
CPU0 (ext4.trace):
Reads Queued: 372, 1,944KiB Writes Queued: 1,932, 7,728KiB
Read Dispatches: 372, 1,944KiB Write Dispatches: 19, 7,756KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 419, 2,104KiB Writes Completed: 13, 5,123KiB
Read Merges: 0, 0KiB Write Merges: 1,914, 7,656KiB
Read depth: 29 Write depth: 32
IO unplugs: 12 Timer unplugs: 0
CPU1 (ext4.trace):
Reads Queued: 3,119, 12,484KiB Writes Queued: 2,591, 420,624KiB
Read Dispatches: 3,168, 12,484KiB Write Dispatches: 356, 423,100KiB
Reads Requeued: 49 Writes Requeued: 115
Reads Completed: 3,271, 13,104KiB Writes Completed: 251, 11,317KiB
Read Merges: 0, 0KiB Write Merges: 2,358, 9,504KiB
Read depth: 29 Write depth: 32
IO unplugs: 10 Timer unplugs: 2
CPU2 (ext4.trace):
Reads Queued: 191, 844KiB Writes Queued: 2, 3KiB
Read Dispatches: 191, 844KiB Write Dispatches: 1, 3KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 144, 684KiB Writes Completed: 10, 2,608KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 29 Write depth: 32
IO unplugs: 6 Timer unplugs: 0
CPU3 (ext4.trace):
Reads Queued: 1,273, 5,116KiB Writes Queued: 852, 112,049KiB
Read Dispatches: 1,315, 5,116KiB Write Dispatches: 72, 109,545KiB
Reads Requeued: 42 Writes Requeued: 11
Reads Completed: 1,121, 4,496KiB Writes Completed: 56, 4,024KiB
Read Merges: 0, 0KiB Write Merges: 779, 3,392KiB
Read depth: 29 Write depth: 32
IO unplugs: 15 Timer unplugs: 11
Total (ext4.trace):
Reads Queued: 4,955, 20,388KiB Writes Queued: 5,377, 540,404KiB
Read Dispatches: 5,046, 20,388KiB Write Dispatches: 448, 540,404KiB
Reads Requeued: 91 Writes Requeued: 126
Reads Completed: 4,955, 20,388KiB Writes Completed: 330, 23,072KiB
Read Merges: 0, 0KiB Write Merges: 5,051, 20,552KiB
IO unplugs: 43 Timer unplugs: 13
Throughput (R/W): 5,011KiB/s / 5,671KiB/s
Events (ext4.trace): 57,930 entries
Skips: 0 forward (0 - 0.0%)
CPU0 (ext4.trace):
Reads Queued: 372, 1,944KiB Writes Queued: 1,932, 7,728KiB
Read Dispatches: 372, 1,944KiB Write Dispatches: 19, 7,756KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 419, 2,104KiB Writes Completed: 13, 5,123KiB
Read Merges: 0, 0KiB Write Merges: 1,914, 7,656KiB
Read depth: 29 Write depth: 32
IO unplugs: 12 Timer unplugs: 0
CPU1 (ext4.trace):
Reads Queued: 3,119, 12,484KiB Writes Queued: 2,591, 420,624KiB
Read Dispatches: 3,168, 12,484KiB Write Dispatches: 356, 423,100KiB
Reads Requeued: 49 Writes Requeued: 115
Reads Completed: 3,271, 13,104KiB Writes Completed: 251, 11,317KiB
Read Merges: 0, 0KiB Write Merges: 2,358, 9,504KiB
Read depth: 29 Write depth: 32
IO unplugs: 10 Timer unplugs: 2
CPU2 (ext4.trace):
Reads Queued: 191, 844KiB Writes Queued: 2, 3KiB
Read Dispatches: 191, 844KiB Write Dispatches: 1, 3KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 144, 684KiB Writes Completed: 10, 2,608KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 29 Write depth: 32
IO unplugs: 6 Timer unplugs: 0
CPU3 (ext4.trace):
Reads Queued: 1,273, 5,116KiB Writes Queued: 852, 112,049KiB
Read Dispatches: 1,315, 5,116KiB Write Dispatches: 72, 109,545KiB
Reads Requeued: 42 Writes Requeued: 11
Reads Completed: 1,121, 4,496KiB Writes Completed: 56, 4,024KiB
Read Merges: 0, 0KiB Write Merges: 779, 3,392KiB
Read depth: 29 Write depth: 32
IO unplugs: 15 Timer unplugs: 11
Total (ext4.trace):
Reads Queued: 4,955, 20,388KiB Writes Queued: 5,377, 540,404KiB
Read Dispatches: 5,046, 20,388KiB Write Dispatches: 448, 540,404KiB
Reads Requeued: 91 Writes Requeued: 126
Reads Completed: 4,955, 20,388KiB Writes Completed: 330, 23,072KiB
Read Merges: 0, 0KiB Write Merges: 5,051, 20,552KiB
IO unplugs: 43 Timer unplugs: 13
Throughput (R/W): 5,011KiB/s / 5,671KiB/s
Events (ext4.trace): 57,930 entries
Skips: 0 forward (0 - 0.0%)
Peter was right then. Per multiple recommendations
I'm switching to fstrim, it is quicker than the deletes.
fstrim consistently takes about a minute to run on a 40GB
volume.
time fstrim -v /usr
/usr: 9047117824 bytes were trimmed
real 0m56.121s
user 0m0.090s
sys 0m0.000s
I see that the running of fstrim was already discussed
http://oss.sgi.com/archives/xfs/2011-05/msg00338.html.
Please add something to the FAQ about using SSDs. It would be great if
people could see the recommended mount options and fstrim crontab entry
(or other option) for running xfs on a SSD.
Thanks for the input and patience.
~tom
[-- Attachment #1.1.2: ext4.trace.blktrace.tar.bz --]
[-- Type: application/x-bzip-compressed-tar, Size: 364717 bytes --]
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
2012-03-02 15:41 ` Thomas Lynema
@ 2012-03-05 3:02 ` Dave Chinner
2012-03-05 6:41 ` Jeffrey Hundstad
0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2012-03-05 3:02 UTC (permalink / raw)
To: Thomas Lynema; +Cc: xfs
On Fri, Mar 02, 2012 at 10:41:50AM -0500, Thomas Lynema wrote:
> Peter was right then. Per multiple recommendations
> I'm switching to fstrim, it is quicker than the deletes.
>
> fstrim consistently takes about a minute to run on a 40GB
> volume.
>
> time fstrim -v /usr
> /usr: 9047117824 bytes were trimmed
>
> real 0m56.121s
> user 0m0.090s
> sys 0m0.000s
>
> I see that the running of fstrim was already discussed
> http://oss.sgi.com/archives/xfs/2011-05/msg00338.html.
>
> Please add something to the FAQ about using SSDs. It would be great if
> people could see the recommended mount options and fstrim crontab entry
> (or other option) for running xfs on a SSD.
It's a publicly modifiable wiki. Feel free to add what you've
learned here to a new FAQ entry - others will come by and correct
anything in it that is inaccurate....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Poor performance using discard
2012-03-05 3:02 ` Dave Chinner
@ 2012-03-05 6:41 ` Jeffrey Hundstad
0 siblings, 0 replies; 14+ messages in thread
From: Jeffrey Hundstad @ 2012-03-05 6:41 UTC (permalink / raw)
To: xfs
On 03/04/2012 09:02 PM, Dave Chinner wrote:
> On Fri, Mar 02, 2012 at 10:41:50AM -0500, Thomas Lynema wrote:
>> Peter was right then. Per multiple recommendations
>> I'm switching to fstrim, it is quicker than the deletes.
>>
>> fstrim consistently takes about a minute to run on a 40GB
>> volume.
>>
>> time fstrim -v /usr
>> /usr: 9047117824 bytes were trimmed
>>
>> real 0m56.121s
>> user 0m0.090s
>> sys 0m0.000s
>>
>> I see that the running of fstrim was already discussed
>> http://oss.sgi.com/archives/xfs/2011-05/msg00338.html.
>>
>> Please add something to the FAQ about using SSDs. It would be great if
>> people could see the recommended mount options and fstrim crontab entry
>> (or other option) for running xfs on a SSD.
> It's a publicly modifiable wiki. Feel free to add what you've
> learned here to a new FAQ entry - others will come by and correct
> anything in it that is inaccurate....
And it is already there:
http://xfs.org/index.php/FITRIM/discard
Feel free to expand it.
--
Jeffrey Hundstad
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2012-03-05 6:41 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-28 22:56 Poor performance using discard Thomas Lynema
2012-02-28 23:58 ` Peter Grandi
2012-02-29 1:22 ` Dave Chinner
2012-02-29 2:00 ` Thomas Lynema
2012-02-29 4:08 ` Dave Chinner
2012-02-29 10:38 ` Peter Grandi
2012-02-29 19:46 ` Eric Sandeen
2012-03-01 5:59 ` Christoph Hellwig
2012-03-01 6:27 ` Dave Chinner
2012-03-01 6:31 ` Christoph Hellwig
[not found] ` <1330658311.6438.24.camel@core24>
2012-03-02 14:57 ` Thomas Lynema
2012-03-02 15:41 ` Thomas Lynema
2012-03-05 3:02 ` Dave Chinner
2012-03-05 6:41 ` Jeffrey Hundstad
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox