public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* Poor performance using discard
@ 2012-02-28 22:56 Thomas Lynema
  2012-02-28 23:58 ` Peter Grandi
  2012-02-29  1:22 ` Dave Chinner
  0 siblings, 2 replies; 14+ messages in thread
From: Thomas Lynema @ 2012-02-28 22:56 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 1270 bytes --]

Please reply to my personal email as well as I am not subscribed to the
list.

I have a PP120GS25SSDR it does support trim 

cat /sys/block/sdc/queue/discard_max_bytes 
2147450880

The entire drive is one partition that is totally used by LVM.

I made a test vg and formatted it with mkfs.xfs.  Then mounted it with
discard and got the following result when deleting a kernel source:

/dev/mapper/ssdvg0-testLV on /media/temp type xfs
(rw,noatime,nodiratime,discard)

time rm -rf linux-3.2.6-gentoo/
real   5m7.139s
user   0m0.080s
sys   0m1.580s 

There where lockups where the system would pause for about a minute
during the process.

ext4 handles this scenerio fine:

/dev/mapper/ssdvg0-testLV on /media/temp type ext4
(rw,noatime,nodiratime,discard)

time rm -rf linux-3.2.6-gentoo/

real   0m0.943s
user   0m0.050s
sys   0m0.830s 

xfs mounted without discard seems to handle this fine:

/dev/mapper/ssdvg0-testLV on /media/temp type xfs
(rw,noatime,nodiratime)

time rm -rf linux-3.2.6-gentoo/
real	0m1.634s
user	0m0.040s
sys	0m1.420s

uname -a
Linux core24 3.2.5-gentoo #11 SMP PREEMPT Sat Feb 11 15:46:22 EST 2012
x86_64 Intel(R) Core(TM)2 Quad CPU Q6700 @ 2.66GHz GenuineIntel
GNU/Linux


Any suggestions?


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
  2012-02-28 22:56 Poor performance using discard Thomas Lynema
@ 2012-02-28 23:58 ` Peter Grandi
  2012-02-29  1:22 ` Dave Chinner
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Grandi @ 2012-02-28 23:58 UTC (permalink / raw)
  To: Linux fs XFS

[ ... ]

> /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> (rw,noatime,nodiratime,discard) [ ... ] real 5m7.139s [ ... ]

> There where lockups where the system would pause for about a
> minute during the process.

> ext4 handles this scenerio fine:

> /dev/mapper/ssdvg0-testLV on /media/temp type ext4
> (rw,noatime,nodiratime,discard) [ ... ] real 0m0.943s [ ... ]

> xfs mounted without discard seems to handle this fine:

> /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> (rw,noatime,nodiratime) [ ... ] real 0m1.634s [ ... ]

[ ... ]

> Any suggestions?

* Look at 'vmstat 1' while 'rm' is running.

* Learn how TRIM is specified, and thus why many people prefer
  running periodically 'fstrim' which uses FITRIM to mounting
  with 'discard'.

* Compare with my Crucial M4 flash SSD with XFS:

    #  time sh -c 'sysctl vm/drop_caches=3; rm -r linux-2.6.32; sync'
    vm.drop_caches = 3

    real    0m59.604s
    user    0m0.060s
    sys     0m3.944s

  That's pretty good for ~32k files and ~390MiB. Probably the
  TRIM implementation on the M4 is rather faster than that on
  the Patriot.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
  2012-02-28 22:56 Poor performance using discard Thomas Lynema
  2012-02-28 23:58 ` Peter Grandi
@ 2012-02-29  1:22 ` Dave Chinner
  2012-02-29  2:00   ` Thomas Lynema
  1 sibling, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2012-02-29  1:22 UTC (permalink / raw)
  To: Thomas Lynema; +Cc: xfs

On Tue, Feb 28, 2012 at 05:56:18PM -0500, Thomas Lynema wrote:
> Please reply to my personal email as well as I am not subscribed to the
> list.
> 
> I have a PP120GS25SSDR it does support trim 
> 
> cat /sys/block/sdc/queue/discard_max_bytes 
> 2147450880
> 
> The entire drive is one partition that is totally used by LVM.
> 
> I made a test vg and formatted it with mkfs.xfs.  Then mounted it with
> discard and got the following result when deleting a kernel source:
> 
> /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> (rw,noatime,nodiratime,discard)
> 
> time rm -rf linux-3.2.6-gentoo/
> real   5m7.139s
> user   0m0.080s
> sys   0m1.580s 
> 

I'd say your problem is that trim is extremely slow on your
hardware. You've told XFS to execute a discard command for every
single extent that is freed, and that can be very slow if you are
freeing lots of small extents (like a kernel tree contains) and you
have a device that is slow at executing discards.

> There where lockups where the system would pause for about a minute
> during the process.

Yup, that's because it runs as part of the journal commit
completion, and if your SSD is extremely slow the journal will stall
waiting for all the discards to complete.

Basically, online discard is not really a smart thing to use for
consumer SSDs. Indeed, it's just not a smart thign to run for most
workloads and use cases precisely because discard is a very slow
and non-queuable operation on most hardware that supports it.

If you really need to run discard, just run a background discard
(fstrim) from a cronjob that runs when the system is mostly idle.
You won't have any runtime overhead on every unlink but you'll still
get the benefit of discarding unused blocks regularly.

> ext4 handles this scenerio fine:
> 
> /dev/mapper/ssdvg0-testLV on /media/temp type ext4
> (rw,noatime,nodiratime,discard)
> 
> time rm -rf linux-3.2.6-gentoo/
> 
> real   0m0.943s
> user   0m0.050s
> sys   0m0.830s 

I very much doubt that a single discard IO was issued during that
workload - ext4 uses the same fine-grained discard method XFS does,
and it does it at journal checkpoint completion just like XFS. So
I'd say that ext4 didn't commit the journal during this workload,
and no discards were issued, unlike XFS.

So, now time how long it takes to run sync to get the discards
issued and completed on ext4. Do the same with XFS and see what
happens. i.e.:

$ time (rm -rf linux-3.2.6-gentoo/ ; sync)

is the only real way to compare performance....

> xfs mounted without discard seems to handle this fine:
> 
> /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> (rw,noatime,nodiratime)
> 
> time rm -rf linux-3.2.6-gentoo/
> real	0m1.634s
> user	0m0.040s
> sys	0m1.420s

Right, that's how long XFS takes with normal journal checkpoint
IO latency. Add to that the time it takes for all the discards to be
run, and you've got the above number.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
  2012-02-29  1:22 ` Dave Chinner
@ 2012-02-29  2:00   ` Thomas Lynema
  2012-02-29  4:08     ` Dave Chinner
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Lynema @ 2012-02-29  2:00 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs


[-- Attachment #1.1: Type: text/plain, Size: 7784 bytes --]




On Wed, 2012-02-29 at 12:22 +1100, Dave Chinner wrote:
> On Tue, Feb 28, 2012 at 05:56:18PM -0500, Thomas Lynema wrote:
> > Please reply to my personal email as well as I am not subscribed to the
> > list.
> > 
> > I have a PP120GS25SSDR it does support trim 
> > 
> > cat /sys/block/sdc/queue/discard_max_bytes 
> > 2147450880
> > 
> > The entire drive is one partition that is totally used by LVM.
> > 
> > I made a test vg and formatted it with mkfs.xfs.  Then mounted it with
> > discard and got the following result when deleting a kernel source:
> > 
> > /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> > (rw,noatime,nodiratime,discard)
> > 
> > time rm -rf linux-3.2.6-gentoo/
> > real   5m7.139s
> > user   0m0.080s
> > sys   0m1.580s 
> > 
> 
> I'd say your problem is that trim is extremely slow on your
> hardware. You've told XFS to execute a discard command for every
> single extent that is freed, and that can be very slow if you are
> freeing lots of small extents (like a kernel tree contains) and you
> have a device that is slow at executing discards.
> 
> > There where lockups where the system would pause for about a minute
> > during the process.
> 
> Yup, that's because it runs as part of the journal commit
> completion, and if your SSD is extremely slow the journal will stall
> waiting for all the discards to complete.
> 
> Basically, online discard is not really a smart thing to use for
> consumer SSDs. Indeed, it's just not a smart thign to run for most
> workloads and use cases precisely because discard is a very slow
> and non-queuable operation on most hardware that supports it.
> 
> If you really need to run discard, just run a background discard
> (fstrim) from a cronjob that runs when the system is mostly idle.
> You won't have any runtime overhead on every unlink but you'll still
> get the benefit of discarding unused blocks regularly.
> 
> > ext4 handles this scenerio fine:
> > 
> > /dev/mapper/ssdvg0-testLV on /media/temp type ext4
> > (rw,noatime,nodiratime,discard)
> > 
> > time rm -rf linux-3.2.6-gentoo/
> > 
> > real   0m0.943s
> > user   0m0.050s
> > sys   0m0.830s 
> 
> I very much doubt that a single discard IO was issued during that
> workload - ext4 uses the same fine-grained discard method XFS does,
> and it does it at journal checkpoint completion just like XFS. So
> I'd say that ext4 didn't commit the journal during this workload,
> and no discards were issued, unlike XFS.
> 
> So, now time how long it takes to run sync to get the discards
> issued and completed on ext4. Do the same with XFS and see what
> happens. i.e.:
> 
> $ time (rm -rf linux-3.2.6-gentoo/ ; sync)
> 
> is the only real way to compare performance....
> 
> > xfs mounted without discard seems to handle this fine:
> > 
> > /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> > (rw,noatime,nodiratime)
> > 
> > time rm -rf linux-3.2.6-gentoo/
> > real	0m1.634s
> > user	0m0.040s
> > sys	0m1.420s
> 
> Right, that's how long XFS takes with normal journal checkpoint
> IO latency. Add to that the time it takes for all the discards to be
> run, and you've got the above number.
> 
> Cheers,
> 
> Dave.


Dave and Peter,

Thank you both for the replies.  Dave, it is actually your article on
lwn and presentation that you did recently that lead me to use xfs on my
home computer.

Let's try this with the sync as Dave suggested and the command that
Peter used:

mount /dev/ssdvg0/testLV -t xfs -o
noatime,nodiratime,discard /media/temp/

time sh -c 'sysctl vm/drop_caches=3; rm -r linux-3.2.6-gentoo; sync'
vm.drop_caches = 3

real	6m35.768s
user	0m0.110s
sys	0m2.090s

vmstat samples.  Not putting 6 minutes worth in the email unless it is
necessary.

procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
 0  1   3552 6604412      0 151108    0    0  6675  5982 3109 3477  3 24
55 18
 0  1   3552 6594756      0 161032    0    0  9948     0 1655 2006  1  1
74 24
 0  1   3552 6587068      0 168672    0    0  7572     8 2799 3130  1  1
74 24
 1  0   3552 6580744      0 174852    0    0  6288     0 2880 3215  6  2
74 19
----i/o wait here----
 1  0   3552 6580496      0 174972    0    0     0     0  782 1110 22  4
74  0
 1  0   3552 6580744      0 174972    0    0     0     0  830 1194 22  4
74  0
 1  0   3552 6580744      0 174972    0    0     0     0  771 1117 23  3
74  0
 1  0   3552 6580744      0 174972    0    0     0     4 1538 2637 30  5
66  0
 1  0   3552 6580744      0 174972    0    0     0     0 1168 1946 26  3
72  0
 1  0   3552 6580744      0 174976    0    0     0     0  762 1169 23  4
73  0


 1  0   3552 6580528      0 175052    0    0     0     0  785 1138 25  2
73  0
 2  0   3552 6580528      0 175052    0    0     0     0  868 1350 24  7
69  0
 1  0   3552 6580528      0 175052    0    0     0     0  866 1259 24  5
72  0
 1  0   3552 6580528      0 175052    0    0     0     8  901 1364 26  5
69  0
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
id wa
 2  0   3552 6586348      0 175540    0    0   728  1069 1187 2057 26  7
66  1
 2  0   3552 6583344      0 176068    0    0  1812     4 1427 2350 24  8
65  2
 1  0   3552 6580920      0 177116    0    0  1964     0 1220 1961 25  8
67  1
 1  0   3552 6566616      0 190232    0    0 13376     0 1291 1938 24  7
62  8
 1  1   3552 6561780      0 193380    0    0  3344    12 1081 1953 22  4
58 15
 1  1   3552 6532148      0 200548    0    0  7236     0 10488 3630 35
11 42 13
 1  0   3552 6518508      0 200748    0    0   200     0 1929 4038 35 11
52  1
 2  0   3552 6516516      0 200828    0    0    57     0 1308 2019 24  6
69  0

EXT4 sample

mkfs.ext4 /dev/ssdvg0/testLV
mount /dev/ssdvg0/testLV -t ext4 -o
discard,noatime,nodiratime /media/temp/


time sh -c 'sysctl vm/drop_caches=3; rm -r linux-3.2.6-gentoo; sync'
vm.drop_caches = 3

real	0m2.711s
user	0m0.030s
sys	0m1.330s

#because I didn't believe it, I ran the command a second time.

time sync

real	0m0.157s
user	0m0.000s
sys	0m0.000s
0m1.420s

vmstat 1

procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
id wa
 1  0   3548 5474268  19736 1191868    0    0     0     0 1274 2097 25
3 72  0
 1  0   3548 5474268  19736 1191872    0    0     0     0 1027 1614 26
3 71  0
 2  1   3548 6649292   4688 154264    0    0  9512     8 2256 3267 11 18
58 12
 2  2   3548 6633188  15920 161592    0    0 18788  7732 5137 6274  5 17
49 29
 0  1   3548 6623044  19624 167936    0    0  9948 10081 3233 4810  4  7
54 35
 0  1   3548 6621556  19624 170068    0    0  2112  2642 1294 2135  4  1
72 23
 0  2   3548 6611140  19624 179420    0    0 10260    50 1677 2930  7  2
64 27
 0  1   3548 6606660  19624 183828    0    0  4181    32 2192 2707  6  2
67 26
 1  0   3548 6604700  19624 185864    0    0  2080     0  961 1451  7  2
74 17
 1  0   3548 6604700  19624 185864    0    0     0     0  966 1715 24  3
73  0
 2  0   3548 6604700  19624 185864    0    0     8   196 1025 1582 24  4
72  0
 1  0   3548 6604700  19624 185864    0    0     0     0 1133 1901 24  3
73  0


This time, I ran a sync.  That should mean all of the discard operations
were completed...right?

If it makes a difference, when I get the i/o hang during the xfs
deletes, my entire system seems to hang.  It doesn't just hang that
particular mounted volumes' i/o.

Please let me know if there anything obvious that I'm missing from this
equation.

~tom


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
  2012-02-29  2:00   ` Thomas Lynema
@ 2012-02-29  4:08     ` Dave Chinner
  2012-02-29 10:38       ` Peter Grandi
                         ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Dave Chinner @ 2012-02-29  4:08 UTC (permalink / raw)
  To: Thomas Lynema; +Cc: xfs

On Tue, Feb 28, 2012 at 09:00:26PM -0500, Thomas Lynema wrote:
> 
> On Wed, 2012-02-29 at 12:22 +1100, Dave Chinner wrote:
> > On Tue, Feb 28, 2012 at 05:56:18PM -0500, Thomas Lynema wrote:
> > > /dev/mapper/ssdvg0-testLV on /media/temp type ext4
> > > (rw,noatime,nodiratime,discard)
> > > 
> > > time rm -rf linux-3.2.6-gentoo/
> > > 
> > > real   0m0.943s
> > > user   0m0.050s
> > > sys   0m0.830s 
> > 
> > I very much doubt that a single discard IO was issued during that
> > workload - ext4 uses the same fine-grained discard method XFS does,
> > and it does it at journal checkpoint completion just like XFS. So
> > I'd say that ext4 didn't commit the journal during this workload,
> > and no discards were issued, unlike XFS.
> > 
> > So, now time how long it takes to run sync to get the discards
> > issued and completed on ext4. Do the same with XFS and see what
> > happens. i.e.:
> > 
> > $ time (rm -rf linux-3.2.6-gentoo/ ; sync)
> > 
> > is the only real way to compare performance....
> > 
> > > xfs mounted without discard seems to handle this fine:
> > > 
> > > /dev/mapper/ssdvg0-testLV on /media/temp type xfs
> > > (rw,noatime,nodiratime)
> > > 
> > > time rm -rf linux-3.2.6-gentoo/
> > > real	0m1.634s
> > > user	0m0.040s
> > > sys	0m1.420s
> > 
> > Right, that's how long XFS takes with normal journal checkpoint
> > IO latency. Add to that the time it takes for all the discards to be
> > run, and you've got the above number.
> > 
> > Cheers,
> > 
> > Dave.
> 
> 
> Dave and Peter,
> 
> Thank you both for the replies.  Dave, it is actually your article on
> lwn and presentation that you did recently that lead me to use xfs on my
> home computer.
> 
> Let's try this with the sync as Dave suggested and the command that
> Peter used:
> 
> mount /dev/ssdvg0/testLV -t xfs -o
> noatime,nodiratime,discard /media/temp/
> 
> time sh -c 'sysctl vm/drop_caches=3; rm -r linux-3.2.6-gentoo; sync'
> vm.drop_caches = 3
> 
> real	6m35.768s
> user	0m0.110s
> sys	0m2.090s
> 
> vmstat samples.  Not putting 6 minutes worth in the email unless it is
> necessary.
> 
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
>  0  1   3552 6604412      0 151108    0    0  6675  5982 3109 3477  3 24 55 18
>  0  1   3552 6594756      0 161032    0    0  9948     0 1655 2006  1  1 74 24
>  0  1   3552 6587068      0 168672    0    0  7572     8 2799 3130  1  1  74 24
>  1  0   3552 6580744      0 174852    0    0  6288     0 2880 3215  6  2  74 19
> ----i/o wait here----
>  1  0   3552 6580496      0 174972    0    0     0     0  782 1110 22  4 74  0
>  1  0   3552 6580744      0 174972    0    0     0     0  830 1194 22  4 74  0
>  1  0   3552 6580744      0 174972    0    0     0     0  771 1117 23  3 74  0
>  1  0   3552 6580744      0 174972    0    0     0     4 1538 2637 30  5 66  0
>  1  0   3552 6580744      0 174972    0    0     0     0 1168 1946 26  3 72  0
>  1  0   3552 6580744      0 174976    0    0     0     0  762 1169 23  4 73  0

There's no IO wait time here - it's apparently burning a CPU in userspace and
doing no IO at all. running discards all happens in kernel threads,
so there should be no user time at all if it was stuck doing
discards. What is consuming that CPU time?

....
> EXT4 sample
> 
> mkfs.ext4 /dev/ssdvg0/testLV
> mount /dev/ssdvg0/testLV -t ext4 -o
> discard,noatime,nodiratime /media/temp/
> 
> 
> time sh -c 'sysctl vm/drop_caches=3; rm -r linux-3.2.6-gentoo; sync'
> vm.drop_caches = 3
> 
> real	0m2.711s
> user	0m0.030s
> sys	0m1.330s
> 
> #because I didn't believe it, I ran the command a second time.
> 
> time sync
> 
> real	0m0.157s
> user	0m0.000s
> sys	0m0.000s
> 0m1.420s
> 
> vmstat 1
> 
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
>  1  0   3548 5474268  19736 1191868    0    0     0     0 1274 2097 25 3 72  0
>  1  0   3548 5474268  19736 1191872    0    0     0     0 1027 1614 26 3 71  0
>  2  1   3548 6649292   4688 154264    0    0  9512     8 2256 3267 11 18 58 12
>  2  2   3548 6633188  15920 161592    0    0 18788  7732 5137 6274  5 17 49 29
>  0  1   3548 6623044  19624 167936    0    0  9948 10081 3233 4810  4  7 54 35
>  0  1   3548 6621556  19624 170068    0    0  2112  2642 1294 2135  4  1 72 23
>  0  2   3548 6611140  19624 179420    0    0 10260    50 1677 2930  7  2 64 27
>  0  1   3548 6606660  19624 183828    0    0  4181    32 2192 2707  6  2 67 26
>  1  0   3548 6604700  19624 185864    0    0  2080     0  961 1451  7  2 74 17
>  1  0   3548 6604700  19624 185864    0    0     0     0  966 1715 24  3 73  0
>  2  0   3548 6604700  19624 185864    0    0     8   196 1025 1582 24  4 72  0
>  1  0   3548 6604700  19624 185864    0    0     0     0 1133 1901 24  3 73  0

Same again - aparently when you system goes idle, it burns a CPU in
user time, but stops doing that when IO is in progress.

> This time, I ran a sync.  That should mean all of the discard operations
> were completed...right?

Well, it certainly is the case for XFS. I'm not sure what is
happening with ext4 though.

> If it makes a difference, when I get the i/o hang during the xfs
> deletes, my entire system seems to hang.  It doesn't just hang that
> particular mounted volumes' i/o.

Any errors in dmesg?

Also, I think you need to provide a block trace (output of
blktrace/blkparse for the rm -rf workloads) for both the XFS and
ext4 cases so we can see what discards are actually being issued and
how long they take to complete....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
  2012-02-29  4:08     ` Dave Chinner
@ 2012-02-29 10:38       ` Peter Grandi
  2012-02-29 19:46       ` Eric Sandeen
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Peter Grandi @ 2012-02-29 10:38 UTC (permalink / raw)
  To: Linux fs XFS

[ ... ]

> Same again - aparently when you system goes idle, it burns a CPU in
> user time, but stops doing that when IO is in progress.

>> This time, I ran a sync.  That should mean all of the discard
>> operations were completed...right?

> Well, it certainly is the case for XFS. I'm not sure what is
> happening with ext4 though.

>> If it makes a difference, when I get the i/o hang during the
>> xfs deletes, my entire system seems to hang.  It doesn't just
>> hang that particular mounted volumes' i/o.

> Any errors in dmesg?

I had hoped that the OP would take my suggestions and runs some
web search, because I was obviously suggesting that there the
behaviour he saw is expected and correct for XFS:

>>> * Learn how TRIM is specified, and thus why many people prefer
>>>   running periodically 'fstrim' which uses FITRIM to mounting
>>>   with 'discard'.

So let's cut it short: the TRIM command is specified to be
synchronous.

Therefore a sequence if TRIM operations will lock up the disk,
and usually as a result the host too. In addition to this, in
some implementations TRIM is faster and in some it is slower,
but again the main problem is that it is synchronous.

Therefore using 'discard' which uses TRIM is subject to exactly
the behaviour reported. But 'ext4' batches TRIMs, issuing them
out of the journal, while XFS probably issues them with every
deletion operation, so 'ext4' should be hit less, but the
difference should not be as large as reported.

I suspect then that recent 'ext4' ignores 'discard' precisely
because of the many reports of freezes they may have received.

An early discussion of the issue with TRIM:

http://www.spinics.net/lists/linux-fsdevel/msg23064.html

From this report it seems that 'ext4' used to be "slow"
on 'discard' too:

https://patrick-nagel.net/blog/archives/337
 "I did it three times with and three times without the “discard”
  option, and then took the average of those three tries:
  Without “discard” option:
      Unpack: 1.21s
      Sync: 1.66s (= 172 MB/s)
      Delete: 0.47s
      Sync: 0.17s
  With “discard” option:
      Unpack: 1.18s
      Sync: 1.62s (= 176 MB/s)
      Delete: 0.48s
      Sync: 40.41s
  So, with “discard” on, deleting a big bunch of small files is 64
  times slower on my SSD. For those ~40 seconds any I/O is really
  slow, so that’s pretty much the time when you get a fresh cup of
  coffee, or waste time watching the mass storage activity LED."

Also the 'man' page for 'fstrim':

http://www.vdmeulen.net/cgi-bin/man/man2html?fstrim+8
 "-m, --minimum minimum-free-extent
    Minimum contiguous free range to discard, in bytes.
    (This value is internally rounded up to a multiple of the
    filesystem block size). Free ranges smaller than this will be
    ignored. By increasing this value, the fstrim operation will
    complete more quickly for filesystems with badly fragmented
    freespace, although not all blocks will be discarded. Default
    value is zero, discard every free block."

Note the "By increasing this value, the fstrim operation will
complete more quickly for filesystems with badly fragmented
freespace", which implies that FITRIM is also synchronous or slow
or both.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
  2012-02-29  4:08     ` Dave Chinner
  2012-02-29 10:38       ` Peter Grandi
@ 2012-02-29 19:46       ` Eric Sandeen
  2012-03-01  5:59         ` Christoph Hellwig
       [not found]       ` <1330658311.6438.24.camel@core24>
  2012-03-02 15:41       ` Thomas Lynema
  3 siblings, 1 reply; 14+ messages in thread
From: Eric Sandeen @ 2012-02-29 19:46 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Thomas Lynema, xfs

On 2/28/12 10:08 PM, Dave Chinner wrote:

> Also, I think you need to provide a block trace (output of
> blktrace/blkparse for the rm -rf workloads) for both the XFS and
> ext4 cases so we can see what discards are actually being issued and
> how long they take to complete....
> 

I ran a quick test on a loopback device on 3.3.0-rc4.  Loopback supports
discards.  I made 1G filesystems on loopback on ext4 & xfs, mounted with
-o discard, cloned a git tree to them, and ran rm -rf; sync under blktrace.

XFS took about 11 seconds, ext4 took about 1.7.

(without trim, times were roughly the same - but discard/trim is probably
quite fast on the looback file)

Both files were reduced in disk usage about the same amount, so online
discard was working for both:

# du -h ext4_fsfile xfs_fsfile
497M	ext4_fsfile
491M	xfs_fsfile

XFS issued many more discards than ext4:

# blkparse xfs.trace | grep -w D | wc -l
40205
# blkparse ext4.trace | grep -w D | wc -l
123

XFS issued many small discards (4k/8 sectors) and a few larger ones:

[sectors | # discards]

8 20079
16 6762
24 3627
32 2798
40 1439
...
1840 1
7256 1
26720 1

ext4 issued far fewer discards, but in much larger chunks:

8 29
16 9
24 4
32 6
...
35152 1
35248 1
53744 1
192320 1
261624 1
262144 1

So that could certainly explain the relative speed.

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
  2012-02-29 19:46       ` Eric Sandeen
@ 2012-03-01  5:59         ` Christoph Hellwig
  2012-03-01  6:27           ` Dave Chinner
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2012-03-01  5:59 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Thomas Lynema, xfs

On Wed, Feb 29, 2012 at 01:46:34PM -0600, Eric Sandeen wrote:
> On 2/28/12 10:08 PM, Dave Chinner wrote:
> 
> > Also, I think you need to provide a block trace (output of
> > blktrace/blkparse for the rm -rf workloads) for both the XFS and
> > ext4 cases so we can see what discards are actually being issued and
> > how long they take to complete....
> > 
> 
> I ran a quick test on a loopback device on 3.3.0-rc4.  Loopback supports
> discards.  I made 1G filesystems on loopback on ext4 & xfs, mounted with
> -o discard, cloned a git tree to them, and ran rm -rf; sync under blktrace.
> 
> XFS took about 11 seconds, ext4 took about 1.7.
> 
> (without trim, times were roughly the same - but discard/trim is probably
> quite fast on the looback file)
> 
> Both files were reduced in disk usage about the same amount, so online
> discard was working for both:
> 
> # du -h ext4_fsfile xfs_fsfile
> 497M	ext4_fsfile
> 491M	xfs_fsfile
> 
> XFS issued many more discards than ext4:

XFS frees inode blocks, directory blocks and btree blocks.  ext4 only
ever frees data blocks and the occasional indirect block on files.

So a proper discard implementation on XFS without either a reall fast
non-blocking and/or vectored trim (like actually supported in hardware)
XFS will be fairly slow.

Unfortunately all the required bits are missing in the Linux block
layer, thus you really should use fstrim for now. 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
  2012-03-01  5:59         ` Christoph Hellwig
@ 2012-03-01  6:27           ` Dave Chinner
  2012-03-01  6:31             ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2012-03-01  6:27 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Eric Sandeen, Thomas Lynema, xfs

On Thu, Mar 01, 2012 at 12:59:43AM -0500, Christoph Hellwig wrote:
> On Wed, Feb 29, 2012 at 01:46:34PM -0600, Eric Sandeen wrote:
> > On 2/28/12 10:08 PM, Dave Chinner wrote:
> > 
> > > Also, I think you need to provide a block trace (output of
> > > blktrace/blkparse for the rm -rf workloads) for both the XFS and
> > > ext4 cases so we can see what discards are actually being issued and
> > > how long they take to complete....
> > > 
> > 
> > I ran a quick test on a loopback device on 3.3.0-rc4.  Loopback supports
> > discards.  I made 1G filesystems on loopback on ext4 & xfs, mounted with
> > -o discard, cloned a git tree to them, and ran rm -rf; sync under blktrace.
> > 
> > XFS took about 11 seconds, ext4 took about 1.7.
> > 
> > (without trim, times were roughly the same - but discard/trim is probably
> > quite fast on the looback file)
> > 
> > Both files were reduced in disk usage about the same amount, so online
> > discard was working for both:
> > 
> > # du -h ext4_fsfile xfs_fsfile
> > 497M	ext4_fsfile
> > 491M	xfs_fsfile
> > 
> > XFS issued many more discards than ext4:
> 
> XFS frees inode blocks, directory blocks and btree blocks.  ext4 only
> ever frees data blocks and the occasional indirect block on files.

One other thing the ext4 tracing implementation does is merge
adjacent ranges, whereas the XFS implementation does not. XFS has
more tracking complexity than ext4, though, in that it tracks free
extents in multiple concurrent journal commits whereas ext4 only has
to track across a single journal commit.  Hence ext4 can merge
without having to care about where the adjacent range is being
committed in the same journal checkpoint.

Further, ext4 doesn't reallocate from the freed extents until after
the journal commit completes, whilst XFS can reallocate freed ranges
before the freeing is journalled and hence can modify ranges in the
free list prior to journal commit.

We could probably implement extent merging in the free extent
tracking similar to ext4, but I'm not sure how much it would gain us
because of the way we do reallocation of freed ranges prior to
journal commit....

> So a proper discard implementation on XFS without either a reall fast
> non-blocking and/or vectored trim (like actually supported in hardware)
> XFS will be fairly slow.
>
> Unfortunately all the required bits are missing in the Linux block
> layer, thus you really should use fstrim for now. 

Another good reason for using fstrim instead of online discard... :/

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
  2012-03-01  6:27           ` Dave Chinner
@ 2012-03-01  6:31             ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2012-03-01  6:31 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, Eric Sandeen, Thomas Lynema, xfs

On Thu, Mar 01, 2012 at 05:27:09PM +1100, Dave Chinner wrote:
> One other thing the ext4 tracing implementation does is merge
> adjacent ranges, whereas the XFS implementation does not. XFS has
> more tracking complexity than ext4, though, in that it tracks free
> extents in multiple concurrent journal commits whereas ext4 only has
> to track across a single journal commit.  Hence ext4 can merge
> without having to care about where the adjacent range is being
> committed in the same journal checkpoint.
> 
> Further, ext4 doesn't reallocate from the freed extents until after
> the journal commit completes, whilst XFS can reallocate freed ranges
> before the freeing is journalled and hence can modify ranges in the
> free list prior to journal commit.
> 
> We could probably implement extent merging in the free extent
> tracking similar to ext4, but I'm not sure how much it would gain us
> because of the way we do reallocation of freed ranges prior to
> journal commit....

Also there generally aren't that many merging opportunities.  Back when
I implemented the code and looked at block traces we'd get them
occasionally:

 (a) for inode buffers due to the inode clusters beeing smaller than the
     inode chunks.  Better fixed by increasing the amount of inode
     clustering we do.
 (b) during rm -rf sometimes when lots of small files were end-to-end,
     but this doesn't happen all that often.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
       [not found]       ` <1330658311.6438.24.camel@core24>
@ 2012-03-02 14:57         ` Thomas Lynema
  0 siblings, 0 replies; 14+ messages in thread
From: Thomas Lynema @ 2012-03-02 14:57 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs


[-- Attachment #1.1.1: Type: text/plain, Size: 11148 bytes --]

> > Also, I think you need to provide a block trace (output of
> > blktrace/blkparse for the rm -rf workloads) for both the XFS and
> > ext4 cases so we can see what discards are actually being issued and
> > how long they take to complete....
> 

Partial xfs results attached this was the overall summary with xfs.

The summary looks like this.

CPU0 (xfs.trace):
 Reads Queued:       1,274,    9,216KiB	 Writes Queued:       9,981,
156,710KiB
 Read Dispatches:    1,738,    8,656KiB	 Write Dispatches:    9,855,
157,818KiB
 Reads Requeued:       495		 Writes Requeued:        72
 Reads Completed:    1,844,   12,916KiB	 Writes Completed:   15,486,
8,360KiB
 Read Merges:            0,        0KiB	 Write Merges:           42,
364KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            95        	 Timer unplugs:           3
CPU1 (xfs.trace):
 Reads Queued:         153,    1,372KiB	 Writes Queued:      14,521,
183,608KiB
 Read Dispatches:      235,    2,348KiB	 Write Dispatches:   15,246,
183,627KiB
 Reads Requeued:        24		 Writes Requeued:       748
 Reads Completed:      197,    1,868KiB	 Writes Completed:   14,885,
11,825KiB
 Read Merges:            0,        0KiB	 Write Merges:           19,
60KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            39        	 Timer unplugs:           7
CPU2 (xfs.trace):
 Reads Queued:       1,156,    7,520KiB	 Writes Queued:      10,873,
160,987KiB
 Read Dispatches:    1,586,    7,416KiB	 Write Dispatches:   10,724,
160,687KiB
 Reads Requeued:       446		 Writes Requeued:        69
 Reads Completed:      586,    3,820KiB	 Writes Completed:    5,310,
4,103KiB
 Read Merges:            0,        0KiB	 Write Merges:           57,
664KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            54        	 Timer unplugs:           8
CPU3 (xfs.trace):
 Reads Queued:         126,    1,436KiB	 Writes Queued:       5,197,
47,004KiB
 Read Dispatches:      135,    1,124KiB	 Write Dispatches:    5,388,
46,187KiB
 Reads Requeued:        20		 Writes Requeued:       333
 Reads Completed:       82,      940KiB	 Writes Completed:    5,037,
1,551KiB
 Read Merges:            0,        0KiB	 Write Merges:           91,
708KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            50        	 Timer unplugs:           8

Total (xfs.trace):
 Reads Queued:       2,709,   19,544KiB	 Writes Queued:      40,572,
548,309KiB
 Read Dispatches:    3,694,   19,544KiB	 Write Dispatches:   41,213,
548,319KiB
 Reads Requeued:       985		 Writes Requeued:     1,222
 Reads Completed:    2,709,   19,544KiB	 Writes Completed:   40,718,
25,839KiB
 Read Merges:            0,        0KiB	 Write Merges:          209,
1,796KiB
 IO unplugs:           238        	 Timer unplugs:          26

Throughput (R/W): 45KiB/s / 60KiB/s
Events (xfs.trace): 309,109 entries
Skips: 0 forward (0 -   0.0%)

CPU0 (xfs.trace):
 Reads Queued:       1,274,    9,216KiB	 Writes Queued:       9,981,
156,710KiB
 Read Dispatches:    1,738,    8,656KiB	 Write Dispatches:    9,855,
157,818KiB
 Reads Requeued:       495		 Writes Requeued:        72
 Reads Completed:    1,844,   12,916KiB	 Writes Completed:   15,486,
8,360KiB
 Read Merges:            0,        0KiB	 Write Merges:           42,
364KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            95        	 Timer unplugs:           3
CPU1 (xfs.trace):
 Reads Queued:         153,    1,372KiB	 Writes Queued:      14,521,
183,608KiB
 Read Dispatches:      235,    2,348KiB	 Write Dispatches:   15,246,
183,627KiB
 Reads Requeued:        24		 Writes Requeued:       748
 Reads Completed:      197,    1,868KiB	 Writes Completed:   14,885,
11,825KiB
 Read Merges:            0,        0KiB	 Write Merges:           19,
60KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            39        	 Timer unplugs:           7
CPU2 (xfs.trace):
 Reads Queued:       1,156,    7,520KiB	 Writes Queued:      10,873,
160,987KiB
 Read Dispatches:    1,586,    7,416KiB	 Write Dispatches:   10,724,
160,687KiB
 Reads Requeued:       446		 Writes Requeued:        69
 Reads Completed:      586,    3,820KiB	 Writes Completed:    5,310,
4,103KiB
 Read Merges:            0,        0KiB	 Write Merges:           57,
664KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            54        	 Timer unplugs:           8
CPU3 (xfs.trace):
 Reads Queued:         126,    1,436KiB	 Writes Queued:       5,197,
47,004KiB
 Read Dispatches:      135,    1,124KiB	 Write Dispatches:    5,388,
46,187KiB
 Reads Requeued:        20		 Writes Requeued:       333
 Reads Completed:       82,      940KiB	 Writes Completed:    5,037,
1,551KiB
 Read Merges:            0,        0KiB	 Write Merges:           91,
708KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            50        	 Timer unplugs:           8

Total (xfs.trace):
 Reads Queued:       2,709,   19,544KiB	 Writes Queued:      40,572,
548,309KiB
 Read Dispatches:    3,694,   19,544KiB	 Write Dispatches:   41,213,
548,319KiB
 Reads Requeued:       985		 Writes Requeued:     1,222
 Reads Completed:    2,709,   19,544KiB	 Writes Completed:   40,718,
25,839KiB
 Read Merges:            0,        0KiB	 Write Merges:          209,
1,796KiB
 IO unplugs:           238        	 Timer unplugs:          26

Throughput (R/W): 45KiB/s / 60KiB/s
Events (xfs.trace): 309,109 entries
Skips: 0 forward (0 -   0.0%)

CPU0 (xfs.trace):
 Reads Queued:       1,274,    9,216KiB	 Writes Queued:       9,981,
156,710KiB
 Read Dispatches:    1,738,    8,656KiB	 Write Dispatches:    9,855,
157,818KiB
 Reads Requeued:       495		 Writes Requeued:        72
 Reads Completed:    1,844,   12,916KiB	 Writes Completed:   15,486,
8,360KiB
 Read Merges:            0,        0KiB	 Write Merges:           42,
364KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            95        	 Timer unplugs:           3
CPU1 (xfs.trace):
 Reads Queued:         153,    1,372KiB	 Writes Queued:      14,521,
183,608KiB
 Read Dispatches:      235,    2,348KiB	 Write Dispatches:   15,246,
183,627KiB
 Reads Requeued:        24		 Writes Requeued:       748
 Reads Completed:      197,    1,868KiB	 Writes Completed:   14,885,
11,825KiB
 Read Merges:            0,        0KiB	 Write Merges:           19,
60KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            39        	 Timer unplugs:           7
CPU2 (xfs.trace):
 Reads Queued:       1,156,    7,520KiB	 Writes Queued:      10,873,
160,987KiB
 Read Dispatches:    1,586,    7,416KiB	 Write Dispatches:   10,724,
160,687KiB
 Reads Requeued:       446		 Writes Requeued:        69
 Reads Completed:      586,    3,820KiB	 Writes Completed:    5,310,
4,103KiB
 Read Merges:            0,        0KiB	 Write Merges:           57,
664KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            54        	 Timer unplugs:           8
CPU3 (xfs.trace):
 Reads Queued:         126,    1,436KiB	 Writes Queued:       5,197,
47,004KiB
 Read Dispatches:      135,    1,124KiB	 Write Dispatches:    5,388,
46,187KiB
 Reads Requeued:        20		 Writes Requeued:       333
 Reads Completed:       82,      940KiB	 Writes Completed:    5,037,
1,551KiB
 Read Merges:            0,        0KiB	 Write Merges:           91,
708KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            50        	 Timer unplugs:           8

Total (xfs.trace):
 Reads Queued:       2,709,   19,544KiB	 Writes Queued:      40,572,
548,309KiB
 Read Dispatches:    3,694,   19,544KiB	 Write Dispatches:   41,213,
548,319KiB
 Reads Requeued:       985		 Writes Requeued:     1,222
 Reads Completed:    2,709,   19,544KiB	 Writes Completed:   40,718,
25,839KiB
 Read Merges:            0,        0KiB	 Write Merges:          209,
1,796KiB
 IO unplugs:           238        	 Timer unplugs:          26

Throughput (R/W): 45KiB/s / 60KiB/s
Events (xfs.trace): 309,109 entries
Skips: 0 forward (0 -   0.0%)

CPU0 (xfs.trace):
 Reads Queued:       1,274,    9,216KiB	 Writes Queued:       9,981,
156,710KiB
 Read Dispatches:    1,738,    8,656KiB	 Write Dispatches:    9,855,
157,818KiB
 Reads Requeued:       495		 Writes Requeued:        72
 Reads Completed:    1,844,   12,916KiB	 Writes Completed:   15,486,
8,360KiB
 Read Merges:            0,        0KiB	 Write Merges:           42,
364KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            95        	 Timer unplugs:           3
CPU1 (xfs.trace):
 Reads Queued:         153,    1,372KiB	 Writes Queued:      14,521,
183,608KiB
 Read Dispatches:      235,    2,348KiB	 Write Dispatches:   15,246,
183,627KiB
 Reads Requeued:        24		 Writes Requeued:       748
 Reads Completed:      197,    1,868KiB	 Writes Completed:   14,885,
11,825KiB
 Read Merges:            0,        0KiB	 Write Merges:           19,
60KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            39        	 Timer unplugs:           7
CPU2 (xfs.trace):
 Reads Queued:       1,156,    7,520KiB	 Writes Queued:      10,873,
160,987KiB
 Read Dispatches:    1,586,    7,416KiB	 Write Dispatches:   10,724,
160,687KiB
 Reads Requeued:       446		 Writes Requeued:        69
 Reads Completed:      586,    3,820KiB	 Writes Completed:    5,310,
4,103KiB
 Read Merges:            0,        0KiB	 Write Merges:           57,
664KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            54        	 Timer unplugs:           8
CPU3 (xfs.trace):
 Reads Queued:         126,    1,436KiB	 Writes Queued:       5,197,
47,004KiB
 Read Dispatches:      135,    1,124KiB	 Write Dispatches:    5,388,
46,187KiB
 Reads Requeued:        20		 Writes Requeued:       333
 Reads Completed:       82,      940KiB	 Writes Completed:    5,037,
1,551KiB
 Read Merges:            0,        0KiB	 Write Merges:           91,
708KiB
 Read depth:            10        	 Write depth:            11
 IO unplugs:            50        	 Timer unplugs:           8

Total (xfs.trace):
 Reads Queued:       2,709,   19,544KiB	 Writes Queued:      40,572,
548,309KiB
 Read Dispatches:    3,694,   19,544KiB	 Write Dispatches:   41,213,
548,319KiB
 Reads Requeued:       985		 Writes Requeued:     1,222
 Reads Completed:    2,709,   19,544KiB	 Writes Completed:   40,718,
25,839KiB
 Read Merges:            0,        0KiB	 Write Merges:          209,
1,796KiB
 IO unplugs:           238        	 Timer unplugs:          26

Throughput (R/W): 45KiB/s / 60KiB/s
Events (xfs.trace): 309,109 entries
Skips: 0 forward (0 -   0.0%)



>  
> 
> Thanks for the input and patience.  
> 
> ~tom
> 
> 


[-- Attachment #1.1.2: xfs.trace.blktrace.3.bz2 --]
[-- Type: application/x-bzip, Size: 358698 bytes --]

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
  2012-02-29  4:08     ` Dave Chinner
                         ` (2 preceding siblings ...)
       [not found]       ` <1330658311.6438.24.camel@core24>
@ 2012-03-02 15:41       ` Thomas Lynema
  2012-03-05  3:02         ` Dave Chinner
  3 siblings, 1 reply; 14+ messages in thread
From: Thomas Lynema @ 2012-03-02 15:41 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs


[-- Attachment #1.1.1: Type: text/plain, Size: 14684 bytes --]

[....]

> Any errors in dmesg?
None
> Also, I think you need to provide a block trace (output of
> blktrace/blkparse for the rm -rf workloads) for both the XFS and
> ext4 cases

Here's the output from running similar
commands as Christoph:

blkparse xfs.trace.blktrace.* | grep -w D | wc -l 
1134996
blkparse ext4.trace.blktrace.* | grep -w D | wc -l 
27240

Attached archives of half of the ext4 results. 

Summary:

CPU0 (ext4.trace):
 Reads Queued:         372,    1,944KiB	 Writes Queued:       1,932,    7,728KiB
 Read Dispatches:      372,    1,944KiB	 Write Dispatches:       19,    7,756KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:      419,    2,104KiB	 Writes Completed:       13,    5,123KiB
 Read Merges:            0,        0KiB	 Write Merges:        1,914,    7,656KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            12        	 Timer unplugs:           0
CPU1 (ext4.trace):
 Reads Queued:       3,119,   12,484KiB	 Writes Queued:       2,591,  420,624KiB
 Read Dispatches:    3,168,   12,484KiB	 Write Dispatches:      356,  423,100KiB
 Reads Requeued:        49		 Writes Requeued:       115
 Reads Completed:    3,271,   13,104KiB	 Writes Completed:      251,   11,317KiB
 Read Merges:            0,        0KiB	 Write Merges:        2,358,    9,504KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            10        	 Timer unplugs:           2
CPU2 (ext4.trace):
 Reads Queued:         191,      844KiB	 Writes Queued:           2,        3KiB
 Read Dispatches:      191,      844KiB	 Write Dispatches:        1,        3KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:      144,      684KiB	 Writes Completed:       10,    2,608KiB
 Read Merges:            0,        0KiB	 Write Merges:            0,        0KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:             6        	 Timer unplugs:           0
CPU3 (ext4.trace):
 Reads Queued:       1,273,    5,116KiB	 Writes Queued:         852,  112,049KiB
 Read Dispatches:    1,315,    5,116KiB	 Write Dispatches:       72,  109,545KiB
 Reads Requeued:        42		 Writes Requeued:        11
 Reads Completed:    1,121,    4,496KiB	 Writes Completed:       56,    4,024KiB
 Read Merges:            0,        0KiB	 Write Merges:          779,    3,392KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            15        	 Timer unplugs:          11

Total (ext4.trace):
 Reads Queued:       4,955,   20,388KiB	 Writes Queued:       5,377,  540,404KiB
 Read Dispatches:    5,046,   20,388KiB	 Write Dispatches:      448,  540,404KiB
 Reads Requeued:        91		 Writes Requeued:       126
 Reads Completed:    4,955,   20,388KiB	 Writes Completed:      330,   23,072KiB
 Read Merges:            0,        0KiB	 Write Merges:        5,051,   20,552KiB
 IO unplugs:            43        	 Timer unplugs:          13

Throughput (R/W): 5,011KiB/s / 5,671KiB/s
Events (ext4.trace): 57,930 entries
Skips: 0 forward (0 -   0.0%)

CPU0 (ext4.trace):
 Reads Queued:         372,    1,944KiB	 Writes Queued:       1,932,    7,728KiB
 Read Dispatches:      372,    1,944KiB	 Write Dispatches:       19,    7,756KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:      419,    2,104KiB	 Writes Completed:       13,    5,123KiB
 Read Merges:            0,        0KiB	 Write Merges:        1,914,    7,656KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            12        	 Timer unplugs:           0
CPU1 (ext4.trace):
 Reads Queued:       3,119,   12,484KiB	 Writes Queued:       2,591,  420,624KiB
 Read Dispatches:    3,168,   12,484KiB	 Write Dispatches:      356,  423,100KiB
 Reads Requeued:        49		 Writes Requeued:       115
 Reads Completed:    3,271,   13,104KiB	 Writes Completed:      251,   11,317KiB
 Read Merges:            0,        0KiB	 Write Merges:        2,358,    9,504KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            10        	 Timer unplugs:           2
CPU2 (ext4.trace):
 Reads Queued:         191,      844KiB	 Writes Queued:           2,        3KiB
 Read Dispatches:      191,      844KiB	 Write Dispatches:        1,        3KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:      144,      684KiB	 Writes Completed:       10,    2,608KiB
 Read Merges:            0,        0KiB	 Write Merges:            0,        0KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:             6        	 Timer unplugs:           0
CPU3 (ext4.trace):
 Reads Queued:       1,273,    5,116KiB	 Writes Queued:         852,  112,049KiB
 Read Dispatches:    1,315,    5,116KiB	 Write Dispatches:       72,  109,545KiB
 Reads Requeued:        42		 Writes Requeued:        11
 Reads Completed:    1,121,    4,496KiB	 Writes Completed:       56,    4,024KiB
 Read Merges:            0,        0KiB	 Write Merges:          779,    3,392KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            15        	 Timer unplugs:          11

Total (ext4.trace):
 Reads Queued:       4,955,   20,388KiB	 Writes Queued:       5,377,  540,404KiB
 Read Dispatches:    5,046,   20,388KiB	 Write Dispatches:      448,  540,404KiB
 Reads Requeued:        91		 Writes Requeued:       126
 Reads Completed:    4,955,   20,388KiB	 Writes Completed:      330,   23,072KiB
 Read Merges:            0,        0KiB	 Write Merges:        5,051,   20,552KiB
 IO unplugs:            43        	 Timer unplugs:          13

Throughput (R/W): 5,011KiB/s / 5,671KiB/s
Events (ext4.trace): 57,930 entries
Skips: 0 forward (0 -   0.0%)

CPU0 (ext4.trace):
 Reads Queued:         372,    1,944KiB	 Writes Queued:       1,932,    7,728KiB
 Read Dispatches:      372,    1,944KiB	 Write Dispatches:       19,    7,756KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:      419,    2,104KiB	 Writes Completed:       13,    5,123KiB
 Read Merges:            0,        0KiB	 Write Merges:        1,914,    7,656KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            12        	 Timer unplugs:           0
CPU1 (ext4.trace):
 Reads Queued:       3,119,   12,484KiB	 Writes Queued:       2,591,  420,624KiB
 Read Dispatches:    3,168,   12,484KiB	 Write Dispatches:      356,  423,100KiB
 Reads Requeued:        49		 Writes Requeued:       115
 Reads Completed:    3,271,   13,104KiB	 Writes Completed:      251,   11,317KiB
 Read Merges:            0,        0KiB	 Write Merges:        2,358,    9,504KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            10        	 Timer unplugs:           2
CPU2 (ext4.trace):
 Reads Queued:         191,      844KiB	 Writes Queued:           2,        3KiB
 Read Dispatches:      191,      844KiB	 Write Dispatches:        1,        3KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:      144,      684KiB	 Writes Completed:       10,    2,608KiB
 Read Merges:            0,        0KiB	 Write Merges:            0,        0KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:             6        	 Timer unplugs:           0
CPU3 (ext4.trace):
 Reads Queued:       1,273,    5,116KiB	 Writes Queued:         852,  112,049KiB
 Read Dispatches:    1,315,    5,116KiB	 Write Dispatches:       72,  109,545KiB
 Reads Requeued:        42		 Writes Requeued:        11
 Reads Completed:    1,121,    4,496KiB	 Writes Completed:       56,    4,024KiB
 Read Merges:            0,        0KiB	 Write Merges:          779,    3,392KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            15        	 Timer unplugs:          11

Total (ext4.trace):
 Reads Queued:       4,955,   20,388KiB	 Writes Queued:       5,377,  540,404KiB
 Read Dispatches:    5,046,   20,388KiB	 Write Dispatches:      448,  540,404KiB
 Reads Requeued:        91		 Writes Requeued:       126
 Reads Completed:    4,955,   20,388KiB	 Writes Completed:      330,   23,072KiB
 Read Merges:            0,        0KiB	 Write Merges:        5,051,   20,552KiB
 IO unplugs:            43        	 Timer unplugs:          13

Throughput (R/W): 5,011KiB/s / 5,671KiB/s
Events (ext4.trace): 57,930 entries
Skips: 0 forward (0 -   0.0%)

CPU0 (ext4.trace):
 Reads Queued:         372,    1,944KiB	 Writes Queued:       1,932,    7,728KiB
 Read Dispatches:      372,    1,944KiB	 Write Dispatches:       19,    7,756KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:      419,    2,104KiB	 Writes Completed:       13,    5,123KiB
 Read Merges:            0,        0KiB	 Write Merges:        1,914,    7,656KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            12        	 Timer unplugs:           0
CPU1 (ext4.trace):
 Reads Queued:       3,119,   12,484KiB	 Writes Queued:       2,591,  420,624KiB
 Read Dispatches:    3,168,   12,484KiB	 Write Dispatches:      356,  423,100KiB
 Reads Requeued:        49		 Writes Requeued:       115
 Reads Completed:    3,271,   13,104KiB	 Writes Completed:      251,   11,317KiB
 Read Merges:            0,        0KiB	 Write Merges:        2,358,    9,504KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            10        	 Timer unplugs:           2
CPU2 (ext4.trace):
 Reads Queued:         191,      844KiB	 Writes Queued:           2,        3KiB
 Read Dispatches:      191,      844KiB	 Write Dispatches:        1,        3KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:      144,      684KiB	 Writes Completed:       10,    2,608KiB
 Read Merges:            0,        0KiB	 Write Merges:            0,        0KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:             6        	 Timer unplugs:           0
CPU3 (ext4.trace):
 Reads Queued:       1,273,    5,116KiB	 Writes Queued:         852,  112,049KiB
 Read Dispatches:    1,315,    5,116KiB	 Write Dispatches:       72,  109,545KiB
 Reads Requeued:        42		 Writes Requeued:        11
 Reads Completed:    1,121,    4,496KiB	 Writes Completed:       56,    4,024KiB
 Read Merges:            0,        0KiB	 Write Merges:          779,    3,392KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            15        	 Timer unplugs:          11

Total (ext4.trace):
 Reads Queued:       4,955,   20,388KiB	 Writes Queued:       5,377,  540,404KiB
 Read Dispatches:    5,046,   20,388KiB	 Write Dispatches:      448,  540,404KiB
 Reads Requeued:        91		 Writes Requeued:       126
 Reads Completed:    4,955,   20,388KiB	 Writes Completed:      330,   23,072KiB
 Read Merges:            0,        0KiB	 Write Merges:        5,051,   20,552KiB
 IO unplugs:            43        	 Timer unplugs:          13

Throughput (R/W): 5,011KiB/s / 5,671KiB/s
Events (ext4.trace): 57,930 entries
Skips: 0 forward (0 -   0.0%)

CPU0 (ext4.trace):
 Reads Queued:         372,    1,944KiB	 Writes Queued:       1,932,    7,728KiB
 Read Dispatches:      372,    1,944KiB	 Write Dispatches:       19,    7,756KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:      419,    2,104KiB	 Writes Completed:       13,    5,123KiB
 Read Merges:            0,        0KiB	 Write Merges:        1,914,    7,656KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            12        	 Timer unplugs:           0
CPU1 (ext4.trace):
 Reads Queued:       3,119,   12,484KiB	 Writes Queued:       2,591,  420,624KiB
 Read Dispatches:    3,168,   12,484KiB	 Write Dispatches:      356,  423,100KiB
 Reads Requeued:        49		 Writes Requeued:       115
 Reads Completed:    3,271,   13,104KiB	 Writes Completed:      251,   11,317KiB
 Read Merges:            0,        0KiB	 Write Merges:        2,358,    9,504KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            10        	 Timer unplugs:           2
CPU2 (ext4.trace):
 Reads Queued:         191,      844KiB	 Writes Queued:           2,        3KiB
 Read Dispatches:      191,      844KiB	 Write Dispatches:        1,        3KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:      144,      684KiB	 Writes Completed:       10,    2,608KiB
 Read Merges:            0,        0KiB	 Write Merges:            0,        0KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:             6        	 Timer unplugs:           0
CPU3 (ext4.trace):
 Reads Queued:       1,273,    5,116KiB	 Writes Queued:         852,  112,049KiB
 Read Dispatches:    1,315,    5,116KiB	 Write Dispatches:       72,  109,545KiB
 Reads Requeued:        42		 Writes Requeued:        11
 Reads Completed:    1,121,    4,496KiB	 Writes Completed:       56,    4,024KiB
 Read Merges:            0,        0KiB	 Write Merges:          779,    3,392KiB
 Read depth:            29        	 Write depth:            32
 IO unplugs:            15        	 Timer unplugs:          11

Total (ext4.trace):
 Reads Queued:       4,955,   20,388KiB	 Writes Queued:       5,377,  540,404KiB
 Read Dispatches:    5,046,   20,388KiB	 Write Dispatches:      448,  540,404KiB
 Reads Requeued:        91		 Writes Requeued:       126
 Reads Completed:    4,955,   20,388KiB	 Writes Completed:      330,   23,072KiB
 Read Merges:            0,        0KiB	 Write Merges:        5,051,   20,552KiB
 IO unplugs:            43        	 Timer unplugs:          13

Throughput (R/W): 5,011KiB/s / 5,671KiB/s
Events (ext4.trace): 57,930 entries
Skips: 0 forward (0 -   0.0%)




Peter was right then.  Per multiple recommendations
I'm switching to fstrim, it is quicker than the deletes.

fstrim consistently takes about a minute to run on a 40GB
volume.

time fstrim -v /usr
/usr: 9047117824 bytes were trimmed

real	0m56.121s
user	0m0.090s
sys	0m0.000s

I see that the running of fstrim was already discussed
http://oss.sgi.com/archives/xfs/2011-05/msg00338.html.

Please add something to the FAQ about using SSDs.  It would be great if
people could see the recommended mount options and fstrim crontab entry
(or other option) for running xfs on a SSD.  

Thanks for the input and patience.  

~tom

[-- Attachment #1.1.2: ext4.trace.blktrace.tar.bz --]
[-- Type: application/x-bzip-compressed-tar, Size: 364717 bytes --]

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
  2012-03-02 15:41       ` Thomas Lynema
@ 2012-03-05  3:02         ` Dave Chinner
  2012-03-05  6:41           ` Jeffrey Hundstad
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2012-03-05  3:02 UTC (permalink / raw)
  To: Thomas Lynema; +Cc: xfs

On Fri, Mar 02, 2012 at 10:41:50AM -0500, Thomas Lynema wrote:
> Peter was right then.  Per multiple recommendations
> I'm switching to fstrim, it is quicker than the deletes.
> 
> fstrim consistently takes about a minute to run on a 40GB
> volume.
> 
> time fstrim -v /usr
> /usr: 9047117824 bytes were trimmed
> 
> real	0m56.121s
> user	0m0.090s
> sys	0m0.000s
> 
> I see that the running of fstrim was already discussed
> http://oss.sgi.com/archives/xfs/2011-05/msg00338.html.
> 
> Please add something to the FAQ about using SSDs.  It would be great if
> people could see the recommended mount options and fstrim crontab entry
> (or other option) for running xfs on a SSD.  

It's a publicly modifiable wiki. Feel free to add what you've
learned here to a new FAQ entry - others will come by and correct
anything in it that is inaccurate....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Poor performance using discard
  2012-03-05  3:02         ` Dave Chinner
@ 2012-03-05  6:41           ` Jeffrey Hundstad
  0 siblings, 0 replies; 14+ messages in thread
From: Jeffrey Hundstad @ 2012-03-05  6:41 UTC (permalink / raw)
  To: xfs

On 03/04/2012 09:02 PM, Dave Chinner wrote:
> On Fri, Mar 02, 2012 at 10:41:50AM -0500, Thomas Lynema wrote:
>> Peter was right then.  Per multiple recommendations
>> I'm switching to fstrim, it is quicker than the deletes.
>>
>> fstrim consistently takes about a minute to run on a 40GB
>> volume.
>>
>> time fstrim -v /usr
>> /usr: 9047117824 bytes were trimmed
>>
>> real	0m56.121s
>> user	0m0.090s
>> sys	0m0.000s
>>
>> I see that the running of fstrim was already discussed
>> http://oss.sgi.com/archives/xfs/2011-05/msg00338.html.
>>
>> Please add something to the FAQ about using SSDs.  It would be great if
>> people could see the recommended mount options and fstrim crontab entry
>> (or other option) for running xfs on a SSD.
> It's a publicly modifiable wiki. Feel free to add what you've
> learned here to a new FAQ entry - others will come by and correct
> anything in it that is inaccurate....

And it is already there:
http://xfs.org/index.php/FITRIM/discard

Feel free to expand it.

-- 
Jeffrey Hundstad

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-03-05  6:41 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-28 22:56 Poor performance using discard Thomas Lynema
2012-02-28 23:58 ` Peter Grandi
2012-02-29  1:22 ` Dave Chinner
2012-02-29  2:00   ` Thomas Lynema
2012-02-29  4:08     ` Dave Chinner
2012-02-29 10:38       ` Peter Grandi
2012-02-29 19:46       ` Eric Sandeen
2012-03-01  5:59         ` Christoph Hellwig
2012-03-01  6:27           ` Dave Chinner
2012-03-01  6:31             ` Christoph Hellwig
     [not found]       ` <1330658311.6438.24.camel@core24>
2012-03-02 14:57         ` Thomas Lynema
2012-03-02 15:41       ` Thomas Lynema
2012-03-05  3:02         ` Dave Chinner
2012-03-05  6:41           ` Jeffrey Hundstad

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox