linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/4] ext4: extents status tree shrinker improvement
@ 2014-04-16 11:30 Zheng Liu
  2014-04-16 11:30 ` [RFC PATCH v2 1/4] ext4: improve extents status tree trace point Zheng Liu
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Zheng Liu @ 2014-04-16 11:30 UTC (permalink / raw)
  To: linux-ext4; +Cc: Zheng Liu, Theodore Ts'o, Andreas Dilger, Jan Kara

Hi all,

Here is the second version to improve the extent status tree shrinker.
In this version I do some cleanups, add some statistics, and implement
two apporaches that we discussed at Napa to improve the shrinker.

One is to improve the current lru algorithm, which add a new list to
track all reclaimable objects in order not to burn some cpu time to scan
delayed extent.  Meanwhile it makes lru algorithm more efficient when
some applications open a huge number of files.  Another apporach is
inspired by Jan Kara.  It drops lru algorithm and uses a round-robin
algorithm to shrink all reclaimable extent caches.  Every time the
shrinker scans the list and tries to shrink objects from the position
that it stopped at last time.  Please see the commit log in the patch
to get the more details.

>From the result, the conclusion is that the round-robin algorithm wins.
Espeically if the applications open a large amount of files.

In this patch set, patch 1 is pretty stable and can be queued in this
cycle.  Patch 2 adds some statistics in order that we can collect more
details about the status of the shrinker.  But I am not sure whether or
not we should enable it by default.  Maybe we need to define a switch
to turn on/off dynamically.  Patch 3 and patch 4 improve the shrinker
as described above.

There are also some improvements for these apporaches, such as using
rcu when the shrinker traverses the list because now the shrinker does
not need to change the list during this process.  Another improvement
is to make the shrinker numa-aware.  But before that I believe this
patch set should be reviewed as soon as possible.  Now the key problem
is to make a decision which apporach should be applied.

I use two test cases to compare these improvements.  The test case A
simulates some applications that generate a very fragmented extents
status tree, and the test case B simulates some applications opens a
large number of files with a few extent caches.  Every test cases are
run 3 times.

For getting a fragmented extents status tree, I hack the code and let
ext4_es_can_be_merged() always return 0 in order to disable to merge
the extents status tree.  Meanwhile for increasing the memory pressure,
vm.dirty_background_ratio is set to 60, and vm.dirty_ratio is set to 80
in order to keep dirty pages in memory as many as possible.

Environement
============
$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                16
On-line CPU(s) list:   0-15
Thread(s) per core:    2
Core(s) per socket:    4
CPU socket(s):         2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 44
Stepping:              2
CPU MHz:               2400.000
BogoMIPS:              4799.89
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              12288K
NUMA node0 CPU(s):     0-3,8-11
NUMA node1 CPU(s):     4-7,12-15

$ cat /proc/meminfo
MemTotal:       24677988 kB

$ df -ah
/dev/sdb1             183G   15G  159G   9% /mnt/sdb1 (HDD)

The Test Case A
===============

Script
------ 
[global]
ioengine=psync
bs=4k
directory=/mnt/sdb1
group_reporting
fallocate=0
direct=0
filesize=100000g
size=600000g
runtime=300
create_on_open=1
create_serialize=0
create_fsync=0
norandommap

[io]
rw=write
numjobs=100
nrfiles=5

Max Scan Time
-------------
x vanilla
+ lru
* rr
    N           Min           Max        Median           Avg        Stddev
x   3         22230         24607         23532     23456.333     1190.3051
+   3           203           364           301     289.33333      81.13158
Difference at 95.0% confidence
        -23167 +/- 1912.16
        -98.7665% +/- 8.15199%
        (Student's t, pooled s = 843.626)
*   3           165           248           172           195     46.032597
Difference at 95.0% confidence
        -23261.3 +/- 1909.16
        -99.1687% +/- 8.1392%
        (Student's t, pooled s = 842.302)

Avg. Scan Time
-------------
x vanilla
+ lru
* rr
    N           Min           Max        Median           Avg        Stddev
x 220           204         15997          3976     5268.6773     4121.2038
+ 220           105           169           126        132.65     14.904881
Difference at 95.0% confidence
        -5136.03 +/- 544.593
        -97.4823% +/- 10.3364%
        (Student's t, pooled s = 2914.15)
* 224            55           144            82     97.834821     27.811093
Difference at 95.0% confidence
        -5170.84 +/- 539.706
        -98.1431% +/- 10.2437%
        (Student's t, pooled s = 2900.98)

The Test Case B
===============

Script
------
[global]
ioengine=psync
bs=4k
directory=/mnt/sdb1
group_reporting
fallocate=0
direct=0
runtime=300
create_on_open=1
create_serialize=0
create_fsync=0
norandommap

[io]
rw=randwrite
numjobs=25
nrfiles=40000

[streamer]
rw=write
numjobs=1
filesize=1000g
size=1000g
nrfiles=1

Max Scan Time
-------------
x vanilla
+ lru
* rr
    N           Min           Max        Median           Avg        Stddev
x   3        390531        481463        393469        421821     51672.373
+   3        106433        170801        130652        135962     32510.874
Difference at 95.0% confidence
        -285859 +/- 97844.9
        -67.7678% +/- 23.1958%
        (Student's t, pooled s = 43168.2)
*   3         72569        156338        113704     114203.67     41886.735
Difference at 95.0% confidence
        -307617 +/- 106609
        -72.926% +/- 25.2734%
        (Student's t, pooled s = 47034.7)

Avg. Scan Time
-------------
x vanilla
+ lru
* rr
    N           Min           Max        Median           Avg        Stddev
x 221           164        155601         19553     24630.968     22736.242
+ 207            44         49210         13633     16167.768     15087.729
Difference at 95.0% confidence
        -8463.2 +/- 3681.22
        -34.36% +/- 14.9455%
        (Student's t, pooled s = 19417.6)
*  78            41         18043           166     808.85897     2605.2387
Difference at 95.0% confidence
        -23822.1 +/- 5062.86
        -96.7161% +/- 20.5548%
        (Student's t, pooled s = 19613.2)

As always, feedback, comment and idea are welcome.

Regards,
						- Zheng

Zheng Liu (4):
  ext4: improve extents status tree trace point
  ext4: track extent status tree shrinker delay statictics
  ext4: improve extents status tree shrinker lru algorithm
  ext4: use a round-robin algorithm to shrink extent cache

 fs/ext4/ext4.h              |   11 +-
 fs/ext4/extents.c           |    4 +-
 fs/ext4/extents_status.c    |  310 +++++++++++++++++++++++++++++--------------
 fs/ext4/extents_status.h    |   16 ++-
 fs/ext4/inode.c             |    4 +-
 fs/ext4/ioctl.c             |    4 +-
 fs/ext4/super.c             |   22 ++-
 include/trace/events/ext4.h |   59 ++++++--
 8 files changed, 296 insertions(+), 134 deletions(-)

-- 
1.7.9.7


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-04-24  1:46 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-16 11:30 [RFC PATCH v2 0/4] ext4: extents status tree shrinker improvement Zheng Liu
2014-04-16 11:30 ` [RFC PATCH v2 1/4] ext4: improve extents status tree trace point Zheng Liu
2014-04-16 11:30 ` [RFC PATCH v2 2/4] ext4: track extent status tree shrinker delay statictics Zheng Liu
2014-04-16 11:30 ` [RFC PATCH v2 3/4] ext4: improve extents status tree shrinker lru algorithm Zheng Liu
2014-04-16 11:30 ` [RFC PATCH v2 4/4] ext4: use a round-robin algorithm to shrink extent cache Zheng Liu
2014-04-16 15:19 ` [RFC PATCH v2 0/4] ext4: extents status tree shrinker improvement Theodore Ts'o
2014-04-16 15:42   ` Theodore Ts'o
2014-04-17 15:35     ` Theodore Ts'o
2014-04-21 13:50       ` Zheng Liu
2014-04-21 14:05         ` Theodore Ts'o
2014-04-21 14:46           ` Zheng Liu
2014-04-21 14:54             ` Theodore Ts'o
2014-04-21 23:10       ` Dave Chinner
2014-04-23  5:35         ` Zheng Liu
2014-04-24  1:46           ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).