From: Minchan Kim <minchan@kernel.org>
To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
<linux-kernel@vger.kernel.org>,
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Subject: Re: [PATCH 2/7] zram: switch to crypto compress API
Date: Fri, 27 May 2016 13:22:07 +0900 [thread overview]
Message-ID: <20160527042207.GB2322@bbox> (raw)
In-Reply-To: <20160525143006.1207-3-sergey.senozhatsky@gmail.com>
On Wed, May 25, 2016 at 11:30:01PM +0900, Sergey Senozhatsky wrote:
> We don't have an idle zstreams list anymore and our write path
> now works absolutely differently, preventing preemption during
> compression. This removes possibilities of read paths preempting
> writes at wrong places (which could badly affect the performance
> of both paths) and at the same time opens the door for a move
> from custom LZO/LZ4 compression backends implementation to a more
> generic one, using crypto compress API.
>
> Joonsoo Kim [1] attempted to do this a while ago, but faced with
> the need of introducing a new crypto API interface. The root cause
> was the fact that crypto API compression algorithms require a
> compression stream structure (in zram terminology) for both
> compression and decompression ops, while in reality only several
> of compression algorithms really need it. This resulted in a
> concept of context-less crypto API compression backends [2]. Both
> write and read paths, though, would have been executed with the
> preemption enabled, which in the worst case could have resulted
> in a decreased worst-case performance, e.g. consider the
> following case:
>
> CPU0
>
> zram_write()
> spin_lock()
> take the last idle stream
> spin_unlock()
>
> << preempted >>
>
> zram_read()
> spin_lock()
> no idle streams
> spin_unlock()
> schedule()
>
> resuming zram_write compression()
>
> but it took me some time to realize that, and it took even longer
> to evolve zram and to make it ready for crypto API. The key turned
> out to be -- drop the idle streams list entirely. With out the
Without
> idle streams list we are free to use compression algorithms that
> require compression stream for decompression (read), because streams
> are now placed in per-cpu data and each write path has to disable
> preemption for compression op, almost completely eliminating the
> aforementioned case (technically, we still have a small chance,
> because write path has a fast and a slow paths and the slow path
> is executed with the preemption enabled; but the frequency of
> failed fast path is too low).
Nice description.
>
> TEST
> ====
>
> - 4 CPUs, x86_64 system
> - 3G zram, lzo
> - fio tests: read, randread, write, randwrite, rw, randrw
>
> test script [3] command:
> ZRAM_SIZE=3G LOG_SUFFIX=XXXX FIO_LOOPS=5 ./zram-fio-test.sh
>
> BASE PATCHED
> jobs1
> READ: 2527.2MB/s 2482.7MB/s
> READ: 2102.7MB/s 2045.0MB/s
> WRITE: 1284.3MB/s 1324.3MB/s
> WRITE: 1080.7MB/s 1101.9MB/s
> READ: 430125KB/s 437498KB/s
> WRITE: 430538KB/s 437919KB/s
> READ: 399593KB/s 403987KB/s
> WRITE: 399910KB/s 404308KB/s
> jobs2
> READ: 8133.5MB/s 7854.8MB/s
> READ: 7086.6MB/s 6912.8MB/s
> WRITE: 3177.2MB/s 3298.3MB/s
> WRITE: 2810.2MB/s 2871.4MB/s
> READ: 1017.6MB/s 1023.4MB/s
> WRITE: 1018.2MB/s 1023.1MB/s
> READ: 977836KB/s 984205KB/s
> WRITE: 979435KB/s 985814KB/s
> jobs3
> READ: 13557MB/s 13391MB/s
> READ: 11876MB/s 11752MB/s
> WRITE: 4641.5MB/s 4682.1MB/s
> WRITE: 4164.9MB/s 4179.3MB/s
> READ: 1453.8MB/s 1455.1MB/s
> WRITE: 1455.1MB/s 1458.2MB/s
> READ: 1387.7MB/s 1395.7MB/s
> WRITE: 1386.1MB/s 1394.9MB/s
> jobs4
> READ: 20271MB/s 20078MB/s
> READ: 18033MB/s 17928MB/s
> WRITE: 6176.8MB/s 6180.5MB/s
> WRITE: 5686.3MB/s 5705.3MB/s
> READ: 2009.4MB/s 2006.7MB/s
> WRITE: 2007.5MB/s 2004.9MB/s
> READ: 1929.7MB/s 1935.6MB/s
> WRITE: 1926.8MB/s 1932.6MB/s
> jobs5
> READ: 18823MB/s 19024MB/s
> READ: 18968MB/s 19071MB/s
> WRITE: 6191.6MB/s 6372.1MB/s
> WRITE: 5818.7MB/s 5787.1MB/s
> READ: 2011.7MB/s 1981.3MB/s
> WRITE: 2011.4MB/s 1980.1MB/s
> READ: 1949.3MB/s 1935.7MB/s
> WRITE: 1940.4MB/s 1926.1MB/s
> jobs6
> READ: 21870MB/s 21715MB/s
> READ: 19957MB/s 19879MB/s
> WRITE: 6528.4MB/s 6537.6MB/s
> WRITE: 6098.9MB/s 6073.6MB/s
> READ: 2048.6MB/s 2049.9MB/s
> WRITE: 2041.7MB/s 2042.9MB/s
> READ: 2013.4MB/s 1990.4MB/s
> WRITE: 2009.4MB/s 1986.5MB/s
> jobs7
> READ: 21359MB/s 21124MB/s
> READ: 19746MB/s 19293MB/s
> WRITE: 6660.4MB/s 6518.8MB/s
> WRITE: 6211.6MB/s 6193.1MB/s
> READ: 2089.7MB/s 2080.6MB/s
> WRITE: 2085.8MB/s 2076.5MB/s
> READ: 2041.2MB/s 2052.5MB/s
> WRITE: 2037.5MB/s 2048.8MB/s
> jobs8
> READ: 20477MB/s 19974MB/s
> READ: 18922MB/s 18576MB/s
> WRITE: 6851.9MB/s 6788.3MB/s
> WRITE: 6407.7MB/s 6347.5MB/s
> READ: 2134.8MB/s 2136.1MB/s
> WRITE: 2132.8MB/s 2134.4MB/s
> READ: 2074.2MB/s 2069.6MB/s
> WRITE: 2087.3MB/s 2082.4MB/s
> jobs9
> READ: 19797MB/s 19994MB/s
> READ: 18806MB/s 18581MB/s
> WRITE: 6878.7MB/s 6822.7MB/s
> WRITE: 6456.8MB/s 6447.2MB/s
> READ: 2141.1MB/s 2154.7MB/s
> WRITE: 2144.4MB/s 2157.3MB/s
> READ: 2084.1MB/s 2085.1MB/s
> WRITE: 2091.5MB/s 2092.5MB/s
> jobs10
> READ: 19794MB/s 19784MB/s
> READ: 18794MB/s 18745MB/s
> WRITE: 6984.4MB/s 6676.3MB/s
> WRITE: 6532.3MB/s 6342.7MB/s
> READ: 2150.6MB/s 2155.4MB/s
> WRITE: 2156.8MB/s 2161.5MB/s
> READ: 2106.4MB/s 2095.6MB/s
> WRITE: 2109.7MB/s 2098.4MB/s
>
> BASE PATCHED
> jobs1 perfstat
> stalled-cycles-frontend 102,480,595,419 ( 41.53%) 114,508,864,804 ( 46.92%)
> stalled-cycles-backend 51,941,417,832 ( 21.05%) 46,836,112,388 ( 19.19%)
> instructions 283,612,054,215 ( 1.15) 283,918,134,959 ( 1.16)
> branches 56,372,560,385 ( 724.923) 56,449,814,753 ( 733.766)
> branch-misses 374,826,000 ( 0.66%) 326,935,859 ( 0.58%)
> jobs2 perfstat
> stalled-cycles-frontend 155,142,745,777 ( 40.99%) 164,170,979,198 ( 43.82%)
> stalled-cycles-backend 70,813,866,387 ( 18.71%) 66,456,858,165 ( 17.74%)
> instructions 463,436,648,173 ( 1.22) 464,221,890,191 ( 1.24)
> branches 91,088,733,902 ( 760.088) 91,278,144,546 ( 769.133)
> branch-misses 504,460,363 ( 0.55%) 394,033,842 ( 0.43%)
> jobs3 perfstat
> stalled-cycles-frontend 201,300,397,212 ( 39.84%) 223,969,902,257 ( 44.44%)
> stalled-cycles-backend 87,712,593,974 ( 17.36%) 81,618,888,712 ( 16.19%)
> instructions 642,869,545,023 ( 1.27) 644,677,354,132 ( 1.28)
> branches 125,724,560,594 ( 690.682) 126,133,159,521 ( 694.542)
> branch-misses 527,941,798 ( 0.42%) 444,782,220 ( 0.35%)
> jobs4 perfstat
> stalled-cycles-frontend 246,701,197,429 ( 38.12%) 280,076,030,886 ( 43.29%)
> stalled-cycles-backend 119,050,341,112 ( 18.40%) 110,955,641,671 ( 17.15%)
> instructions 822,716,962,127 ( 1.27) 825,536,969,320 ( 1.28)
> branches 160,590,028,545 ( 688.614) 161,152,996,915 ( 691.068)
> branch-misses 650,295,287 ( 0.40%) 550,229,113 ( 0.34%)
> jobs5 perfstat
> stalled-cycles-frontend 298,958,462,516 ( 38.30%) 344,852,200,358 ( 44.16%)
> stalled-cycles-backend 137,558,742,122 ( 17.62%) 129,465,067,102 ( 16.58%)
> instructions 1,005,714,688,752 ( 1.29) 1,007,657,999,432 ( 1.29)
> branches 195,988,773,962 ( 697.730) 196,446,873,984 ( 700.319)
> branch-misses 695,818,940 ( 0.36%) 624,823,263 ( 0.32%)
> jobs6 perfstat
> stalled-cycles-frontend 334,497,602,856 ( 36.71%) 387,590,419,779 ( 42.38%)
> stalled-cycles-backend 163,539,365,335 ( 17.95%) 152,640,193,639 ( 16.69%)
> instructions 1,184,738,177,851 ( 1.30) 1,187,396,281,677 ( 1.30)
> branches 230,592,915,640 ( 702.902) 231,253,802,882 ( 702.356)
> branch-misses 747,934,786 ( 0.32%) 643,902,424 ( 0.28%)
> jobs7 perfstat
> stalled-cycles-frontend 396,724,684,187 ( 37.71%) 460,705,858,952 ( 43.84%)
> stalled-cycles-backend 188,096,616,496 ( 17.88%) 175,785,787,036 ( 16.73%)
> instructions 1,364,041,136,608 ( 1.30) 1,366,689,075,112 ( 1.30)
> branches 265,253,096,936 ( 700.078) 265,890,524,883 ( 702.839)
> branch-misses 784,991,589 ( 0.30%) 729,196,689 ( 0.27%)
> jobs8 perfstat
> stalled-cycles-frontend 440,248,299,870 ( 36.92%) 509,554,793,816 ( 42.46%)
> stalled-cycles-backend 222,575,930,616 ( 18.67%) 213,401,248,432 ( 17.78%)
> instructions 1,542,262,045,114 ( 1.29) 1,545,233,932,257 ( 1.29)
> branches 299,775,178,439 ( 697.666) 300,528,458,505 ( 694.769)
> branch-misses 847,496,084 ( 0.28%) 748,794,308 ( 0.25%)
> jobs9 perfstat
> stalled-cycles-frontend 506,269,882,480 ( 37.86%) 592,798,032,820 ( 44.43%)
> stalled-cycles-backend 253,192,498,861 ( 18.93%) 233,727,666,185 ( 17.52%)
> instructions 1,721,985,080,913 ( 1.29) 1,724,666,236,005 ( 1.29)
> branches 334,517,360,255 ( 694.134) 335,199,758,164 ( 697.131)
> branch-misses 873,496,730 ( 0.26%) 815,379,236 ( 0.24%)
> jobs10 perfstat
> stalled-cycles-frontend 549,063,363,749 ( 37.18%) 651,302,376,662 ( 43.61%)
> stalled-cycles-backend 281,680,986,810 ( 19.07%) 277,005,235,582 ( 18.55%)
> instructions 1,901,859,271,180 ( 1.29) 1,906,311,064,230 ( 1.28)
> branches 369,398,536,153 ( 694.004) 370,527,696,358 ( 688.409)
> branch-misses 967,929,335 ( 0.26%) 890,125,056 ( 0.24%)
>
> BASE PATCHED
> seconds elapsed 79.421641008 78.735285546
> seconds elapsed 61.471246133 60.869085949
> seconds elapsed 62.317058173 62.224188495
> seconds elapsed 60.030739363 60.081102518
> seconds elapsed 74.070398362 74.317582865
> seconds elapsed 84.985953007 85.414364176
> seconds elapsed 97.724553255 98.173311344
> seconds elapsed 109.488066758 110.268399318
> seconds elapsed 122.768189405 122.967164498
> seconds elapsed 135.130035105 136.934770801
>
> On my other system (8 x86_64 CPUs, short version of test results):
>
> BASE PATCHED
> seconds elapsed 19.518065994 19.806320662
> seconds elapsed 15.172772749 15.594718291
> seconds elapsed 13.820925970 13.821708564
> seconds elapsed 13.293097816 14.585206405
> seconds elapsed 16.207284118 16.064431606
> seconds elapsed 17.958376158 17.771825767
> seconds elapsed 19.478009164 19.602961508
> seconds elapsed 21.347152811 21.352318709
> seconds elapsed 24.478121126 24.171088735
> seconds elapsed 26.865057442 26.767327618
>
> So performance-wise the numbers are quite similar.
>
> [1] http://marc.info/?l=linux-kernel&m=144480832108927&w=2
> [2] http://marc.info/?l=linux-kernel&m=145379613507518&w=2
> [3] https://github.com/sergey-senozhatsky/zram-perf-test
>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Suggested-by: Minchan Kim <minchan@kernel.org>
> Suggested-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
> drivers/block/zram/Kconfig | 10 +++----
> drivers/block/zram/zcomp.c | 66 ++++++++++++++++++++++++++++---------------
> drivers/block/zram/zcomp.h | 7 +++--
> drivers/block/zram/zram_drv.c | 14 +++++----
> 4 files changed, 61 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
> index 386ba3d..2252cd7 100644
> --- a/drivers/block/zram/Kconfig
> +++ b/drivers/block/zram/Kconfig
> @@ -1,8 +1,7 @@
> config ZRAM
> tristate "Compressed RAM block device support"
> - depends on BLOCK && SYSFS && ZSMALLOC
> - select LZO_COMPRESS
> - select LZO_DECOMPRESS
> + depends on BLOCK && SYSFS && ZSMALLOC && CRYPTO
> + select CRYPTO_LZO
> default n
> help
> Creates virtual block devices called /dev/zramX (X = 0, 1, ...).
> @@ -18,9 +17,8 @@ config ZRAM
> config ZRAM_LZ4_COMPRESS
> bool "Enable LZ4 algorithm support"
> depends on ZRAM
> - select LZ4_COMPRESS
> - select LZ4_DECOMPRESS
> + select CRYPTO_LZ4
> default n
> help
> This option enables LZ4 compression algorithm support. Compression
> - algorithm can be changed using `comp_algorithm' device attribute.
> \ No newline at end of file
> + algorithm can be changed using `comp_algorithm' device attribute.
> diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
> index 400f826..82b568e 100644
> --- a/drivers/block/zram/zcomp.c
> +++ b/drivers/block/zram/zcomp.c
> @@ -14,26 +14,23 @@
> #include <linux/wait.h>
> #include <linux/sched.h>
> #include <linux/cpu.h>
> +#include <linux/crypto.h>
>
> #include "zcomp.h"
> -#include "zcomp_lzo.h"
> -#ifdef CONFIG_ZRAM_LZ4_COMPRESS
> -#include "zcomp_lz4.h"
> -#endif
>
> -static struct zcomp_backend *backends[] = {
> - &zcomp_lzo,
> +static const char * const backends[] = {
> + "lzo",
> #ifdef CONFIG_ZRAM_LZ4_COMPRESS
> - &zcomp_lz4,
> + "lz4",
> #endif
> NULL
> };
>
> -static struct zcomp_backend *find_backend(const char *compress)
> +static const char *find_backend(const char *compress)
> {
> int i = 0;
> while (backends[i]) {
> - if (sysfs_streq(compress, backends[i]->name))
> + if (sysfs_streq(compress, backends[i]))
> break;
> i++;
> }
> @@ -42,8 +39,8 @@ static struct zcomp_backend *find_backend(const char *compress)
>
> static void zcomp_strm_free(struct zcomp *comp, struct zcomp_strm *zstrm)
> {
> - if (zstrm->private)
> - comp->backend->destroy(zstrm->private);
> + if (!IS_ERR_OR_NULL(zstrm->private))
Let's change private with tfm.
> + crypto_free_comp(zstrm->private);
> free_pages((unsigned long)zstrm->buffer, 1);
> kfree(zstrm);
> }
> @@ -58,13 +55,13 @@ static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp, gfp_t flags)
> if (!zstrm)
> return NULL;
>
> - zstrm->private = comp->backend->create(flags);
> + zstrm->private = crypto_alloc_comp(comp->name, 0, 0);
crypto_alloc_comp uses GPF_KERNEL for allocating tfm and zram uses
GFP_KERNEL for zcomp_strm_alloc now so there is no point to pass
gfp_t so let's clean it up.
Otherwise, looks good to me at af first glance.
next prev parent reply other threads:[~2016-05-27 4:21 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-25 14:29 [PATCH 0/7] zram: switch to crypto api Sergey Senozhatsky
2016-05-25 14:30 ` [PATCH 1/7] zram: rename zstrm find-release functions Sergey Senozhatsky
2016-05-26 0:44 ` Minchan Kim
2016-05-26 1:07 ` Sergey Senozhatsky
2016-05-25 14:30 ` [PATCH 2/7] zram: switch to crypto compress API Sergey Senozhatsky
2016-05-27 4:22 ` Minchan Kim [this message]
2016-05-27 7:59 ` Sergey Senozhatsky
2016-05-25 14:30 ` [PATCH 3/7] zram: drop zcomp param from compress/decompress Sergey Senozhatsky
2016-05-27 4:22 ` Minchan Kim
2016-05-27 7:31 ` Sergey Senozhatsky
2016-05-25 14:30 ` [PATCH 4/7] zram: align zcomp interface to crypto comp API Sergey Senozhatsky
2016-05-27 4:31 ` Minchan Kim
2016-05-27 8:00 ` Sergey Senozhatsky
2016-05-25 14:30 ` [PATCH 5/7] zram: use crypto api to check alg availability Sergey Senozhatsky
2016-05-27 4:43 ` Minchan Kim
2016-05-27 7:50 ` Sergey Senozhatsky
2016-05-27 8:27 ` Minchan Kim
2016-05-27 8:43 ` Herbert Xu
2016-05-27 9:04 ` Minchan Kim
2016-05-29 3:24 ` Sergey Senozhatsky
2016-05-30 4:47 ` Minchan Kim
2016-05-30 4:57 ` Sergey Senozhatsky
2016-05-25 14:30 ` [PATCH 6/7] zram: delete custom lzo/lz4 Sergey Senozhatsky
2016-05-25 14:30 ` [PATCH 7/7] zram: add more compression algorithms Sergey Senozhatsky
2016-05-26 0:43 ` [PATCH 0/7] zram: switch to crypto api Joonsoo Kim
2016-05-26 1:12 ` Sergey Senozhatsky
2016-05-26 1:52 ` Joonsoo Kim
2016-05-26 0:52 ` Minchan Kim
2016-05-26 1:08 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160527042207.GB2322@bbox \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=sergey.senozhatsky@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.