From: Mike Rapoport <rppt@kernel.org>
To: Ananda <a.badmaev@clicknet.pro>
Cc: linux-mm@kvack.org, vitaly.wool@konsulko.com, vbabka@suse.cz,
akpm@linux-foundation.org
Subject: Re: [PATCH v3] mm: add ztree - new allocator for use via zpool API
Date: Thu, 10 Mar 2022 12:27:34 +0200 [thread overview]
Message-ID: <YinSlko+wdV3Q6RJ@kernel.org> (raw)
In-Reply-To: <20220307142724.14519-1-a.badmaev@clicknet.pro>
On Mon, Mar 07, 2022 at 05:27:24PM +0300, Ananda wrote:
> From: Ananda Badmaev <a.badmaev@clicknet.pro>
>
> Ztree stores integer number of compressed objects per ztree block.
> These blocks consist of several physical pages (from 1 to 8) and are
> arranged in trees.
> The range from 0 to PAGE_SIZE is divided into the number of intervals
> corresponding to the number of trees and each tree only operates objects of
> size from its interval. Thus the block trees are isolated from each other,
> which makes it possible to simultaneously perform actions with several
> objects from different trees.
> Blocks make it possible to densely arrange objects of various sizes
> resulting in low internal fragmentation. Also this allocator tries to fill
> incomplete blocks instead of adding new ones thus in many cases providing a
> compression ratio substantially higher than z3fold and zbud.
> Apart from greater flexibility, ztree is significantly superior to other
> zpool backends with regard to the worst execution times, thus allowing for
> better response time and real-time characteristics of the whole system.
>
> Signed-off-by: Ananda Badmaev <a.badmaev@clicknet.pro>
> ---
>
> v2: fixed compiler warnings
>
> v3: added documentation and const modifier to struct tree_descr
>
> Documentation/vm/ztree.rst | 104 +++++
> MAINTAINERS | 7 +
> mm/Kconfig | 18 +
> mm/Makefile | 1 +
> mm/ztree.c | 754 +++++++++++++++++++++++++++++++++++++
> 5 files changed, 884 insertions(+)
> create mode 100644 Documentation/vm/ztree.rst
> create mode 100644 mm/ztree.c
There are a lot of style issues, please run scripts/checkpatch.pl.
> diff --git a/Documentation/vm/ztree.rst b/Documentation/vm/ztree.rst
> new file mode 100644
> index 000000000000..78cad0a6d616
> --- /dev/null
> +++ b/Documentation/vm/ztree.rst
> @@ -0,0 +1,104 @@
> +.. _ztree:
> +
> +=====
> +ztree
> +=====
> +
> +Ztree stores integer number of compressed objects per ztree block. These
> +blocks consist of several consecutive physical pages (from 1 to 8) and
> +are arranged in trees. The range from 0 to PAGE_SIZE is divided into the
> +number of intervals corresponding to the number of trees and each tree
> +only operates objects of size from its interval. Thus the block trees are
> +isolated from each other, which makes it possible to simultaneously
> +perform actions with several objects from different trees.
> +
> +Blocks make it possible to densely arrange objects of various sizes
> +resulting in low internal fragmentation. Also this allocator tries to fill
> +incomplete blocks instead of adding new ones thus in many cases providing
> +a compression ratio substantially higher than z3fold and zbud. Apart from
> +greater flexibility, ztree is significantly superior to other zpool
> +backends with regard to the worst execution times, thus allowing for better
> +response time and real-time characteristics of the whole system.
> +
> +Like z3fold and zsmalloc ztree_alloc() does not return a dereferenceable
> +pointer. Instead, it returns an unsigned long handle which encodes actual
> +location of the allocated object.
> +
> +Unlike others ztree works well with objects of various sizes - both highly
> +compressed and poorly compressed including cases where both types are present.
> +
> +Tests
> +=====
I don't think the sections below belong to the Documentation. IMO they are
more suitable to the changelog
> +
> +Test platform
> +-------------
> +
> +Qemu arm64 virtual board with debian 11.
> +
> +Kernel
> +------
> +
> +Linux 5.17-rc6 with ztree and zram over zpool patch. Additionally, counters and
> +time measurements using ktime_get_ns() have been added to ZPOOL API.
> +
> +Tools
> +-----
> +
> +ZRAM disks of size 1000M/1500M/2G, fio 3.25.
> +
> +Test description
> +----------------
> +
> +Run 2 fio scripts in parallel - one with VALUE=50, other with VALUE=70.
> +This emulates page content heterogeneity.
> +
> +fio --bs=4k --randrepeat=1 --randseed=100 --refill_buffers \
> + --scramble_buffers=1 --buffer_compress_percentage=VALUE \
> + --direct=1 --loops=1 --numjobs=1 --filename=/dev/zram0 \
> + --name=seq-write --rw=write --stonewall --name=seq-read \
> + --rw=read --stonewall --name=seq-readwrite --rw=rw --stonewall \
> + --name=rand-readwrite --rw=randrw --stonewall
> +
> +Results
> +-------
> +
> +ztree
> +~~~~~
> +
> +* average malloc time (us): 3.8
> +* average free time (us): 3.1
> +* average map time (us): 4.5
> +* average unmap time (us): 1.2
> +* worst zpool op time (us): ~2200
> +* total zpool ops exceeding 1000 us: 29
> +
> +
> +zsmalloc
> +~~~~~~~~
> +
> +* average malloc time (us): 10.3
> +* average free time (us): 6.5
> +* average map time (us): 3.2
> +* average unmap time (us): 1.2
> +* worst zpool op time (us): ~6200
> +* total zpool ops exceeding 1000 us: 1031
> +
> +z3fold
> +~~~~~~
> +
> +* average malloc time (us): 20.8
> +* average free time (us): 29.9
> +* average map time (us): 3.4
> +* average unmap time (us): 1.4
> +* worst zpool op time (us): ~4900
> +* total zpool ops exceeding 1000 us: 100
> +
> +zbud
> +~~~~
> +
> +* average malloc time (us): 8.1
> +* average free time (us): 4.0
> +* average map time (us): 0.3
> +* average unmap time (us): 0.3
> +* worst zpool op time (us): ~9400
> +* total zpool ops exceeding 1000 us: 727
--
Sincerely yours,
Mike.
prev parent reply other threads:[~2022-03-10 10:27 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-07 14:27 [PATCH v3] mm: add ztree - new allocator for use via zpool API Ananda
2022-03-07 15:08 ` Matthew Wilcox
[not found] ` <719621646713798@mail.yandex.ru>
2022-03-08 13:13 ` Matthew Wilcox
2022-03-08 15:16 ` Ananda Badmaev
2022-03-08 15:40 ` Matthew Wilcox
2022-03-08 17:10 ` Ananda Badmaev
2022-03-10 10:27 ` Mike Rapoport [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YinSlko+wdV3Q6RJ@kernel.org \
--to=rppt@kernel.org \
--cc=a.badmaev@clicknet.pro \
--cc=akpm@linux-foundation.org \
--cc=linux-mm@kvack.org \
--cc=vbabka@suse.cz \
--cc=vitaly.wool@konsulko.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.