From: Zaslonko Mikhail <zaslonko@linux.ibm.com>
To: Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Cc: linux-s390@vger.kernel.org, Vasily Gorbik <gor@linux.ibm.com>
Subject: Re: [PATCH] btrfs: zlib: do not do unnecessary page copying for compression
Date: Mon, 27 May 2024 18:25:48 +0200 [thread overview]
Message-ID: <08aca5cf-f259-4963-bb2a-356847317d94@linux.ibm.com> (raw)
In-Reply-To: <0a24cc8a48821e8cf3bd01263b453c4cbc22d832.1716801849.git.wqu@suse.com>
Hello Qu,
I remember implementing btrfs zlib changes related to s390 dfltcc compression support a while ago:
https://lwn.net/Articles/808809/
The workspace buffer size was indeed enlarged for performance reasons.
Please see my comments below.
On 27.05.2024 11:24, Qu Wenruo wrote:
> [BUG]
> In function zlib_compress_folios(), we handle the input by:
>
> - If there are multiple pages left
> We copy the page content into workspace->buf, and use workspace->buf
> as input for compression.
>
> But on x86_64 (which doesn't support dfltcc), that buffer size is just
> one page, so we're wasting our CPU time copying the page for no
> benefit.
>
> - If there is only one page left
> We use the mapped page address as input for compression.
>
> The problem is, this means we will copy the whole input range except the
> last page (can be as large as 124K), without much obvious benefit.
>
> Meanwhile the cost is pretty obvious.
Actually, the behavior for kernels w/o dfltcc support (currently available on s390
only) should not be affected.
We copy input pages to the workspace->buf only if the buffer size is larger than 1 page.
At least it worked this way after my original btrfs zlib patch:
https://lwn.net/ml/linux-kernel/20200108105103.29028-1-zaslonko@linux.ibm.com/
Has this behavior somehow changed after your page->folio conversion performed for btrfs?
https://lore.kernel.org/all/cover.1706521511.git.wqu@suse.com/
Am I missing something?
>
> [POSSIBLE REASON]
> The possible reason may be related to the support of S390 hardware zlib
> decompression acceleration.
>
> As we allocate 4 pages (4 * 4K) as workspace input buffer just for s390.
>
> [FIX]
> I checked the dfltcc code, there seems to be no requirement on the
> input buffer size.
> The function dfltcc_can_deflate() only checks:
>
> - If the compression settings are supported
> Only level/w_bits/strategy/level_mask is checked.
>
> - If the hardware supports
>
> No mention at all on the input buffer size, thus I believe there is no
> need to waste time doing the page copying.
>
> Maybe the hardware acceleration is so good for s390 that they can offset
> the page copying cost, but it's definitely a penalty for non-s390
> systems.
>
> So fix the problem by:
>
> - Use the same buffer size
> No matter if dfltcc support is enabled or not
>
> - Always use page address as input
>
> Cc: linux-s390@vger.kernel.org
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> fs/btrfs/zlib.c | 67 +++++++++++--------------------------------------
> 1 file changed, 14 insertions(+), 53 deletions(-)
>
> diff --git a/fs/btrfs/zlib.c b/fs/btrfs/zlib.c
> index d9e5c88a0f85..9c88a841a060 100644
> --- a/fs/btrfs/zlib.c
> +++ b/fs/btrfs/zlib.c
> @@ -65,21 +65,8 @@ struct list_head *zlib_alloc_workspace(unsigned int level)
> zlib_inflate_workspacesize());
> workspace->strm.workspace = kvzalloc(workspacesize, GFP_KERNEL | __GFP_NOWARN);
> workspace->level = level;
> - workspace->buf = NULL;
> - /*
> - * In case of s390 zlib hardware support, allocate lager workspace
> - * buffer. If allocator fails, fall back to a single page buffer.
> - */
> - if (zlib_deflate_dfltcc_enabled()) {
> - workspace->buf = kmalloc(ZLIB_DFLTCC_BUF_SIZE,
> - __GFP_NOMEMALLOC | __GFP_NORETRY |
> - __GFP_NOWARN | GFP_NOIO);
> - workspace->buf_size = ZLIB_DFLTCC_BUF_SIZE;
> - }
> - if (!workspace->buf) {
> - workspace->buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
> - workspace->buf_size = PAGE_SIZE;
> - }
> + workspace->buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
> + workspace->buf_size = PAGE_SIZE;
> if (!workspace->strm.workspace || !workspace->buf)
> goto fail;
>
> @@ -103,7 +90,6 @@ int zlib_compress_folios(struct list_head *ws, struct address_space *mapping,
> struct folio *in_folio = NULL;
> struct folio *out_folio = NULL;
> unsigned long bytes_left;
> - unsigned int in_buf_folios;
> unsigned long len = *total_out;
> unsigned long nr_dest_folios = *out_folios;
> const unsigned long max_out = nr_dest_folios * PAGE_SIZE;
> @@ -130,7 +116,6 @@ int zlib_compress_folios(struct list_head *ws, struct address_space *mapping,
> folios[0] = out_folio;
> nr_folios = 1;
>
> - workspace->strm.next_in = workspace->buf;
> workspace->strm.avail_in = 0;
> workspace->strm.next_out = cfolio_out;
> workspace->strm.avail_out = PAGE_SIZE;
> @@ -142,43 +127,19 @@ int zlib_compress_folios(struct list_head *ws, struct address_space *mapping,
> */
> if (workspace->strm.avail_in == 0) {
> bytes_left = len - workspace->strm.total_in;
> - in_buf_folios = min(DIV_ROUND_UP(bytes_left, PAGE_SIZE),
> - workspace->buf_size / PAGE_SIZE);
doesn't this always set *in_buf_folios* to 1 in case no dfltcc support (single page workspace buffer)?
> - if (in_buf_folios > 1) {
> - int i;
> -
> - for (i = 0; i < in_buf_folios; i++) {
> - if (data_in) {
> - kunmap_local(data_in);
> - folio_put(in_folio);
> - data_in = NULL;
> - }
> - ret = btrfs_compress_filemap_get_folio(mapping,
> - start, &in_folio);
> - if (ret < 0)
> - goto out;
> - data_in = kmap_local_folio(in_folio, 0);
> - copy_page(workspace->buf + i * PAGE_SIZE,
> - data_in);
> - start += PAGE_SIZE;
> - }
> - workspace->strm.next_in = workspace->buf;
> - } else {
> - if (data_in) {
> - kunmap_local(data_in);
> - folio_put(in_folio);
> - data_in = NULL;
> - }
> - ret = btrfs_compress_filemap_get_folio(mapping,
> - start, &in_folio);
> - if (ret < 0)
> - goto out;
> - data_in = kmap_local_folio(in_folio, 0);
> - start += PAGE_SIZE;
> - workspace->strm.next_in = data_in;
> + if (data_in) {
> + kunmap_local(data_in);
> + folio_put(in_folio);
> + data_in = NULL;
> }
> - workspace->strm.avail_in = min(bytes_left,
> - (unsigned long) workspace->buf_size);
> + ret = btrfs_compress_filemap_get_folio(mapping,
> + start, &in_folio);
> + if (ret < 0)
> + goto out;
> + data_in = kmap_local_folio(in_folio, 0);
> + start += PAGE_SIZE;
> + workspace->strm.next_in = data_in;
> + workspace->strm.avail_in = min(bytes_left, PAGE_SIZE);
> }
>
> ret = zlib_deflate(&workspace->strm, Z_SYNC_FLUSH);
Thanks,
Mikhail
next prev parent reply other threads:[~2024-05-27 16:25 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-27 9:24 [PATCH] btrfs: zlib: do not do unnecessary page copying for compression Qu Wenruo
2024-05-27 16:25 ` Zaslonko Mikhail [this message]
2024-05-27 22:09 ` Qu Wenruo
2024-05-28 10:44 ` Zaslonko Mikhail
2024-05-28 21:43 ` David Sterba
2024-05-28 22:02 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=08aca5cf-f259-4963-bb2a-356847317d94@linux.ibm.com \
--to=zaslonko@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox