All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosry.ahmed@linux.dev>
To: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Hillf Danton <hdanton@sina.com>, Kairui Song <ryncsn@gmail.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Minchan Kim <minchan@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v9 14/19] zsmalloc: introduce new object mapping API
Date: Thu, 27 Feb 2025 05:48:02 +0000	[thread overview]
Message-ID: <Z7_8koiBRTfQ81bb@google.com> (raw)
In-Reply-To: <20250227043618.88380-15-senozhatsky@chromium.org>

On Thu, Feb 27, 2025 at 01:35:32PM +0900, Sergey Senozhatsky wrote:
> Current object mapping API is a little cumbersome.  First, it's
> inconsistent, sometimes it returns with page-faults disabled and
> sometimes with page-faults enabled.  Second, and most importantly,
> it enforces atomicity restrictions on its users.  zs_map_object()
> has to return a liner object address which is not always possible
> because some objects span multiple physical (non-contiguous)
> pages.  For such objects zsmalloc uses a per-CPU buffer to which
> object's data is copied before a pointer to that per-CPU buffer
> is returned back to the caller.  This leads to another, final,
> issue - extra memcpy().  Since the caller gets a pointer to
> per-CPU buffer it can memcpy() data only to that buffer, and
> during zs_unmap_object() zsmalloc will memcpy() from that per-CPU
> buffer to physical pages that object in question spans across.
> 
> New API splits functions by access mode:
> - zs_obj_read_begin(handle, local_copy)
>   Returns a pointer to handle memory.  For objects that span two
>   physical pages a local_copy buffer is used to store object's
>   data before the address is returned to the caller.  Otherwise
>   the object's page is kmap_local mapped directly.
> 
> - zs_obj_read_end(handle, buf)
>   Unmaps the page if it was kmap_local mapped by zs_obj_read_begin().
> 
> - zs_obj_write(handle, buf, len)
>   Copies len-bytes from compression buffer to handle memory
>   (takes care of objects that span two pages).  This does not
>   need any additional (e.g. per-CPU) buffers and writes the data
>   directly to zsmalloc pool pages.
> 
> In terms of performance, on a synthetic and completely reproducible
> test that allocates fixed number of objects of fixed sizes and
> iterates over those objects, first mapping in RO then in RW mode:
> 
> OLD API
> =======
> 
> 3 first results out of 10
> 
>   369,205,778      instructions        #    0.80  insn per cycle
>    40,467,926      branches            #  113.732 M/sec
> 
>   369,002,122      instructions        #    0.62  insn per cycle
>    40,426,145      branches            #  189.361 M/sec
> 
>   369,036,706      instructions        #    0.63  insn per cycle
>    40,430,860      branches            #  204.105 M/sec
> 
> [..]
> 
> NEW API
> =======
> 
> 3 first results out of 10
> 
>   265,799,293      instructions        #    0.51  insn per cycle
>    29,834,567      branches            #  170.281 M/sec
> 
>   265,765,970      instructions        #    0.55  insn per cycle
>    29,829,019      branches            #  161.602 M/sec
> 
>   265,764,702      instructions        #    0.51  insn per cycle
>    29,828,015      branches            #  189.677 M/sec
> 
> [..]
> 
> T-test on all 10 runs
> =====================
> 
> Difference at 95.0% confidence
>    -1.03219e+08 +/- 55308.7
>    -27.9705% +/- 0.0149878%
>    (Student's t, pooled s = 58864.4)
> 
> The old API will stay around until the remaining users switch
> to the new one.  After that we'll also remove zsmalloc per-CPU
> buffer and CPU hotplug handling.
> 
> The split of map(RO) and map(WO) into read_{begin/end}/write is
> suggested by Yosry Ahmed.
> 
> Suggested-by: Yosry Ahmed <yosry.ahmed@linux.dev>
> Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>

I see my Reviewed-by was removed at some point. Did something change in
this patch (do I need to review it again) or was it just lost?


  reply	other threads:[~2025-02-27  5:48 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-27  4:35 [PATCH v9 00/19] zsmalloc/zram: there be preemption Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 01/19] zram: sleepable entry locking Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 02/19] zram: permit preemption with active compression stream Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 03/19] zram: remove unused crypto include Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 04/19] zram: remove max_comp_streams device attr Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 05/19] zram: remove second stage of handle allocation Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 06/19] zram: no-warn about zsmalloc " Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 07/19] zram: remove writestall zram_stats member Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 08/19] zram: limit max recompress prio to num_active_comps Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 09/19] zram: filter out recomp targets based on priority Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 10/19] zram: rework recompression loop Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 11/19] zram: move post-processing target allocation Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 12/19] zsmalloc: rename pool lock Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 13/19] zsmalloc: make zspage lock preemptible Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 14/19] zsmalloc: introduce new object mapping API Sergey Senozhatsky
2025-02-27  5:48   ` Yosry Ahmed [this message]
2025-02-27  6:54     ` Sergey Senozhatsky
2025-02-27  7:09       ` Yosry Ahmed
2025-03-01  7:22   ` Herbert Xu
2025-03-03  2:42     ` Sergey Senozhatsky
2025-03-03  2:51       ` Herbert Xu
2025-02-27  4:35 ` [PATCH v9 15/19] zram: switch to new zsmalloc " Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 16/19] zram: permit reclaim in zstd custom allocator Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 17/19] zram: do not leak page on recompress_store error path Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 18/19] zram: do not leak page on writeback_store " Sergey Senozhatsky
2025-02-27  4:35 ` [PATCH v9 19/19] zram: add might_sleep to zcomp API Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z7_8koiBRTfQ81bb@google.com \
    --to=yosry.ahmed@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=ryncsn@gmail.com \
    --cc=senozhatsky@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.