From: Yosry Ahmed <yosry.ahmed@linux.dev>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
Eric Biggers <ebiggers@kernel.org>,
Linux Crypto Mailing List <linux-crypto@vger.kernel.org>,
linux-mm@kvack.org
Subject: Re: [RFC PATCH 7/7] mm: zswap: Use acomp virtual address interface
Date: Wed, 5 Mar 2025 06:17:01 +0000 [thread overview]
Message-ID: <Z8fsXZNgEbVkZrJP@google.com> (raw)
In-Reply-To: <Z8fHyvF3GNKeVw0k@gondor.apana.org.au>
On Wed, Mar 05, 2025 at 11:40:58AM +0800, Herbert Xu wrote:
> On Tue, Mar 04, 2025 at 10:19:51PM +0900, Sergey Senozhatsky wrote:
> >
> > One thing to notice is that these functions don't actually map/unmap.
>
> Fair enough. I'll rename them to pin and unpin.
>
> > And the handling is spread out over different parts of the stack,
> > sg list is set in zsmalloc, but the actual zsmalloc map local page is
> > done in crypto, and then zswap does memcpy() to write to object and so
> > on. The "new" zsmalloc map API, which we plan on landing soon, handles
> > most of the things within zsmalloc. Would it be possible to do something
> > similar with the sg API?
>
> If by mapping you're referring to kmap then it's only being done
> in the Crypto API. zswap is not doing any mappings with my patch,
> even the copy to SG list operation after compression calls Crypto
> API code (the newly introduced memcpy_to_sglist from crypto/scatterwalk.c.
>
> The data is only ever read/written by Crypto API code so it
> would seem to be more natural to map it when and where the
> data is needed.
>
> This also eliminates unnecessary mappings when the data is passed
> to hardware offload, since there is no point in mapping the data
> into CPU address space at all if it's only going to be accessed
> with DMA.
>
> > > @@ -972,9 +973,9 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
> > > if (alloc_ret)
> > > goto unlock;
> > >
> > > - buf = zpool_map_handle(zpool, handle, ZPOOL_MM_WO);
> > > - memcpy(buf, dst, dlen);
> > > - zpool_unmap_handle(zpool, handle);
> > > + zpool_map_sg(zpool, handle, ZPOOL_MM_WO, sg);
> > > + memcpy_to_sglist(sg, 0, dst, dlen);
> > > + zpool_unmap_sg(zpool, handle);
> >
> > You can give zsmalloc a handle and a compressed buffer (u8) and
> > zsmalloc should be able to figure it out. WO direction map()
> > seems, a bit, like an extra step.
>
> Sure, this part can be dropped since your patch-set already provides
> an interface for writing to the buffer. Here is the same patch rebased
> on top of your read_begin series.
Actually I just sent out a series that I had sitting in my local tree
for a bit to complete Sergey's work and completely remove the map/unmap
APIs:
https://lore.kernel.org/lkml/20250305061134.4105762-1-yosry.ahmed@linux.dev/.
The intention was always to have a single API for read/write or
map/unmap in zsmalloc. Sergey's series was already too long so I said I
will send a follow up one to switch zswap and complete the switch.
I am not objecting to switch the API to use SG lists if we intend to
switch multiple compression algorithms to use them and will completely
switch to using SG-based APIs in both zswap and zram. But I don't want
us to have two separate interfaces please.
Also, please take a look at patch 2 in this series for another reason, I
want to make sure if your virtual address series can be used to remove
the !virt_addr_valid() memcpy() case completely.
Thanks.
>
> commit d5891a27df516192e381047b4c79de4e9f7df4cd
> Author: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Thu Feb 27 18:10:32 2025 +0800
>
> mm: zswap: Give non-linear objects to Crypto API
>
> Instead of copying non-linear objects into a buffer, use the
> scatterlist to give them directly to the Crypto API.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> diff --git a/include/linux/zpool.h b/include/linux/zpool.h
> index a67d62b79698..795f8e3ad964 100644
> --- a/include/linux/zpool.h
> +++ b/include/linux/zpool.h
> @@ -12,27 +12,9 @@
> #ifndef _ZPOOL_H_
> #define _ZPOOL_H_
>
> +struct scatterlist;
> struct zpool;
>
> -/*
> - * Control how a handle is mapped. It will be ignored if the
> - * implementation does not support it. Its use is optional.
> - * Note that this does not refer to memory protection, it
> - * refers to how the memory will be copied in/out if copying
> - * is necessary during mapping; read-write is the safest as
> - * it copies the existing memory in on map, and copies the
> - * changed memory back out on unmap. Write-only does not copy
> - * in the memory and should only be used for initialization.
> - * If in doubt, use ZPOOL_MM_DEFAULT which is read-write.
> - */
> -enum zpool_mapmode {
> - ZPOOL_MM_RW, /* normal read-write mapping */
> - ZPOOL_MM_RO, /* read-only (no copy-out at unmap time) */
> - ZPOOL_MM_WO, /* write-only (no copy-in at map time) */
> -
> - ZPOOL_MM_DEFAULT = ZPOOL_MM_RW
> -};
> -
> bool zpool_has_pool(char *type);
>
> struct zpool *zpool_create_pool(const char *type, const char *name, gfp_t gfp);
> @@ -48,10 +30,13 @@ int zpool_malloc(struct zpool *pool, size_t size, gfp_t gfp,
>
> void zpool_free(struct zpool *pool, unsigned long handle);
>
> -void *zpool_map_handle(struct zpool *pool, unsigned long handle,
> - enum zpool_mapmode mm);
> +void zpool_pin_handle(struct zpool *pool, unsigned long handle,
> + struct scatterlist *sg);
>
> -void zpool_unmap_handle(struct zpool *pool, unsigned long handle);
> +void zpool_unpin_handle(struct zpool *pool, unsigned long handle);
> +
> +void zpool_write_handle(struct zpool *pool, unsigned long handle,
> + void *handle_mem, size_t mem_len);
>
> u64 zpool_get_total_pages(struct zpool *pool);
>
> @@ -64,9 +49,9 @@ u64 zpool_get_total_pages(struct zpool *pool);
> * @destroy: destroy a pool.
> * @malloc: allocate mem from a pool.
> * @free: free mem from a pool.
> - * @sleep_mapped: whether zpool driver can sleep during map.
> - * @map: map a handle.
> - * @unmap: unmap a handle.
> + * @pin: pin a handle and write it into a two-entry SG list.
> + * @unpin: unpin a handle.
> + * @write: write buffer to a handle.
> * @total_size: get total size of a pool.
> *
> * This is created by a zpool implementation and registered
> @@ -86,10 +71,10 @@ struct zpool_driver {
> unsigned long *handle);
> void (*free)(void *pool, unsigned long handle);
>
> - bool sleep_mapped;
> - void *(*map)(void *pool, unsigned long handle,
> - enum zpool_mapmode mm);
> - void (*unmap)(void *pool, unsigned long handle);
> + void (*pin)(void *pool, unsigned long handle, struct scatterlist *sg);
> + void (*unpin)(void *pool, unsigned long handle);
> + void (*write)(void *pool, unsigned long handle,
> + void *handle_mem, size_t mem_len);
>
> u64 (*total_pages)(void *pool);
> };
> @@ -98,6 +83,4 @@ void zpool_register_driver(struct zpool_driver *driver);
>
> int zpool_unregister_driver(struct zpool_driver *driver);
>
> -bool zpool_can_sleep_mapped(struct zpool *pool);
> -
> #endif
> diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> index 7d70983cf398..c26baf9fb331 100644
> --- a/include/linux/zsmalloc.h
> +++ b/include/linux/zsmalloc.h
> @@ -16,23 +16,6 @@
>
> #include <linux/types.h>
>
> -/*
> - * zsmalloc mapping modes
> - *
> - * NOTE: These only make a difference when a mapped object spans pages.
> - */
> -enum zs_mapmode {
> - ZS_MM_RW, /* normal read-write mapping */
> - ZS_MM_RO, /* read-only (no copy-out at unmap time) */
> - ZS_MM_WO /* write-only (no copy-in at map time) */
> - /*
> - * NOTE: ZS_MM_WO should only be used for initializing new
> - * (uninitialized) allocations. Partial writes to already
> - * initialized allocations should use ZS_MM_RW to preserve the
> - * existing data.
> - */
> -};
> -
> struct zs_pool_stats {
> /* How many pages were migrated (freed) */
> atomic_long_t pages_compacted;
> @@ -48,10 +31,6 @@ void zs_free(struct zs_pool *pool, unsigned long obj);
>
> size_t zs_huge_class_size(struct zs_pool *pool);
>
> -void *zs_map_object(struct zs_pool *pool, unsigned long handle,
> - enum zs_mapmode mm);
> -void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
> -
> unsigned long zs_get_total_pages(struct zs_pool *pool);
> unsigned long zs_compact(struct zs_pool *pool);
>
> diff --git a/mm/z3fold.c b/mm/z3fold.c
> index 379d24b4fef9..f0dc45cf9138 100644
> --- a/mm/z3fold.c
> +++ b/mm/z3fold.c
> @@ -36,6 +36,7 @@
> #include <linux/percpu.h>
> #include <linux/preempt.h>
> #include <linux/workqueue.h>
> +#include <linux/scatterlist.h>
> #include <linux/slab.h>
> #include <linux/spinlock.h>
> #include <linux/zpool.h>
> @@ -1392,16 +1393,28 @@ static void z3fold_zpool_free(void *pool, unsigned long handle)
> z3fold_free(pool, handle);
> }
>
> -static void *z3fold_zpool_map(void *pool, unsigned long handle,
> - enum zpool_mapmode mm)
> +static void z3fold_zpool_pin(void *pool, unsigned long handle,
> + struct scatterlist sg[2])
> {
> - return z3fold_map(pool, handle);
> + void *buf = z3fold_map(pool, handle);
> +
> + sg_init_one(sg, buf, PAGE_SIZE - offset_in_page(buf));
> }
> -static void z3fold_zpool_unmap(void *pool, unsigned long handle)
> +
> +static void z3fold_zpool_unpin(void *pool, unsigned long handle)
> {
> z3fold_unmap(pool, handle);
> }
>
> +static void z3fold_zpool_write(void *pool, unsigned long handle,
> + void *handle_mem, size_t mem_len)
> +{
> + void *buf = z3fold_map(pool, handle);
> +
> + memcpy(buf, handle_mem, mem_len);
> + z3fold_unmap(pool, handle);
> +}
> +
> static u64 z3fold_zpool_total_pages(void *pool)
> {
> return z3fold_get_pool_pages(pool);
> @@ -1409,14 +1422,14 @@ static u64 z3fold_zpool_total_pages(void *pool)
>
> static struct zpool_driver z3fold_zpool_driver = {
> .type = "z3fold",
> - .sleep_mapped = true,
> .owner = THIS_MODULE,
> .create = z3fold_zpool_create,
> .destroy = z3fold_zpool_destroy,
> .malloc = z3fold_zpool_malloc,
> .free = z3fold_zpool_free,
> - .map = z3fold_zpool_map,
> - .unmap = z3fold_zpool_unmap,
> + .pin = z3fold_zpool_pin,
> + .unpin = z3fold_zpool_unpin,
> + .write = z3fold_zpool_write,
> .total_pages = z3fold_zpool_total_pages,
> };
>
> diff --git a/mm/zbud.c b/mm/zbud.c
> index e9836fff9438..21c0a9c26abe 100644
> --- a/mm/zbud.c
> +++ b/mm/zbud.c
> @@ -36,10 +36,9 @@
> *
> * The zbud API differs from that of conventional allocators in that the
> * allocation function, zbud_alloc(), returns an opaque handle to the user,
> - * not a dereferenceable pointer. The user must map the handle using
> - * zbud_map() in order to get a usable pointer by which to access the
> - * allocation data and unmap the handle with zbud_unmap() when operations
> - * on the allocation data are complete.
> + * not a dereferenceable pointer. The user must pin the handle using
> + * zbud_pin() in order to access the allocation data and unpin the handle
> + * with zbud_unpin() when operations on the allocation data are complete.
> */
>
> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> @@ -49,6 +48,7 @@
> #include <linux/mm.h>
> #include <linux/module.h>
> #include <linux/preempt.h>
> +#include <linux/scatterlist.h>
> #include <linux/slab.h>
> #include <linux/spinlock.h>
> #include <linux/zpool.h>
> @@ -339,28 +339,30 @@ static void zbud_free(struct zbud_pool *pool, unsigned long handle)
> }
>
> /**
> - * zbud_map() - maps the allocation associated with the given handle
> + * zbud_pin() - pins the allocation associated with the given handle
> * @pool: pool in which the allocation resides
> - * @handle: handle associated with the allocation to be mapped
> + * @handle: handle associated with the allocation to be pinned
> + * @sg: 2-entry scatter list to store the memory pointers
> *
> - * While trivial for zbud, the mapping functions for others allocators
> + * While trivial for zbud, the pinning functions for others allocators
> * implementing this allocation API could have more complex information encoded
> * in the handle and could create temporary mappings to make the data
> * accessible to the user.
> - *
> - * Returns: a pointer to the mapped allocation
> */
> -static void *zbud_map(struct zbud_pool *pool, unsigned long handle)
> +static void zbud_pin(struct zbud_pool *pool, unsigned long handle,
> + struct scatterlist sg[2])
> {
> - return (void *)(handle);
> + void *buf = (void *)handle;
> +
> + sg_init_one(sg, buf, PAGE_SIZE - offset_in_page(buf));
> }
>
> /**
> - * zbud_unmap() - maps the allocation associated with the given handle
> + * zbud_unpin() - unpins the allocation associated with the given handle
> * @pool: pool in which the allocation resides
> - * @handle: handle associated with the allocation to be unmapped
> + * @handle: handle associated with the allocation to be unpinned
> */
> -static void zbud_unmap(struct zbud_pool *pool, unsigned long handle)
> +static void zbud_unpin(struct zbud_pool *pool, unsigned long handle)
> {
> }
>
> @@ -400,14 +402,20 @@ static void zbud_zpool_free(void *pool, unsigned long handle)
> zbud_free(pool, handle);
> }
>
> -static void *zbud_zpool_map(void *pool, unsigned long handle,
> - enum zpool_mapmode mm)
> +static void zbud_zpool_pin(void *pool, unsigned long handle,
> + struct scatterlist sg[2])
> {
> - return zbud_map(pool, handle);
> + zbud_pin(pool, handle, sg);
> }
> -static void zbud_zpool_unmap(void *pool, unsigned long handle)
> +static void zbud_zpool_unpin(void *pool, unsigned long handle)
> {
> - zbud_unmap(pool, handle);
> + zbud_unpin(pool, handle);
> +}
> +
> +static void zbud_zpool_write(void *pool, unsigned long handle,
> + void *handle_mem, size_t mem_len)
> +{
> + memcpy((void *)handle, handle_mem, mem_len);
> }
>
> static u64 zbud_zpool_total_pages(void *pool)
> @@ -417,14 +425,14 @@ static u64 zbud_zpool_total_pages(void *pool)
>
> static struct zpool_driver zbud_zpool_driver = {
> .type = "zbud",
> - .sleep_mapped = true,
> .owner = THIS_MODULE,
> .create = zbud_zpool_create,
> .destroy = zbud_zpool_destroy,
> .malloc = zbud_zpool_malloc,
> .free = zbud_zpool_free,
> - .map = zbud_zpool_map,
> - .unmap = zbud_zpool_unmap,
> + .pin = zbud_zpool_pin,
> + .unpin = zbud_zpool_unpin,
> + .write = zbud_zpool_write,
> .total_pages = zbud_zpool_total_pages,
> };
>
> diff --git a/mm/zpool.c b/mm/zpool.c
> index b9fda1fa857d..304639959b90 100644
> --- a/mm/zpool.c
> +++ b/mm/zpool.c
> @@ -13,6 +13,7 @@
> #include <linux/list.h>
> #include <linux/types.h>
> #include <linux/mm.h>
> +#include <linux/scatterlist.h>
> #include <linux/slab.h>
> #include <linux/spinlock.h>
> #include <linux/module.h>
> @@ -278,46 +279,53 @@ void zpool_free(struct zpool *zpool, unsigned long handle)
> }
>
> /**
> - * zpool_map_handle() - Map a previously allocated handle into memory
> + * zpool_pin_handle() - Pin a previously allocated handle into memory
> * @zpool: The zpool that the handle was allocated from
> - * @handle: The handle to map
> - * @mapmode: How the memory should be mapped
> + * @handle: The handle to pin
> + * @sg: 2-entry scatterlist to store pointers to the memmory
> *
> - * This maps a previously allocated handle into memory. The @mapmode
> - * param indicates to the implementation how the memory will be
> - * used, i.e. read-only, write-only, read-write. If the
> - * implementation does not support it, the memory will be treated
> - * as read-write.
> + * This pins a previously allocated handle into memory.
> *
> * This may hold locks, disable interrupts, and/or preemption,
> - * and the zpool_unmap_handle() must be called to undo those
> - * actions. The code that uses the mapped handle should complete
> - * its operations on the mapped handle memory quickly and unmap
> - * as soon as possible. As the implementation may use per-cpu
> - * data, multiple handles should not be mapped concurrently on
> - * any cpu.
> - *
> - * Returns: A pointer to the handle's mapped memory area.
> + * and the zpool_unpin_handle() must be called to undo those
> + * actions. The code that uses the pinned handle should complete
> + * its operations on the pinned handle memory quickly and unpin
> + * as soon as possible.
> */
> -void *zpool_map_handle(struct zpool *zpool, unsigned long handle,
> - enum zpool_mapmode mapmode)
> +void zpool_pin_handle(struct zpool *zpool, unsigned long handle,
> + struct scatterlist *sg)
> {
> - return zpool->driver->map(zpool->pool, handle, mapmode);
> + zpool->driver->pin(zpool->pool, handle, sg);
> }
>
> /**
> - * zpool_unmap_handle() - Unmap a previously mapped handle
> + * zpool_unpin_handle() - Unpin a previously pinned handle
> * @zpool: The zpool that the handle was allocated from
> - * @handle: The handle to unmap
> + * @handle: The handle to unpin
> *
> - * This unmaps a previously mapped handle. Any locks or other
> - * actions that the implementation took in zpool_map_handle()
> + * This unpins a previously pinned handle. Any locks or other
> + * actions that the implementation took in zpool_pin_handle()
> * will be undone here. The memory area returned from
> - * zpool_map_handle() should no longer be used after this.
> + * zpool_pin_handle() should no longer be used after this.
> */
> -void zpool_unmap_handle(struct zpool *zpool, unsigned long handle)
> +void zpool_unpin_handle(struct zpool *zpool, unsigned long handle)
> {
> - zpool->driver->unmap(zpool->pool, handle);
> + zpool->driver->unpin(zpool->pool, handle);
> +}
> +
> +/**
> + * zpool_write_handle() - Write to a previously allocated handle
> + * @zpool: The zpool that the handle was allocated from
> + * @handle: The handle to write
> + * @handle_mem: Data to write from
> + * @mem_len: Length of data to be written
> + *
> + * This writes data to a previously allocated handle.
> + */
> +void zpool_write_handle(struct zpool *zpool, unsigned long handle,
> + void *handle_mem, size_t mem_len)
> +{
> + zpool->driver->write(zpool->pool, handle, handle_mem, mem_len);
> }
>
> /**
> @@ -333,23 +341,5 @@ u64 zpool_get_total_pages(struct zpool *zpool)
> return zpool->driver->total_pages(zpool->pool);
> }
>
> -/**
> - * zpool_can_sleep_mapped - Test if zpool can sleep when do mapped.
> - * @zpool: The zpool to test
> - *
> - * Some allocators enter non-preemptible context in ->map() callback (e.g.
> - * disable pagefaults) and exit that context in ->unmap(), which limits what
> - * we can do with the mapped object. For instance, we cannot wait for
> - * asynchronous crypto API to decompress such an object or take mutexes
> - * since those will call into the scheduler. This function tells us whether
> - * we use such an allocator.
> - *
> - * Returns: true if zpool can sleep; false otherwise.
> - */
> -bool zpool_can_sleep_mapped(struct zpool *zpool)
> -{
> - return zpool->driver->sleep_mapped;
> -}
> -
> MODULE_AUTHOR("Dan Streetman <ddstreet@ieee.org>");
> MODULE_DESCRIPTION("Common API for compressed memory storage");
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 63c99db71dc1..934b3be467e6 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -23,6 +23,7 @@
> * zspage->lock
> */
>
> +#include <crypto/scatterwalk.h>
> #include <linux/module.h>
> #include <linux/kernel.h>
> #include <linux/sched.h>
> @@ -49,6 +50,7 @@
> #include <linux/pagemap.h>
> #include <linux/fs.h>
> #include <linux/local_lock.h>
> +#include <linux/scatterlist.h>
> #include "zpdesc.h"
>
> #define ZSPAGE_MAGIC 0x58
> @@ -281,13 +283,6 @@ struct zspage {
> struct zspage_lock zsl;
> };
>
> -struct mapping_area {
> - local_lock_t lock;
> - char *vm_buf; /* copy buffer for objects that span pages */
> - char *vm_addr; /* address of kmap_local_page()'ed pages */
> - enum zs_mapmode vm_mm; /* mapping mode */
> -};
> -
> static void zspage_lock_init(struct zspage *zspage)
> {
> static struct lock_class_key __key;
> @@ -453,6 +448,10 @@ static void record_obj(unsigned long handle, unsigned long obj)
>
> #ifdef CONFIG_ZPOOL
>
> +static int zs_pin_object(struct zs_pool *pool, unsigned long handle,
> + struct scatterlist sg[2]);
> +static void zs_unpin_object(struct zs_pool *pool, unsigned long handle);
> +
> static void *zs_zpool_create(const char *name, gfp_t gfp)
> {
> /*
> @@ -482,29 +481,21 @@ static void zs_zpool_free(void *pool, unsigned long handle)
> zs_free(pool, handle);
> }
>
> -static void *zs_zpool_map(void *pool, unsigned long handle,
> - enum zpool_mapmode mm)
> +static void zs_zpool_pin(void *pool, unsigned long handle,
> + struct scatterlist sg[2])
> {
> - enum zs_mapmode zs_mm;
> -
> - switch (mm) {
> - case ZPOOL_MM_RO:
> - zs_mm = ZS_MM_RO;
> - break;
> - case ZPOOL_MM_WO:
> - zs_mm = ZS_MM_WO;
> - break;
> - case ZPOOL_MM_RW:
> - default:
> - zs_mm = ZS_MM_RW;
> - break;
> - }
> -
> - return zs_map_object(pool, handle, zs_mm);
> + zs_pin_object(pool, handle, sg);
> }
> -static void zs_zpool_unmap(void *pool, unsigned long handle)
> +
> +static void zs_zpool_unpin(void *pool, unsigned long handle)
> {
> - zs_unmap_object(pool, handle);
> + zs_unpin_object(pool, handle);
> +}
> +
> +static void zs_zpool_write(void *pool, unsigned long handle,
> + void *handle_mem, size_t mem_len)
> +{
> + zs_obj_write(pool, handle, handle_mem, mem_len);
> }
>
> static u64 zs_zpool_total_pages(void *pool)
> @@ -520,19 +511,15 @@ static struct zpool_driver zs_zpool_driver = {
> .malloc_support_movable = true,
> .malloc = zs_zpool_malloc,
> .free = zs_zpool_free,
> - .map = zs_zpool_map,
> - .unmap = zs_zpool_unmap,
> + .pin = zs_zpool_pin,
> + .unpin = zs_zpool_unpin,
> + .write = zs_zpool_write,
> .total_pages = zs_zpool_total_pages,
> };
>
> MODULE_ALIAS("zpool-zsmalloc");
> #endif /* CONFIG_ZPOOL */
>
> -/* per-cpu VM mapping areas for zspage accesses that cross page boundaries */
> -static DEFINE_PER_CPU(struct mapping_area, zs_map_area) = {
> - .lock = INIT_LOCAL_LOCK(lock),
> -};
> -
> static inline bool __maybe_unused is_first_zpdesc(struct zpdesc *zpdesc)
> {
> return PagePrivate(zpdesc_page(zpdesc));
> @@ -1117,93 +1104,6 @@ static struct zspage *find_get_zspage(struct size_class *class)
> return zspage;
> }
>
> -static inline int __zs_cpu_up(struct mapping_area *area)
> -{
> - /*
> - * Make sure we don't leak memory if a cpu UP notification
> - * and zs_init() race and both call zs_cpu_up() on the same cpu
> - */
> - if (area->vm_buf)
> - return 0;
> - area->vm_buf = kmalloc(ZS_MAX_ALLOC_SIZE, GFP_KERNEL);
> - if (!area->vm_buf)
> - return -ENOMEM;
> - return 0;
> -}
> -
> -static inline void __zs_cpu_down(struct mapping_area *area)
> -{
> - kfree(area->vm_buf);
> - area->vm_buf = NULL;
> -}
> -
> -static void *__zs_map_object(struct mapping_area *area,
> - struct zpdesc *zpdescs[2], int off, int size)
> -{
> - size_t sizes[2];
> - char *buf = area->vm_buf;
> -
> - /* disable page faults to match kmap_local_page() return conditions */
> - pagefault_disable();
> -
> - /* no read fastpath */
> - if (area->vm_mm == ZS_MM_WO)
> - goto out;
> -
> - sizes[0] = PAGE_SIZE - off;
> - sizes[1] = size - sizes[0];
> -
> - /* copy object to per-cpu buffer */
> - memcpy_from_page(buf, zpdesc_page(zpdescs[0]), off, sizes[0]);
> - memcpy_from_page(buf + sizes[0], zpdesc_page(zpdescs[1]), 0, sizes[1]);
> -out:
> - return area->vm_buf;
> -}
> -
> -static void __zs_unmap_object(struct mapping_area *area,
> - struct zpdesc *zpdescs[2], int off, int size)
> -{
> - size_t sizes[2];
> - char *buf;
> -
> - /* no write fastpath */
> - if (area->vm_mm == ZS_MM_RO)
> - goto out;
> -
> - buf = area->vm_buf;
> - buf = buf + ZS_HANDLE_SIZE;
> - size -= ZS_HANDLE_SIZE;
> - off += ZS_HANDLE_SIZE;
> -
> - sizes[0] = PAGE_SIZE - off;
> - sizes[1] = size - sizes[0];
> -
> - /* copy per-cpu buffer to object */
> - memcpy_to_page(zpdesc_page(zpdescs[0]), off, buf, sizes[0]);
> - memcpy_to_page(zpdesc_page(zpdescs[1]), 0, buf + sizes[0], sizes[1]);
> -
> -out:
> - /* enable page faults to match kunmap_local() return conditions */
> - pagefault_enable();
> -}
> -
> -static int zs_cpu_prepare(unsigned int cpu)
> -{
> - struct mapping_area *area;
> -
> - area = &per_cpu(zs_map_area, cpu);
> - return __zs_cpu_up(area);
> -}
> -
> -static int zs_cpu_dead(unsigned int cpu)
> -{
> - struct mapping_area *area;
> -
> - area = &per_cpu(zs_map_area, cpu);
> - __zs_cpu_down(area);
> - return 0;
> -}
> -
> static bool can_merge(struct size_class *prev, int pages_per_zspage,
> int objs_per_zspage)
> {
> @@ -1251,126 +1151,15 @@ unsigned long zs_get_total_pages(struct zs_pool *pool)
> }
> EXPORT_SYMBOL_GPL(zs_get_total_pages);
>
> -/**
> - * zs_map_object - get address of allocated object from handle.
> - * @pool: pool from which the object was allocated
> - * @handle: handle returned from zs_malloc
> - * @mm: mapping mode to use
> - *
> - * Before using an object allocated from zs_malloc, it must be mapped using
> - * this function. When done with the object, it must be unmapped using
> - * zs_unmap_object.
> - *
> - * Only one object can be mapped per cpu at a time. There is no protection
> - * against nested mappings.
> - *
> - * This function returns with preemption and page faults disabled.
> - */
> -void *zs_map_object(struct zs_pool *pool, unsigned long handle,
> - enum zs_mapmode mm)
> -{
> - struct zspage *zspage;
> - struct zpdesc *zpdesc;
> - unsigned long obj, off;
> - unsigned int obj_idx;
> -
> - struct size_class *class;
> - struct mapping_area *area;
> - struct zpdesc *zpdescs[2];
> - void *ret;
> -
> - /*
> - * Because we use per-cpu mapping areas shared among the
> - * pools/users, we can't allow mapping in interrupt context
> - * because it can corrupt another users mappings.
> - */
> - BUG_ON(in_interrupt());
> -
> - /* It guarantees it can get zspage from handle safely */
> - read_lock(&pool->lock);
> - obj = handle_to_obj(handle);
> - obj_to_location(obj, &zpdesc, &obj_idx);
> - zspage = get_zspage(zpdesc);
> -
> - /*
> - * migration cannot move any zpages in this zspage. Here, class->lock
> - * is too heavy since callers would take some time until they calls
> - * zs_unmap_object API so delegate the locking from class to zspage
> - * which is smaller granularity.
> - */
> - zspage_read_lock(zspage);
> - read_unlock(&pool->lock);
> -
> - class = zspage_class(pool, zspage);
> - off = offset_in_page(class->size * obj_idx);
> -
> - local_lock(&zs_map_area.lock);
> - area = this_cpu_ptr(&zs_map_area);
> - area->vm_mm = mm;
> - if (off + class->size <= PAGE_SIZE) {
> - /* this object is contained entirely within a page */
> - area->vm_addr = kmap_local_zpdesc(zpdesc);
> - ret = area->vm_addr + off;
> - goto out;
> - }
> -
> - /* this object spans two pages */
> - zpdescs[0] = zpdesc;
> - zpdescs[1] = get_next_zpdesc(zpdesc);
> - BUG_ON(!zpdescs[1]);
> -
> - ret = __zs_map_object(area, zpdescs, off, class->size);
> -out:
> - if (likely(!ZsHugePage(zspage)))
> - ret += ZS_HANDLE_SIZE;
> -
> - return ret;
> -}
> -EXPORT_SYMBOL_GPL(zs_map_object);
> -
> -void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
> -{
> - struct zspage *zspage;
> - struct zpdesc *zpdesc;
> - unsigned long obj, off;
> - unsigned int obj_idx;
> -
> - struct size_class *class;
> - struct mapping_area *area;
> -
> - obj = handle_to_obj(handle);
> - obj_to_location(obj, &zpdesc, &obj_idx);
> - zspage = get_zspage(zpdesc);
> - class = zspage_class(pool, zspage);
> - off = offset_in_page(class->size * obj_idx);
> -
> - area = this_cpu_ptr(&zs_map_area);
> - if (off + class->size <= PAGE_SIZE)
> - kunmap_local(area->vm_addr);
> - else {
> - struct zpdesc *zpdescs[2];
> -
> - zpdescs[0] = zpdesc;
> - zpdescs[1] = get_next_zpdesc(zpdesc);
> - BUG_ON(!zpdescs[1]);
> -
> - __zs_unmap_object(area, zpdescs, off, class->size);
> - }
> - local_unlock(&zs_map_area.lock);
> -
> - zspage_read_unlock(zspage);
> -}
> -EXPORT_SYMBOL_GPL(zs_unmap_object);
> -
> -void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle,
> - void *local_copy)
> +static int zs_pin_object(struct zs_pool *pool, unsigned long handle,
> + struct scatterlist sg[2])
> {
> + int handle_size = ZS_HANDLE_SIZE;
> struct zspage *zspage;
> struct zpdesc *zpdesc;
> unsigned long obj, off;
> unsigned int obj_idx;
> struct size_class *class;
> - void *addr;
>
> /* Guarantee we can get zspage from handle safely */
> read_lock(&pool->lock);
> @@ -1385,33 +1174,56 @@ void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle,
> class = zspage_class(pool, zspage);
> off = offset_in_page(class->size * obj_idx);
>
> + if (ZsHugePage(zspage))
> + handle_size = 0;
> +
> if (off + class->size <= PAGE_SIZE) {
> /* this object is contained entirely within a page */
> - addr = kmap_local_zpdesc(zpdesc);
> - addr += off;
> + sg_init_table(sg, 1);
> + sg_set_page(sg, zpdesc_page(zpdesc),
> + class->size - handle_size, off + handle_size);
> } else {
> size_t sizes[2];
>
> /* this object spans two pages */
> sizes[0] = PAGE_SIZE - off;
> sizes[1] = class->size - sizes[0];
> - addr = local_copy;
>
> - memcpy_from_page(addr, zpdesc_page(zpdesc),
> - off, sizes[0]);
> + sg_init_table(sg, 2);
> + sg_set_page(sg, zpdesc_page(zpdesc), sizes[0] - handle_size,
> + off + handle_size);
> zpdesc = get_next_zpdesc(zpdesc);
> - memcpy_from_page(addr + sizes[0],
> - zpdesc_page(zpdesc),
> - 0, sizes[1]);
> + sg_set_page(&sg[1], zpdesc_page(zpdesc), sizes[1], 0);
> }
>
> - if (!ZsHugePage(zspage))
> - addr += ZS_HANDLE_SIZE;
> + return class->size - handle_size;
> +}
> +
> +void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle,
> + void *local_copy)
> +{
> + struct scatterlist sg[2];
> + void *addr;
> + int len;
> +
> + len = zs_pin_object(pool, handle, sg);
> + if (sg_is_last(sg)) {
> + addr = kmap_local_page(sg_page(sg));
> + addr += sg[0].offset;;
> + } else {
> + addr = local_copy;
> + memcpy_from_sglist(addr, sg, 0, len);
> + }
>
> return addr;
> }
> EXPORT_SYMBOL_GPL(zs_obj_read_begin);
>
> +static void zs_unpin_object(struct zs_pool *pool, unsigned long handle)
> +{
> + zs_obj_read_end(pool, handle, NULL);
> +}
> +
> void zs_obj_read_end(struct zs_pool *pool, unsigned long handle,
> void *handle_mem)
> {
> @@ -1427,7 +1239,7 @@ void zs_obj_read_end(struct zs_pool *pool, unsigned long handle,
> class = zspage_class(pool, zspage);
> off = offset_in_page(class->size * obj_idx);
>
> - if (off + class->size <= PAGE_SIZE) {
> + if (handle_mem && off + class->size <= PAGE_SIZE) {
> if (!ZsHugePage(zspage))
> off += ZS_HANDLE_SIZE;
> handle_mem -= off;
> @@ -1441,49 +1253,11 @@ EXPORT_SYMBOL_GPL(zs_obj_read_end);
> void zs_obj_write(struct zs_pool *pool, unsigned long handle,
> void *handle_mem, size_t mem_len)
> {
> - struct zspage *zspage;
> - struct zpdesc *zpdesc;
> - unsigned long obj, off;
> - unsigned int obj_idx;
> - struct size_class *class;
> + struct scatterlist sg[2];
>
> - /* Guarantee we can get zspage from handle safely */
> - read_lock(&pool->lock);
> - obj = handle_to_obj(handle);
> - obj_to_location(obj, &zpdesc, &obj_idx);
> - zspage = get_zspage(zpdesc);
> -
> - /* Make sure migration doesn't move any pages in this zspage */
> - zspage_read_lock(zspage);
> - read_unlock(&pool->lock);
> -
> - class = zspage_class(pool, zspage);
> - off = offset_in_page(class->size * obj_idx);
> -
> - if (off + class->size <= PAGE_SIZE) {
> - /* this object is contained entirely within a page */
> - void *dst = kmap_local_zpdesc(zpdesc);
> -
> - if (!ZsHugePage(zspage))
> - off += ZS_HANDLE_SIZE;
> - memcpy(dst + off, handle_mem, mem_len);
> - kunmap_local(dst);
> - } else {
> - /* this object spans two pages */
> - size_t sizes[2];
> -
> - off += ZS_HANDLE_SIZE;
> - sizes[0] = PAGE_SIZE - off;
> - sizes[1] = mem_len - sizes[0];
> -
> - memcpy_to_page(zpdesc_page(zpdesc), off,
> - handle_mem, sizes[0]);
> - zpdesc = get_next_zpdesc(zpdesc);
> - memcpy_to_page(zpdesc_page(zpdesc), 0,
> - handle_mem + sizes[0], sizes[1]);
> - }
> -
> - zspage_read_unlock(zspage);
> + zs_pin_object(pool, handle, sg);
> + memcpy_to_sglist(sg, 0, handle_mem, mem_len);
> + zs_unpin_object(pool, handle);
> }
> EXPORT_SYMBOL_GPL(zs_obj_write);
>
> @@ -2465,13 +2239,6 @@ EXPORT_SYMBOL_GPL(zs_destroy_pool);
>
> static int __init zs_init(void)
> {
> - int ret;
> -
> - ret = cpuhp_setup_state(CPUHP_MM_ZS_PREPARE, "mm/zsmalloc:prepare",
> - zs_cpu_prepare, zs_cpu_dead);
> - if (ret)
> - goto out;
> -
> #ifdef CONFIG_ZPOOL
> zpool_register_driver(&zs_zpool_driver);
> #endif
> @@ -2479,9 +2246,6 @@ static int __init zs_init(void)
> zs_stat_init();
>
> return 0;
> -
> -out:
> - return ret;
> }
>
> static void __exit zs_exit(void)
> @@ -2489,7 +2253,6 @@ static void __exit zs_exit(void)
> #ifdef CONFIG_ZPOOL
> zpool_unregister_driver(&zs_zpool_driver);
> #endif
> - cpuhp_remove_state(CPUHP_MM_ZS_PREPARE);
>
> zs_stat_exit();
> }
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 6504174fbc6a..74252187d763 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -147,7 +147,6 @@ struct crypto_acomp_ctx {
> struct crypto_wait wait;
> u8 *buffer;
> struct mutex mutex;
> - bool is_sleepable;
> };
>
> /*
> @@ -865,7 +864,6 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
>
> acomp_ctx->buffer = buffer;
> acomp_ctx->acomp = acomp;
> - acomp_ctx->is_sleepable = acomp_is_async(acomp);
> acomp_ctx->req = req;
> mutex_unlock(&acomp_ctx->mutex);
> return 0;
> @@ -930,7 +928,6 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
> unsigned int dlen = PAGE_SIZE;
> unsigned long handle;
> struct zpool *zpool;
> - char *buf;
> gfp_t gfp;
> u8 *dst;
>
> @@ -972,9 +969,7 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
> if (alloc_ret)
> goto unlock;
>
> - buf = zpool_map_handle(zpool, handle, ZPOOL_MM_WO);
> - memcpy(buf, dst, dlen);
> - zpool_unmap_handle(zpool, handle);
> + zpool_write_handle(zpool, handle, dst, dlen);
>
> entry->handle = handle;
> entry->length = dlen;
> @@ -994,37 +989,19 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
> static void zswap_decompress(struct zswap_entry *entry, struct folio *folio)
> {
> struct zpool *zpool = entry->pool->zpool;
> - struct scatterlist input, output;
> struct crypto_acomp_ctx *acomp_ctx;
> - u8 *src;
> + struct scatterlist input[2];
> + struct scatterlist output;
>
> acomp_ctx = acomp_ctx_get_cpu_lock(entry->pool);
> - src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
> - /*
> - * If zpool_map_handle is atomic, we cannot reliably utilize its mapped buffer
> - * to do crypto_acomp_decompress() which might sleep. In such cases, we must
> - * resort to copying the buffer to a temporary one.
> - * Meanwhile, zpool_map_handle() might return a non-linearly mapped buffer,
> - * such as a kmap address of high memory or even ever a vmap address.
> - * However, sg_init_one is only equipped to handle linearly mapped low memory.
> - * In such cases, we also must copy the buffer to a temporary and lowmem one.
> - */
> - if ((acomp_ctx->is_sleepable && !zpool_can_sleep_mapped(zpool)) ||
> - !virt_addr_valid(src)) {
> - memcpy(acomp_ctx->buffer, src, entry->length);
> - src = acomp_ctx->buffer;
> - zpool_unmap_handle(zpool, entry->handle);
> - }
> -
> - sg_init_one(&input, src, entry->length);
> + zpool_pin_handle(zpool, entry->handle, input);
> sg_init_table(&output, 1);
> sg_set_folio(&output, folio, PAGE_SIZE, 0);
> - acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, PAGE_SIZE);
> + acomp_request_set_params(acomp_ctx->req, input, &output, entry->length, PAGE_SIZE);
> BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait));
> BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
>
> - if (src != acomp_ctx->buffer)
> - zpool_unmap_handle(zpool, entry->handle);
> + zpool_unpin_handle(zpool, entry->handle);
> acomp_ctx_put_unlock(acomp_ctx);
> }
>
>
> Thanks,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
next prev parent reply other threads:[~2025-03-05 6:17 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-27 10:14 [PATCH 0/7] crypto: acomp - Add request chaining and virtual address support Herbert Xu
2025-02-27 10:14 ` [PATCH 1/7] crypto: iaa - Test the correct request flag Herbert Xu
2025-02-27 10:14 ` [PATCH 2/7] crypto: acomp - Remove acomp request flags Herbert Xu
2025-02-27 10:15 ` [PATCH 3/7] crypto: acomp - Add request chaining and virtual addresses Herbert Xu
2025-02-27 18:33 ` Eric Biggers
2025-02-27 10:15 ` [PATCH 4/7] crypto: testmgr - Remove NULL dst acomp tests Herbert Xu
2025-02-27 10:15 ` [PATCH 5/7] crypto: scomp - Remove support for non-trivial SG lists Herbert Xu
2025-02-27 10:15 ` [PATCH 6/7] crypto: scomp - Add chaining and virtual address support Herbert Xu
2025-02-27 10:15 ` [RFC PATCH 7/7] mm: zswap: Use acomp virtual address interface Herbert Xu
2025-02-27 18:11 ` Yosry Ahmed
2025-02-27 18:38 ` Eric Biggers
2025-02-27 21:43 ` Yosry Ahmed
2025-02-28 8:13 ` Herbert Xu
2025-02-28 9:54 ` Herbert Xu
2025-02-28 15:56 ` Yosry Ahmed
2025-03-01 6:36 ` Herbert Xu
2025-03-01 7:03 ` Herbert Xu
2025-03-03 20:17 ` Yosry Ahmed
2025-03-04 3:29 ` Herbert Xu
2025-03-04 4:30 ` Yosry Ahmed
2025-03-04 6:10 ` Herbert Xu
2025-03-04 8:33 ` Sergey Senozhatsky
2025-03-04 8:42 ` Herbert Xu
2025-03-04 8:45 ` Sergey Senozhatsky
2025-03-04 13:19 ` Sergey Senozhatsky
2025-03-04 20:47 ` Yosry Ahmed
2025-03-05 3:45 ` Herbert Xu
2025-03-05 5:58 ` Sergey Senozhatsky
2025-03-05 6:05 ` Herbert Xu
2025-03-05 6:18 ` Yosry Ahmed
2025-03-05 7:41 ` Herbert Xu
2025-03-05 17:07 ` Nhat Pham
2025-03-06 2:48 ` Sergey Senozhatsky
2025-03-05 3:40 ` Herbert Xu
2025-03-05 6:17 ` Yosry Ahmed [this message]
2025-03-05 7:46 ` Herbert Xu
2025-03-05 14:10 ` Herbert Xu
2025-03-05 16:25 ` Yosry Ahmed
2025-03-06 0:40 ` Herbert Xu
2025-03-06 16:58 ` Yosry Ahmed
2025-03-01 7:34 ` Herbert Xu
2025-03-01 7:38 ` Herbert Xu
2025-02-27 18:11 ` [PATCH 0/7] crypto: acomp - Add request chaining and virtual address support Yosry Ahmed
2025-02-28 9:02 ` Herbert Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z8fsXZNgEbVkZrJP@google.com \
--to=yosry.ahmed@linux.dev \
--cc=ebiggers@kernel.org \
--cc=herbert@gondor.apana.org.au \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=senozhatsky@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox