* [PATCHv1 0/8] zram: introduce multi-handle entries
@ 2024-11-19 7:20 Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 1/8] zram: cond_resched() in writeback loop Sergey Senozhatsky
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-11-19 7:20 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim; +Cc: linux-kernel, Sergey Senozhatsky
ZRAM_HUGE objects are incompressible and each takes a whole
physical page on the zsmalloc side. zsmalloc pool, naturally, has
some internal memory fragmentation (within size-classes), so what
we can do for ZRAM_HUGE objects is to split them into several
smaller objects (2 at this point) and store those parts individually
in regular size-classes (hence multi-handle entries). This, basically,
lets us to use already allocated (but unused) zspages memory for
ZRAM_HUGE objects, instead of unconditional allocation of 0-order
page for each ZRAM_HUGE object.
v1:
-- reworked ZRAM_SAME patch (added missing slot lock guard for slot
flags operation, added missing .pages_stored bump, factored out
into a separate function)
-- renamed mhandle defines and added mhandle tail len define
-- fixed some typos
Sergey Senozhatsky (8):
zram: cond_resched() in writeback loop
zram: free slot memory early during write
zram: remove entry element member
zram: factor out ZRAM_SAME write
zram: factor out ZRAM_HUGE write
zram: factor out ZRAM_HUGE read
zsmalloc: move ZS_HANDLE_SIZE to zsmalloc header
zram: introduce multi-handle entries
drivers/block/zram/zram_drv.c | 368 ++++++++++++++++++++++------------
drivers/block/zram/zram_drv.h | 12 +-
include/linux/zsmalloc.h | 2 +
mm/zsmalloc.c | 2 -
4 files changed, 249 insertions(+), 135 deletions(-)
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCHv1 1/8] zram: cond_resched() in writeback loop
2024-11-19 7:20 [PATCHv1 0/8] zram: introduce multi-handle entries Sergey Senozhatsky
@ 2024-11-19 7:20 ` Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 2/8] zram: free slot memory early during write Sergey Senozhatsky
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-11-19 7:20 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim; +Cc: linux-kernel, Sergey Senozhatsky
Writeback loop can run for quite a while (depending on
wb device performance, compression algorithm and the
number of entries we writeback), so we need to do
cond_resched() there, similarly to what we do in
recompress loop.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
drivers/block/zram/zram_drv.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 3dee026988dc..882a32d46a75 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -883,6 +883,8 @@ static ssize_t writeback_store(struct device *dev,
next:
zram_slot_unlock(zram, index);
release_pp_slot(zram, pps);
+
+ cond_resched();
}
if (blk_idx)
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCHv1 2/8] zram: free slot memory early during write
2024-11-19 7:20 [PATCHv1 0/8] zram: introduce multi-handle entries Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 1/8] zram: cond_resched() in writeback loop Sergey Senozhatsky
@ 2024-11-19 7:20 ` Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 3/8] zram: remove entry element member Sergey Senozhatsky
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-11-19 7:20 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim; +Cc: linux-kernel, Sergey Senozhatsky
In the current implementation entry's previously allocated
memory is released in the very last moment, when we already
have allocated a new memory for new data. This, basically,
temporarily increases memory usage for no good reason. For
example, consider the case when both old (stale) and new
entry data are incompressible so such entry will temporarily
use two physical pages - one for stale (old) data and one
for new data. We can release old memory as soon as we get
a write request for entry.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
drivers/block/zram/zram_drv.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 882a32d46a75..987d72f2249c 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1640,6 +1640,11 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
unsigned long element = 0;
enum zram_pageflags flags = 0;
+ /* First, free memory allocated to this slot (if any) */
+ zram_slot_lock(zram, index);
+ zram_free_page(zram, index);
+ zram_slot_unlock(zram, index);
+
mem = kmap_local_page(page);
if (page_same_filled(mem, &element)) {
kunmap_local(mem);
@@ -1728,13 +1733,7 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
zs_unmap_object(zram->mem_pool, handle);
atomic64_add(comp_len, &zram->stats.compr_data_size);
out:
- /*
- * Free memory associated with this sector
- * before overwriting unused sectors.
- */
zram_slot_lock(zram, index);
- zram_free_page(zram, index);
-
if (comp_len == PAGE_SIZE) {
zram_set_flag(zram, index, ZRAM_HUGE);
atomic64_inc(&zram->stats.huge_pages);
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCHv1 3/8] zram: remove entry element member
2024-11-19 7:20 [PATCHv1 0/8] zram: introduce multi-handle entries Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 1/8] zram: cond_resched() in writeback loop Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 2/8] zram: free slot memory early during write Sergey Senozhatsky
@ 2024-11-19 7:20 ` Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 4/8] zram: factor out ZRAM_SAME write Sergey Senozhatsky
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-11-19 7:20 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim; +Cc: linux-kernel, Sergey Senozhatsky
Element is in the same anon union as handle and hence
holds the same value, which makes code below sort of
confusing
handle = zram_get_handle()
if (!handle)
element = zram_get_element()
Element doesn't really simplify the code, let's just
remove it.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
drivers/block/zram/zram_drv.c | 23 +++++------------------
drivers/block/zram/zram_drv.h | 5 +----
2 files changed, 6 insertions(+), 22 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 987d72f2249c..e80b4d15b74b 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -112,17 +112,6 @@ static void zram_clear_flag(struct zram *zram, u32 index,
zram->table[index].flags &= ~BIT(flag);
}
-static inline void zram_set_element(struct zram *zram, u32 index,
- unsigned long element)
-{
- zram->table[index].element = element;
-}
-
-static unsigned long zram_get_element(struct zram *zram, u32 index)
-{
- return zram->table[index].element;
-}
-
static size_t zram_get_obj_size(struct zram *zram, u32 index)
{
return zram->table[index].flags & (BIT(ZRAM_FLAG_SHIFT) - 1);
@@ -873,7 +862,7 @@ static ssize_t writeback_store(struct device *dev,
zram_free_page(zram, index);
zram_set_flag(zram, index, ZRAM_WB);
- zram_set_element(zram, index, blk_idx);
+ zram_set_handle(zram, index, blk_idx);
blk_idx = 0;
atomic64_inc(&zram->stats.pages_stored);
spin_lock(&zram->wb_limit_lock);
@@ -1496,7 +1485,7 @@ static void zram_free_page(struct zram *zram, size_t index)
if (zram_test_flag(zram, index, ZRAM_WB)) {
zram_clear_flag(zram, index, ZRAM_WB);
- free_block_bdev(zram, zram_get_element(zram, index));
+ free_block_bdev(zram, zram_get_handle(zram, index));
goto out;
}
@@ -1540,12 +1529,10 @@ static int zram_read_from_zspool(struct zram *zram, struct page *page,
handle = zram_get_handle(zram, index);
if (!handle || zram_test_flag(zram, index, ZRAM_SAME)) {
- unsigned long value;
void *mem;
- value = handle ? zram_get_element(zram, index) : 0;
mem = kmap_local_page(page);
- zram_fill_page(mem, PAGE_SIZE, value);
+ zram_fill_page(mem, PAGE_SIZE, handle);
kunmap_local(mem);
return 0;
}
@@ -1591,7 +1578,7 @@ static int zram_read_page(struct zram *zram, struct page *page, u32 index,
*/
zram_slot_unlock(zram, index);
- ret = read_from_bdev(zram, page, zram_get_element(zram, index),
+ ret = read_from_bdev(zram, page, zram_get_handle(zram, index),
parent);
}
@@ -1742,7 +1729,7 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
if (flags) {
zram_set_flag(zram, index, flags);
- zram_set_element(zram, index, element);
+ zram_set_handle(zram, index, element);
} else {
zram_set_handle(zram, index, handle);
zram_set_obj_size(zram, index, comp_len);
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 134be414e210..db78d7c01b9a 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -62,10 +62,7 @@ enum zram_pageflags {
/* Allocated for each disk page */
struct zram_table_entry {
- union {
- unsigned long handle;
- unsigned long element;
- };
+ unsigned long handle;
unsigned int flags;
spinlock_t lock;
#ifdef CONFIG_ZRAM_TRACK_ENTRY_ACTIME
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCHv1 4/8] zram: factor out ZRAM_SAME write
2024-11-19 7:20 [PATCHv1 0/8] zram: introduce multi-handle entries Sergey Senozhatsky
` (2 preceding siblings ...)
2024-11-19 7:20 ` [PATCHv1 3/8] zram: remove entry element member Sergey Senozhatsky
@ 2024-11-19 7:20 ` Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 5/8] zram: factor out ZRAM_HUGE write Sergey Senozhatsky
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-11-19 7:20 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim; +Cc: linux-kernel, Sergey Senozhatsky
Handling of ZRAM_SAME now uses a goto to the final stages of
zram_write_page() plus it introduces a branch and flags variable,
which is not making the code any simpler. In reality, we can
handle ZRAM_SAME immediately when we detect such pages and
remove a goto and a branch.
Factor out ZRAM_SAME handling into a separate routine to
simplify zram_write_page().
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
drivers/block/zram/zram_drv.c | 37 ++++++++++++++++++++---------------
1 file changed, 21 insertions(+), 16 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index e80b4d15b74b..f89af45237c9 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1616,6 +1616,20 @@ static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
return zram_read_page(zram, bvec->bv_page, index, bio);
}
+static int zram_write_same_filled_page(struct zram *zram, unsigned long fill,
+ u32 index)
+{
+ zram_slot_lock(zram, index);
+ zram_set_flag(zram, index, ZRAM_SAME);
+ zram_set_handle(zram, index, fill);
+ zram_slot_unlock(zram, index);
+
+ atomic64_inc(&zram->stats.same_pages);
+ atomic64_inc(&zram->stats.pages_stored);
+
+ return 0;
+}
+
static int zram_write_page(struct zram *zram, struct page *page, u32 index)
{
int ret = 0;
@@ -1625,7 +1639,7 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
void *src, *dst, *mem;
struct zcomp_strm *zstrm;
unsigned long element = 0;
- enum zram_pageflags flags = 0;
+ bool same_filled;
/* First, free memory allocated to this slot (if any) */
zram_slot_lock(zram, index);
@@ -1633,14 +1647,10 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
zram_slot_unlock(zram, index);
mem = kmap_local_page(page);
- if (page_same_filled(mem, &element)) {
- kunmap_local(mem);
- /* Free memory associated with this sector now. */
- flags = ZRAM_SAME;
- atomic64_inc(&zram->stats.same_pages);
- goto out;
- }
+ same_filled = page_same_filled(mem, &element);
kunmap_local(mem);
+ if (same_filled)
+ return zram_write_same_filled_page(zram, element, index);
compress_again:
zstrm = zcomp_stream_get(zram->comps[ZRAM_PRIMARY_COMP]);
@@ -1719,7 +1729,7 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]);
zs_unmap_object(zram->mem_pool, handle);
atomic64_add(comp_len, &zram->stats.compr_data_size);
-out:
+
zram_slot_lock(zram, index);
if (comp_len == PAGE_SIZE) {
zram_set_flag(zram, index, ZRAM_HUGE);
@@ -1727,13 +1737,8 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
atomic64_inc(&zram->stats.huge_pages_since);
}
- if (flags) {
- zram_set_flag(zram, index, flags);
- zram_set_handle(zram, index, element);
- } else {
- zram_set_handle(zram, index, handle);
- zram_set_obj_size(zram, index, comp_len);
- }
+ zram_set_handle(zram, index, handle);
+ zram_set_obj_size(zram, index, comp_len);
zram_slot_unlock(zram, index);
/* Update stats */
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCHv1 5/8] zram: factor out ZRAM_HUGE write
2024-11-19 7:20 [PATCHv1 0/8] zram: introduce multi-handle entries Sergey Senozhatsky
` (3 preceding siblings ...)
2024-11-19 7:20 ` [PATCHv1 4/8] zram: factor out ZRAM_SAME write Sergey Senozhatsky
@ 2024-11-19 7:20 ` Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 6/8] zram: factor out ZRAM_HUGE read Sergey Senozhatsky
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-11-19 7:20 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim; +Cc: linux-kernel, Sergey Senozhatsky
zram_write_page() handles 3 different cases: ZRAM_SAME pages
(which was already simplified in previous patches) writes,
regular pages writes and ZRAM_HUGE pages writes.
ZRAM_HUGE handling adds a significant amount of complexity: all
those conditional src for copy_page(), etc., a conditional jump
backward for the fallback handle allocation and so on. Instead,
we can handle ZRAM_HUGE in a separate function and remove quite
a bit of that complexity at a cost of minor code duplication
(basically, only zram stats updates).
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
drivers/block/zram/zram_drv.c | 140 +++++++++++++++++++++-------------
1 file changed, 85 insertions(+), 55 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index f89af45237c9..9101fd0c670f 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -132,6 +132,30 @@ static inline bool zram_allocated(struct zram *zram, u32 index)
zram_test_flag(zram, index, ZRAM_WB);
}
+static inline void update_used_max(struct zram *zram, const unsigned long pages)
+{
+ unsigned long cur_max = atomic_long_read(&zram->stats.max_used_pages);
+
+ do {
+ if (cur_max >= pages)
+ return;
+ } while (!atomic_long_try_cmpxchg(&zram->stats.max_used_pages,
+ &cur_max, pages));
+}
+
+static bool zram_can_store_page(struct zram *zram)
+{
+ unsigned long alloced_pages;
+
+ alloced_pages = zs_get_total_pages(zram->mem_pool);
+ update_used_max(zram, alloced_pages);
+
+ if (!zram->limit_pages)
+ return true;
+
+ return alloced_pages <= zram->limit_pages;
+}
+
#if PAGE_SIZE != 4096
static inline bool is_partial_io(struct bio_vec *bvec)
{
@@ -266,18 +290,6 @@ static struct zram_pp_slot *select_pp_slot(struct zram_pp_ctl *ctl)
}
#endif
-static inline void update_used_max(struct zram *zram,
- const unsigned long pages)
-{
- unsigned long cur_max = atomic_long_read(&zram->stats.max_used_pages);
-
- do {
- if (cur_max >= pages)
- return;
- } while (!atomic_long_try_cmpxchg(&zram->stats.max_used_pages,
- &cur_max, pages));
-}
-
static inline void zram_fill_page(void *ptr, unsigned long len,
unsigned long value)
{
@@ -1630,13 +1642,54 @@ static int zram_write_same_filled_page(struct zram *zram, unsigned long fill,
return 0;
}
+static int zram_write_incompressible_page(struct zram *zram, struct page *page,
+ u32 index)
+{
+ unsigned long handle;
+ void *src, *dst;
+
+ /*
+ * This function is called from preemptible context so we don't need
+ * to do optimistic and fallback to pessimistic handle allocation,
+ * like we do for compressible pages.
+ */
+ handle = zs_malloc(zram->mem_pool, PAGE_SIZE,
+ GFP_NOIO | __GFP_HIGHMEM | __GFP_MOVABLE);
+ if (IS_ERR_VALUE(handle))
+ return PTR_ERR((void *)handle);
+
+ if (!zram_can_store_page(zram)) {
+ zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]);
+ zs_free(zram->mem_pool, handle);
+ return -ENOMEM;
+ }
+
+ dst = zs_map_object(zram->mem_pool, handle, ZS_MM_WO);
+ src = kmap_local_page(page);
+ memcpy(dst, src, PAGE_SIZE);
+ kunmap_local(src);
+ zs_unmap_object(zram->mem_pool, handle);
+
+ zram_slot_lock(zram, index);
+ zram_set_flag(zram, index, ZRAM_HUGE);
+ zram_set_handle(zram, index, handle);
+ zram_set_obj_size(zram, index, PAGE_SIZE);
+ zram_slot_unlock(zram, index);
+
+ atomic64_add(PAGE_SIZE, &zram->stats.compr_data_size);
+ atomic64_inc(&zram->stats.huge_pages);
+ atomic64_inc(&zram->stats.huge_pages_since);
+ atomic64_inc(&zram->stats.pages_stored);
+
+ return 0;
+}
+
static int zram_write_page(struct zram *zram, struct page *page, u32 index)
{
int ret = 0;
- unsigned long alloced_pages;
unsigned long handle = -ENOMEM;
unsigned int comp_len = 0;
- void *src, *dst, *mem;
+ void *dst, *mem;
struct zcomp_strm *zstrm;
unsigned long element = 0;
bool same_filled;
@@ -1654,10 +1707,10 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
compress_again:
zstrm = zcomp_stream_get(zram->comps[ZRAM_PRIMARY_COMP]);
- src = kmap_local_page(page);
+ mem = kmap_local_page(page);
ret = zcomp_compress(zram->comps[ZRAM_PRIMARY_COMP], zstrm,
- src, &comp_len);
- kunmap_local(src);
+ mem, &comp_len);
+ kunmap_local(mem);
if (unlikely(ret)) {
zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]);
@@ -1666,8 +1719,11 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
return ret;
}
- if (comp_len >= huge_class_size)
- comp_len = PAGE_SIZE;
+ if (comp_len >= huge_class_size) {
+ zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]);
+ return zram_write_incompressible_page(zram, page, index);
+ }
+
/*
* handle allocation has 2 paths:
* a) fast path is executed with preemption disabled (for
@@ -1683,66 +1739,40 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
*/
if (IS_ERR_VALUE(handle))
handle = zs_malloc(zram->mem_pool, comp_len,
- __GFP_KSWAPD_RECLAIM |
- __GFP_NOWARN |
- __GFP_HIGHMEM |
- __GFP_MOVABLE);
+ __GFP_KSWAPD_RECLAIM |
+ __GFP_NOWARN |
+ __GFP_HIGHMEM |
+ __GFP_MOVABLE);
if (IS_ERR_VALUE(handle)) {
zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]);
atomic64_inc(&zram->stats.writestall);
handle = zs_malloc(zram->mem_pool, comp_len,
- GFP_NOIO | __GFP_HIGHMEM |
- __GFP_MOVABLE);
+ GFP_NOIO | __GFP_HIGHMEM | __GFP_MOVABLE);
if (IS_ERR_VALUE(handle))
return PTR_ERR((void *)handle);
- if (comp_len != PAGE_SIZE)
- goto compress_again;
- /*
- * If the page is not compressible, you need to acquire the
- * lock and execute the code below. The zcomp_stream_get()
- * call is needed to disable the cpu hotplug and grab the
- * zstrm buffer back. It is necessary that the dereferencing
- * of the zstrm variable below occurs correctly.
- */
- zstrm = zcomp_stream_get(zram->comps[ZRAM_PRIMARY_COMP]);
+ goto compress_again;
}
- alloced_pages = zs_get_total_pages(zram->mem_pool);
- update_used_max(zram, alloced_pages);
-
- if (zram->limit_pages && alloced_pages > zram->limit_pages) {
+ if (!zram_can_store_page(zram)) {
zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]);
zs_free(zram->mem_pool, handle);
return -ENOMEM;
}
dst = zs_map_object(zram->mem_pool, handle, ZS_MM_WO);
-
- src = zstrm->buffer;
- if (comp_len == PAGE_SIZE)
- src = kmap_local_page(page);
- memcpy(dst, src, comp_len);
- if (comp_len == PAGE_SIZE)
- kunmap_local(src);
-
+ memcpy(dst, zstrm->buffer, comp_len);
zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]);
zs_unmap_object(zram->mem_pool, handle);
- atomic64_add(comp_len, &zram->stats.compr_data_size);
zram_slot_lock(zram, index);
- if (comp_len == PAGE_SIZE) {
- zram_set_flag(zram, index, ZRAM_HUGE);
- atomic64_inc(&zram->stats.huge_pages);
- atomic64_inc(&zram->stats.huge_pages_since);
- }
-
zram_set_handle(zram, index, handle);
zram_set_obj_size(zram, index, comp_len);
zram_slot_unlock(zram, index);
- /* Update stats */
atomic64_inc(&zram->stats.pages_stored);
+ atomic64_add(comp_len, &zram->stats.compr_data_size);
+
return ret;
}
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCHv1 6/8] zram: factor out ZRAM_HUGE read
2024-11-19 7:20 [PATCHv1 0/8] zram: introduce multi-handle entries Sergey Senozhatsky
` (4 preceding siblings ...)
2024-11-19 7:20 ` [PATCHv1 5/8] zram: factor out ZRAM_HUGE write Sergey Senozhatsky
@ 2024-11-19 7:20 ` Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 7/8] zsmalloc: move ZS_HANDLE_SIZE to zsmalloc header Sergey Senozhatsky
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-11-19 7:20 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim; +Cc: linux-kernel, Sergey Senozhatsky
Similarly to write, move ZRAM_HUGE read handlig to a separate
function. This will make more sense with introduction of
multi-handle entries later in the series.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
drivers/block/zram/zram_drv.c | 71 ++++++++++++++++++++++-------------
1 file changed, 44 insertions(+), 27 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 9101fd0c670f..2b20afcfbb94 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1525,6 +1525,46 @@ static void zram_free_page(struct zram *zram, size_t index)
zram_set_obj_size(zram, index, 0);
}
+static int read_incompressible_page(struct zram *zram, struct page *page,
+ u32 index)
+{
+ unsigned long handle;
+ void *src, *dst;
+
+ handle = zram_get_handle(zram, index);
+ src = zs_map_object(zram->mem_pool, handle, ZS_MM_RO);
+ dst = kmap_local_page(page);
+ copy_page(dst, src);
+ kunmap_local(dst);
+ zs_unmap_object(zram->mem_pool, handle);
+
+ return 0;
+}
+
+static int read_compressible_page(struct zram *zram, struct page *page,
+ u32 index)
+{
+ struct zcomp_strm *zstrm;
+ unsigned long handle;
+ unsigned int size;
+ void *src, *dst;
+ int ret, prio;
+
+ handle = zram_get_handle(zram, index);
+ size = zram_get_obj_size(zram, index);
+ prio = zram_get_priority(zram, index);
+
+ zstrm = zcomp_stream_get(zram->comps[prio]);
+ src = zs_map_object(zram->mem_pool, handle, ZS_MM_RO);
+ dst = kmap_local_page(page);
+ ret = zcomp_decompress(zram->comps[prio], zstrm, src, size, dst);
+ kunmap_local(dst);
+ zs_unmap_object(zram->mem_pool, handle);
+ zcomp_stream_put(zram->comps[prio]);
+
+ return ret;
+}
+
/*
* Reads (decompresses if needed) a page from zspool (zsmalloc).
* Corresponding ZRAM slot should be locked.
@@ -1532,12 +1572,7 @@ static void zram_free_page(struct zram *zram, size_t index)
static int zram_read_from_zspool(struct zram *zram, struct page *page,
u32 index)
{
- struct zcomp_strm *zstrm;
unsigned long handle;
- unsigned int size;
- void *src, *dst;
- u32 prio;
- int ret;
handle = zram_get_handle(zram, index);
if (!handle || zram_test_flag(zram, index, ZRAM_SAME)) {
@@ -1549,28 +1584,10 @@ static int zram_read_from_zspool(struct zram *zram, struct page *page,
return 0;
}
- size = zram_get_obj_size(zram, index);
-
- if (size != PAGE_SIZE) {
- prio = zram_get_priority(zram, index);
- zstrm = zcomp_stream_get(zram->comps[prio]);
- }
-
- src = zs_map_object(zram->mem_pool, handle, ZS_MM_RO);
- if (size == PAGE_SIZE) {
- dst = kmap_local_page(page);
- copy_page(dst, src);
- kunmap_local(dst);
- ret = 0;
- } else {
- dst = kmap_local_page(page);
- ret = zcomp_decompress(zram->comps[prio], zstrm,
- src, size, dst);
- kunmap_local(dst);
- zcomp_stream_put(zram->comps[prio]);
- }
- zs_unmap_object(zram->mem_pool, handle);
- return ret;
+ if (!zram_test_flag(zram, index, ZRAM_HUGE))
+ return read_compressible_page(zram, page, index);
+ else
+ return read_incompressible_page(zram, page, index);
}
static int zram_read_page(struct zram *zram, struct page *page, u32 index,
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCHv1 7/8] zsmalloc: move ZS_HANDLE_SIZE to zsmalloc header
2024-11-19 7:20 [PATCHv1 0/8] zram: introduce multi-handle entries Sergey Senozhatsky
` (5 preceding siblings ...)
2024-11-19 7:20 ` [PATCHv1 6/8] zram: factor out ZRAM_HUGE read Sergey Senozhatsky
@ 2024-11-19 7:20 ` Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 8/8] zram: introduce multi-handle entries Sergey Senozhatsky
2024-11-19 9:20 ` [PATCHv1 0/8] " Sergey Senozhatsky
8 siblings, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-11-19 7:20 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim; +Cc: linux-kernel, Sergey Senozhatsky
It will be used in object's split size calculations.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
include/linux/zsmalloc.h | 2 ++
mm/zsmalloc.c | 2 --
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index a48cd0ffe57d..c17803da7f18 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -16,6 +16,8 @@
#include <linux/types.h>
+#define ZS_HANDLE_SIZE (sizeof(unsigned long))
+
/*
* zsmalloc mapping modes
*
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 64b66a4d3e6e..466d5f49eb91 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -78,8 +78,6 @@
*/
#define ZS_ALIGN 8
-#define ZS_HANDLE_SIZE (sizeof(unsigned long))
-
/*
* Object location (<PFN>, <obj_idx>) is encoded as
* a single (unsigned long) handle value.
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCHv1 8/8] zram: introduce multi-handle entries
2024-11-19 7:20 [PATCHv1 0/8] zram: introduce multi-handle entries Sergey Senozhatsky
` (6 preceding siblings ...)
2024-11-19 7:20 ` [PATCHv1 7/8] zsmalloc: move ZS_HANDLE_SIZE to zsmalloc header Sergey Senozhatsky
@ 2024-11-19 7:20 ` Sergey Senozhatsky
2024-11-19 9:20 ` [PATCHv1 0/8] " Sergey Senozhatsky
8 siblings, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-11-19 7:20 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim; +Cc: linux-kernel, Sergey Senozhatsky
zsmalloc size-classes store more than one compressed object per
physical page, therefore internal fragmentation is expected and
quite common. Internal fragmentation is completely normal, once
the system gets low on memory zsmalloc attempts to defragment
its pool and release empty zspage-s. However, even this does not
guarantee 100% usage-ratio of pool memory due to the nature of
allocators.
ZRAM_HUGE objects, on the other hand, do not share physical pages
with another objects, because each such object is stored raw
(uncompressed) and occupies a whole physical page.
We, in fact, can get advantage of zsmalloc's internal fragmentation.
Instead of allocating a physical page for each huge object it is
possible to split such objects into smaller objects and store them
in regular size-classes, possibly using allocated but unused zspages'
space. Given that huge objects are stored raw, both write and read of
such objects require only memcpy() and don't need any extra temporary
storage / buffers.
Split ZRAM_HUGE objects into two 2048 objects are store those
parts in regular size-classes. This now allocates and tracks
two zsmalloc handles for such entries.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
drivers/block/zram/zram_drv.c | 120 ++++++++++++++++++++++++++--------
drivers/block/zram/zram_drv.h | 15 ++++-
2 files changed, 106 insertions(+), 29 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 2b20afcfbb94..7e29e204fccf 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -37,6 +37,17 @@
#include "zram_drv.h"
+/*
+ * This determines sizes of the ZRAM_HUGE object split. Currently we perform
+ * a 2-way split. One part is stored in 2048 size-class and the other one is
+ * stored in the size-class above 2048.
+ *
+ * To store an object in a target size-class we need to sub zsmalloc
+ * handle size (which is added to every object by zsmalloc internally).
+ */
+#define ZRAM_MHANDLE_HEAD_LEN ((PAGE_SIZE) / 2 - ZS_HANDLE_SIZE)
+#define ZRAM_MHANDLE_TAIL_LEN ((PAGE_SIZE) - ZRAM_MHANDLE_HEAD_LEN)
+
static DEFINE_IDR(zram_index_idr);
/* idr index must be protected */
static DEFINE_MUTEX(zram_index_mutex);
@@ -93,6 +104,18 @@ static void zram_set_handle(struct zram *zram, u32 index, unsigned long handle)
zram->table[index].handle = handle;
}
+static struct zram_multi_handle *zram_get_multi_handle(struct zram *zram,
+ u32 index)
+{
+ return zram->table[index].mhandle;
+}
+
+static void zram_set_multi_handle(struct zram *zram, u32 index,
+ struct zram_multi_handle *mhandle)
+{
+ zram->table[index].mhandle = mhandle;
+}
+
/* flag operations require table entry bit_spin_lock() being held */
static bool zram_test_flag(struct zram *zram, u32 index,
enum zram_pageflags flag)
@@ -1479,8 +1502,6 @@ static bool zram_meta_alloc(struct zram *zram, u64 disksize)
*/
static void zram_free_page(struct zram *zram, size_t index)
{
- unsigned long handle;
-
#ifdef CONFIG_ZRAM_TRACK_ENTRY_ACTIME
zram->table[index].ac_time = 0;
#endif
@@ -1490,11 +1511,6 @@ static void zram_free_page(struct zram *zram, size_t index)
zram_clear_flag(zram, index, ZRAM_PP_SLOT);
zram_set_priority(zram, index, 0);
- if (zram_test_flag(zram, index, ZRAM_HUGE)) {
- zram_clear_flag(zram, index, ZRAM_HUGE);
- atomic64_dec(&zram->stats.huge_pages);
- }
-
if (zram_test_flag(zram, index, ZRAM_WB)) {
zram_clear_flag(zram, index, ZRAM_WB);
free_block_bdev(zram, zram_get_handle(zram, index));
@@ -1511,11 +1527,26 @@ static void zram_free_page(struct zram *zram, size_t index)
goto out;
}
- handle = zram_get_handle(zram, index);
- if (!handle)
- return;
+ if (zram_test_flag(zram, index, ZRAM_HUGE)) {
+ struct zram_multi_handle *handle;
+
+ handle = zram_get_multi_handle(zram, index);
+ if (!handle)
+ return;
- zs_free(zram->mem_pool, handle);
+ zs_free(zram->mem_pool, handle->head);
+ zs_free(zram->mem_pool, handle->tail);
+ kfree(handle);
+
+ zram_clear_flag(zram, index, ZRAM_HUGE);
+ atomic64_dec(&zram->stats.huge_pages);
+ } else {
+ unsigned long handle = zram_get_handle(zram, index);
+
+ if (!handle)
+ return;
+ zs_free(zram->mem_pool, handle);
+ }
atomic64_sub(zram_get_obj_size(zram, index),
&zram->stats.compr_data_size);
@@ -1528,16 +1559,21 @@ static void zram_free_page(struct zram *zram, size_t index)
static int read_incompressible_page(struct zram *zram, struct page *page,
u32 index)
{
- unsigned long handle;
+ struct zram_multi_handle *handle;
void *src, *dst;
- handle = zram_get_handle(zram, index);
- src = zs_map_object(zram->mem_pool, handle, ZS_MM_RO);
+ handle = zram_get_multi_handle(zram, index);
dst = kmap_local_page(page);
- copy_page(dst, src);
- kunmap_local(dst);
- zs_unmap_object(zram->mem_pool, handle);
+ src = zs_map_object(zram->mem_pool, handle->head, ZS_MM_RO);
+ memcpy(dst, src, ZRAM_MHANDLE_HEAD_LEN);
+ zs_unmap_object(zram->mem_pool, handle->head);
+
+ src = zs_map_object(zram->mem_pool, handle->tail, ZS_MM_RO);
+ memcpy(dst + ZRAM_MHANDLE_HEAD_LEN, src, ZRAM_MHANDLE_TAIL_LEN);
+ zs_unmap_object(zram->mem_pool, handle->tail);
+
+ kunmap_local(dst);
return 0;
}
@@ -1662,34 +1698,54 @@ static int zram_write_same_filled_page(struct zram *zram, unsigned long fill,
static int zram_write_incompressible_page(struct zram *zram, struct page *page,
u32 index)
{
- unsigned long handle;
+ struct zram_multi_handle *handle;
void *src, *dst;
+ int ret;
/*
* This function is called from preemptible context so we don't need
* to do optimistic and fallback to pessimistic handle allocation,
* like we do for compressible pages.
*/
- handle = zs_malloc(zram->mem_pool, PAGE_SIZE,
- GFP_NOIO | __GFP_HIGHMEM | __GFP_MOVABLE);
- if (IS_ERR_VALUE(handle))
- return PTR_ERR((void *)handle);
+ handle = kzalloc(sizeof(*handle), GFP_KERNEL);
+ if (!handle)
+ return -ENOMEM;
+
+ handle->head = zs_malloc(zram->mem_pool, ZRAM_MHANDLE_HEAD_LEN,
+ GFP_NOIO | __GFP_HIGHMEM | __GFP_MOVABLE);
+ if (IS_ERR_VALUE(handle->head)) {
+ ret = PTR_ERR((void *)handle->head);
+ goto error;
+ }
+
+ handle->tail = zs_malloc(zram->mem_pool, ZRAM_MHANDLE_TAIL_LEN,
+ GFP_NOIO | __GFP_HIGHMEM | __GFP_MOVABLE);
+ if (IS_ERR_VALUE(handle->tail)) {
+ ret = PTR_ERR((void *)handle->tail);
+ goto error;
+ }
if (!zram_can_store_page(zram)) {
zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]);
- zs_free(zram->mem_pool, handle);
- return -ENOMEM;
+ ret = -ENOMEM;
+ goto error;
}
- dst = zs_map_object(zram->mem_pool, handle, ZS_MM_WO);
src = kmap_local_page(page);
- memcpy(dst, src, PAGE_SIZE);
+
+ dst = zs_map_object(zram->mem_pool, handle->head, ZS_MM_WO);
+ memcpy(dst, src, ZRAM_MHANDLE_HEAD_LEN);
+ zs_unmap_object(zram->mem_pool, handle->head);
+
+ dst = zs_map_object(zram->mem_pool, handle->tail, ZS_MM_WO);
+ memcpy(dst, src + ZRAM_MHANDLE_HEAD_LEN, ZRAM_MHANDLE_TAIL_LEN);
+ zs_unmap_object(zram->mem_pool, handle->tail);
+
kunmap_local(src);
- zs_unmap_object(zram->mem_pool, handle);
zram_slot_lock(zram, index);
zram_set_flag(zram, index, ZRAM_HUGE);
- zram_set_handle(zram, index, handle);
+ zram_set_multi_handle(zram, index, handle);
zram_set_obj_size(zram, index, PAGE_SIZE);
zram_slot_unlock(zram, index);
@@ -1699,6 +1755,14 @@ static int zram_write_incompressible_page(struct zram *zram, struct page *page,
atomic64_inc(&zram->stats.pages_stored);
return 0;
+
+error:
+ if (!IS_ERR_VALUE(handle->head))
+ zs_free(zram->mem_pool, handle->head);
+ if (!IS_ERR_VALUE(handle->tail))
+ zs_free(zram->mem_pool, handle->tail);
+ kfree(handle);
+ return ret;
}
static int zram_write_page(struct zram *zram, struct page *page, u32 index)
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index db78d7c01b9a..7bc7792c2fef 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -60,9 +60,22 @@ enum zram_pageflags {
/*-- Data structures */
+/*
+ * Unlike regular zram table entries, ZRAM_HUGE entries are stored in zsmalloc
+ * as smaller objects in multiple locations (size-classes). This keeps tracks
+ * of those locations.
+ */
+struct zram_multi_handle {
+ unsigned long head;
+ unsigned long tail;
+};
+
/* Allocated for each disk page */
struct zram_table_entry {
- unsigned long handle;
+ union {
+ unsigned long handle;
+ struct zram_multi_handle *mhandle;
+ };
unsigned int flags;
spinlock_t lock;
#ifdef CONFIG_ZRAM_TRACK_ENTRY_ACTIME
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCHv1 0/8] zram: introduce multi-handle entries
2024-11-19 7:20 [PATCHv1 0/8] zram: introduce multi-handle entries Sergey Senozhatsky
` (7 preceding siblings ...)
2024-11-19 7:20 ` [PATCHv1 8/8] zram: introduce multi-handle entries Sergey Senozhatsky
@ 2024-11-19 9:20 ` Sergey Senozhatsky
8 siblings, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-11-19 9:20 UTC (permalink / raw)
To: Sergey Senozhatsky; +Cc: Andrew Morton, Minchan Kim, linux-kernel
On (24/11/19 16:20), Sergey Senozhatsky wrote:
> ZRAM_HUGE objects are incompressible and each takes a whole
> physical page on the zsmalloc side. zsmalloc pool, naturally, has
> some internal memory fragmentation (within size-classes), so what
> we can do for ZRAM_HUGE objects is to split them into several
> smaller objects (2 at this point) and store those parts individually
> in regular size-classes (hence multi-handle entries). This, basically,
> lets us to use already allocated (but unused) zspages memory for
> ZRAM_HUGE objects, instead of unconditional allocation of 0-order
> page for each ZRAM_HUGE object.
Forgot to mention, this is still just "RFC".
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-11-19 9:21 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-19 7:20 [PATCHv1 0/8] zram: introduce multi-handle entries Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 1/8] zram: cond_resched() in writeback loop Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 2/8] zram: free slot memory early during write Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 3/8] zram: remove entry element member Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 4/8] zram: factor out ZRAM_SAME write Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 5/8] zram: factor out ZRAM_HUGE write Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 6/8] zram: factor out ZRAM_HUGE read Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 7/8] zsmalloc: move ZS_HANDLE_SIZE to zsmalloc header Sergey Senozhatsky
2024-11-19 7:20 ` [PATCHv1 8/8] zram: introduce multi-handle entries Sergey Senozhatsky
2024-11-19 9:20 ` [PATCHv1 0/8] " Sergey Senozhatsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox