* [PATCHv2 0/3] zram: optimal post-processing target selection
@ 2024-09-08 11:42 Sergey Senozhatsky
2024-09-08 11:45 ` Sergey Senozhatsky
0 siblings, 1 reply; 8+ messages in thread
From: Sergey Senozhatsky @ 2024-09-08 11:42 UTC (permalink / raw)
To: Andrew Morton; +Cc: Richard Chang, linux-kernel, Sergey Senozhatsky
Problem:
--------
Both recompression and writeback perform a very simple linear scan
of all zram slots in search for post-processing (writeback or
recompress) candidate slots. This often means that we pick the
worst candidate for pp (post-processing), e.g. a 48 bytes object for
writeback, which is nearly useless, because it only releases 48
bytes from zsmalloc pool, but consumes an entire 4K slot in the
backing device. Similarly, recompression of an 48 bytes objects
is unlikely to save more memory that recompression of a 3000 bytes
object. Both recompression and writeback consume constrained
resources (CPU time, batter, backing device storage space) and
quite often have a (daily) limit on the number of items they
post-process, so we should utilize those constrained resources in
the most optimal way.
Solution:
---------
This patch reworks the way we select pp targets. We, quite clearly,
want to sort all the candidates and always pick the largest, be it
recompression or writeback. Especially for writeback, because the
larger object we writeback the more memory we release. This series
introduces concept of pp buckets and pp scan/selection.
The scan step is a simple iteration over all zram->table entries,
just like what we currently do, but we don't post-process a candidate
slot immediately. Instead we assign it to a PP (post-processing)
bucket. PP bucket is, basically, a list which holds pp candidate
slots that belong to the same size class. PP buckets are 64 bytes
apart, slots are not strictly sorted within a bucket there is a
64 bytes variance.
The select step simply iterates over pp buckets from highest to lowest
and picks all candidate slots a particular buckets contains. So this
gives us sorted candidates (in linear time) and allows us to select
most optimal (largest) candidates for post-processing first.
v2..v1:
-- clear PP_SLOT when slot is accessed
-- kmalloc pp_ctl instead of keeoing it on the stack
-- increase the number of pp-buckets and rework the way it's defined
-- code reshuffle and refactoring
Sergey Senozhatsky (3):
zram: introduce ZRAM_PP_SLOT flag
zram: rework recompress target selection strategy
zram: rework writeback target selection strategy
drivers/block/zram/zram_drv.c | 279 ++++++++++++++++++++++++++++------
drivers/block/zram/zram_drv.h | 1 +
2 files changed, 235 insertions(+), 45 deletions(-)
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHv2 0/3] zram: optimal post-processing target selection
2024-09-08 11:42 Sergey Senozhatsky
@ 2024-09-08 11:45 ` Sergey Senozhatsky
2024-09-08 11:47 ` Sergey Senozhatsky
0 siblings, 1 reply; 8+ messages in thread
From: Sergey Senozhatsky @ 2024-09-08 11:45 UTC (permalink / raw)
To: Sergey Senozhatsky; +Cc: Andrew Morton, Richard Chang, linux-kernel
On (24/09/08 20:42), Sergey Senozhatsky wrote:
>
> v2..v1:
> -- clear PP_SLOT when slot is accessed
> -- kmalloc pp_ctl instead of keeoing it on the stack
> -- increase the number of pp-buckets and rework the way it's defined
> -- code reshuffle and refactoring
>
D'oh... Let me re-send it properly. It was supposed to be sent to
Minchan and Andrew. Sorry for the noise.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCHv2 0/3] zram: optimal post-processing target selection
@ 2024-09-08 11:45 Sergey Senozhatsky
2024-09-08 11:45 ` [PATCHv2 1/3] zram: introduce ZRAM_PP_SLOT flag Sergey Senozhatsky
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Sergey Senozhatsky @ 2024-09-08 11:45 UTC (permalink / raw)
To: Minchan Kim, Andrew Morton
Cc: Richard Chang, linux-kernel, Sergey Senozhatsky
Problem:
--------
Both recompression and writeback perform a very simple linear scan
of all zram slots in search for post-processing (writeback or
recompress) candidate slots. This often means that we pick the
worst candidate for pp (post-processing), e.g. a 48 bytes object for
writeback, which is nearly useless, because it only releases 48
bytes from zsmalloc pool, but consumes an entire 4K slot in the
backing device. Similarly, recompression of an 48 bytes objects
is unlikely to save more memory that recompression of a 3000 bytes
object. Both recompression and writeback consume constrained
resources (CPU time, batter, backing device storage space) and
quite often have a (daily) limit on the number of items they
post-process, so we should utilize those constrained resources in
the most optimal way.
Solution:
---------
This patch reworks the way we select pp targets. We, quite clearly,
want to sort all the candidates and always pick the largest, be it
recompression or writeback. Especially for writeback, because the
larger object we writeback the more memory we release. This series
introduces concept of pp buckets and pp scan/selection.
The scan step is a simple iteration over all zram->table entries,
just like what we currently do, but we don't post-process a candidate
slot immediately. Instead we assign it to a PP (post-processing)
bucket. PP bucket is, basically, a list which holds pp candidate
slots that belong to the same size class. PP buckets are 64 bytes
apart, slots are not strictly sorted within a bucket there is a
64 bytes variance.
The select step simply iterates over pp buckets from highest to lowest
and picks all candidate slots a particular buckets contains. So this
gives us sorted candidates (in linear time) and allows us to select
most optimal (largest) candidates for post-processing first.
v2..v1:
-- clear PP_SLOT when slot is accessed
-- kmalloc pp_ctl instead of keeoing it on the stack
-- increase the number of pp-buckets and rework the way it's defined
-- code reshuffle and refactoring
Sergey Senozhatsky (3):
zram: introduce ZRAM_PP_SLOT flag
zram: rework recompress target selection strategy
zram: rework writeback target selection strategy
drivers/block/zram/zram_drv.c | 279 ++++++++++++++++++++++++++++------
drivers/block/zram/zram_drv.h | 1 +
2 files changed, 235 insertions(+), 45 deletions(-)
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCHv2 1/3] zram: introduce ZRAM_PP_SLOT flag
2024-09-08 11:45 [PATCHv2 0/3] zram: optimal post-processing target selection Sergey Senozhatsky
@ 2024-09-08 11:45 ` Sergey Senozhatsky
2024-09-08 11:45 ` [PATCHv2 2/3] zram: rework recompress target selection strategy Sergey Senozhatsky
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Sergey Senozhatsky @ 2024-09-08 11:45 UTC (permalink / raw)
To: Minchan Kim, Andrew Morton
Cc: Richard Chang, linux-kernel, Sergey Senozhatsky
This flag will indicate that the slot was selected as
a candidate slot for post-processing (pp) and was assigned
to a pp bucket.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
drivers/block/zram/zram_drv.c | 10 ++++++++--
drivers/block/zram/zram_drv.h | 1 +
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 1f1bf175a6c3..a14aef6bf634 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -178,6 +178,7 @@ static inline u32 zram_get_priority(struct zram *zram, u32 index)
static void zram_accessed(struct zram *zram, u32 index)
{
zram_clear_flag(zram, index, ZRAM_IDLE);
+ zram_clear_flag(zram, index, ZRAM_PP_SLOT);
#ifdef CONFIG_ZRAM_TRACK_ENTRY_ACTIME
zram->table[index].ac_time = ktime_get_boottime();
#endif
@@ -659,8 +660,9 @@ static ssize_t writeback_store(struct device *dev,
goto next;
if (zram_test_flag(zram, index, ZRAM_WB) ||
- zram_test_flag(zram, index, ZRAM_SAME) ||
- zram_test_flag(zram, index, ZRAM_UNDER_WB))
+ zram_test_flag(zram, index, ZRAM_SAME) ||
+ zram_test_flag(zram, index, ZRAM_PP_SLOT) ||
+ zram_test_flag(zram, index, ZRAM_UNDER_WB))
goto next;
if (mode & IDLE_WRITEBACK &&
@@ -1368,6 +1370,9 @@ static void zram_free_page(struct zram *zram, size_t index)
goto out;
}
+ if (zram_test_flag(zram, index, ZRAM_PP_SLOT))
+ zram_clear_flag(zram, index, ZRAM_PP_SLOT);
+
handle = zram_get_handle(zram, index);
if (!handle)
return;
@@ -1927,6 +1932,7 @@ static ssize_t recompress_store(struct device *dev,
if (zram_test_flag(zram, index, ZRAM_WB) ||
zram_test_flag(zram, index, ZRAM_UNDER_WB) ||
zram_test_flag(zram, index, ZRAM_SAME) ||
+ zram_test_flag(zram, index, ZRAM_PP_SLOT) ||
zram_test_flag(zram, index, ZRAM_INCOMPRESSIBLE))
goto next;
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index b976824ead67..e0578b3542ce 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -50,6 +50,7 @@ enum zram_pageflags {
ZRAM_SAME, /* Page consists the same element */
ZRAM_WB, /* page is stored on backing_device */
ZRAM_UNDER_WB, /* page is under writeback */
+ ZRAM_PP_SLOT, /* Selected for post-processing */
ZRAM_HUGE, /* Incompressible page */
ZRAM_IDLE, /* not accessed page since last idle marking */
ZRAM_INCOMPRESSIBLE, /* none of the algorithms could compress it */
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCHv2 2/3] zram: rework recompress target selection strategy
2024-09-08 11:45 [PATCHv2 0/3] zram: optimal post-processing target selection Sergey Senozhatsky
2024-09-08 11:45 ` [PATCHv2 1/3] zram: introduce ZRAM_PP_SLOT flag Sergey Senozhatsky
@ 2024-09-08 11:45 ` Sergey Senozhatsky
2024-09-08 11:45 ` [PATCHv2 3/3] zram: rework writeback " Sergey Senozhatsky
2024-09-10 9:37 ` [PATCHv2 0/3] zram: optimal post-processing target selection Sergey Senozhatsky
3 siblings, 0 replies; 8+ messages in thread
From: Sergey Senozhatsky @ 2024-09-08 11:45 UTC (permalink / raw)
To: Minchan Kim, Andrew Morton
Cc: Richard Chang, linux-kernel, Sergey Senozhatsky
Target slot selection for recompression is just a simple
iteration over zram->table entries (stored pages) from
slot 0 to max slot. Given that zram->table slots are
written in random order and are not sorted by size, a
simple iteration over slots selects suboptimal targets
for recompression. This is not a problem if we recompress
every single zram->table slot, but we never do that in
reality. In reality we limit the number of slots we
can recompress (via max_pages parameter) and hence proper
slot selection becomes very important. The strategy is
quite simple, suppose we have two candidate slots for
recompression, one of size 48 bytes and one of size 2800
bytes, and we can recompress only one, then it certainly
makes more sense to pick 2800 entry for recompression.
Because even if we manage to compress 48 bytes objects
even further the savings are going to be very small.
Potential savings after good re-compression of 2800 bytes
objects are much higher.
This patch reworks slot selection and introduces the
strategy described above: among candidate slots always
select the biggest ones first.
For that the patch introduces zram_pp_ctl (post-processing)
structure which holds NUM_PP_BUCKETS pp buckets of slots.
Slots are assigned to a particular group based on their
sizes - the larger the size of the slot the higher the group
index. This, basically, sorts slots by size in liner time
(we still perform just one iteration over zram->table slots).
When we select slot for recompression we always first lookup
in higher pp buckets (those that hold the largest slots).
Which achieves the desired behavior.
TEST
====
A very simple demonstration: zram is configured with zstd, and
zstd with dict as a recompression stream. A limited (max 4096
pages) recompression is performed then, with a log of sizes of
slots that were recompressed. You can see that patched zram
selects slots for recompression in significantly different
manner, which leads to higher memory savings (see column #2 of
mm_stat output).
BASE
----
*** initial state of zram device
/sys/block/zram0/mm_stat
1750994944 504491413 514203648 0 514203648 1 0 34204 34204
*** recompress idle max_pages=4096
/sys/block/zram0/mm_stat
1750994944 504262229 514953216 0 514203648 1 0 34204 34204
Sizes of selected objects for recompression:
... 45 58 24 226 91 40 24 24 24 424 2104 93 2078 2078 2078 959 154 ...
PATCHED
-------
*** initial state of zram device
/sys/block/zram0/mm_stat
1750982656 504492801 514170880 0 514170880 1 0 34204 34204
*** recompress idle max_pages=4096
/sys/block/zram0/mm_stat
1750982656 503716710 517586944 0 514170880 1 0 34204 34204
Sizes of selected objects for recompression:
... 3680 3694 3667 3590 3614 3553 3537 3548 3550 3542 3543 3537 ...
Note, pp-slots are not strictly sorted, there is a PP_BUCKET_SIZE_RANGE
variation of sizes within particular bucket.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
drivers/block/zram/zram_drv.c | 191 +++++++++++++++++++++++++++++-----
1 file changed, 163 insertions(+), 28 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index a14aef6bf634..026e527ab17d 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -184,6 +184,100 @@ static void zram_accessed(struct zram *zram, u32 index)
#endif
}
+#ifdef CONFIG_ZRAM_MULTI_COMP
+struct zram_pp_slot {
+ unsigned long index;
+ struct list_head entry;
+};
+
+/*
+ * A post-processing bucket is, essentially, a size class, this defines
+ * the range (in bytes) of pp-slots sizes in particular bucket.
+ */
+#define PP_BUCKET_SIZE_RANGE 64
+#define NUM_PP_BUCKETS ((PAGE_SIZE / PP_BUCKET_SIZE_RANGE) + 1)
+
+struct zram_pp_ctl {
+ struct list_head pp_buckets[NUM_PP_BUCKETS];
+};
+
+static struct zram_pp_ctl *init_pp_ctl(void)
+{
+ struct zram_pp_ctl *ctl;
+ u32 idx;
+
+ ctl = kmalloc(sizeof(*ctl), GFP_KERNEL);
+ if (!ctl)
+ return NULL;
+
+ for (idx = 0; idx < NUM_PP_BUCKETS; idx++)
+ INIT_LIST_HEAD(&ctl->pp_buckets[idx]);
+ return ctl;
+}
+
+static void release_pp_slot(struct zram *zram, struct zram_pp_slot *pps)
+{
+ zram_slot_lock(zram, pps->index);
+ if (zram_test_flag(zram, pps->index, ZRAM_PP_SLOT))
+ zram_clear_flag(zram, pps->index, ZRAM_PP_SLOT);
+ zram_slot_unlock(zram, pps->index);
+ kfree(pps);
+}
+
+static void release_pp_ctl(struct zram *zram, struct zram_pp_ctl *ctl)
+{
+ u32 idx;
+
+ if (!ctl)
+ return;
+
+ for (idx = 0; idx < NUM_PP_BUCKETS; idx++) {
+ while (!list_empty(&ctl->pp_buckets[idx])) {
+ struct zram_pp_slot *pps;
+
+ pps = list_first_entry(&ctl->pp_buckets[idx],
+ struct zram_pp_slot,
+ entry);
+ list_del_init(&pps->entry);
+ release_pp_slot(zram, pps);
+ }
+ }
+
+ kfree(ctl);
+}
+
+static void place_pp_slot(struct zram *zram, struct zram_pp_ctl *ctl,
+ struct zram_pp_slot *pps)
+{
+ u32 idx;
+
+ idx = zram_get_obj_size(zram, pps->index) / PP_BUCKET_SIZE_RANGE;
+ list_add(&pps->entry, &ctl->pp_buckets[idx]);
+
+ zram_set_flag(zram, pps->index, ZRAM_PP_SLOT);
+}
+
+static struct zram_pp_slot *select_pp_slot(struct zram_pp_ctl *ctl)
+{
+ struct zram_pp_slot *pps = NULL;
+ s32 idx = NUM_PP_BUCKETS - 1;
+
+ /* The higher the bucket id the more optimal slot post-processing is */
+ while (idx > 0) {
+ pps = list_first_entry_or_null(&ctl->pp_buckets[idx],
+ struct zram_pp_slot,
+ entry);
+ if (pps) {
+ list_del_init(&pps->entry);
+ break;
+ }
+
+ idx--;
+ }
+ return pps;
+}
+#endif
+
static inline void update_used_max(struct zram *zram,
const unsigned long pages)
{
@@ -1650,6 +1744,54 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
}
#ifdef CONFIG_ZRAM_MULTI_COMP
+#define RECOMPRESS_IDLE (1 << 0)
+#define RECOMPRESS_HUGE (1 << 1)
+
+static int scan_slots_for_recompress(struct zram *zram, u32 mode,
+ struct zram_pp_ctl *ctl)
+{
+ unsigned long nr_pages = zram->disksize >> PAGE_SHIFT;
+ struct zram_pp_slot *pps = NULL;
+ unsigned long index;
+
+ for (index = 0; index < nr_pages; index++) {
+ if (!pps)
+ pps = kmalloc(sizeof(*pps), GFP_KERNEL);
+ if (!pps)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&pps->entry);
+
+ zram_slot_lock(zram, index);
+ if (!zram_allocated(zram, index))
+ goto next;
+
+ if (mode & RECOMPRESS_IDLE &&
+ !zram_test_flag(zram, index, ZRAM_IDLE))
+ goto next;
+
+ if (mode & RECOMPRESS_HUGE &&
+ !zram_test_flag(zram, index, ZRAM_HUGE))
+ goto next;
+
+ if (zram_test_flag(zram, index, ZRAM_WB) ||
+ zram_test_flag(zram, index, ZRAM_UNDER_WB) ||
+ zram_test_flag(zram, index, ZRAM_PP_SLOT) ||
+ zram_test_flag(zram, index, ZRAM_SAME) ||
+ zram_test_flag(zram, index, ZRAM_INCOMPRESSIBLE))
+ goto next;
+
+ pps->index = index;
+ place_pp_slot(zram, ctl, pps);
+ pps = NULL;
+next:
+ zram_slot_unlock(zram, index);
+ }
+
+ kfree(pps);
+ return 0;
+}
+
/*
* This function will decompress (unless it's ZRAM_HUGE) the page and then
* attempt to compress it using provided compression algorithm priority
@@ -1657,7 +1799,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
*
* Corresponding ZRAM slot should be locked.
*/
-static int zram_recompress(struct zram *zram, u32 index, struct page *page,
+static int recompress_slot(struct zram *zram, u32 index, struct page *page,
u64 *num_recomp_pages, u32 threshold, u32 prio,
u32 prio_max)
{
@@ -1677,6 +1819,7 @@ static int zram_recompress(struct zram *zram, u32 index, struct page *page,
return -EINVAL;
comp_len_old = zram_get_obj_size(zram, index);
+
/*
* Do not recompress objects that are already "small enough".
*/
@@ -1800,20 +1943,17 @@ static int zram_recompress(struct zram *zram, u32 index, struct page *page,
return 0;
}
-#define RECOMPRESS_IDLE (1 << 0)
-#define RECOMPRESS_HUGE (1 << 1)
-
static ssize_t recompress_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t len)
{
u32 prio = ZRAM_SECONDARY_COMP, prio_max = ZRAM_MAX_COMPS;
struct zram *zram = dev_to_zram(dev);
- unsigned long nr_pages = zram->disksize >> PAGE_SHIFT;
char *args, *param, *val, *algo = NULL;
u64 num_recomp_pages = ULLONG_MAX;
+ struct zram_pp_ctl *ctl = NULL;
+ struct zram_pp_slot *pps;
u32 mode = 0, threshold = 0;
- unsigned long index;
struct page *page;
ssize_t ret;
@@ -1909,37 +2049,31 @@ static ssize_t recompress_store(struct device *dev,
goto release_init_lock;
}
+ ctl = init_pp_ctl();
+ if (!ctl) {
+ ret = -ENOMEM;
+ goto release_init_lock;
+ }
+ scan_slots_for_recompress(zram, mode, ctl);
+
ret = len;
- for (index = 0; index < nr_pages; index++) {
+ while ((pps = select_pp_slot(ctl))) {
int err = 0;
if (!num_recomp_pages)
break;
- zram_slot_lock(zram, index);
-
- if (!zram_allocated(zram, index))
- goto next;
-
- if (mode & RECOMPRESS_IDLE &&
- !zram_test_flag(zram, index, ZRAM_IDLE))
- goto next;
-
- if (mode & RECOMPRESS_HUGE &&
- !zram_test_flag(zram, index, ZRAM_HUGE))
+ zram_slot_lock(zram, pps->index);
+ if (!zram_test_flag(zram, pps->index, ZRAM_PP_SLOT))
goto next;
- if (zram_test_flag(zram, index, ZRAM_WB) ||
- zram_test_flag(zram, index, ZRAM_UNDER_WB) ||
- zram_test_flag(zram, index, ZRAM_SAME) ||
- zram_test_flag(zram, index, ZRAM_PP_SLOT) ||
- zram_test_flag(zram, index, ZRAM_INCOMPRESSIBLE))
- goto next;
-
- err = zram_recompress(zram, index, page, &num_recomp_pages,
- threshold, prio, prio_max);
+ err = recompress_slot(zram, pps->index, page,
+ &num_recomp_pages, threshold,
+ prio, prio_max);
next:
- zram_slot_unlock(zram, index);
+ zram_slot_unlock(zram, pps->index);
+ release_pp_slot(zram, pps);
+
if (err) {
ret = err;
break;
@@ -1951,6 +2085,7 @@ static ssize_t recompress_store(struct device *dev,
__free_page(page);
release_init_lock:
+ release_pp_ctl(zram, ctl);
up_read(&zram->init_lock);
return ret;
}
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCHv2 3/3] zram: rework writeback target selection strategy
2024-09-08 11:45 [PATCHv2 0/3] zram: optimal post-processing target selection Sergey Senozhatsky
2024-09-08 11:45 ` [PATCHv2 1/3] zram: introduce ZRAM_PP_SLOT flag Sergey Senozhatsky
2024-09-08 11:45 ` [PATCHv2 2/3] zram: rework recompress target selection strategy Sergey Senozhatsky
@ 2024-09-08 11:45 ` Sergey Senozhatsky
2024-09-10 9:37 ` [PATCHv2 0/3] zram: optimal post-processing target selection Sergey Senozhatsky
3 siblings, 0 replies; 8+ messages in thread
From: Sergey Senozhatsky @ 2024-09-08 11:45 UTC (permalink / raw)
To: Minchan Kim, Andrew Morton
Cc: Richard Chang, linux-kernel, Sergey Senozhatsky
Writeback suffers from the same problem as recompression
did before - target slot selection for writeback is just
a simple iteration over zram->table entries (stored pages)
which selects suboptimal targets for writeback. This is
especially problematic for writeback, because we uncompress
objects before writeback so each of them takes 4K out of
limited writeback storage. For example, when we take a
48 bytes slot and store it as a 4K object to writeback device
we only save 48 bytes of memory (release from zsmalloc pool).
We naturally want to pick the largest objects for writeback,
because then each writeback will release the largest amount
of memory.
This patch applies the same solution and strategy as for
recompression target selection: pp control (post-process)
with 16 buckets of candidate pp slots. Slots are assigned to
pp buckets based on sizes - the larger the slot the higher the
group index. This gives us sorted by size lists of candidate
slots (in linear time), so that among post-processing candidate
slots we always select the largest ones first and maximize
the memory saving.
TEST
====
A very simple demonstration: zram is configured with a writeback
device. A limited writeback (wb_limit 2500 pages) is performed
then, with a log of sizes of slots that were written back.
You can see that patched zram selects slots for recompression in
significantly different manner, which leads to higher memory
savings (see column #2 of mm_stat output).
BASE
----
*** initial state of zram device
/sys/block/zram0/mm_stat
1750327296 619765836 631902208 0 631902208 1 0 34278 34278
*** writeback idle wb_limit 2500
/sys/block/zram0/mm_stat
1750327296 617622333 631578624 0 631902208 1 0 34278 34278
Sizes of selected objects for writeback:
... 193 349 46 46 46 46 852 1002 543 162 107 49 34 34 34 ...
PATCHED
-------
*** initial state of zram device
/sys/block/zram0/mm_stat
1750319104 619760957 631992320 0 631992320 1 0 34278 34278
*** writeback idle wb_limit 2500
/sys/block/zram0/mm_stat
1750319104 612672056 626135040 0 631992320 1 0 34278 34278
Sizes of selected objects for writeback:
... 3667 3580 3581 3580 3581 3581 3581 3231 3211 3203 3231 3246 ...
Note, pp-slots are not strictly sorted, there is a PP_BUCKET_SIZE_RANGE
variation of sizes within particular bucket.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
drivers/block/zram/zram_drv.c | 88 +++++++++++++++++++++++++++--------
1 file changed, 68 insertions(+), 20 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 026e527ab17d..d0da6bb4be79 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -184,7 +184,7 @@ static void zram_accessed(struct zram *zram, u32 index)
#endif
}
-#ifdef CONFIG_ZRAM_MULTI_COMP
+#if defined CONFIG_ZRAM_WRITEBACK || defined CONFIG_ZRAM_MULTI_COMP
struct zram_pp_slot {
unsigned long index;
struct list_head entry;
@@ -682,11 +682,59 @@ static void read_from_bdev_async(struct zram *zram, struct page *page,
#define IDLE_WRITEBACK (1<<1)
#define INCOMPRESSIBLE_WRITEBACK (1<<2)
+static int scan_slots_for_writeback(struct zram *zram, u32 mode,
+ unsigned long nr_pages,
+ unsigned long index,
+ struct zram_pp_ctl *ctl)
+{
+ struct zram_pp_slot *pps = NULL;
+
+ for (; nr_pages != 0; index++, nr_pages--) {
+ if (!pps)
+ pps = kmalloc(sizeof(*pps), GFP_KERNEL);
+ if (!pps)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&pps->entry);
+
+ zram_slot_lock(zram, index);
+ if (!zram_allocated(zram, index))
+ goto next;
+
+ if (zram_test_flag(zram, index, ZRAM_WB) ||
+ zram_test_flag(zram, index, ZRAM_SAME) ||
+ zram_test_flag(zram, index, ZRAM_PP_SLOT) ||
+ zram_test_flag(zram, index, ZRAM_UNDER_WB))
+ goto next;
+
+ if (mode & IDLE_WRITEBACK &&
+ !zram_test_flag(zram, index, ZRAM_IDLE))
+ goto next;
+ if (mode & HUGE_WRITEBACK &&
+ !zram_test_flag(zram, index, ZRAM_HUGE))
+ goto next;
+ if (mode & INCOMPRESSIBLE_WRITEBACK &&
+ !zram_test_flag(zram, index, ZRAM_INCOMPRESSIBLE))
+ goto next;
+
+ pps->index = index;
+ place_pp_slot(zram, ctl, pps);
+ pps = NULL;
+next:
+ zram_slot_unlock(zram, index);
+ }
+
+ kfree(pps);
+ return 0;
+}
+
static ssize_t writeback_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t len)
{
struct zram *zram = dev_to_zram(dev);
unsigned long nr_pages = zram->disksize >> PAGE_SHIFT;
+ struct zram_pp_ctl *ctl = NULL;
+ struct zram_pp_slot *pps;
unsigned long index = 0;
struct bio bio;
struct bio_vec bio_vec;
@@ -732,11 +780,19 @@ static ssize_t writeback_store(struct device *dev,
goto release_init_lock;
}
- for (; nr_pages != 0; index++, nr_pages--) {
+ ctl = init_pp_ctl();
+ if (!ctl) {
+ ret = -ENOMEM;
+ goto release_init_lock;
+ }
+ scan_slots_for_writeback(zram, mode, nr_pages, index, ctl);
+
+ while ((pps = select_pp_slot(ctl))) {
spin_lock(&zram->wb_limit_lock);
if (zram->wb_limit_enable && !zram->bd_wb_limit) {
spin_unlock(&zram->wb_limit_lock);
ret = -EIO;
+ release_pp_slot(zram, pps);
break;
}
spin_unlock(&zram->wb_limit_lock);
@@ -745,30 +801,15 @@ static ssize_t writeback_store(struct device *dev,
blk_idx = alloc_block_bdev(zram);
if (!blk_idx) {
ret = -ENOSPC;
+ release_pp_slot(zram, pps);
break;
}
}
+ index = pps->index;
zram_slot_lock(zram, index);
- if (!zram_allocated(zram, index))
+ if (!zram_test_flag(zram, index, ZRAM_PP_SLOT))
goto next;
-
- if (zram_test_flag(zram, index, ZRAM_WB) ||
- zram_test_flag(zram, index, ZRAM_SAME) ||
- zram_test_flag(zram, index, ZRAM_PP_SLOT) ||
- zram_test_flag(zram, index, ZRAM_UNDER_WB))
- goto next;
-
- if (mode & IDLE_WRITEBACK &&
- !zram_test_flag(zram, index, ZRAM_IDLE))
- goto next;
- if (mode & HUGE_WRITEBACK &&
- !zram_test_flag(zram, index, ZRAM_HUGE))
- goto next;
- if (mode & INCOMPRESSIBLE_WRITEBACK &&
- !zram_test_flag(zram, index, ZRAM_INCOMPRESSIBLE))
- goto next;
-
/*
* Clearing ZRAM_UNDER_WB is duty of caller.
* IOW, zram_free_page never clear it.
@@ -777,11 +818,14 @@ static ssize_t writeback_store(struct device *dev,
/* Need for hugepage writeback racing */
zram_set_flag(zram, index, ZRAM_IDLE);
zram_slot_unlock(zram, index);
+
if (zram_read_page(zram, page, index, NULL)) {
zram_slot_lock(zram, index);
zram_clear_flag(zram, index, ZRAM_UNDER_WB);
zram_clear_flag(zram, index, ZRAM_IDLE);
zram_slot_unlock(zram, index);
+
+ release_pp_slot(zram, pps);
continue;
}
@@ -800,6 +844,8 @@ static ssize_t writeback_store(struct device *dev,
zram_clear_flag(zram, index, ZRAM_UNDER_WB);
zram_clear_flag(zram, index, ZRAM_IDLE);
zram_slot_unlock(zram, index);
+
+ release_pp_slot(zram, pps);
/*
* BIO errors are not fatal, we continue and simply
* attempt to writeback the remaining objects (pages).
@@ -842,12 +888,14 @@ static ssize_t writeback_store(struct device *dev,
spin_unlock(&zram->wb_limit_lock);
next:
zram_slot_unlock(zram, index);
+ release_pp_slot(zram, pps);
}
if (blk_idx)
free_block_bdev(zram, blk_idx);
__free_page(page);
release_init_lock:
+ release_pp_ctl(zram, ctl);
up_read(&zram->init_lock);
return ret;
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCHv2 0/3] zram: optimal post-processing target selection
2024-09-08 11:45 ` Sergey Senozhatsky
@ 2024-09-08 11:47 ` Sergey Senozhatsky
0 siblings, 0 replies; 8+ messages in thread
From: Sergey Senozhatsky @ 2024-09-08 11:47 UTC (permalink / raw)
To: Sergey Senozhatsky; +Cc: Andrew Morton, Richard Chang, linux-kernel
On (24/09/08 20:45), Sergey Senozhatsky wrote:
> On (24/09/08 20:42), Sergey Senozhatsky wrote:
> >
> > v2..v1:
> > -- clear PP_SLOT when slot is accessed
> > -- kmalloc pp_ctl instead of keeoing it on the stack
> > -- increase the number of pp-buckets and rework the way it's defined
> > -- code reshuffle and refactoring
> >
>
> D'oh... Let me re-send it properly. It was supposed to be sent to
> Minchan and Andrew. Sorry for the noise.
Re-sent under the same name
https://lore.kernel.org/lkml/20240908114541.3025351-1-senozhatsky@chromium.org/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHv2 0/3] zram: optimal post-processing target selection
2024-09-08 11:45 [PATCHv2 0/3] zram: optimal post-processing target selection Sergey Senozhatsky
` (2 preceding siblings ...)
2024-09-08 11:45 ` [PATCHv2 3/3] zram: rework writeback " Sergey Senozhatsky
@ 2024-09-10 9:37 ` Sergey Senozhatsky
3 siblings, 0 replies; 8+ messages in thread
From: Sergey Senozhatsky @ 2024-09-10 9:37 UTC (permalink / raw)
To: Minchan Kim, Andrew Morton
Cc: Richard Chang, linux-kernel, Sergey Senozhatsky
On (24/09/08 20:45), Sergey Senozhatsky wrote:
> v2..v1:
> -- clear PP_SLOT when slot is accessed
> -- kmalloc pp_ctl instead of keeoing it on the stack
> -- increase the number of pp-buckets and rework the way it's defined
> -- code reshuffle and refactoring
Folks, ignore this series for now, I'm working on v3.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-09-10 9:37 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-08 11:45 [PATCHv2 0/3] zram: optimal post-processing target selection Sergey Senozhatsky
2024-09-08 11:45 ` [PATCHv2 1/3] zram: introduce ZRAM_PP_SLOT flag Sergey Senozhatsky
2024-09-08 11:45 ` [PATCHv2 2/3] zram: rework recompress target selection strategy Sergey Senozhatsky
2024-09-08 11:45 ` [PATCHv2 3/3] zram: rework writeback " Sergey Senozhatsky
2024-09-10 9:37 ` [PATCHv2 0/3] zram: optimal post-processing target selection Sergey Senozhatsky
-- strict thread matches above, loose matches on Subject: below --
2024-09-08 11:42 Sergey Senozhatsky
2024-09-08 11:45 ` Sergey Senozhatsky
2024-09-08 11:47 ` Sergey Senozhatsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox