* [PATCH v2 0/4] zram memory control enhance
@ 2014-08-19 7:54 Minchan Kim
2014-08-19 7:54 ` [PATCH v2 1/4] zsmalloc: move pages_allocated to zs_pool Minchan Kim
` (3 more replies)
0 siblings, 4 replies; 19+ messages in thread
From: Minchan Kim @ 2014-08-19 7:54 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta,
Seth Jennings, Dan Streetman, ds2horner, Minchan Kim
Now, zram has no feature to limit memory usage even though so that
theoretically zram can deplete system memory.
In addition, it makes hard to control memory usage of the platform
so zram users have asked it several times.
This patchset adds the feature.
Patch 1 makes zs_get_total_size_bytes faster because it would be
used frequently in later patches for the new feature.
Patch 2 changes zs_get_total_size_bytes's return unit from bytes
to page so that zsmalloc doesn't need unnecessary operation(ie,
<< PAGE_SHIFT).
Patch 3 adds new feature. I added the feature into zram layer,
not zsmalloc because limiation is zram's requirement, not zsmalloc
so any other user using zsmalloc(ie, zpool) shouldn't affected
by unnecessary branch of zsmalloc. In future, if every users
of zsmalloc want the feature, then, we could move the feature
from client side to zsmalloc easily but vice versa would be
painful.
Patch 4 adds news facility to report maximum memory usage of zram
so that user can get how many of peak memory zram used easily
without extremely short inverval polling via
/sys/block/zram0/mem_used_total.
* From v1
* rebased on next-20140815
* fix up race problem - David, Dan
* reset mem_used_max as current total_bytes, rather than 0 - David
* resetting works with only "0" write for extensiblilty - David
Minchan Kim (4):
[1] zsmalloc: move pages_allocated to zs_pool
[2] zsmalloc: change return value unit of zs_get_total_size_bytes
[3] zram: zram memory size limitation
[4] zram: report maximum used memory
Documentation/ABI/testing/sysfs-block-zram | 19 ++++++
Documentation/blockdev/zram.txt | 21 ++++--
drivers/block/zram/zram_drv.c | 101 ++++++++++++++++++++++++++++-
drivers/block/zram/zram_drv.h | 6 ++
include/linux/zsmalloc.h | 2 +-
mm/zsmalloc.c | 38 ++++++-----
6 files changed, 162 insertions(+), 25 deletions(-)
--
2.0.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 1/4] zsmalloc: move pages_allocated to zs_pool
2014-08-19 7:54 [PATCH v2 0/4] zram memory control enhance Minchan Kim
@ 2014-08-19 7:54 ` Minchan Kim
2014-08-19 7:54 ` [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes Minchan Kim
` (2 subsequent siblings)
3 siblings, 0 replies; 19+ messages in thread
From: Minchan Kim @ 2014-08-19 7:54 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta,
Seth Jennings, Dan Streetman, ds2horner, Minchan Kim
Pages_allocated has counted in size_class structure and when user
want to see total_size_bytes, it gathers all of value from each
size_class to report the sum.
It's not bad if user don't see the value often but if user start
to see the value frequently, it would be not a good deal for
performance POV.
This patch moves the variable from size_class to zs_pool so it would
reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
but it adds new locking overhead but it wouldn't be severe because
it's not a hot path in zs_malloc(ie, it is called only when new
zspage is created, not a object).
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
mm/zsmalloc.c | 30 ++++++++++++++++--------------
1 file changed, 16 insertions(+), 14 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 94f38fac5e81..a65924255763 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -199,9 +199,6 @@ struct size_class {
spinlock_t lock;
- /* stats */
- u64 pages_allocated;
-
struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
};
@@ -217,9 +214,12 @@ struct link_free {
};
struct zs_pool {
+ spinlock_t stat_lock;
+
struct size_class size_class[ZS_SIZE_CLASSES];
gfp_t flags; /* allocation flags used when growing pool */
+ unsigned long pages_allocated;
};
/*
@@ -967,6 +967,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
}
+ spin_lock_init(&pool->stat_lock);
pool->flags = flags;
return pool;
@@ -1028,8 +1029,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
return 0;
set_zspage_mapping(first_page, class->index, ZS_EMPTY);
+ spin_lock(&pool->stat_lock);
+ pool->pages_allocated += class->pages_per_zspage;
+ spin_unlock(&pool->stat_lock);
spin_lock(&class->lock);
- class->pages_allocated += class->pages_per_zspage;
}
obj = (unsigned long)first_page->freelist;
@@ -1082,14 +1085,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
first_page->inuse--;
fullness = fix_fullness_group(pool, first_page);
-
- if (fullness == ZS_EMPTY)
- class->pages_allocated -= class->pages_per_zspage;
-
spin_unlock(&class->lock);
- if (fullness == ZS_EMPTY)
+ if (fullness == ZS_EMPTY) {
+ spin_lock(&pool->stat_lock);
+ pool->pages_allocated -= class->pages_per_zspage;
+ spin_unlock(&pool->stat_lock);
free_zspage(first_page);
+ }
}
EXPORT_SYMBOL_GPL(zs_free);
@@ -1185,12 +1188,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
u64 zs_get_total_size_bytes(struct zs_pool *pool)
{
- int i;
- u64 npages = 0;
-
- for (i = 0; i < ZS_SIZE_CLASSES; i++)
- npages += pool->size_class[i].pages_allocated;
+ u64 npages;
+ spin_lock(&pool->stat_lock);
+ npages = pool->pages_allocated;
+ spin_unlock(&pool->stat_lock);
return npages << PAGE_SHIFT;
}
EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
--
2.0.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes
2014-08-19 7:54 [PATCH v2 0/4] zram memory control enhance Minchan Kim
2014-08-19 7:54 ` [PATCH v2 1/4] zsmalloc: move pages_allocated to zs_pool Minchan Kim
@ 2014-08-19 7:54 ` Minchan Kim
2014-08-19 14:46 ` Seth Jennings
2014-08-19 7:54 ` [PATCH v2 3/4] zram: zram memory size limitation Minchan Kim
2014-08-19 7:54 ` [PATCH v2 4/4] zram: report maximum used memory Minchan Kim
3 siblings, 1 reply; 19+ messages in thread
From: Minchan Kim @ 2014-08-19 7:54 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta,
Seth Jennings, Dan Streetman, ds2horner, Minchan Kim
zs_get_total_size_bytes returns a amount of memory zsmalloc
consumed with *byte unit* but zsmalloc operates *page unit*
rather than byte unit so let's change the API so benefit
we could get is that reduce unnecessary overhead
(ie, change page unit with byte unit) in zsmalloc.
Now, zswap can rollback to zswap_pool_pages.
Over to zswap guys ;-)
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
drivers/block/zram/zram_drv.c | 4 ++--
include/linux/zsmalloc.h | 2 +-
mm/zsmalloc.c | 10 +++++-----
3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index d00831c3d731..302dd37bcea3 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -103,10 +103,10 @@ static ssize_t mem_used_total_show(struct device *dev,
down_read(&zram->init_lock);
if (init_done(zram))
- val = zs_get_total_size_bytes(meta->mem_pool);
+ val = zs_get_total_size(meta->mem_pool);
up_read(&zram->init_lock);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
}
static ssize_t max_comp_streams_show(struct device *dev,
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index e44d634e7fb7..105b56e45d23 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -46,6 +46,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
enum zs_mapmode mm);
void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
-u64 zs_get_total_size_bytes(struct zs_pool *pool);
+unsigned long zs_get_total_size(struct zs_pool *pool);
#endif
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index a65924255763..80408a1da03a 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -299,7 +299,7 @@ static void zs_zpool_unmap(void *pool, unsigned long handle)
static u64 zs_zpool_total_size(void *pool)
{
- return zs_get_total_size_bytes(pool);
+ return zs_get_total_size(pool) << PAGE_SHIFT;
}
static struct zpool_driver zs_zpool_driver = {
@@ -1186,16 +1186,16 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
}
EXPORT_SYMBOL_GPL(zs_unmap_object);
-u64 zs_get_total_size_bytes(struct zs_pool *pool)
+unsigned long zs_get_total_size(struct zs_pool *pool)
{
- u64 npages;
+ unsigned long npages;
spin_lock(&pool->stat_lock);
npages = pool->pages_allocated;
spin_unlock(&pool->stat_lock);
- return npages << PAGE_SHIFT;
+ return npages;
}
-EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
+EXPORT_SYMBOL_GPL(zs_get_total_size);
module_init(zs_init);
module_exit(zs_exit);
--
2.0.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 3/4] zram: zram memory size limitation
2014-08-19 7:54 [PATCH v2 0/4] zram memory control enhance Minchan Kim
2014-08-19 7:54 ` [PATCH v2 1/4] zsmalloc: move pages_allocated to zs_pool Minchan Kim
2014-08-19 7:54 ` [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes Minchan Kim
@ 2014-08-19 7:54 ` Minchan Kim
2014-08-19 8:06 ` Marc Dietrich
2014-08-19 7:54 ` [PATCH v2 4/4] zram: report maximum used memory Minchan Kim
3 siblings, 1 reply; 19+ messages in thread
From: Minchan Kim @ 2014-08-19 7:54 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta,
Seth Jennings, Dan Streetman, ds2horner, Minchan Kim
Since zram has no control feature to limit memory usage,
it makes hard to manage system memrory.
This patch adds new knob "mem_limit" via sysfs to set up the
a limit so that zram could fail allocation once it reaches
the limit.
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
Documentation/ABI/testing/sysfs-block-zram | 9 +++++++
Documentation/blockdev/zram.txt | 20 ++++++++++++---
drivers/block/zram/zram_drv.c | 41 ++++++++++++++++++++++++++++++
drivers/block/zram/zram_drv.h | 5 ++++
4 files changed, 71 insertions(+), 4 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram
index 70ec992514d0..025331c19045 100644
--- a/Documentation/ABI/testing/sysfs-block-zram
+++ b/Documentation/ABI/testing/sysfs-block-zram
@@ -119,3 +119,12 @@ Description:
efficiency can be calculated using compr_data_size and this
statistic.
Unit: bytes
+
+What: /sys/block/zram<id>/mem_limit
+Date: August 2014
+Contact: Minchan Kim <minchan@kernel.org>
+Description:
+ The mem_limit file is read/write and specifies the amount
+ of memory to be able to consume memory to store store
+ compressed data.
+ Unit: bytes
diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index 0595c3f56ccf..9f239ff8c444 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -74,14 +74,26 @@ There is little point creating a zram of greater than twice the size of memory
since we expect a 2:1 compression ratio. Note that zram uses about 0.1% of the
size of the disk when not in use so a huge zram is wasteful.
-5) Activate:
+5) Set memory limit: Optional
+ Set memory limit by writing the value to sysfs node 'mem_limit'.
+ The value can be either in bytes or you can use mem suffixes.
+ Examples:
+ # limit /dev/zram0 with 50MB memory
+ echo $((50*1024*1024)) > /sys/block/zram0/mem_limit
+
+ # Using mem suffixes
+ echo 256K > /sys/block/zram0/mem_limit
+ echo 512M > /sys/block/zram0/mem_limit
+ echo 1G > /sys/block/zram0/mem_limit
+
+6) Activate:
mkswap /dev/zram0
swapon /dev/zram0
mkfs.ext4 /dev/zram1
mount /dev/zram1 /tmp
-6) Stats:
+7) Stats:
Per-device statistics are exported as various nodes under
/sys/block/zram<id>/
disksize
@@ -96,11 +108,11 @@ size of the disk when not in use so a huge zram is wasteful.
compr_data_size
mem_used_total
-7) Deactivate:
+8) Deactivate:
swapoff /dev/zram0
umount /dev/zram1
-8) Reset:
+9) Reset:
Write any positive value to 'reset' sysfs node
echo 1 > /sys/block/zram0/reset
echo 1 > /sys/block/zram1/reset
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 302dd37bcea3..adc91c7ecaef 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -122,6 +122,33 @@ static ssize_t max_comp_streams_show(struct device *dev,
return scnprintf(buf, PAGE_SIZE, "%d\n", val);
}
+static ssize_t mem_limit_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ u64 val;
+ struct zram *zram = dev_to_zram(dev);
+
+ down_read(&zram->init_lock);
+ val = zram->limit_pages;
+ up_read(&zram->init_lock);
+
+ return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
+}
+
+static ssize_t mem_limit_store(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t len)
+{
+ u64 limit;
+ struct zram *zram = dev_to_zram(dev);
+
+ limit = memparse(buf, NULL);
+ down_write(&zram->init_lock);
+ zram->limit_pages = PAGE_ALIGN(limit) >> PAGE_SHIFT;
+ up_write(&zram->init_lock);
+
+ return len;
+}
+
static ssize_t max_comp_streams_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t len)
{
@@ -513,6 +540,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
ret = -ENOMEM;
goto out;
}
+
+ if (zram->limit_pages &&
+ zs_get_total_size(meta->mem_pool) > zram->limit_pages) {
+ zs_free(meta->mem_pool, handle);
+ ret = -ENOMEM;
+ goto out;
+ }
+
cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
@@ -617,6 +652,9 @@ static void zram_reset_device(struct zram *zram, bool reset_capacity)
struct zram_meta *meta;
down_write(&zram->init_lock);
+
+ zram->limit_pages = 0;
+
if (!init_done(zram)) {
up_write(&zram->init_lock);
return;
@@ -857,6 +895,8 @@ static DEVICE_ATTR(initstate, S_IRUGO, initstate_show, NULL);
static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
+static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show,
+ mem_limit_store);
static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
max_comp_streams_show, max_comp_streams_store);
static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
@@ -885,6 +925,7 @@ static struct attribute *zram_disk_attrs[] = {
&dev_attr_orig_data_size.attr,
&dev_attr_compr_data_size.attr,
&dev_attr_mem_used_total.attr,
+ &dev_attr_mem_limit.attr,
&dev_attr_max_comp_streams.attr,
&dev_attr_comp_algorithm.attr,
NULL,
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index e0f725c87cc6..b7aa9c21553f 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -112,6 +112,11 @@ struct zram {
u64 disksize; /* bytes */
int max_comp_streams;
struct zram_stats stats;
+ /*
+ * the number of pages zram can consume for storing compressed data
+ */
+ unsigned long limit_pages;
+
char compressor[10];
};
#endif
--
2.0.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 4/4] zram: report maximum used memory
2014-08-19 7:54 [PATCH v2 0/4] zram memory control enhance Minchan Kim
` (2 preceding siblings ...)
2014-08-19 7:54 ` [PATCH v2 3/4] zram: zram memory size limitation Minchan Kim
@ 2014-08-19 7:54 ` Minchan Kim
2014-08-20 6:26 ` David Horner
3 siblings, 1 reply; 19+ messages in thread
From: Minchan Kim @ 2014-08-19 7:54 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta,
Seth Jennings, Dan Streetman, ds2horner, Minchan Kim
Normally, zram user could get maximum memory usage zram consumed
via polling mem_used_total with sysfs in userspace.
But it has a critical problem because user can miss peak memory
usage during update inverval of polling. For avoiding that,
user should poll it with shorter interval(ie, 0.0000000001s)
with mlocking to avoid page fault delay when memory pressure
is heavy. It would be troublesome.
This patch adds new knob "mem_used_max" so user could see
the maximum memory usage easily via reading the knob and reset
it via "echo 0 > /sys/block/zram0/mem_used_max".
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
Documentation/ABI/testing/sysfs-block-zram | 10 +++++
Documentation/blockdev/zram.txt | 1 +
drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++-
drivers/block/zram/zram_drv.h | 1 +
4 files changed, 70 insertions(+), 2 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram
index 025331c19045..ffd1ea7443dd 100644
--- a/Documentation/ABI/testing/sysfs-block-zram
+++ b/Documentation/ABI/testing/sysfs-block-zram
@@ -120,6 +120,16 @@ Description:
statistic.
Unit: bytes
+What: /sys/block/zram<id>/mem_used_max
+Date: August 2014
+Contact: Minchan Kim <minchan@kernel.org>
+Description:
+ The mem_used_max file is read/write and specifies the amount
+ of maximum memory zram have consumed to store compressed data.
+ For resetting the value, you should do "echo 0". Otherwise,
+ you could see -EINVAL.
+ Unit: bytes
+
What: /sys/block/zram<id>/mem_limit
Date: August 2014
Contact: Minchan Kim <minchan@kernel.org>
diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index 9f239ff8c444..3b2247c2d4cf 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful.
orig_data_size
compr_data_size
mem_used_total
+ mem_used_max
8) Deactivate:
swapoff /dev/zram0
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index adc91c7ecaef..e4d44842a91d 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev,
return len;
}
+static ssize_t mem_used_max_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ u64 val = 0;
+ struct zram *zram = dev_to_zram(dev);
+
+ down_read(&zram->init_lock);
+ if (init_done(zram))
+ val = atomic64_read(&zram->stats.max_used_pages);
+ up_read(&zram->init_lock);
+
+ return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
+}
+
+static ssize_t mem_used_max_store(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t len)
+{
+ u64 limit;
+ struct zram *zram = dev_to_zram(dev);
+ struct zram_meta *meta = zram->meta;
+
+ limit = memparse(buf, NULL);
+ if (0 != limit)
+ return -EINVAL;
+
+ down_read(&zram->init_lock);
+ if (init_done(zram))
+ atomic64_set(&zram->stats.max_used_pages,
+ zs_get_total_size(meta->mem_pool));
+ up_read(&zram->init_lock);
+
+ return len;
+}
+
static ssize_t max_comp_streams_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t len)
{
@@ -461,6 +495,26 @@ out_cleanup:
return ret;
}
+static bool check_limit(struct zram *zram)
+{
+ unsigned long alloced_pages;
+ u64 old_max, cur_max;
+ struct zram_meta *meta = zram->meta;
+
+ do {
+ alloced_pages = zs_get_total_size(meta->mem_pool);
+ if (zram->limit_pages && alloced_pages > zram->limit_pages)
+ return false;
+
+ old_max = cur_max = atomic64_read(&zram->stats.max_used_pages);
+ if (alloced_pages > cur_max)
+ old_max = atomic64_cmpxchg(&zram->stats.max_used_pages,
+ cur_max, alloced_pages);
+ } while (old_max != cur_max);
+
+ return true;
+}
+
static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
int offset)
{
@@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
goto out;
}
- if (zram->limit_pages &&
- zs_get_total_size(meta->mem_pool) > zram->limit_pages) {
+ if (!check_limit(zram)) {
zs_free(meta->mem_pool, handle);
ret = -ENOMEM;
goto out;
@@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show,
mem_limit_store);
+static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show,
+ mem_used_max_store);
static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
max_comp_streams_show, max_comp_streams_store);
static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
@@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = {
&dev_attr_compr_data_size.attr,
&dev_attr_mem_used_total.attr,
&dev_attr_mem_limit.attr,
+ &dev_attr_mem_used_max.attr,
&dev_attr_max_comp_streams.attr,
&dev_attr_comp_algorithm.attr,
NULL,
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index b7aa9c21553f..29383312d543 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -90,6 +90,7 @@ struct zram_stats {
atomic64_t notify_free; /* no. of swap slot free notifications */
atomic64_t zero_pages; /* no. of zero filled pages */
atomic64_t pages_stored; /* no. of pages currently stored */
+ atomic64_t max_used_pages; /* no. of maximum pages stored */
};
struct zram_meta {
--
2.0.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/4] zram: zram memory size limitation
2014-08-19 7:54 ` [PATCH v2 3/4] zram: zram memory size limitation Minchan Kim
@ 2014-08-19 8:06 ` Marc Dietrich
2014-08-19 23:32 ` Minchan Kim
0 siblings, 1 reply; 19+ messages in thread
From: Marc Dietrich @ 2014-08-19 8:06 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Seth Jennings, Dan Streetman, ds2horner
Am Dienstag, 19. August 2014, 16:54:46 schrieb Minchan Kim:
> Since zram has no control feature to limit memory usage,
> it makes hard to manage system memrory.
>
> This patch adds new knob "mem_limit" via sysfs to set up the
> a limit so that zram could fail allocation once it reaches
> the limit.
Sorry to jump in late with a probably silly question, but I couldn't find the
answer easily. What's the difference between disksize and mem_limit?
I assume the former is uncompressed size (virtual size) and the latter is
compressed size (real memory usage)? Maybe the difference should be made
clearer in the documentation.
If disksize is the uncompressed size, why would we want to set this at all?
Marc
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes
2014-08-19 7:54 ` [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes Minchan Kim
@ 2014-08-19 14:46 ` Seth Jennings
2014-08-19 15:11 ` Seth Jennings
2014-08-19 23:46 ` Minchan Kim
0 siblings, 2 replies; 19+ messages in thread
From: Seth Jennings @ 2014-08-19 14:46 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Dan Streetman, ds2horner
On Tue, Aug 19, 2014 at 04:54:45PM +0900, Minchan Kim wrote:
> zs_get_total_size_bytes returns a amount of memory zsmalloc
> consumed with *byte unit* but zsmalloc operates *page unit*
> rather than byte unit so let's change the API so benefit
> we could get is that reduce unnecessary overhead
> (ie, change page unit with byte unit) in zsmalloc.
>
> Now, zswap can rollback to zswap_pool_pages.
> Over to zswap guys ;-)
I don't think that's how is it done :-/ Changing the API for a
component that has two users, changing one, then saying "hope you guys
change your newly broken stuff".
I know you would rather not move zram to the zpool API but doing so
would avoid situations like this.
Anyway, this does break the zpool API and by extension zswap, and that
needs to be addressed in this patch or we create a point in the commit
history where it is broken.
Quick glance:
- zpool_get_total_size() return type is u64 but this patch changes to
unsigned long. Now mismatches between zbud and zsmalloc.
- zbud_zpool_total_size needs to return pages, not bytes
- as you noted s/pool_total_size/pool_pages/g in zswap.c plus
modification to zswap_is_full()
Thanks,
Seth
>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
> drivers/block/zram/zram_drv.c | 4 ++--
> include/linux/zsmalloc.h | 2 +-
> mm/zsmalloc.c | 10 +++++-----
> 3 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index d00831c3d731..302dd37bcea3 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -103,10 +103,10 @@ static ssize_t mem_used_total_show(struct device *dev,
>
> down_read(&zram->init_lock);
> if (init_done(zram))
> - val = zs_get_total_size_bytes(meta->mem_pool);
> + val = zs_get_total_size(meta->mem_pool);
> up_read(&zram->init_lock);
>
> - return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> }
>
> static ssize_t max_comp_streams_show(struct device *dev,
> diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> index e44d634e7fb7..105b56e45d23 100644
> --- a/include/linux/zsmalloc.h
> +++ b/include/linux/zsmalloc.h
> @@ -46,6 +46,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
> enum zs_mapmode mm);
> void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
>
> -u64 zs_get_total_size_bytes(struct zs_pool *pool);
> +unsigned long zs_get_total_size(struct zs_pool *pool);
>
> #endif
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index a65924255763..80408a1da03a 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -299,7 +299,7 @@ static void zs_zpool_unmap(void *pool, unsigned long handle)
>
> static u64 zs_zpool_total_size(void *pool)
> {
> - return zs_get_total_size_bytes(pool);
> + return zs_get_total_size(pool) << PAGE_SHIFT;
> }
>
> static struct zpool_driver zs_zpool_driver = {
> @@ -1186,16 +1186,16 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
> }
> EXPORT_SYMBOL_GPL(zs_unmap_object);
>
> -u64 zs_get_total_size_bytes(struct zs_pool *pool)
> +unsigned long zs_get_total_size(struct zs_pool *pool)
> {
> - u64 npages;
> + unsigned long npages;
>
> spin_lock(&pool->stat_lock);
> npages = pool->pages_allocated;
> spin_unlock(&pool->stat_lock);
> - return npages << PAGE_SHIFT;
> + return npages;
> }
> -EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> +EXPORT_SYMBOL_GPL(zs_get_total_size);
>
> module_init(zs_init);
> module_exit(zs_exit);
> --
> 2.0.0
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes
2014-08-19 14:46 ` Seth Jennings
@ 2014-08-19 15:11 ` Seth Jennings
2014-08-19 23:47 ` Minchan Kim
2014-08-19 23:46 ` Minchan Kim
1 sibling, 1 reply; 19+ messages in thread
From: Seth Jennings @ 2014-08-19 15:11 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Dan Streetman, ds2horner
On Tue, Aug 19, 2014 at 09:46:28AM -0500, Seth Jennings wrote:
> On Tue, Aug 19, 2014 at 04:54:45PM +0900, Minchan Kim wrote:
> > zs_get_total_size_bytes returns a amount of memory zsmalloc
> > consumed with *byte unit* but zsmalloc operates *page unit*
> > rather than byte unit so let's change the API so benefit
> > we could get is that reduce unnecessary overhead
> > (ie, change page unit with byte unit) in zsmalloc.
> >
> > Now, zswap can rollback to zswap_pool_pages.
> > Over to zswap guys ;-)
>
> I don't think that's how is it done :-/ Changing the API for a
> component that has two users, changing one, then saying "hope you guys
> change your newly broken stuff".
However, I'll bite on this one :) Just squash this in so that
zpool/zswap aren't broken at any point.
Dan, care to make sure I didn't miss something?
Thanks,
Seth
diff --git a/mm/zbud.c b/mm/zbud.c
index a05790b..27a3701 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -179,7 +179,7 @@ static void zbud_zpool_unmap(void *pool, unsigned long handle)
static u64 zbud_zpool_total_size(void *pool)
{
- return zbud_get_pool_size(pool) * PAGE_SIZE;
+ return zbud_get_pool_size(pool);
}
static struct zpool_driver zbud_zpool_driver = {
diff --git a/mm/zpool.c b/mm/zpool.c
index e40612a..d126ebc 100644
--- a/mm/zpool.c
+++ b/mm/zpool.c
@@ -336,9 +336,9 @@ void zpool_unmap_handle(struct zpool *zpool, unsigned long handle)
* zpool_get_total_size() - The total size of the pool
* @pool The zpool to check
*
- * This returns the total size in bytes of the pool.
+ * This returns the total size in pages of the pool.
*
- * Returns: Total size of the zpool in bytes.
+ * Returns: Total size of the zpool in pages.
*/
u64 zpool_get_total_size(struct zpool *zpool)
{
diff --git a/mm/zswap.c b/mm/zswap.c
index ea064c1..124f750 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -45,8 +45,8 @@
/*********************************
* statistics
**********************************/
-/* Total bytes used by the compressed storage */
-static u64 zswap_pool_total_size;
+/* Total pages used by the compressed storage */
+static u64 zswap_pool_pages;
/* The number of compressed pages currently stored in zswap */
static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
@@ -297,7 +297,7 @@ static void zswap_free_entry(struct zswap_entry *entry)
zpool_free(zswap_pool, entry->handle);
zswap_entry_cache_free(entry);
atomic_dec(&zswap_stored_pages);
- zswap_pool_total_size = zpool_get_total_size(zswap_pool);
+ zswap_pool_pages = zpool_get_total_size(zswap_pool);
}
/* caller must hold the tree lock */
@@ -414,7 +414,7 @@ cleanup:
static bool zswap_is_full(void)
{
return totalram_pages * zswap_max_pool_percent / 100 <
- DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE);
+ zswap_pool_pages;
}
/*********************************
@@ -721,7 +721,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
/* update stats */
atomic_inc(&zswap_stored_pages);
- zswap_pool_total_size = zpool_get_total_size(zswap_pool);
+ zswap_pool_pages = zpool_get_total_size(zswap_pool);
return 0;
@@ -874,8 +874,8 @@ static int __init zswap_debugfs_init(void)
zswap_debugfs_root, &zswap_written_back_pages);
debugfs_create_u64("duplicate_entry", S_IRUGO,
zswap_debugfs_root, &zswap_duplicate_entry);
- debugfs_create_u64("pool_total_size", S_IRUGO,
- zswap_debugfs_root, &zswap_pool_total_size);
+ debugfs_create_u64("pool_pages", S_IRUGO,
+ zswap_debugfs_root, &zswap_pool_pages);
debugfs_create_atomic_t("stored_pages", S_IRUGO,
zswap_debugfs_root, &zswap_stored_pages);
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/4] zram: zram memory size limitation
2014-08-19 8:06 ` Marc Dietrich
@ 2014-08-19 23:32 ` Minchan Kim
2014-08-20 7:58 ` Marc Dietrich
0 siblings, 1 reply; 19+ messages in thread
From: Minchan Kim @ 2014-08-19 23:32 UTC (permalink / raw)
To: Marc Dietrich
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Seth Jennings, Dan Streetman, ds2horner
Hello,
On Tue, Aug 19, 2014 at 10:06:22AM +0200, Marc Dietrich wrote:
> Am Dienstag, 19. August 2014, 16:54:46 schrieb Minchan Kim:
> > Since zram has no control feature to limit memory usage,
> > it makes hard to manage system memrory.
> >
> > This patch adds new knob "mem_limit" via sysfs to set up the
> > a limit so that zram could fail allocation once it reaches
> > the limit.
>
> Sorry to jump in late with a probably silly question, but I couldn't find the
> answer easily. What's the difference between disksize and mem_limit?
No need to say sorry.
It was totally my fault because zram documentation sucks.
The disksize means the size point of view upper layer from block subsystem
so filesystem based on zram or blockdevice itself(ie, zram0) seen by admin
will have the disksize size but keep in mind that it's virtual size,
not compressed. As you know already, zram is backed on volatile storage
(ie, DRAM) with *compressed form*, not permanent storage.
The point of this patchset is that anybody cannot expect exact memory
usage of zram in advance. Acutally, zram folks have estimated it by several
experiment and assuming zram compression ratio(ex, 2:1 or 3:1) before
releasing product. But thesedays, embedded platforms have varios workloads
which cannot be expected when the product was released so compression
ratio expectation could be wrong sometime so zram could consume lots of
memory than expected once compression ratio is low.
It makes admin trouble to manage memeory on the product because there
is no way to release memory zram is using so that one of the way is
to limit memory usage of zram from the beginning.
> I assume the former is uncompressed size (virtual size) and the latter is
> compressed size (real memory usage)? Maybe the difference should be made
Right.
> clearer in the documentation.
Okay.
> If disksize is the uncompressed size, why would we want to set this at all?
For example, we have 500M disksize of zram0 because we assumed 2:1
compression ratio so that we could guess zram will consume 250M physical memory
in the end. But our guessing could be wrong so if real compression ratio
is 4:1, we use up 125M phsyical memory to store 500M uncompressed pages.
It's good but admin want to use up more memory for zram because we saved
100% than expected zram memory but we couldn't becuase upper layer point of view
from zram, zram is already full by 500M and if zram is used for swap,
we will encounter OOM. :(
So, it would be better to increase disksize to 1000M but in this case,
if compression ratio becomes 4:1 by something(ex, workload change),
zram can consume 500M physical memory, which is above we expected
and admin don't want zram to use up system memory too much.
In summary, we couldn't control exact zram memory usage with only disksize
by compression ratio.
>
> Marc
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes
2014-08-19 14:46 ` Seth Jennings
2014-08-19 15:11 ` Seth Jennings
@ 2014-08-19 23:46 ` Minchan Kim
2014-08-20 4:44 ` Seth Jennings
1 sibling, 1 reply; 19+ messages in thread
From: Minchan Kim @ 2014-08-19 23:46 UTC (permalink / raw)
To: Seth Jennings
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Dan Streetman, ds2horner
Hey Seth,
On Tue, Aug 19, 2014 at 09:46:28AM -0500, Seth Jennings wrote:
> On Tue, Aug 19, 2014 at 04:54:45PM +0900, Minchan Kim wrote:
> > zs_get_total_size_bytes returns a amount of memory zsmalloc
> > consumed with *byte unit* but zsmalloc operates *page unit*
> > rather than byte unit so let's change the API so benefit
> > we could get is that reduce unnecessary overhead
> > (ie, change page unit with byte unit) in zsmalloc.
> >
> > Now, zswap can rollback to zswap_pool_pages.
> > Over to zswap guys ;-)
>
> I don't think that's how is it done :-/ Changing the API for a
> component that has two users, changing one, then saying "hope you guys
> change your newly broken stuff".
I don't get it. The zsmalloc's zs_zpool_total_size returns u64 with bytes,
not pages so what's broken stuff you mentioned?
static u64 zs_zpool_total_size(void *pool)
{
- return zs_get_total_size_bytes(pool);
+ return zs_get_total_size(pool) << PAGE_SHIFT;
}
Thing I nudged to you is just you could roll back to pages count
instead of bytes if zswap guys want it and I dont' want it myself
for avoding unnecessary noise for this patchset's purpose.
>
> I know you would rather not move zram to the zpool API but doing so
> would avoid situations like this.
>
> Anyway, this does break the zpool API and by extension zswap, and that
> needs to be addressed in this patch or we create a point in the commit
> history where it is broken.
>
> Quick glance:
> - zpool_get_total_size() return type is u64 but this patch changes to
> unsigned long. Now mismatches between zbud and zsmalloc.
Nope. zs_zpool_total_size returns u64 with PAGE_SHIFT.
> - zbud_zpool_total_size needs to return pages, not bytes
It's up to you, not me.
> - as you noted s/pool_total_size/pool_pages/g in zswap.c plus
> modification to zswap_is_full()
I didn't mean it. Sorry for the confusing.
If I miss your point, let me know it.
Thanks.
>
> Thanks,
> Seth
>
> >
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> > drivers/block/zram/zram_drv.c | 4 ++--
> > include/linux/zsmalloc.h | 2 +-
> > mm/zsmalloc.c | 10 +++++-----
> > 3 files changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index d00831c3d731..302dd37bcea3 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -103,10 +103,10 @@ static ssize_t mem_used_total_show(struct device *dev,
> >
> > down_read(&zram->init_lock);
> > if (init_done(zram))
> > - val = zs_get_total_size_bytes(meta->mem_pool);
> > + val = zs_get_total_size(meta->mem_pool);
> > up_read(&zram->init_lock);
> >
> > - return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> > }
> >
> > static ssize_t max_comp_streams_show(struct device *dev,
> > diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> > index e44d634e7fb7..105b56e45d23 100644
> > --- a/include/linux/zsmalloc.h
> > +++ b/include/linux/zsmalloc.h
> > @@ -46,6 +46,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
> > enum zs_mapmode mm);
> > void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
> >
> > -u64 zs_get_total_size_bytes(struct zs_pool *pool);
> > +unsigned long zs_get_total_size(struct zs_pool *pool);
> >
> > #endif
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index a65924255763..80408a1da03a 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -299,7 +299,7 @@ static void zs_zpool_unmap(void *pool, unsigned long handle)
> >
> > static u64 zs_zpool_total_size(void *pool)
> > {
> > - return zs_get_total_size_bytes(pool);
> > + return zs_get_total_size(pool) << PAGE_SHIFT;
> > }
> >
> > static struct zpool_driver zs_zpool_driver = {
> > @@ -1186,16 +1186,16 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
> > }
> > EXPORT_SYMBOL_GPL(zs_unmap_object);
> >
> > -u64 zs_get_total_size_bytes(struct zs_pool *pool)
> > +unsigned long zs_get_total_size(struct zs_pool *pool)
> > {
> > - u64 npages;
> > + unsigned long npages;
> >
> > spin_lock(&pool->stat_lock);
> > npages = pool->pages_allocated;
> > spin_unlock(&pool->stat_lock);
> > - return npages << PAGE_SHIFT;
> > + return npages;
> > }
> > -EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> > +EXPORT_SYMBOL_GPL(zs_get_total_size);
> >
> > module_init(zs_init);
> > module_exit(zs_exit);
> > --
> > 2.0.0
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes
2014-08-19 15:11 ` Seth Jennings
@ 2014-08-19 23:47 ` Minchan Kim
0 siblings, 0 replies; 19+ messages in thread
From: Minchan Kim @ 2014-08-19 23:47 UTC (permalink / raw)
To: Seth Jennings
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Dan Streetman, ds2horner
On Tue, Aug 19, 2014 at 10:11:57AM -0500, Seth Jennings wrote:
> On Tue, Aug 19, 2014 at 09:46:28AM -0500, Seth Jennings wrote:
> > On Tue, Aug 19, 2014 at 04:54:45PM +0900, Minchan Kim wrote:
> > > zs_get_total_size_bytes returns a amount of memory zsmalloc
> > > consumed with *byte unit* but zsmalloc operates *page unit*
> > > rather than byte unit so let's change the API so benefit
> > > we could get is that reduce unnecessary overhead
> > > (ie, change page unit with byte unit) in zsmalloc.
> > >
> > > Now, zswap can rollback to zswap_pool_pages.
> > > Over to zswap guys ;-)
> >
> > I don't think that's how is it done :-/ Changing the API for a
> > component that has two users, changing one, then saying "hope you guys
> > change your newly broken stuff".
>
> However, I'll bite on this one :) Just squash this in so that
Thanks for the eating but it's not what I wanted.
I'd like to leave current semantic of zpool and up to you to roll back
to old time. ;-)
> zpool/zswap aren't broken at any point.
>
> Dan, care to make sure I didn't miss something?
>
> Thanks,
> Seth
>
> diff --git a/mm/zbud.c b/mm/zbud.c
> index a05790b..27a3701 100644
> --- a/mm/zbud.c
> +++ b/mm/zbud.c
> @@ -179,7 +179,7 @@ static void zbud_zpool_unmap(void *pool, unsigned long handle)
>
> static u64 zbud_zpool_total_size(void *pool)
> {
> - return zbud_get_pool_size(pool) * PAGE_SIZE;
> + return zbud_get_pool_size(pool);
> }
>
> static struct zpool_driver zbud_zpool_driver = {
> diff --git a/mm/zpool.c b/mm/zpool.c
> index e40612a..d126ebc 100644
> --- a/mm/zpool.c
> +++ b/mm/zpool.c
> @@ -336,9 +336,9 @@ void zpool_unmap_handle(struct zpool *zpool, unsigned long handle)
> * zpool_get_total_size() - The total size of the pool
> * @pool The zpool to check
> *
> - * This returns the total size in bytes of the pool.
> + * This returns the total size in pages of the pool.
> *
> - * Returns: Total size of the zpool in bytes.
> + * Returns: Total size of the zpool in pages.
> */
> u64 zpool_get_total_size(struct zpool *zpool)
> {
> diff --git a/mm/zswap.c b/mm/zswap.c
> index ea064c1..124f750 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -45,8 +45,8 @@
> /*********************************
> * statistics
> **********************************/
> -/* Total bytes used by the compressed storage */
> -static u64 zswap_pool_total_size;
> +/* Total pages used by the compressed storage */
> +static u64 zswap_pool_pages;
> /* The number of compressed pages currently stored in zswap */
> static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
>
> @@ -297,7 +297,7 @@ static void zswap_free_entry(struct zswap_entry *entry)
> zpool_free(zswap_pool, entry->handle);
> zswap_entry_cache_free(entry);
> atomic_dec(&zswap_stored_pages);
> - zswap_pool_total_size = zpool_get_total_size(zswap_pool);
> + zswap_pool_pages = zpool_get_total_size(zswap_pool);
> }
>
> /* caller must hold the tree lock */
> @@ -414,7 +414,7 @@ cleanup:
> static bool zswap_is_full(void)
> {
> return totalram_pages * zswap_max_pool_percent / 100 <
> - DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE);
> + zswap_pool_pages;
> }
>
> /*********************************
> @@ -721,7 +721,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
>
> /* update stats */
> atomic_inc(&zswap_stored_pages);
> - zswap_pool_total_size = zpool_get_total_size(zswap_pool);
> + zswap_pool_pages = zpool_get_total_size(zswap_pool);
>
> return 0;
>
> @@ -874,8 +874,8 @@ static int __init zswap_debugfs_init(void)
> zswap_debugfs_root, &zswap_written_back_pages);
> debugfs_create_u64("duplicate_entry", S_IRUGO,
> zswap_debugfs_root, &zswap_duplicate_entry);
> - debugfs_create_u64("pool_total_size", S_IRUGO,
> - zswap_debugfs_root, &zswap_pool_total_size);
> + debugfs_create_u64("pool_pages", S_IRUGO,
> + zswap_debugfs_root, &zswap_pool_pages);
> debugfs_create_atomic_t("stored_pages", S_IRUGO,
> zswap_debugfs_root, &zswap_stored_pages);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes
2014-08-19 23:46 ` Minchan Kim
@ 2014-08-20 4:44 ` Seth Jennings
0 siblings, 0 replies; 19+ messages in thread
From: Seth Jennings @ 2014-08-20 4:44 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Dan Streetman, ds2horner
On Wed, Aug 20, 2014 at 08:46:21AM +0900, Minchan Kim wrote:
> Hey Seth,
>
> On Tue, Aug 19, 2014 at 09:46:28AM -0500, Seth Jennings wrote:
> > On Tue, Aug 19, 2014 at 04:54:45PM +0900, Minchan Kim wrote:
> > > zs_get_total_size_bytes returns a amount of memory zsmalloc
> > > consumed with *byte unit* but zsmalloc operates *page unit*
> > > rather than byte unit so let's change the API so benefit
> > > we could get is that reduce unnecessary overhead
> > > (ie, change page unit with byte unit) in zsmalloc.
> > >
> > > Now, zswap can rollback to zswap_pool_pages.
> > > Over to zswap guys ;-)
> >
> > I don't think that's how is it done :-/ Changing the API for a
> > component that has two users, changing one, then saying "hope you guys
> > change your newly broken stuff".
>
> I don't get it. The zsmalloc's zs_zpool_total_size returns u64 with bytes,
> not pages so what's broken stuff you mentioned?
>
> static u64 zs_zpool_total_size(void *pool)
> {
> - return zs_get_total_size_bytes(pool);
> + return zs_get_total_size(pool) << PAGE_SHIFT;
> }
Ah, that's my bad. I forgot that there is zs_get_total_size() and
zs_pool_total_size() and that the zpool driver calls
zs_pool_total_size(), not zs_get_total_size().
Nevermind!
Seth
>
> Thing I nudged to you is just you could roll back to pages count
> instead of bytes if zswap guys want it and I dont' want it myself
> for avoding unnecessary noise for this patchset's purpose.
>
> >
> > I know you would rather not move zram to the zpool API but doing so
> > would avoid situations like this.
> >
> > Anyway, this does break the zpool API and by extension zswap, and that
> > needs to be addressed in this patch or we create a point in the commit
> > history where it is broken.
> >
> > Quick glance:
> > - zpool_get_total_size() return type is u64 but this patch changes to
> > unsigned long. Now mismatches between zbud and zsmalloc.
>
> Nope. zs_zpool_total_size returns u64 with PAGE_SHIFT.
>
> > - zbud_zpool_total_size needs to return pages, not bytes
>
> It's up to you, not me.
>
> > - as you noted s/pool_total_size/pool_pages/g in zswap.c plus
> > modification to zswap_is_full()
>
> I didn't mean it. Sorry for the confusing.
> If I miss your point, let me know it.
>
> Thanks.
>
> >
> > Thanks,
> > Seth
> >
> > >
> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > ---
> > > drivers/block/zram/zram_drv.c | 4 ++--
> > > include/linux/zsmalloc.h | 2 +-
> > > mm/zsmalloc.c | 10 +++++-----
> > > 3 files changed, 8 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > > index d00831c3d731..302dd37bcea3 100644
> > > --- a/drivers/block/zram/zram_drv.c
> > > +++ b/drivers/block/zram/zram_drv.c
> > > @@ -103,10 +103,10 @@ static ssize_t mem_used_total_show(struct device *dev,
> > >
> > > down_read(&zram->init_lock);
> > > if (init_done(zram))
> > > - val = zs_get_total_size_bytes(meta->mem_pool);
> > > + val = zs_get_total_size(meta->mem_pool);
> > > up_read(&zram->init_lock);
> > >
> > > - return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> > > }
> > >
> > > static ssize_t max_comp_streams_show(struct device *dev,
> > > diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> > > index e44d634e7fb7..105b56e45d23 100644
> > > --- a/include/linux/zsmalloc.h
> > > +++ b/include/linux/zsmalloc.h
> > > @@ -46,6 +46,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
> > > enum zs_mapmode mm);
> > > void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
> > >
> > > -u64 zs_get_total_size_bytes(struct zs_pool *pool);
> > > +unsigned long zs_get_total_size(struct zs_pool *pool);
> > >
> > > #endif
> > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > > index a65924255763..80408a1da03a 100644
> > > --- a/mm/zsmalloc.c
> > > +++ b/mm/zsmalloc.c
> > > @@ -299,7 +299,7 @@ static void zs_zpool_unmap(void *pool, unsigned long handle)
> > >
> > > static u64 zs_zpool_total_size(void *pool)
> > > {
> > > - return zs_get_total_size_bytes(pool);
> > > + return zs_get_total_size(pool) << PAGE_SHIFT;
> > > }
> > >
> > > static struct zpool_driver zs_zpool_driver = {
> > > @@ -1186,16 +1186,16 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
> > > }
> > > EXPORT_SYMBOL_GPL(zs_unmap_object);
> > >
> > > -u64 zs_get_total_size_bytes(struct zs_pool *pool)
> > > +unsigned long zs_get_total_size(struct zs_pool *pool)
> > > {
> > > - u64 npages;
> > > + unsigned long npages;
> > >
> > > spin_lock(&pool->stat_lock);
> > > npages = pool->pages_allocated;
> > > spin_unlock(&pool->stat_lock);
> > > - return npages << PAGE_SHIFT;
> > > + return npages;
> > > }
> > > -EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> > > +EXPORT_SYMBOL_GPL(zs_get_total_size);
> > >
> > > module_init(zs_init);
> > > module_exit(zs_exit);
> > > --
> > > 2.0.0
> > >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
> --
> Kind regards,
> Minchan Kim
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/4] zram: report maximum used memory
2014-08-19 7:54 ` [PATCH v2 4/4] zram: report maximum used memory Minchan Kim
@ 2014-08-20 6:26 ` David Horner
2014-08-20 6:53 ` Minchan Kim
0 siblings, 1 reply; 19+ messages in thread
From: David Horner @ 2014-08-20 6:26 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, Linux-MM, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Seth Jennings, Dan Streetman
On Tue, Aug 19, 2014 at 3:54 AM, Minchan Kim <minchan@kernel.org> wrote:
> Normally, zram user could get maximum memory usage zram consumed
> via polling mem_used_total with sysfs in userspace.
>
> But it has a critical problem because user can miss peak memory
> usage during update inverval of polling. For avoiding that,
> user should poll it with shorter interval(ie, 0.0000000001s)
> with mlocking to avoid page fault delay when memory pressure
> is heavy. It would be troublesome.
>
> This patch adds new knob "mem_used_max" so user could see
> the maximum memory usage easily via reading the knob and reset
> it via "echo 0 > /sys/block/zram0/mem_used_max".
>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
> Documentation/ABI/testing/sysfs-block-zram | 10 +++++
> Documentation/blockdev/zram.txt | 1 +
> drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++-
> drivers/block/zram/zram_drv.h | 1 +
> 4 files changed, 70 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram
> index 025331c19045..ffd1ea7443dd 100644
> --- a/Documentation/ABI/testing/sysfs-block-zram
> +++ b/Documentation/ABI/testing/sysfs-block-zram
> @@ -120,6 +120,16 @@ Description:
> statistic.
> Unit: bytes
>
> +What: /sys/block/zram<id>/mem_used_max
> +Date: August 2014
> +Contact: Minchan Kim <minchan@kernel.org>
> +Description:
> + The mem_used_max file is read/write and specifies the amount
> + of maximum memory zram have consumed to store compressed data.
> + For resetting the value, you should do "echo 0". Otherwise,
> + you could see -EINVAL.
> + Unit: bytes
> +
> What: /sys/block/zram<id>/mem_limit
> Date: August 2014
> Contact: Minchan Kim <minchan@kernel.org>
> diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> index 9f239ff8c444..3b2247c2d4cf 100644
> --- a/Documentation/blockdev/zram.txt
> +++ b/Documentation/blockdev/zram.txt
> @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful.
> orig_data_size
> compr_data_size
> mem_used_total
> + mem_used_max
>
> 8) Deactivate:
> swapoff /dev/zram0
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index adc91c7ecaef..e4d44842a91d 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev,
> return len;
> }
>
> +static ssize_t mem_used_max_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + u64 val = 0;
> + struct zram *zram = dev_to_zram(dev);
> +
> + down_read(&zram->init_lock);
> + if (init_done(zram))
> + val = atomic64_read(&zram->stats.max_used_pages);
> + up_read(&zram->init_lock);
> +
> + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> +}
> +
> +static ssize_t mem_used_max_store(struct device *dev,
> + struct device_attribute *attr, const char *buf, size_t len)
> +{
> + u64 limit;
> + struct zram *zram = dev_to_zram(dev);
> + struct zram_meta *meta = zram->meta;
> +
> - limit = memparse(buf, NULL);
> - if (0 != limit)
we wanted explicit "0" and nothing else for extensibility
if (len != 1 || *buf != "0")
> + return -EINVAL;
> +
> + down_read(&zram->init_lock);
> + if (init_done(zram))
> + atomic64_set(&zram->stats.max_used_pages,
> + zs_get_total_size(meta->mem_pool));
> + up_read(&zram->init_lock);
> +
> + return len;
return 1;
the standard convention is to return used amount of buffer
> +}
> +
> static ssize_t max_comp_streams_store(struct device *dev,
> struct device_attribute *attr, const char *buf, size_t len)
> {
> @@ -461,6 +495,26 @@ out_cleanup:
> return ret;
> }
>
> +static bool check_limit(struct zram *zram)
> +{
> + unsigned long alloced_pages;
> + u64 old_max, cur_max;
> + struct zram_meta *meta = zram->meta;
> +
> + do {
> + alloced_pages = zs_get_total_size(meta->mem_pool);
> + if (zram->limit_pages && alloced_pages > zram->limit_pages)
> + return false;
> +
> + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages);
> + if (alloced_pages > cur_max)
> + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages,
> + cur_max, alloced_pages);
> + } while (old_max != cur_max);
> +
> + return true;
> +}
> +
Check_limit does more than check limit - it has a substantial side
effect of updating max used.
Basically if we already allocated the buffer and our alloced_pages is
less than the limit then we are good to go.
It is the race to update that we need to have the cmpxchg.
And maybe a helper function would aid readability - not sure, see next point.
I don't believe there is need for the loop either.
Any other updater will also be including our allocated pages
(and at this point in the code eliminated from roll back)
so if they beat us to it, then no problem, their max is better than ours.
> static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> int offset)
> {
> @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> goto out;
> }
>
> - if (zram->limit_pages &&
> - zs_get_total_size(meta->mem_pool) > zram->limit_pages) {
> + if (!check_limit(zram)) {
> zs_free(meta->mem_pool, handle);
> ret = -ENOMEM;
> goto out;
> @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show,
> mem_limit_store);
> +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show,
> + mem_used_max_store);
> static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> max_comp_streams_show, max_comp_streams_store);
> static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = {
> &dev_attr_compr_data_size.attr,
> &dev_attr_mem_used_total.attr,
> &dev_attr_mem_limit.attr,
> + &dev_attr_mem_used_max.attr,
> &dev_attr_max_comp_streams.attr,
> &dev_attr_comp_algorithm.attr,
> NULL,
> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> index b7aa9c21553f..29383312d543 100644
> --- a/drivers/block/zram/zram_drv.h
> +++ b/drivers/block/zram/zram_drv.h
> @@ -90,6 +90,7 @@ struct zram_stats {
> atomic64_t notify_free; /* no. of swap slot free notifications */
> atomic64_t zero_pages; /* no. of zero filled pages */
> atomic64_t pages_stored; /* no. of pages currently stored */
> + atomic64_t max_used_pages; /* no. of maximum pages stored */
> };
>
> struct zram_meta {
> --
> 2.0.0
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/4] zram: report maximum used memory
2014-08-20 6:26 ` David Horner
@ 2014-08-20 6:53 ` Minchan Kim
2014-08-20 7:38 ` David Horner
2014-08-21 0:06 ` Minchan Kim
0 siblings, 2 replies; 19+ messages in thread
From: Minchan Kim @ 2014-08-20 6:53 UTC (permalink / raw)
To: David Horner
Cc: Andrew Morton, Linux-MM, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Seth Jennings, Dan Streetman
On Wed, Aug 20, 2014 at 02:26:50AM -0400, David Horner wrote:
> On Tue, Aug 19, 2014 at 3:54 AM, Minchan Kim <minchan@kernel.org> wrote:
> > Normally, zram user could get maximum memory usage zram consumed
> > via polling mem_used_total with sysfs in userspace.
> >
> > But it has a critical problem because user can miss peak memory
> > usage during update inverval of polling. For avoiding that,
> > user should poll it with shorter interval(ie, 0.0000000001s)
> > with mlocking to avoid page fault delay when memory pressure
> > is heavy. It would be troublesome.
> >
> > This patch adds new knob "mem_used_max" so user could see
> > the maximum memory usage easily via reading the knob and reset
> > it via "echo 0 > /sys/block/zram0/mem_used_max".
> >
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> > Documentation/ABI/testing/sysfs-block-zram | 10 +++++
> > Documentation/blockdev/zram.txt | 1 +
> > drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++-
> > drivers/block/zram/zram_drv.h | 1 +
> > 4 files changed, 70 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram
> > index 025331c19045..ffd1ea7443dd 100644
> > --- a/Documentation/ABI/testing/sysfs-block-zram
> > +++ b/Documentation/ABI/testing/sysfs-block-zram
> > @@ -120,6 +120,16 @@ Description:
> > statistic.
> > Unit: bytes
> >
> > +What: /sys/block/zram<id>/mem_used_max
> > +Date: August 2014
> > +Contact: Minchan Kim <minchan@kernel.org>
> > +Description:
> > + The mem_used_max file is read/write and specifies the amount
> > + of maximum memory zram have consumed to store compressed data.
> > + For resetting the value, you should do "echo 0". Otherwise,
> > + you could see -EINVAL.
> > + Unit: bytes
> > +
> > What: /sys/block/zram<id>/mem_limit
> > Date: August 2014
> > Contact: Minchan Kim <minchan@kernel.org>
> > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> > index 9f239ff8c444..3b2247c2d4cf 100644
> > --- a/Documentation/blockdev/zram.txt
> > +++ b/Documentation/blockdev/zram.txt
> > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful.
> > orig_data_size
> > compr_data_size
> > mem_used_total
> > + mem_used_max
> >
> > 8) Deactivate:
> > swapoff /dev/zram0
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index adc91c7ecaef..e4d44842a91d 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev,
> > return len;
> > }
> >
> > +static ssize_t mem_used_max_show(struct device *dev,
> > + struct device_attribute *attr, char *buf)
> > +{
> > + u64 val = 0;
> > + struct zram *zram = dev_to_zram(dev);
> > +
> > + down_read(&zram->init_lock);
> > + if (init_done(zram))
> > + val = atomic64_read(&zram->stats.max_used_pages);
> > + up_read(&zram->init_lock);
> > +
> > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> > +}
> > +
> > +static ssize_t mem_used_max_store(struct device *dev,
> > + struct device_attribute *attr, const char *buf, size_t len)
> > +{
> > + u64 limit;
> > + struct zram *zram = dev_to_zram(dev);
> > + struct zram_meta *meta = zram->meta;
> > +
> > - limit = memparse(buf, NULL);
> > - if (0 != limit)
>
> we wanted explicit "0" and nothing else for extensibility
>
> if (len != 1 || *buf != "0")
>
I wanted to work with "0", "0K", "0M", "0G" but agree it's meaningless
at the moment so your version is better.
> > + return -EINVAL;
> > +
> > + down_read(&zram->init_lock);
> > + if (init_done(zram))
> > + atomic64_set(&zram->stats.max_used_pages,
> > + zs_get_total_size(meta->mem_pool));
> > + up_read(&zram->init_lock);
> > +
> > + return len;
> return 1;
>
> the standard convention is to return used amount of buffer
If I follow your suggestion, len should be 1 right before returning
so no problem for functionality POV but I agree explicit "1" is better
for readability so your version is better, better.
>
>
>
> > +}
> > +
> > static ssize_t max_comp_streams_store(struct device *dev,
> > struct device_attribute *attr, const char *buf, size_t len)
> > {
> > @@ -461,6 +495,26 @@ out_cleanup:
> > return ret;
> > }
> >
> > +static bool check_limit(struct zram *zram)
> > +{
> > + unsigned long alloced_pages;
> > + u64 old_max, cur_max;
> > + struct zram_meta *meta = zram->meta;
> > +
> > + do {
> > + alloced_pages = zs_get_total_size(meta->mem_pool);
> > + if (zram->limit_pages && alloced_pages > zram->limit_pages)
> > + return false;
> > +
> > + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages);
> > + if (alloced_pages > cur_max)
> > + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages,
> > + cur_max, alloced_pages);
> > + } while (old_max != cur_max);
> > +
> > + return true;
> > +}
> > +
>
> Check_limit does more than check limit - it has a substantial side
> effect of updating max used.
Hmm, Normally, limit check is best place to update the max although
function name imply just checking the limit and I don't think
code piece for max updating doesn't hurt readbilty.
If you or other reviewer is strong against, I will be happy to
factor out part of max updating into another function because
I think it's just preference problem for small logic and don't want
to waste argue for that.
If you really want it, pz, ping me again.
>
> Basically if we already allocated the buffer and our alloced_pages is
> less than the limit then we are good to go.
Yeb.
>
> It is the race to update that we need to have the cmpxchg.
> And maybe a helper function would aid readability - not sure, see next point.
>
> I don't believe there is need for the loop either.
> Any other updater will also be including our allocated pages
> (and at this point in the code eliminated from roll back)
> so if they beat us to it, then no problem, their max is better than ours.
Let's assume we don't have the loop.
CPU A CPU B
alloced_pages = 2001
old_max = cur_max = 2000
alloced_pages = 2005
old_max = cur_max = 2000
cmpxchg(2000, 2000, 2001) -> OK
cmpxchg(2001, 2000, 2005) -> FAIL
So, we lose 2005 which is bigger vaule.
>
>
>
> > static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> > int offset)
> > {
> > @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> > goto out;
> > }
> >
> > - if (zram->limit_pages &&
> > - zs_get_total_size(meta->mem_pool) > zram->limit_pages) {
> > + if (!check_limit(zram)) {
> > zs_free(meta->mem_pool, handle);
> > ret = -ENOMEM;
> > goto out;
> > @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> > static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> > static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show,
> > mem_limit_store);
> > +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show,
> > + mem_used_max_store);
> > static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> > max_comp_streams_show, max_comp_streams_store);
> > static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> > @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = {
> > &dev_attr_compr_data_size.attr,
> > &dev_attr_mem_used_total.attr,
> > &dev_attr_mem_limit.attr,
> > + &dev_attr_mem_used_max.attr,
> > &dev_attr_max_comp_streams.attr,
> > &dev_attr_comp_algorithm.attr,
> > NULL,
> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > index b7aa9c21553f..29383312d543 100644
> > --- a/drivers/block/zram/zram_drv.h
> > +++ b/drivers/block/zram/zram_drv.h
> > @@ -90,6 +90,7 @@ struct zram_stats {
> > atomic64_t notify_free; /* no. of swap slot free notifications */
> > atomic64_t zero_pages; /* no. of zero filled pages */
> > atomic64_t pages_stored; /* no. of pages currently stored */
> > + atomic64_t max_used_pages; /* no. of maximum pages stored */
> > };
> >
> > struct zram_meta {
> > --
> > 2.0.0
> >
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/4] zram: report maximum used memory
2014-08-20 6:53 ` Minchan Kim
@ 2014-08-20 7:38 ` David Horner
2014-08-20 7:53 ` Minchan Kim
2014-08-21 0:06 ` Minchan Kim
1 sibling, 1 reply; 19+ messages in thread
From: David Horner @ 2014-08-20 7:38 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, Linux-MM, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Seth Jennings, Dan Streetman
On Wed, Aug 20, 2014 at 2:53 AM, Minchan Kim <minchan@kernel.org> wrote:
> On Wed, Aug 20, 2014 at 02:26:50AM -0400, David Horner wrote:
>> On Tue, Aug 19, 2014 at 3:54 AM, Minchan Kim <minchan@kernel.org> wrote:
>> > Normally, zram user could get maximum memory usage zram consumed
>> > via polling mem_used_total with sysfs in userspace.
>> >
>> > But it has a critical problem because user can miss peak memory
>> > usage during update inverval of polling. For avoiding that,
>> > user should poll it with shorter interval(ie, 0.0000000001s)
>> > with mlocking to avoid page fault delay when memory pressure
>> > is heavy. It would be troublesome.
>> >
>> > This patch adds new knob "mem_used_max" so user could see
>> > the maximum memory usage easily via reading the knob and reset
>> > it via "echo 0 > /sys/block/zram0/mem_used_max".
>> >
>> > Signed-off-by: Minchan Kim <minchan@kernel.org>
>> > ---
>> > Documentation/ABI/testing/sysfs-block-zram | 10 +++++
>> > Documentation/blockdev/zram.txt | 1 +
>> > drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++-
>> > drivers/block/zram/zram_drv.h | 1 +
>> > 4 files changed, 70 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram
>> > index 025331c19045..ffd1ea7443dd 100644
>> > --- a/Documentation/ABI/testing/sysfs-block-zram
>> > +++ b/Documentation/ABI/testing/sysfs-block-zram
>> > @@ -120,6 +120,16 @@ Description:
>> > statistic.
>> > Unit: bytes
>> >
>> > +What: /sys/block/zram<id>/mem_used_max
>> > +Date: August 2014
>> > +Contact: Minchan Kim <minchan@kernel.org>
>> > +Description:
>> > + The mem_used_max file is read/write and specifies the amount
>> > + of maximum memory zram have consumed to store compressed data.
>> > + For resetting the value, you should do "echo 0". Otherwise,
>> > + you could see -EINVAL.
>> > + Unit: bytes
>> > +
>> > What: /sys/block/zram<id>/mem_limit
>> > Date: August 2014
>> > Contact: Minchan Kim <minchan@kernel.org>
>> > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
>> > index 9f239ff8c444..3b2247c2d4cf 100644
>> > --- a/Documentation/blockdev/zram.txt
>> > +++ b/Documentation/blockdev/zram.txt
>> > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful.
>> > orig_data_size
>> > compr_data_size
>> > mem_used_total
>> > + mem_used_max
>> >
>> > 8) Deactivate:
>> > swapoff /dev/zram0
>> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>> > index adc91c7ecaef..e4d44842a91d 100644
>> > --- a/drivers/block/zram/zram_drv.c
>> > +++ b/drivers/block/zram/zram_drv.c
>> > @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev,
>> > return len;
>> > }
>> >
>> > +static ssize_t mem_used_max_show(struct device *dev,
>> > + struct device_attribute *attr, char *buf)
>> > +{
>> > + u64 val = 0;
>> > + struct zram *zram = dev_to_zram(dev);
>> > +
>> > + down_read(&zram->init_lock);
>> > + if (init_done(zram))
>> > + val = atomic64_read(&zram->stats.max_used_pages);
>> > + up_read(&zram->init_lock);
>> > +
>> > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
>> > +}
>> > +
>> > +static ssize_t mem_used_max_store(struct device *dev,
>> > + struct device_attribute *attr, const char *buf, size_t len)
>> > +{
>> > + u64 limit;
>> > + struct zram *zram = dev_to_zram(dev);
>> > + struct zram_meta *meta = zram->meta;
>> > +
>> > - limit = memparse(buf, NULL);
>> > - if (0 != limit)
>>
>> we wanted explicit "0" and nothing else for extensibility
>>
>> if (len != 1 || *buf != "0")
>>
>
> I wanted to work with "0", "0K", "0M", "0G" but agree it's meaningless
> at the moment so your version is better.
>
>
>> > + return -EINVAL;
>> > +
>> > + down_read(&zram->init_lock);
>> > + if (init_done(zram))
>> > + atomic64_set(&zram->stats.max_used_pages,
>> > + zs_get_total_size(meta->mem_pool));
>> > + up_read(&zram->init_lock);
>> > +
>> > + return len;
>> return 1;
>>
>> the standard convention is to return used amount of buffer
>
> If I follow your suggestion, len should be 1 right before returning
> so no problem for functionality POV but I agree explicit "1" is better
> for readability so your version is better, better.
>
>>
>>
>>
>> > +}
>> > +
>> > static ssize_t max_comp_streams_store(struct device *dev,
>> > struct device_attribute *attr, const char *buf, size_t len)
>> > {
>> > @@ -461,6 +495,26 @@ out_cleanup:
>> > return ret;
>> > }
>> >
>> > +static bool check_limit(struct zram *zram)
>> > +{
>> > + unsigned long alloced_pages;
>> > + u64 old_max, cur_max;
>> > + struct zram_meta *meta = zram->meta;
>> > +
>> > + do {
>> > + alloced_pages = zs_get_total_size(meta->mem_pool);
>> > + if (zram->limit_pages && alloced_pages > zram->limit_pages)
>> > + return false;
>> > +
>> > + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages);
>> > + if (alloced_pages > cur_max)
>> > + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages,
>> > + cur_max, alloced_pages);
>> > + } while (old_max != cur_max);
>> > +
>> > + return true;
>> > +}
>> > +
>>
>> Check_limit does more than check limit - it has a substantial side
>> effect of updating max used.
>
> Hmm, Normally, limit check is best place to update the max although
> function name imply just checking the limit and I don't think
> code piece for max updating doesn't hurt readbilty.
> If you or other reviewer is strong against, I will be happy to
> factor out part of max updating into another function because
> I think it's just preference problem for small logic and don't want
> to waste argue for that.
>
> If you really want it, pz, ping me again.
>
>>
>> Basically if we already allocated the buffer and our alloced_pages is
>> less than the limit then we are good to go.
>
> Yeb.
>
>>
>> It is the race to update that we need to have the cmpxchg.
>> And maybe a helper function would aid readability - not sure, see next point.
>>
>> I don't believe there is need for the loop either.
>> Any other updater will also be including our allocated pages
>> (and at this point in the code eliminated from roll back)
>> so if they beat us to it, then no problem, their max is better than ours.
>
> Let's assume we don't have the loop.
>
>
> CPU A CPU B
>
> alloced_pages = 2001
> old_max = cur_max = 2000
> alloced_pages = 2005
> old_max = cur_max = 2000
>
> cmpxchg(2000, 2000, 2001) -> OK
>
> cmpxchg(2001, 2000, 2005) -> FAIL
>
> So, we lose 2005 which is bigger vaule.
>
Yes - you are absolutely correct - I missed that scenario.
but there isn't the need to redo zs_get_total_size.
we only need to loop while our value is still the max.
So the two parts are not closely coupled and the inline code for the
exceeded check is simple enough.
And the loop to apply max would be best in helper function.
>>
>>
>>
>> > static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>> > int offset)
>> > {
>> > @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>> > goto out;
>> > }
>> >
>> > - if (zram->limit_pages &&
>> > - zs_get_total_size(meta->mem_pool) > zram->limit_pages) {
>> > + if (!check_limit(zram)) {
>> > zs_free(meta->mem_pool, handle);
>> > ret = -ENOMEM;
>> > goto out;
>> > @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
>> > static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
>> > static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show,
>> > mem_limit_store);
>> > +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show,
>> > + mem_used_max_store);
>> > static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
>> > max_comp_streams_show, max_comp_streams_store);
>> > static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
>> > @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = {
>> > &dev_attr_compr_data_size.attr,
>> > &dev_attr_mem_used_total.attr,
>> > &dev_attr_mem_limit.attr,
>> > + &dev_attr_mem_used_max.attr,
>> > &dev_attr_max_comp_streams.attr,
>> > &dev_attr_comp_algorithm.attr,
>> > NULL,
>> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
>> > index b7aa9c21553f..29383312d543 100644
>> > --- a/drivers/block/zram/zram_drv.h
>> > +++ b/drivers/block/zram/zram_drv.h
>> > @@ -90,6 +90,7 @@ struct zram_stats {
>> > atomic64_t notify_free; /* no. of swap slot free notifications */
>> > atomic64_t zero_pages; /* no. of zero filled pages */
>> > atomic64_t pages_stored; /* no. of pages currently stored */
>> > + atomic64_t max_used_pages; /* no. of maximum pages stored */
>> > };
>> >
>> > struct zram_meta {
>> > --
>> > 2.0.0
>> >
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
> --
> Kind regards,
> Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/4] zram: report maximum used memory
2014-08-20 7:38 ` David Horner
@ 2014-08-20 7:53 ` Minchan Kim
2014-08-20 8:25 ` David Horner
0 siblings, 1 reply; 19+ messages in thread
From: Minchan Kim @ 2014-08-20 7:53 UTC (permalink / raw)
To: David Horner
Cc: Andrew Morton, Linux-MM, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Seth Jennings, Dan Streetman
On Wed, Aug 20, 2014 at 03:38:27AM -0400, David Horner wrote:
> On Wed, Aug 20, 2014 at 2:53 AM, Minchan Kim <minchan@kernel.org> wrote:
> > On Wed, Aug 20, 2014 at 02:26:50AM -0400, David Horner wrote:
> >> On Tue, Aug 19, 2014 at 3:54 AM, Minchan Kim <minchan@kernel.org> wrote:
> >> > Normally, zram user could get maximum memory usage zram consumed
> >> > via polling mem_used_total with sysfs in userspace.
> >> >
> >> > But it has a critical problem because user can miss peak memory
> >> > usage during update inverval of polling. For avoiding that,
> >> > user should poll it with shorter interval(ie, 0.0000000001s)
> >> > with mlocking to avoid page fault delay when memory pressure
> >> > is heavy. It would be troublesome.
> >> >
> >> > This patch adds new knob "mem_used_max" so user could see
> >> > the maximum memory usage easily via reading the knob and reset
> >> > it via "echo 0 > /sys/block/zram0/mem_used_max".
> >> >
> >> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> >> > ---
> >> > Documentation/ABI/testing/sysfs-block-zram | 10 +++++
> >> > Documentation/blockdev/zram.txt | 1 +
> >> > drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++-
> >> > drivers/block/zram/zram_drv.h | 1 +
> >> > 4 files changed, 70 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram
> >> > index 025331c19045..ffd1ea7443dd 100644
> >> > --- a/Documentation/ABI/testing/sysfs-block-zram
> >> > +++ b/Documentation/ABI/testing/sysfs-block-zram
> >> > @@ -120,6 +120,16 @@ Description:
> >> > statistic.
> >> > Unit: bytes
> >> >
> >> > +What: /sys/block/zram<id>/mem_used_max
> >> > +Date: August 2014
> >> > +Contact: Minchan Kim <minchan@kernel.org>
> >> > +Description:
> >> > + The mem_used_max file is read/write and specifies the amount
> >> > + of maximum memory zram have consumed to store compressed data.
> >> > + For resetting the value, you should do "echo 0". Otherwise,
> >> > + you could see -EINVAL.
> >> > + Unit: bytes
> >> > +
> >> > What: /sys/block/zram<id>/mem_limit
> >> > Date: August 2014
> >> > Contact: Minchan Kim <minchan@kernel.org>
> >> > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> >> > index 9f239ff8c444..3b2247c2d4cf 100644
> >> > --- a/Documentation/blockdev/zram.txt
> >> > +++ b/Documentation/blockdev/zram.txt
> >> > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful.
> >> > orig_data_size
> >> > compr_data_size
> >> > mem_used_total
> >> > + mem_used_max
> >> >
> >> > 8) Deactivate:
> >> > swapoff /dev/zram0
> >> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> >> > index adc91c7ecaef..e4d44842a91d 100644
> >> > --- a/drivers/block/zram/zram_drv.c
> >> > +++ b/drivers/block/zram/zram_drv.c
> >> > @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev,
> >> > return len;
> >> > }
> >> >
> >> > +static ssize_t mem_used_max_show(struct device *dev,
> >> > + struct device_attribute *attr, char *buf)
> >> > +{
> >> > + u64 val = 0;
> >> > + struct zram *zram = dev_to_zram(dev);
> >> > +
> >> > + down_read(&zram->init_lock);
> >> > + if (init_done(zram))
> >> > + val = atomic64_read(&zram->stats.max_used_pages);
> >> > + up_read(&zram->init_lock);
> >> > +
> >> > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> >> > +}
> >> > +
> >> > +static ssize_t mem_used_max_store(struct device *dev,
> >> > + struct device_attribute *attr, const char *buf, size_t len)
> >> > +{
> >> > + u64 limit;
> >> > + struct zram *zram = dev_to_zram(dev);
> >> > + struct zram_meta *meta = zram->meta;
> >> > +
> >> > - limit = memparse(buf, NULL);
> >> > - if (0 != limit)
> >>
> >> we wanted explicit "0" and nothing else for extensibility
> >>
> >> if (len != 1 || *buf != "0")
> >>
> >
> > I wanted to work with "0", "0K", "0M", "0G" but agree it's meaningless
> > at the moment so your version is better.
> >
> >
> >> > + return -EINVAL;
> >> > +
> >> > + down_read(&zram->init_lock);
> >> > + if (init_done(zram))
> >> > + atomic64_set(&zram->stats.max_used_pages,
> >> > + zs_get_total_size(meta->mem_pool));
> >> > + up_read(&zram->init_lock);
> >> > +
> >> > + return len;
> >> return 1;
> >>
> >> the standard convention is to return used amount of buffer
> >
> > If I follow your suggestion, len should be 1 right before returning
> > so no problem for functionality POV but I agree explicit "1" is better
> > for readability so your version is better, better.
> >
> >>
> >>
> >>
> >> > +}
> >> > +
> >> > static ssize_t max_comp_streams_store(struct device *dev,
> >> > struct device_attribute *attr, const char *buf, size_t len)
> >> > {
> >> > @@ -461,6 +495,26 @@ out_cleanup:
> >> > return ret;
> >> > }
> >> >
> >> > +static bool check_limit(struct zram *zram)
> >> > +{
> >> > + unsigned long alloced_pages;
> >> > + u64 old_max, cur_max;
> >> > + struct zram_meta *meta = zram->meta;
> >> > +
> >> > + do {
> >> > + alloced_pages = zs_get_total_size(meta->mem_pool);
> >> > + if (zram->limit_pages && alloced_pages > zram->limit_pages)
> >> > + return false;
> >> > +
> >> > + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages);
> >> > + if (alloced_pages > cur_max)
> >> > + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages,
> >> > + cur_max, alloced_pages);
> >> > + } while (old_max != cur_max);
> >> > +
> >> > + return true;
> >> > +}
> >> > +
> >>
> >> Check_limit does more than check limit - it has a substantial side
> >> effect of updating max used.
> >
> > Hmm, Normally, limit check is best place to update the max although
> > function name imply just checking the limit and I don't think
> > code piece for max updating doesn't hurt readbilty.
> > If you or other reviewer is strong against, I will be happy to
> > factor out part of max updating into another function because
> > I think it's just preference problem for small logic and don't want
> > to waste argue for that.
> >
> > If you really want it, pz, ping me again.
> >
> >>
> >> Basically if we already allocated the buffer and our alloced_pages is
> >> less than the limit then we are good to go.
> >
> > Yeb.
> >
> >>
> >> It is the race to update that we need to have the cmpxchg.
> >> And maybe a helper function would aid readability - not sure, see next point.
> >>
> >> I don't believe there is need for the loop either.
> >> Any other updater will also be including our allocated pages
> >> (and at this point in the code eliminated from roll back)
> >> so if they beat us to it, then no problem, their max is better than ours.
> >
> > Let's assume we don't have the loop.
> >
> >
> > CPU A CPU B
> >
> > alloced_pages = 2001
> > old_max = cur_max = 2000
> > alloced_pages = 2005
> > old_max = cur_max = 2000
> >
> > cmpxchg(2000, 2000, 2001) -> OK
> >
> > cmpxchg(2001, 2000, 2005) -> FAIL
> >
> > So, we lose 2005 which is bigger vaule.
> >
>
> Yes - you are absolutely correct - I missed that scenario.
>
> but there isn't the need to redo zs_get_total_size.
>
> we only need to loop while our value is still the max.
Yes - you are absolutely right. :)
>
> So the two parts are not closely coupled and the inline code for the
> exceeded check is simple enough.
> And the loop to apply max would be best in helper function.
Okay, you proved helper function would be better for readabilty
to indicate limit check and max_used_check is not coupled.
Thanks for the review, David!
>
> >>
> >>
> >>
> >> > static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >> > int offset)
> >> > {
> >> > @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >> > goto out;
> >> > }
> >> >
> >> > - if (zram->limit_pages &&
> >> > - zs_get_total_size(meta->mem_pool) > zram->limit_pages) {
> >> > + if (!check_limit(zram)) {
> >> > zs_free(meta->mem_pool, handle);
> >> > ret = -ENOMEM;
> >> > goto out;
> >> > @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> >> > static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> >> > static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show,
> >> > mem_limit_store);
> >> > +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show,
> >> > + mem_used_max_store);
> >> > static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> >> > max_comp_streams_show, max_comp_streams_store);
> >> > static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> >> > @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = {
> >> > &dev_attr_compr_data_size.attr,
> >> > &dev_attr_mem_used_total.attr,
> >> > &dev_attr_mem_limit.attr,
> >> > + &dev_attr_mem_used_max.attr,
> >> > &dev_attr_max_comp_streams.attr,
> >> > &dev_attr_comp_algorithm.attr,
> >> > NULL,
> >> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> >> > index b7aa9c21553f..29383312d543 100644
> >> > --- a/drivers/block/zram/zram_drv.h
> >> > +++ b/drivers/block/zram/zram_drv.h
> >> > @@ -90,6 +90,7 @@ struct zram_stats {
> >> > atomic64_t notify_free; /* no. of swap slot free notifications */
> >> > atomic64_t zero_pages; /* no. of zero filled pages */
> >> > atomic64_t pages_stored; /* no. of pages currently stored */
> >> > + atomic64_t max_used_pages; /* no. of maximum pages stored */
> >> > };
> >> >
> >> > struct zram_meta {
> >> > --
> >> > 2.0.0
> >> >
> >>
> >> --
> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >> the body to majordomo@kvack.org. For more info on Linux MM,
> >> see: http://www.linux-mm.org/ .
> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
> > --
> > Kind regards,
> > Minchan Kim
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/4] zram: zram memory size limitation
2014-08-19 23:32 ` Minchan Kim
@ 2014-08-20 7:58 ` Marc Dietrich
0 siblings, 0 replies; 19+ messages in thread
From: Marc Dietrich @ 2014-08-20 7:58 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Seth Jennings, Dan Streetman, ds2horner
[-- Attachment #1: Type: text/plain, Size: 3313 bytes --]
Am Mittwoch, 20. August 2014, 08:32:25 schrieb Minchan Kim:
> Hello,
>
> On Tue, Aug 19, 2014 at 10:06:22AM +0200, Marc Dietrich wrote:
> > Am Dienstag, 19. August 2014, 16:54:46 schrieb Minchan Kim:
> > > Since zram has no control feature to limit memory usage,
> > > it makes hard to manage system memrory.
> > >
> > > This patch adds new knob "mem_limit" via sysfs to set up the
> > > a limit so that zram could fail allocation once it reaches
> > > the limit.
> >
> > Sorry to jump in late with a probably silly question, but I couldn't find
> > the answer easily. What's the difference between disksize and mem_limit?
> No need to say sorry.
> It was totally my fault because zram documentation sucks.
>
> The disksize means the size point of view upper layer from block subsystem
> so filesystem based on zram or blockdevice itself(ie, zram0) seen by admin
> will have the disksize size but keep in mind that it's virtual size,
> not compressed. As you know already, zram is backed on volatile storage
> (ie, DRAM) with *compressed form*, not permanent storage.
>
> The point of this patchset is that anybody cannot expect exact memory
> usage of zram in advance. Acutally, zram folks have estimated it by several
> experiment and assuming zram compression ratio(ex, 2:1 or 3:1) before
> releasing product. But thesedays, embedded platforms have varios workloads
> which cannot be expected when the product was released so compression
> ratio expectation could be wrong sometime so zram could consume lots of
> memory than expected once compression ratio is low.
>
> It makes admin trouble to manage memeory on the product because there
> is no way to release memory zram is using so that one of the way is
> to limit memory usage of zram from the beginning.
>
> > I assume the former is uncompressed size (virtual size) and the latter is
> > compressed size (real memory usage)? Maybe the difference should be made
>
> Right.
>
> > clearer in the documentation.
>
> Okay.
>
> > If disksize is the uncompressed size, why would we want to set this at
> > all?
>
> For example, we have 500M disksize of zram0 because we assumed 2:1
> compression ratio so that we could guess zram will consume 250M physical
> memory in the end. But our guessing could be wrong so if real compression
> ratio is 4:1, we use up 125M phsyical memory to store 500M uncompressed
> pages. It's good but admin want to use up more memory for zram because we
> saved 100% than expected zram memory but we couldn't becuase upper layer
> point of view from zram, zram is already full by 500M and if zram is used
> for swap, we will encounter OOM. :(
>
> So, it would be better to increase disksize to 1000M but in this case,
> if compression ratio becomes 4:1 by something(ex, workload change),
> zram can consume 500M physical memory, which is above we expected
> and admin don't want zram to use up system memory too much.
>
> In summary, we couldn't control exact zram memory usage with only disksize
> by compression ratio.
thanks for your detailed explanation. It's a bit confusing that you can
specify two limits (for two different layers). I guess a floating disksize is
not possible, because you wouldn't be able to create a filesystem/swapfile on
it, so you need to make a *fixed* assumption.
Regards,
Marc
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/4] zram: report maximum used memory
2014-08-20 7:53 ` Minchan Kim
@ 2014-08-20 8:25 ` David Horner
0 siblings, 0 replies; 19+ messages in thread
From: David Horner @ 2014-08-20 8:25 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, Linux-MM, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Seth Jennings, Dan Streetman
I'm really surprised there isn't an atomic_set_max function -
that keeps trying the exchange until the target value is equal or
greater than provided value.
(and complimentary atomic_set_min)
I would be surprised that this is the only place in the kernel with
this scenario.
On Wed, Aug 20, 2014 at 3:53 AM, Minchan Kim <minchan@kernel.org> wrote:
> On Wed, Aug 20, 2014 at 03:38:27AM -0400, David Horner wrote:
>> On Wed, Aug 20, 2014 at 2:53 AM, Minchan Kim <minchan@kernel.org> wrote:
>> > On Wed, Aug 20, 2014 at 02:26:50AM -0400, David Horner wrote:
>> >> On Tue, Aug 19, 2014 at 3:54 AM, Minchan Kim <minchan@kernel.org> wrote:
>> >> > Normally, zram user could get maximum memory usage zram consumed
>> >> > via polling mem_used_total with sysfs in userspace.
>> >> >
>> >> > But it has a critical problem because user can miss peak memory
>> >> > usage during update inverval of polling. For avoiding that,
>> >> > user should poll it with shorter interval(ie, 0.0000000001s)
>> >> > with mlocking to avoid page fault delay when memory pressure
>> >> > is heavy. It would be troublesome.
>> >> >
>> >> > This patch adds new knob "mem_used_max" so user could see
>> >> > the maximum memory usage easily via reading the knob and reset
>> >> > it via "echo 0 > /sys/block/zram0/mem_used_max".
>> >> >
>> >> > Signed-off-by: Minchan Kim <minchan@kernel.org>
>> >> > ---
>> >> > Documentation/ABI/testing/sysfs-block-zram | 10 +++++
>> >> > Documentation/blockdev/zram.txt | 1 +
>> >> > drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++-
>> >> > drivers/block/zram/zram_drv.h | 1 +
>> >> > 4 files changed, 70 insertions(+), 2 deletions(-)
>> >> >
>> >> > diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram
>> >> > index 025331c19045..ffd1ea7443dd 100644
>> >> > --- a/Documentation/ABI/testing/sysfs-block-zram
>> >> > +++ b/Documentation/ABI/testing/sysfs-block-zram
>> >> > @@ -120,6 +120,16 @@ Description:
>> >> > statistic.
>> >> > Unit: bytes
>> >> >
>> >> > +What: /sys/block/zram<id>/mem_used_max
>> >> > +Date: August 2014
>> >> > +Contact: Minchan Kim <minchan@kernel.org>
>> >> > +Description:
>> >> > + The mem_used_max file is read/write and specifies the amount
>> >> > + of maximum memory zram have consumed to store compressed data.
>> >> > + For resetting the value, you should do "echo 0". Otherwise,
>> >> > + you could see -EINVAL.
>> >> > + Unit: bytes
>> >> > +
>> >> > What: /sys/block/zram<id>/mem_limit
>> >> > Date: August 2014
>> >> > Contact: Minchan Kim <minchan@kernel.org>
>> >> > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
>> >> > index 9f239ff8c444..3b2247c2d4cf 100644
>> >> > --- a/Documentation/blockdev/zram.txt
>> >> > +++ b/Documentation/blockdev/zram.txt
>> >> > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful.
>> >> > orig_data_size
>> >> > compr_data_size
>> >> > mem_used_total
>> >> > + mem_used_max
>> >> >
>> >> > 8) Deactivate:
>> >> > swapoff /dev/zram0
>> >> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>> >> > index adc91c7ecaef..e4d44842a91d 100644
>> >> > --- a/drivers/block/zram/zram_drv.c
>> >> > +++ b/drivers/block/zram/zram_drv.c
>> >> > @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev,
>> >> > return len;
>> >> > }
>> >> >
>> >> > +static ssize_t mem_used_max_show(struct device *dev,
>> >> > + struct device_attribute *attr, char *buf)
>> >> > +{
>> >> > + u64 val = 0;
>> >> > + struct zram *zram = dev_to_zram(dev);
>> >> > +
>> >> > + down_read(&zram->init_lock);
>> >> > + if (init_done(zram))
>> >> > + val = atomic64_read(&zram->stats.max_used_pages);
>> >> > + up_read(&zram->init_lock);
>> >> > +
>> >> > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
>> >> > +}
>> >> > +
>> >> > +static ssize_t mem_used_max_store(struct device *dev,
>> >> > + struct device_attribute *attr, const char *buf, size_t len)
>> >> > +{
>> >> > + u64 limit;
>> >> > + struct zram *zram = dev_to_zram(dev);
>> >> > + struct zram_meta *meta = zram->meta;
>> >> > +
>> >> > - limit = memparse(buf, NULL);
>> >> > - if (0 != limit)
>> >>
>> >> we wanted explicit "0" and nothing else for extensibility
>> >>
>> >> if (len != 1 || *buf != "0")
>> >>
>> >
>> > I wanted to work with "0", "0K", "0M", "0G" but agree it's meaningless
>> > at the moment so your version is better.
>> >
>> >
>> >> > + return -EINVAL;
>> >> > +
>> >> > + down_read(&zram->init_lock);
>> >> > + if (init_done(zram))
>> >> > + atomic64_set(&zram->stats.max_used_pages,
>> >> > + zs_get_total_size(meta->mem_pool));
>> >> > + up_read(&zram->init_lock);
>> >> > +
>> >> > + return len;
>> >> return 1;
>> >>
>> >> the standard convention is to return used amount of buffer
>> >
>> > If I follow your suggestion, len should be 1 right before returning
>> > so no problem for functionality POV but I agree explicit "1" is better
>> > for readability so your version is better, better.
>> >
>> >>
>> >>
>> >>
>> >> > +}
>> >> > +
>> >> > static ssize_t max_comp_streams_store(struct device *dev,
>> >> > struct device_attribute *attr, const char *buf, size_t len)
>> >> > {
>> >> > @@ -461,6 +495,26 @@ out_cleanup:
>> >> > return ret;
>> >> > }
>> >> >
>> >> > +static bool check_limit(struct zram *zram)
>> >> > +{
>> >> > + unsigned long alloced_pages;
>> >> > + u64 old_max, cur_max;
>> >> > + struct zram_meta *meta = zram->meta;
>> >> > +
>> >> > + do {
>> >> > + alloced_pages = zs_get_total_size(meta->mem_pool);
>> >> > + if (zram->limit_pages && alloced_pages > zram->limit_pages)
>> >> > + return false;
>> >> > +
>> >> > + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages);
>> >> > + if (alloced_pages > cur_max)
>> >> > + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages,
>> >> > + cur_max, alloced_pages);
>> >> > + } while (old_max != cur_max);
>> >> > +
>> >> > + return true;
>> >> > +}
>> >> > +
>> >>
>> >> Check_limit does more than check limit - it has a substantial side
>> >> effect of updating max used.
>> >
>> > Hmm, Normally, limit check is best place to update the max although
>> > function name imply just checking the limit and I don't think
>> > code piece for max updating doesn't hurt readbilty.
>> > If you or other reviewer is strong against, I will be happy to
>> > factor out part of max updating into another function because
>> > I think it's just preference problem for small logic and don't want
>> > to waste argue for that.
>> >
>> > If you really want it, pz, ping me again.
>> >
>> >>
>> >> Basically if we already allocated the buffer and our alloced_pages is
>> >> less than the limit then we are good to go.
>> >
>> > Yeb.
>> >
>> >>
>> >> It is the race to update that we need to have the cmpxchg.
>> >> And maybe a helper function would aid readability - not sure, see next point.
>> >>
>> >> I don't believe there is need for the loop either.
>> >> Any other updater will also be including our allocated pages
>> >> (and at this point in the code eliminated from roll back)
>> >> so if they beat us to it, then no problem, their max is better than ours.
>> >
>> > Let's assume we don't have the loop.
>> >
>> >
>> > CPU A CPU B
>> >
>> > alloced_pages = 2001
>> > old_max = cur_max = 2000
>> > alloced_pages = 2005
>> > old_max = cur_max = 2000
>> >
>> > cmpxchg(2000, 2000, 2001) -> OK
>> >
>> > cmpxchg(2001, 2000, 2005) -> FAIL
>> >
>> > So, we lose 2005 which is bigger vaule.
>> >
>>
>> Yes - you are absolutely correct - I missed that scenario.
>>
>> but there isn't the need to redo zs_get_total_size.
>>
>> we only need to loop while our value is still the max.
>
> Yes - you are absolutely right. :)
>
>>
>> So the two parts are not closely coupled and the inline code for the
>> exceeded check is simple enough.
>> And the loop to apply max would be best in helper function.
>
> Okay, you proved helper function would be better for readabilty
> to indicate limit check and max_used_check is not coupled.
>
> Thanks for the review, David!
>
>>
>> >>
>> >>
>> >>
>> >> > static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>> >> > int offset)
>> >> > {
>> >> > @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>> >> > goto out;
>> >> > }
>> >> >
>> >> > - if (zram->limit_pages &&
>> >> > - zs_get_total_size(meta->mem_pool) > zram->limit_pages) {
>> >> > + if (!check_limit(zram)) {
>> >> > zs_free(meta->mem_pool, handle);
>> >> > ret = -ENOMEM;
>> >> > goto out;
>> >> > @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
>> >> > static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
>> >> > static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show,
>> >> > mem_limit_store);
>> >> > +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show,
>> >> > + mem_used_max_store);
>> >> > static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
>> >> > max_comp_streams_show, max_comp_streams_store);
>> >> > static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
>> >> > @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = {
>> >> > &dev_attr_compr_data_size.attr,
>> >> > &dev_attr_mem_used_total.attr,
>> >> > &dev_attr_mem_limit.attr,
>> >> > + &dev_attr_mem_used_max.attr,
>> >> > &dev_attr_max_comp_streams.attr,
>> >> > &dev_attr_comp_algorithm.attr,
>> >> > NULL,
>> >> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
>> >> > index b7aa9c21553f..29383312d543 100644
>> >> > --- a/drivers/block/zram/zram_drv.h
>> >> > +++ b/drivers/block/zram/zram_drv.h
>> >> > @@ -90,6 +90,7 @@ struct zram_stats {
>> >> > atomic64_t notify_free; /* no. of swap slot free notifications */
>> >> > atomic64_t zero_pages; /* no. of zero filled pages */
>> >> > atomic64_t pages_stored; /* no. of pages currently stored */
>> >> > + atomic64_t max_used_pages; /* no. of maximum pages stored */
>> >> > };
>> >> >
>> >> > struct zram_meta {
>> >> > --
>> >> > 2.0.0
>> >> >
>> >>
>> >> --
>> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> >> the body to majordomo@kvack.org. For more info on Linux MM,
>> >> see: http://www.linux-mm.org/ .
>> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>> >
>> > --
>> > Kind regards,
>> > Minchan Kim
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
> --
> Kind regards,
> Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/4] zram: report maximum used memory
2014-08-20 6:53 ` Minchan Kim
2014-08-20 7:38 ` David Horner
@ 2014-08-21 0:06 ` Minchan Kim
1 sibling, 0 replies; 19+ messages in thread
From: Minchan Kim @ 2014-08-21 0:06 UTC (permalink / raw)
To: David Horner
Cc: Andrew Morton, Linux-MM, linux-kernel, Sergey Senozhatsky,
Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
Nitin Gupta, Seth Jennings, Dan Streetman
On Wed, Aug 20, 2014 at 03:53:18PM +0900, Minchan Kim wrote:
> On Wed, Aug 20, 2014 at 02:26:50AM -0400, David Horner wrote:
> > On Tue, Aug 19, 2014 at 3:54 AM, Minchan Kim <minchan@kernel.org> wrote:
> > > Normally, zram user could get maximum memory usage zram consumed
> > > via polling mem_used_total with sysfs in userspace.
> > >
> > > But it has a critical problem because user can miss peak memory
> > > usage during update inverval of polling. For avoiding that,
> > > user should poll it with shorter interval(ie, 0.0000000001s)
> > > with mlocking to avoid page fault delay when memory pressure
> > > is heavy. It would be troublesome.
> > >
> > > This patch adds new knob "mem_used_max" so user could see
> > > the maximum memory usage easily via reading the knob and reset
> > > it via "echo 0 > /sys/block/zram0/mem_used_max".
> > >
> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > ---
> > > Documentation/ABI/testing/sysfs-block-zram | 10 +++++
> > > Documentation/blockdev/zram.txt | 1 +
> > > drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++-
> > > drivers/block/zram/zram_drv.h | 1 +
> > > 4 files changed, 70 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram
> > > index 025331c19045..ffd1ea7443dd 100644
> > > --- a/Documentation/ABI/testing/sysfs-block-zram
> > > +++ b/Documentation/ABI/testing/sysfs-block-zram
> > > @@ -120,6 +120,16 @@ Description:
> > > statistic.
> > > Unit: bytes
> > >
> > > +What: /sys/block/zram<id>/mem_used_max
> > > +Date: August 2014
> > > +Contact: Minchan Kim <minchan@kernel.org>
> > > +Description:
> > > + The mem_used_max file is read/write and specifies the amount
> > > + of maximum memory zram have consumed to store compressed data.
> > > + For resetting the value, you should do "echo 0". Otherwise,
> > > + you could see -EINVAL.
> > > + Unit: bytes
> > > +
> > > What: /sys/block/zram<id>/mem_limit
> > > Date: August 2014
> > > Contact: Minchan Kim <minchan@kernel.org>
> > > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> > > index 9f239ff8c444..3b2247c2d4cf 100644
> > > --- a/Documentation/blockdev/zram.txt
> > > +++ b/Documentation/blockdev/zram.txt
> > > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful.
> > > orig_data_size
> > > compr_data_size
> > > mem_used_total
> > > + mem_used_max
> > >
> > > 8) Deactivate:
> > > swapoff /dev/zram0
> > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > > index adc91c7ecaef..e4d44842a91d 100644
> > > --- a/drivers/block/zram/zram_drv.c
> > > +++ b/drivers/block/zram/zram_drv.c
> > > @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev,
> > > return len;
> > > }
> > >
> > > +static ssize_t mem_used_max_show(struct device *dev,
> > > + struct device_attribute *attr, char *buf)
> > > +{
> > > + u64 val = 0;
> > > + struct zram *zram = dev_to_zram(dev);
> > > +
> > > + down_read(&zram->init_lock);
> > > + if (init_done(zram))
> > > + val = atomic64_read(&zram->stats.max_used_pages);
> > > + up_read(&zram->init_lock);
> > > +
> > > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> > > +}
> > > +
> > > +static ssize_t mem_used_max_store(struct device *dev,
> > > + struct device_attribute *attr, const char *buf, size_t len)
> > > +{
> > > + u64 limit;
> > > + struct zram *zram = dev_to_zram(dev);
> > > + struct zram_meta *meta = zram->meta;
> > > +
> > > - limit = memparse(buf, NULL);
> > > - if (0 != limit)
> >
> > we wanted explicit "0" and nothing else for extensibility
> >
> > if (len != 1 || *buf != "0")
> >
>
> I wanted to work with "0", "0K", "0M", "0G" but agree it's meaningless
> at the moment so your version is better.
When I tested your way, it makes trobule for use who wanted "echo 0 >
/sys/block/zram0/mem_used_max" because normally echo adds newline.
Although we can guide them to use "echo -n", it's not a handy for users
as well as not consistent with other knobs.
IMO, we shouldn't force uncomfortable way to user for our uncertain
future expandability. So, I will use kstrtoul.
>
>
> > > + return -EINVAL;
> > > +
> > > + down_read(&zram->init_lock);
> > > + if (init_done(zram))
> > > + atomic64_set(&zram->stats.max_used_pages,
> > > + zs_get_total_size(meta->mem_pool));
> > > + up_read(&zram->init_lock);
> > > +
> > > + return len;
> > return 1;
> >
> > the standard convention is to return used amount of buffer
>
> If I follow your suggestion, len should be 1 right before returning
> so no problem for functionality POV but I agree explicit "1" is better
> for readability so your version is better, better.
>
> >
> >
> >
> > > +}
> > > +
> > > static ssize_t max_comp_streams_store(struct device *dev,
> > > struct device_attribute *attr, const char *buf, size_t len)
> > > {
> > > @@ -461,6 +495,26 @@ out_cleanup:
> > > return ret;
> > > }
> > >
> > > +static bool check_limit(struct zram *zram)
> > > +{
> > > + unsigned long alloced_pages;
> > > + u64 old_max, cur_max;
> > > + struct zram_meta *meta = zram->meta;
> > > +
> > > + do {
> > > + alloced_pages = zs_get_total_size(meta->mem_pool);
> > > + if (zram->limit_pages && alloced_pages > zram->limit_pages)
> > > + return false;
> > > +
> > > + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages);
> > > + if (alloced_pages > cur_max)
> > > + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages,
> > > + cur_max, alloced_pages);
> > > + } while (old_max != cur_max);
> > > +
> > > + return true;
> > > +}
> > > +
> >
> > Check_limit does more than check limit - it has a substantial side
> > effect of updating max used.
>
> Hmm, Normally, limit check is best place to update the max although
> function name imply just checking the limit and I don't think
> code piece for max updating doesn't hurt readbilty.
> If you or other reviewer is strong against, I will be happy to
> factor out part of max updating into another function because
> I think it's just preference problem for small logic and don't want
> to waste argue for that.
>
> If you really want it, pz, ping me again.
>
> >
> > Basically if we already allocated the buffer and our alloced_pages is
> > less than the limit then we are good to go.
>
> Yeb.
>
> >
> > It is the race to update that we need to have the cmpxchg.
> > And maybe a helper function would aid readability - not sure, see next point.
> >
> > I don't believe there is need for the loop either.
> > Any other updater will also be including our allocated pages
> > (and at this point in the code eliminated from roll back)
> > so if they beat us to it, then no problem, their max is better than ours.
>
> Let's assume we don't have the loop.
>
>
> CPU A CPU B
>
> alloced_pages = 2001
> old_max = cur_max = 2000
> alloced_pages = 2005
> old_max = cur_max = 2000
>
> cmpxchg(2000, 2000, 2001) -> OK
>
> cmpxchg(2001, 2000, 2005) -> FAIL
>
> So, we lose 2005 which is bigger vaule.
>
> >
> >
> >
> > > static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> > > int offset)
> > > {
> > > @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> > > goto out;
> > > }
> > >
> > > - if (zram->limit_pages &&
> > > - zs_get_total_size(meta->mem_pool) > zram->limit_pages) {
> > > + if (!check_limit(zram)) {
> > > zs_free(meta->mem_pool, handle);
> > > ret = -ENOMEM;
> > > goto out;
> > > @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> > > static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> > > static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show,
> > > mem_limit_store);
> > > +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show,
> > > + mem_used_max_store);
> > > static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> > > max_comp_streams_show, max_comp_streams_store);
> > > static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> > > @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = {
> > > &dev_attr_compr_data_size.attr,
> > > &dev_attr_mem_used_total.attr,
> > > &dev_attr_mem_limit.attr,
> > > + &dev_attr_mem_used_max.attr,
> > > &dev_attr_max_comp_streams.attr,
> > > &dev_attr_comp_algorithm.attr,
> > > NULL,
> > > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > > index b7aa9c21553f..29383312d543 100644
> > > --- a/drivers/block/zram/zram_drv.h
> > > +++ b/drivers/block/zram/zram_drv.h
> > > @@ -90,6 +90,7 @@ struct zram_stats {
> > > atomic64_t notify_free; /* no. of swap slot free notifications */
> > > atomic64_t zero_pages; /* no. of zero filled pages */
> > > atomic64_t pages_stored; /* no. of pages currently stored */
> > > + atomic64_t max_used_pages; /* no. of maximum pages stored */
> > > };
> > >
> > > struct zram_meta {
> > > --
> > > 2.0.0
> > >
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org. For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
> --
> Kind regards,
> Minchan Kim
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2014-08-21 0:05 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-19 7:54 [PATCH v2 0/4] zram memory control enhance Minchan Kim
2014-08-19 7:54 ` [PATCH v2 1/4] zsmalloc: move pages_allocated to zs_pool Minchan Kim
2014-08-19 7:54 ` [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes Minchan Kim
2014-08-19 14:46 ` Seth Jennings
2014-08-19 15:11 ` Seth Jennings
2014-08-19 23:47 ` Minchan Kim
2014-08-19 23:46 ` Minchan Kim
2014-08-20 4:44 ` Seth Jennings
2014-08-19 7:54 ` [PATCH v2 3/4] zram: zram memory size limitation Minchan Kim
2014-08-19 8:06 ` Marc Dietrich
2014-08-19 23:32 ` Minchan Kim
2014-08-20 7:58 ` Marc Dietrich
2014-08-19 7:54 ` [PATCH v2 4/4] zram: report maximum used memory Minchan Kim
2014-08-20 6:26 ` David Horner
2014-08-20 6:53 ` Minchan Kim
2014-08-20 7:38 ` David Horner
2014-08-20 7:53 ` Minchan Kim
2014-08-20 8:25 ` David Horner
2014-08-21 0:06 ` Minchan Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).