* [PATCH v2 0/4] zram memory control enhance @ 2014-08-19 7:54 Minchan Kim 2014-08-19 7:54 ` [PATCH v2 1/4] zsmalloc: move pages_allocated to zs_pool Minchan Kim ` (3 more replies) 0 siblings, 4 replies; 19+ messages in thread From: Minchan Kim @ 2014-08-19 7:54 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman, ds2horner, Minchan Kim Now, zram has no feature to limit memory usage even though so that theoretically zram can deplete system memory. In addition, it makes hard to control memory usage of the platform so zram users have asked it several times. This patchset adds the feature. Patch 1 makes zs_get_total_size_bytes faster because it would be used frequently in later patches for the new feature. Patch 2 changes zs_get_total_size_bytes's return unit from bytes to page so that zsmalloc doesn't need unnecessary operation(ie, << PAGE_SHIFT). Patch 3 adds new feature. I added the feature into zram layer, not zsmalloc because limiation is zram's requirement, not zsmalloc so any other user using zsmalloc(ie, zpool) shouldn't affected by unnecessary branch of zsmalloc. In future, if every users of zsmalloc want the feature, then, we could move the feature from client side to zsmalloc easily but vice versa would be painful. Patch 4 adds news facility to report maximum memory usage of zram so that user can get how many of peak memory zram used easily without extremely short inverval polling via /sys/block/zram0/mem_used_total. * From v1 * rebased on next-20140815 * fix up race problem - David, Dan * reset mem_used_max as current total_bytes, rather than 0 - David * resetting works with only "0" write for extensiblilty - David Minchan Kim (4): [1] zsmalloc: move pages_allocated to zs_pool [2] zsmalloc: change return value unit of zs_get_total_size_bytes [3] zram: zram memory size limitation [4] zram: report maximum used memory Documentation/ABI/testing/sysfs-block-zram | 19 ++++++ Documentation/blockdev/zram.txt | 21 ++++-- drivers/block/zram/zram_drv.c | 101 ++++++++++++++++++++++++++++- drivers/block/zram/zram_drv.h | 6 ++ include/linux/zsmalloc.h | 2 +- mm/zsmalloc.c | 38 ++++++----- 6 files changed, 162 insertions(+), 25 deletions(-) -- 2.0.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 1/4] zsmalloc: move pages_allocated to zs_pool 2014-08-19 7:54 [PATCH v2 0/4] zram memory control enhance Minchan Kim @ 2014-08-19 7:54 ` Minchan Kim 2014-08-19 7:54 ` [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes Minchan Kim ` (2 subsequent siblings) 3 siblings, 0 replies; 19+ messages in thread From: Minchan Kim @ 2014-08-19 7:54 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman, ds2horner, Minchan Kim Pages_allocated has counted in size_class structure and when user want to see total_size_bytes, it gathers all of value from each size_class to report the sum. It's not bad if user don't see the value often but if user start to see the value frequently, it would be not a good deal for performance POV. This patch moves the variable from size_class to zs_pool so it would reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)]) but it adds new locking overhead but it wouldn't be severe because it's not a hot path in zs_malloc(ie, it is called only when new zspage is created, not a object). Signed-off-by: Minchan Kim <minchan@kernel.org> --- mm/zsmalloc.c | 30 ++++++++++++++++-------------- 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 94f38fac5e81..a65924255763 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -199,9 +199,6 @@ struct size_class { spinlock_t lock; - /* stats */ - u64 pages_allocated; - struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS]; }; @@ -217,9 +214,12 @@ struct link_free { }; struct zs_pool { + spinlock_t stat_lock; + struct size_class size_class[ZS_SIZE_CLASSES]; gfp_t flags; /* allocation flags used when growing pool */ + unsigned long pages_allocated; }; /* @@ -967,6 +967,7 @@ struct zs_pool *zs_create_pool(gfp_t flags) } + spin_lock_init(&pool->stat_lock); pool->flags = flags; return pool; @@ -1028,8 +1029,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size) return 0; set_zspage_mapping(first_page, class->index, ZS_EMPTY); + spin_lock(&pool->stat_lock); + pool->pages_allocated += class->pages_per_zspage; + spin_unlock(&pool->stat_lock); spin_lock(&class->lock); - class->pages_allocated += class->pages_per_zspage; } obj = (unsigned long)first_page->freelist; @@ -1082,14 +1085,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj) first_page->inuse--; fullness = fix_fullness_group(pool, first_page); - - if (fullness == ZS_EMPTY) - class->pages_allocated -= class->pages_per_zspage; - spin_unlock(&class->lock); - if (fullness == ZS_EMPTY) + if (fullness == ZS_EMPTY) { + spin_lock(&pool->stat_lock); + pool->pages_allocated -= class->pages_per_zspage; + spin_unlock(&pool->stat_lock); free_zspage(first_page); + } } EXPORT_SYMBOL_GPL(zs_free); @@ -1185,12 +1188,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object); u64 zs_get_total_size_bytes(struct zs_pool *pool) { - int i; - u64 npages = 0; - - for (i = 0; i < ZS_SIZE_CLASSES; i++) - npages += pool->size_class[i].pages_allocated; + u64 npages; + spin_lock(&pool->stat_lock); + npages = pool->pages_allocated; + spin_unlock(&pool->stat_lock); return npages << PAGE_SHIFT; } EXPORT_SYMBOL_GPL(zs_get_total_size_bytes); -- 2.0.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes 2014-08-19 7:54 [PATCH v2 0/4] zram memory control enhance Minchan Kim 2014-08-19 7:54 ` [PATCH v2 1/4] zsmalloc: move pages_allocated to zs_pool Minchan Kim @ 2014-08-19 7:54 ` Minchan Kim 2014-08-19 14:46 ` Seth Jennings 2014-08-19 7:54 ` [PATCH v2 3/4] zram: zram memory size limitation Minchan Kim 2014-08-19 7:54 ` [PATCH v2 4/4] zram: report maximum used memory Minchan Kim 3 siblings, 1 reply; 19+ messages in thread From: Minchan Kim @ 2014-08-19 7:54 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman, ds2horner, Minchan Kim zs_get_total_size_bytes returns a amount of memory zsmalloc consumed with *byte unit* but zsmalloc operates *page unit* rather than byte unit so let's change the API so benefit we could get is that reduce unnecessary overhead (ie, change page unit with byte unit) in zsmalloc. Now, zswap can rollback to zswap_pool_pages. Over to zswap guys ;-) Signed-off-by: Minchan Kim <minchan@kernel.org> --- drivers/block/zram/zram_drv.c | 4 ++-- include/linux/zsmalloc.h | 2 +- mm/zsmalloc.c | 10 +++++----- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index d00831c3d731..302dd37bcea3 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -103,10 +103,10 @@ static ssize_t mem_used_total_show(struct device *dev, down_read(&zram->init_lock); if (init_done(zram)) - val = zs_get_total_size_bytes(meta->mem_pool); + val = zs_get_total_size(meta->mem_pool); up_read(&zram->init_lock); - return scnprintf(buf, PAGE_SIZE, "%llu\n", val); + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT); } static ssize_t max_comp_streams_show(struct device *dev, diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h index e44d634e7fb7..105b56e45d23 100644 --- a/include/linux/zsmalloc.h +++ b/include/linux/zsmalloc.h @@ -46,6 +46,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle, enum zs_mapmode mm); void zs_unmap_object(struct zs_pool *pool, unsigned long handle); -u64 zs_get_total_size_bytes(struct zs_pool *pool); +unsigned long zs_get_total_size(struct zs_pool *pool); #endif diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index a65924255763..80408a1da03a 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -299,7 +299,7 @@ static void zs_zpool_unmap(void *pool, unsigned long handle) static u64 zs_zpool_total_size(void *pool) { - return zs_get_total_size_bytes(pool); + return zs_get_total_size(pool) << PAGE_SHIFT; } static struct zpool_driver zs_zpool_driver = { @@ -1186,16 +1186,16 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle) } EXPORT_SYMBOL_GPL(zs_unmap_object); -u64 zs_get_total_size_bytes(struct zs_pool *pool) +unsigned long zs_get_total_size(struct zs_pool *pool) { - u64 npages; + unsigned long npages; spin_lock(&pool->stat_lock); npages = pool->pages_allocated; spin_unlock(&pool->stat_lock); - return npages << PAGE_SHIFT; + return npages; } -EXPORT_SYMBOL_GPL(zs_get_total_size_bytes); +EXPORT_SYMBOL_GPL(zs_get_total_size); module_init(zs_init); module_exit(zs_exit); -- 2.0.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes 2014-08-19 7:54 ` [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes Minchan Kim @ 2014-08-19 14:46 ` Seth Jennings 2014-08-19 15:11 ` Seth Jennings 2014-08-19 23:46 ` Minchan Kim 0 siblings, 2 replies; 19+ messages in thread From: Seth Jennings @ 2014-08-19 14:46 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Dan Streetman, ds2horner On Tue, Aug 19, 2014 at 04:54:45PM +0900, Minchan Kim wrote: > zs_get_total_size_bytes returns a amount of memory zsmalloc > consumed with *byte unit* but zsmalloc operates *page unit* > rather than byte unit so let's change the API so benefit > we could get is that reduce unnecessary overhead > (ie, change page unit with byte unit) in zsmalloc. > > Now, zswap can rollback to zswap_pool_pages. > Over to zswap guys ;-) I don't think that's how is it done :-/ Changing the API for a component that has two users, changing one, then saying "hope you guys change your newly broken stuff". I know you would rather not move zram to the zpool API but doing so would avoid situations like this. Anyway, this does break the zpool API and by extension zswap, and that needs to be addressed in this patch or we create a point in the commit history where it is broken. Quick glance: - zpool_get_total_size() return type is u64 but this patch changes to unsigned long. Now mismatches between zbud and zsmalloc. - zbud_zpool_total_size needs to return pages, not bytes - as you noted s/pool_total_size/pool_pages/g in zswap.c plus modification to zswap_is_full() Thanks, Seth > > Signed-off-by: Minchan Kim <minchan@kernel.org> > --- > drivers/block/zram/zram_drv.c | 4 ++-- > include/linux/zsmalloc.h | 2 +- > mm/zsmalloc.c | 10 +++++----- > 3 files changed, 8 insertions(+), 8 deletions(-) > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > index d00831c3d731..302dd37bcea3 100644 > --- a/drivers/block/zram/zram_drv.c > +++ b/drivers/block/zram/zram_drv.c > @@ -103,10 +103,10 @@ static ssize_t mem_used_total_show(struct device *dev, > > down_read(&zram->init_lock); > if (init_done(zram)) > - val = zs_get_total_size_bytes(meta->mem_pool); > + val = zs_get_total_size(meta->mem_pool); > up_read(&zram->init_lock); > > - return scnprintf(buf, PAGE_SIZE, "%llu\n", val); > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT); > } > > static ssize_t max_comp_streams_show(struct device *dev, > diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h > index e44d634e7fb7..105b56e45d23 100644 > --- a/include/linux/zsmalloc.h > +++ b/include/linux/zsmalloc.h > @@ -46,6 +46,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle, > enum zs_mapmode mm); > void zs_unmap_object(struct zs_pool *pool, unsigned long handle); > > -u64 zs_get_total_size_bytes(struct zs_pool *pool); > +unsigned long zs_get_total_size(struct zs_pool *pool); > > #endif > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > index a65924255763..80408a1da03a 100644 > --- a/mm/zsmalloc.c > +++ b/mm/zsmalloc.c > @@ -299,7 +299,7 @@ static void zs_zpool_unmap(void *pool, unsigned long handle) > > static u64 zs_zpool_total_size(void *pool) > { > - return zs_get_total_size_bytes(pool); > + return zs_get_total_size(pool) << PAGE_SHIFT; > } > > static struct zpool_driver zs_zpool_driver = { > @@ -1186,16 +1186,16 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle) > } > EXPORT_SYMBOL_GPL(zs_unmap_object); > > -u64 zs_get_total_size_bytes(struct zs_pool *pool) > +unsigned long zs_get_total_size(struct zs_pool *pool) > { > - u64 npages; > + unsigned long npages; > > spin_lock(&pool->stat_lock); > npages = pool->pages_allocated; > spin_unlock(&pool->stat_lock); > - return npages << PAGE_SHIFT; > + return npages; > } > -EXPORT_SYMBOL_GPL(zs_get_total_size_bytes); > +EXPORT_SYMBOL_GPL(zs_get_total_size); > > module_init(zs_init); > module_exit(zs_exit); > -- > 2.0.0 > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes 2014-08-19 14:46 ` Seth Jennings @ 2014-08-19 15:11 ` Seth Jennings 2014-08-19 23:47 ` Minchan Kim 2014-08-19 23:46 ` Minchan Kim 1 sibling, 1 reply; 19+ messages in thread From: Seth Jennings @ 2014-08-19 15:11 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Dan Streetman, ds2horner On Tue, Aug 19, 2014 at 09:46:28AM -0500, Seth Jennings wrote: > On Tue, Aug 19, 2014 at 04:54:45PM +0900, Minchan Kim wrote: > > zs_get_total_size_bytes returns a amount of memory zsmalloc > > consumed with *byte unit* but zsmalloc operates *page unit* > > rather than byte unit so let's change the API so benefit > > we could get is that reduce unnecessary overhead > > (ie, change page unit with byte unit) in zsmalloc. > > > > Now, zswap can rollback to zswap_pool_pages. > > Over to zswap guys ;-) > > I don't think that's how is it done :-/ Changing the API for a > component that has two users, changing one, then saying "hope you guys > change your newly broken stuff". However, I'll bite on this one :) Just squash this in so that zpool/zswap aren't broken at any point. Dan, care to make sure I didn't miss something? Thanks, Seth diff --git a/mm/zbud.c b/mm/zbud.c index a05790b..27a3701 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -179,7 +179,7 @@ static void zbud_zpool_unmap(void *pool, unsigned long handle) static u64 zbud_zpool_total_size(void *pool) { - return zbud_get_pool_size(pool) * PAGE_SIZE; + return zbud_get_pool_size(pool); } static struct zpool_driver zbud_zpool_driver = { diff --git a/mm/zpool.c b/mm/zpool.c index e40612a..d126ebc 100644 --- a/mm/zpool.c +++ b/mm/zpool.c @@ -336,9 +336,9 @@ void zpool_unmap_handle(struct zpool *zpool, unsigned long handle) * zpool_get_total_size() - The total size of the pool * @pool The zpool to check * - * This returns the total size in bytes of the pool. + * This returns the total size in pages of the pool. * - * Returns: Total size of the zpool in bytes. + * Returns: Total size of the zpool in pages. */ u64 zpool_get_total_size(struct zpool *zpool) { diff --git a/mm/zswap.c b/mm/zswap.c index ea064c1..124f750 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -45,8 +45,8 @@ /********************************* * statistics **********************************/ -/* Total bytes used by the compressed storage */ -static u64 zswap_pool_total_size; +/* Total pages used by the compressed storage */ +static u64 zswap_pool_pages; /* The number of compressed pages currently stored in zswap */ static atomic_t zswap_stored_pages = ATOMIC_INIT(0); @@ -297,7 +297,7 @@ static void zswap_free_entry(struct zswap_entry *entry) zpool_free(zswap_pool, entry->handle); zswap_entry_cache_free(entry); atomic_dec(&zswap_stored_pages); - zswap_pool_total_size = zpool_get_total_size(zswap_pool); + zswap_pool_pages = zpool_get_total_size(zswap_pool); } /* caller must hold the tree lock */ @@ -414,7 +414,7 @@ cleanup: static bool zswap_is_full(void) { return totalram_pages * zswap_max_pool_percent / 100 < - DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE); + zswap_pool_pages; } /********************************* @@ -721,7 +721,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset, /* update stats */ atomic_inc(&zswap_stored_pages); - zswap_pool_total_size = zpool_get_total_size(zswap_pool); + zswap_pool_pages = zpool_get_total_size(zswap_pool); return 0; @@ -874,8 +874,8 @@ static int __init zswap_debugfs_init(void) zswap_debugfs_root, &zswap_written_back_pages); debugfs_create_u64("duplicate_entry", S_IRUGO, zswap_debugfs_root, &zswap_duplicate_entry); - debugfs_create_u64("pool_total_size", S_IRUGO, - zswap_debugfs_root, &zswap_pool_total_size); + debugfs_create_u64("pool_pages", S_IRUGO, + zswap_debugfs_root, &zswap_pool_pages); debugfs_create_atomic_t("stored_pages", S_IRUGO, zswap_debugfs_root, &zswap_stored_pages); ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes 2014-08-19 15:11 ` Seth Jennings @ 2014-08-19 23:47 ` Minchan Kim 0 siblings, 0 replies; 19+ messages in thread From: Minchan Kim @ 2014-08-19 23:47 UTC (permalink / raw) To: Seth Jennings Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Dan Streetman, ds2horner On Tue, Aug 19, 2014 at 10:11:57AM -0500, Seth Jennings wrote: > On Tue, Aug 19, 2014 at 09:46:28AM -0500, Seth Jennings wrote: > > On Tue, Aug 19, 2014 at 04:54:45PM +0900, Minchan Kim wrote: > > > zs_get_total_size_bytes returns a amount of memory zsmalloc > > > consumed with *byte unit* but zsmalloc operates *page unit* > > > rather than byte unit so let's change the API so benefit > > > we could get is that reduce unnecessary overhead > > > (ie, change page unit with byte unit) in zsmalloc. > > > > > > Now, zswap can rollback to zswap_pool_pages. > > > Over to zswap guys ;-) > > > > I don't think that's how is it done :-/ Changing the API for a > > component that has two users, changing one, then saying "hope you guys > > change your newly broken stuff". > > However, I'll bite on this one :) Just squash this in so that Thanks for the eating but it's not what I wanted. I'd like to leave current semantic of zpool and up to you to roll back to old time. ;-) > zpool/zswap aren't broken at any point. > > Dan, care to make sure I didn't miss something? > > Thanks, > Seth > > diff --git a/mm/zbud.c b/mm/zbud.c > index a05790b..27a3701 100644 > --- a/mm/zbud.c > +++ b/mm/zbud.c > @@ -179,7 +179,7 @@ static void zbud_zpool_unmap(void *pool, unsigned long handle) > > static u64 zbud_zpool_total_size(void *pool) > { > - return zbud_get_pool_size(pool) * PAGE_SIZE; > + return zbud_get_pool_size(pool); > } > > static struct zpool_driver zbud_zpool_driver = { > diff --git a/mm/zpool.c b/mm/zpool.c > index e40612a..d126ebc 100644 > --- a/mm/zpool.c > +++ b/mm/zpool.c > @@ -336,9 +336,9 @@ void zpool_unmap_handle(struct zpool *zpool, unsigned long handle) > * zpool_get_total_size() - The total size of the pool > * @pool The zpool to check > * > - * This returns the total size in bytes of the pool. > + * This returns the total size in pages of the pool. > * > - * Returns: Total size of the zpool in bytes. > + * Returns: Total size of the zpool in pages. > */ > u64 zpool_get_total_size(struct zpool *zpool) > { > diff --git a/mm/zswap.c b/mm/zswap.c > index ea064c1..124f750 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -45,8 +45,8 @@ > /********************************* > * statistics > **********************************/ > -/* Total bytes used by the compressed storage */ > -static u64 zswap_pool_total_size; > +/* Total pages used by the compressed storage */ > +static u64 zswap_pool_pages; > /* The number of compressed pages currently stored in zswap */ > static atomic_t zswap_stored_pages = ATOMIC_INIT(0); > > @@ -297,7 +297,7 @@ static void zswap_free_entry(struct zswap_entry *entry) > zpool_free(zswap_pool, entry->handle); > zswap_entry_cache_free(entry); > atomic_dec(&zswap_stored_pages); > - zswap_pool_total_size = zpool_get_total_size(zswap_pool); > + zswap_pool_pages = zpool_get_total_size(zswap_pool); > } > > /* caller must hold the tree lock */ > @@ -414,7 +414,7 @@ cleanup: > static bool zswap_is_full(void) > { > return totalram_pages * zswap_max_pool_percent / 100 < > - DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE); > + zswap_pool_pages; > } > > /********************************* > @@ -721,7 +721,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset, > > /* update stats */ > atomic_inc(&zswap_stored_pages); > - zswap_pool_total_size = zpool_get_total_size(zswap_pool); > + zswap_pool_pages = zpool_get_total_size(zswap_pool); > > return 0; > > @@ -874,8 +874,8 @@ static int __init zswap_debugfs_init(void) > zswap_debugfs_root, &zswap_written_back_pages); > debugfs_create_u64("duplicate_entry", S_IRUGO, > zswap_debugfs_root, &zswap_duplicate_entry); > - debugfs_create_u64("pool_total_size", S_IRUGO, > - zswap_debugfs_root, &zswap_pool_total_size); > + debugfs_create_u64("pool_pages", S_IRUGO, > + zswap_debugfs_root, &zswap_pool_pages); > debugfs_create_atomic_t("stored_pages", S_IRUGO, > zswap_debugfs_root, &zswap_stored_pages); > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes 2014-08-19 14:46 ` Seth Jennings 2014-08-19 15:11 ` Seth Jennings @ 2014-08-19 23:46 ` Minchan Kim 2014-08-20 4:44 ` Seth Jennings 1 sibling, 1 reply; 19+ messages in thread From: Minchan Kim @ 2014-08-19 23:46 UTC (permalink / raw) To: Seth Jennings Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Dan Streetman, ds2horner Hey Seth, On Tue, Aug 19, 2014 at 09:46:28AM -0500, Seth Jennings wrote: > On Tue, Aug 19, 2014 at 04:54:45PM +0900, Minchan Kim wrote: > > zs_get_total_size_bytes returns a amount of memory zsmalloc > > consumed with *byte unit* but zsmalloc operates *page unit* > > rather than byte unit so let's change the API so benefit > > we could get is that reduce unnecessary overhead > > (ie, change page unit with byte unit) in zsmalloc. > > > > Now, zswap can rollback to zswap_pool_pages. > > Over to zswap guys ;-) > > I don't think that's how is it done :-/ Changing the API for a > component that has two users, changing one, then saying "hope you guys > change your newly broken stuff". I don't get it. The zsmalloc's zs_zpool_total_size returns u64 with bytes, not pages so what's broken stuff you mentioned? static u64 zs_zpool_total_size(void *pool) { - return zs_get_total_size_bytes(pool); + return zs_get_total_size(pool) << PAGE_SHIFT; } Thing I nudged to you is just you could roll back to pages count instead of bytes if zswap guys want it and I dont' want it myself for avoding unnecessary noise for this patchset's purpose. > > I know you would rather not move zram to the zpool API but doing so > would avoid situations like this. > > Anyway, this does break the zpool API and by extension zswap, and that > needs to be addressed in this patch or we create a point in the commit > history where it is broken. > > Quick glance: > - zpool_get_total_size() return type is u64 but this patch changes to > unsigned long. Now mismatches between zbud and zsmalloc. Nope. zs_zpool_total_size returns u64 with PAGE_SHIFT. > - zbud_zpool_total_size needs to return pages, not bytes It's up to you, not me. > - as you noted s/pool_total_size/pool_pages/g in zswap.c plus > modification to zswap_is_full() I didn't mean it. Sorry for the confusing. If I miss your point, let me know it. Thanks. > > Thanks, > Seth > > > > > Signed-off-by: Minchan Kim <minchan@kernel.org> > > --- > > drivers/block/zram/zram_drv.c | 4 ++-- > > include/linux/zsmalloc.h | 2 +- > > mm/zsmalloc.c | 10 +++++----- > > 3 files changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > > index d00831c3d731..302dd37bcea3 100644 > > --- a/drivers/block/zram/zram_drv.c > > +++ b/drivers/block/zram/zram_drv.c > > @@ -103,10 +103,10 @@ static ssize_t mem_used_total_show(struct device *dev, > > > > down_read(&zram->init_lock); > > if (init_done(zram)) > > - val = zs_get_total_size_bytes(meta->mem_pool); > > + val = zs_get_total_size(meta->mem_pool); > > up_read(&zram->init_lock); > > > > - return scnprintf(buf, PAGE_SIZE, "%llu\n", val); > > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT); > > } > > > > static ssize_t max_comp_streams_show(struct device *dev, > > diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h > > index e44d634e7fb7..105b56e45d23 100644 > > --- a/include/linux/zsmalloc.h > > +++ b/include/linux/zsmalloc.h > > @@ -46,6 +46,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle, > > enum zs_mapmode mm); > > void zs_unmap_object(struct zs_pool *pool, unsigned long handle); > > > > -u64 zs_get_total_size_bytes(struct zs_pool *pool); > > +unsigned long zs_get_total_size(struct zs_pool *pool); > > > > #endif > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > > index a65924255763..80408a1da03a 100644 > > --- a/mm/zsmalloc.c > > +++ b/mm/zsmalloc.c > > @@ -299,7 +299,7 @@ static void zs_zpool_unmap(void *pool, unsigned long handle) > > > > static u64 zs_zpool_total_size(void *pool) > > { > > - return zs_get_total_size_bytes(pool); > > + return zs_get_total_size(pool) << PAGE_SHIFT; > > } > > > > static struct zpool_driver zs_zpool_driver = { > > @@ -1186,16 +1186,16 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle) > > } > > EXPORT_SYMBOL_GPL(zs_unmap_object); > > > > -u64 zs_get_total_size_bytes(struct zs_pool *pool) > > +unsigned long zs_get_total_size(struct zs_pool *pool) > > { > > - u64 npages; > > + unsigned long npages; > > > > spin_lock(&pool->stat_lock); > > npages = pool->pages_allocated; > > spin_unlock(&pool->stat_lock); > > - return npages << PAGE_SHIFT; > > + return npages; > > } > > -EXPORT_SYMBOL_GPL(zs_get_total_size_bytes); > > +EXPORT_SYMBOL_GPL(zs_get_total_size); > > > > module_init(zs_init); > > module_exit(zs_exit); > > -- > > 2.0.0 > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes 2014-08-19 23:46 ` Minchan Kim @ 2014-08-20 4:44 ` Seth Jennings 0 siblings, 0 replies; 19+ messages in thread From: Seth Jennings @ 2014-08-20 4:44 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Dan Streetman, ds2horner On Wed, Aug 20, 2014 at 08:46:21AM +0900, Minchan Kim wrote: > Hey Seth, > > On Tue, Aug 19, 2014 at 09:46:28AM -0500, Seth Jennings wrote: > > On Tue, Aug 19, 2014 at 04:54:45PM +0900, Minchan Kim wrote: > > > zs_get_total_size_bytes returns a amount of memory zsmalloc > > > consumed with *byte unit* but zsmalloc operates *page unit* > > > rather than byte unit so let's change the API so benefit > > > we could get is that reduce unnecessary overhead > > > (ie, change page unit with byte unit) in zsmalloc. > > > > > > Now, zswap can rollback to zswap_pool_pages. > > > Over to zswap guys ;-) > > > > I don't think that's how is it done :-/ Changing the API for a > > component that has two users, changing one, then saying "hope you guys > > change your newly broken stuff". > > I don't get it. The zsmalloc's zs_zpool_total_size returns u64 with bytes, > not pages so what's broken stuff you mentioned? > > static u64 zs_zpool_total_size(void *pool) > { > - return zs_get_total_size_bytes(pool); > + return zs_get_total_size(pool) << PAGE_SHIFT; > } Ah, that's my bad. I forgot that there is zs_get_total_size() and zs_pool_total_size() and that the zpool driver calls zs_pool_total_size(), not zs_get_total_size(). Nevermind! Seth > > Thing I nudged to you is just you could roll back to pages count > instead of bytes if zswap guys want it and I dont' want it myself > for avoding unnecessary noise for this patchset's purpose. > > > > > I know you would rather not move zram to the zpool API but doing so > > would avoid situations like this. > > > > Anyway, this does break the zpool API and by extension zswap, and that > > needs to be addressed in this patch or we create a point in the commit > > history where it is broken. > > > > Quick glance: > > - zpool_get_total_size() return type is u64 but this patch changes to > > unsigned long. Now mismatches between zbud and zsmalloc. > > Nope. zs_zpool_total_size returns u64 with PAGE_SHIFT. > > > - zbud_zpool_total_size needs to return pages, not bytes > > It's up to you, not me. > > > - as you noted s/pool_total_size/pool_pages/g in zswap.c plus > > modification to zswap_is_full() > > I didn't mean it. Sorry for the confusing. > If I miss your point, let me know it. > > Thanks. > > > > > Thanks, > > Seth > > > > > > > > Signed-off-by: Minchan Kim <minchan@kernel.org> > > > --- > > > drivers/block/zram/zram_drv.c | 4 ++-- > > > include/linux/zsmalloc.h | 2 +- > > > mm/zsmalloc.c | 10 +++++----- > > > 3 files changed, 8 insertions(+), 8 deletions(-) > > > > > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > > > index d00831c3d731..302dd37bcea3 100644 > > > --- a/drivers/block/zram/zram_drv.c > > > +++ b/drivers/block/zram/zram_drv.c > > > @@ -103,10 +103,10 @@ static ssize_t mem_used_total_show(struct device *dev, > > > > > > down_read(&zram->init_lock); > > > if (init_done(zram)) > > > - val = zs_get_total_size_bytes(meta->mem_pool); > > > + val = zs_get_total_size(meta->mem_pool); > > > up_read(&zram->init_lock); > > > > > > - return scnprintf(buf, PAGE_SIZE, "%llu\n", val); > > > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT); > > > } > > > > > > static ssize_t max_comp_streams_show(struct device *dev, > > > diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h > > > index e44d634e7fb7..105b56e45d23 100644 > > > --- a/include/linux/zsmalloc.h > > > +++ b/include/linux/zsmalloc.h > > > @@ -46,6 +46,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle, > > > enum zs_mapmode mm); > > > void zs_unmap_object(struct zs_pool *pool, unsigned long handle); > > > > > > -u64 zs_get_total_size_bytes(struct zs_pool *pool); > > > +unsigned long zs_get_total_size(struct zs_pool *pool); > > > > > > #endif > > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > > > index a65924255763..80408a1da03a 100644 > > > --- a/mm/zsmalloc.c > > > +++ b/mm/zsmalloc.c > > > @@ -299,7 +299,7 @@ static void zs_zpool_unmap(void *pool, unsigned long handle) > > > > > > static u64 zs_zpool_total_size(void *pool) > > > { > > > - return zs_get_total_size_bytes(pool); > > > + return zs_get_total_size(pool) << PAGE_SHIFT; > > > } > > > > > > static struct zpool_driver zs_zpool_driver = { > > > @@ -1186,16 +1186,16 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle) > > > } > > > EXPORT_SYMBOL_GPL(zs_unmap_object); > > > > > > -u64 zs_get_total_size_bytes(struct zs_pool *pool) > > > +unsigned long zs_get_total_size(struct zs_pool *pool) > > > { > > > - u64 npages; > > > + unsigned long npages; > > > > > > spin_lock(&pool->stat_lock); > > > npages = pool->pages_allocated; > > > spin_unlock(&pool->stat_lock); > > > - return npages << PAGE_SHIFT; > > > + return npages; > > > } > > > -EXPORT_SYMBOL_GPL(zs_get_total_size_bytes); > > > +EXPORT_SYMBOL_GPL(zs_get_total_size); > > > > > > module_init(zs_init); > > > module_exit(zs_exit); > > > -- > > > 2.0.0 > > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > -- > Kind regards, > Minchan Kim > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 3/4] zram: zram memory size limitation 2014-08-19 7:54 [PATCH v2 0/4] zram memory control enhance Minchan Kim 2014-08-19 7:54 ` [PATCH v2 1/4] zsmalloc: move pages_allocated to zs_pool Minchan Kim 2014-08-19 7:54 ` [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes Minchan Kim @ 2014-08-19 7:54 ` Minchan Kim 2014-08-19 8:06 ` Marc Dietrich 2014-08-19 7:54 ` [PATCH v2 4/4] zram: report maximum used memory Minchan Kim 3 siblings, 1 reply; 19+ messages in thread From: Minchan Kim @ 2014-08-19 7:54 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman, ds2horner, Minchan Kim Since zram has no control feature to limit memory usage, it makes hard to manage system memrory. This patch adds new knob "mem_limit" via sysfs to set up the a limit so that zram could fail allocation once it reaches the limit. Signed-off-by: Minchan Kim <minchan@kernel.org> --- Documentation/ABI/testing/sysfs-block-zram | 9 +++++++ Documentation/blockdev/zram.txt | 20 ++++++++++++--- drivers/block/zram/zram_drv.c | 41 ++++++++++++++++++++++++++++++ drivers/block/zram/zram_drv.h | 5 ++++ 4 files changed, 71 insertions(+), 4 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram index 70ec992514d0..025331c19045 100644 --- a/Documentation/ABI/testing/sysfs-block-zram +++ b/Documentation/ABI/testing/sysfs-block-zram @@ -119,3 +119,12 @@ Description: efficiency can be calculated using compr_data_size and this statistic. Unit: bytes + +What: /sys/block/zram<id>/mem_limit +Date: August 2014 +Contact: Minchan Kim <minchan@kernel.org> +Description: + The mem_limit file is read/write and specifies the amount + of memory to be able to consume memory to store store + compressed data. + Unit: bytes diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt index 0595c3f56ccf..9f239ff8c444 100644 --- a/Documentation/blockdev/zram.txt +++ b/Documentation/blockdev/zram.txt @@ -74,14 +74,26 @@ There is little point creating a zram of greater than twice the size of memory since we expect a 2:1 compression ratio. Note that zram uses about 0.1% of the size of the disk when not in use so a huge zram is wasteful. -5) Activate: +5) Set memory limit: Optional + Set memory limit by writing the value to sysfs node 'mem_limit'. + The value can be either in bytes or you can use mem suffixes. + Examples: + # limit /dev/zram0 with 50MB memory + echo $((50*1024*1024)) > /sys/block/zram0/mem_limit + + # Using mem suffixes + echo 256K > /sys/block/zram0/mem_limit + echo 512M > /sys/block/zram0/mem_limit + echo 1G > /sys/block/zram0/mem_limit + +6) Activate: mkswap /dev/zram0 swapon /dev/zram0 mkfs.ext4 /dev/zram1 mount /dev/zram1 /tmp -6) Stats: +7) Stats: Per-device statistics are exported as various nodes under /sys/block/zram<id>/ disksize @@ -96,11 +108,11 @@ size of the disk when not in use so a huge zram is wasteful. compr_data_size mem_used_total -7) Deactivate: +8) Deactivate: swapoff /dev/zram0 umount /dev/zram1 -8) Reset: +9) Reset: Write any positive value to 'reset' sysfs node echo 1 > /sys/block/zram0/reset echo 1 > /sys/block/zram1/reset diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 302dd37bcea3..adc91c7ecaef 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -122,6 +122,33 @@ static ssize_t max_comp_streams_show(struct device *dev, return scnprintf(buf, PAGE_SIZE, "%d\n", val); } +static ssize_t mem_limit_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + u64 val; + struct zram *zram = dev_to_zram(dev); + + down_read(&zram->init_lock); + val = zram->limit_pages; + up_read(&zram->init_lock); + + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT); +} + +static ssize_t mem_limit_store(struct device *dev, + struct device_attribute *attr, const char *buf, size_t len) +{ + u64 limit; + struct zram *zram = dev_to_zram(dev); + + limit = memparse(buf, NULL); + down_write(&zram->init_lock); + zram->limit_pages = PAGE_ALIGN(limit) >> PAGE_SHIFT; + up_write(&zram->init_lock); + + return len; +} + static ssize_t max_comp_streams_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { @@ -513,6 +540,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, ret = -ENOMEM; goto out; } + + if (zram->limit_pages && + zs_get_total_size(meta->mem_pool) > zram->limit_pages) { + zs_free(meta->mem_pool, handle); + ret = -ENOMEM; + goto out; + } + cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO); if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) { @@ -617,6 +652,9 @@ static void zram_reset_device(struct zram *zram, bool reset_capacity) struct zram_meta *meta; down_write(&zram->init_lock); + + zram->limit_pages = 0; + if (!init_done(zram)) { up_write(&zram->init_lock); return; @@ -857,6 +895,8 @@ static DEVICE_ATTR(initstate, S_IRUGO, initstate_show, NULL); static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store); static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL); static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL); +static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show, + mem_limit_store); static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR, max_comp_streams_show, max_comp_streams_store); static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR, @@ -885,6 +925,7 @@ static struct attribute *zram_disk_attrs[] = { &dev_attr_orig_data_size.attr, &dev_attr_compr_data_size.attr, &dev_attr_mem_used_total.attr, + &dev_attr_mem_limit.attr, &dev_attr_max_comp_streams.attr, &dev_attr_comp_algorithm.attr, NULL, diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h index e0f725c87cc6..b7aa9c21553f 100644 --- a/drivers/block/zram/zram_drv.h +++ b/drivers/block/zram/zram_drv.h @@ -112,6 +112,11 @@ struct zram { u64 disksize; /* bytes */ int max_comp_streams; struct zram_stats stats; + /* + * the number of pages zram can consume for storing compressed data + */ + unsigned long limit_pages; + char compressor[10]; }; #endif -- 2.0.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/4] zram: zram memory size limitation 2014-08-19 7:54 ` [PATCH v2 3/4] zram: zram memory size limitation Minchan Kim @ 2014-08-19 8:06 ` Marc Dietrich 2014-08-19 23:32 ` Minchan Kim 0 siblings, 1 reply; 19+ messages in thread From: Marc Dietrich @ 2014-08-19 8:06 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman, ds2horner Am Dienstag, 19. August 2014, 16:54:46 schrieb Minchan Kim: > Since zram has no control feature to limit memory usage, > it makes hard to manage system memrory. > > This patch adds new knob "mem_limit" via sysfs to set up the > a limit so that zram could fail allocation once it reaches > the limit. Sorry to jump in late with a probably silly question, but I couldn't find the answer easily. What's the difference between disksize and mem_limit? I assume the former is uncompressed size (virtual size) and the latter is compressed size (real memory usage)? Maybe the difference should be made clearer in the documentation. If disksize is the uncompressed size, why would we want to set this at all? Marc -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/4] zram: zram memory size limitation 2014-08-19 8:06 ` Marc Dietrich @ 2014-08-19 23:32 ` Minchan Kim 2014-08-20 7:58 ` Marc Dietrich 0 siblings, 1 reply; 19+ messages in thread From: Minchan Kim @ 2014-08-19 23:32 UTC (permalink / raw) To: Marc Dietrich Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman, ds2horner Hello, On Tue, Aug 19, 2014 at 10:06:22AM +0200, Marc Dietrich wrote: > Am Dienstag, 19. August 2014, 16:54:46 schrieb Minchan Kim: > > Since zram has no control feature to limit memory usage, > > it makes hard to manage system memrory. > > > > This patch adds new knob "mem_limit" via sysfs to set up the > > a limit so that zram could fail allocation once it reaches > > the limit. > > Sorry to jump in late with a probably silly question, but I couldn't find the > answer easily. What's the difference between disksize and mem_limit? No need to say sorry. It was totally my fault because zram documentation sucks. The disksize means the size point of view upper layer from block subsystem so filesystem based on zram or blockdevice itself(ie, zram0) seen by admin will have the disksize size but keep in mind that it's virtual size, not compressed. As you know already, zram is backed on volatile storage (ie, DRAM) with *compressed form*, not permanent storage. The point of this patchset is that anybody cannot expect exact memory usage of zram in advance. Acutally, zram folks have estimated it by several experiment and assuming zram compression ratio(ex, 2:1 or 3:1) before releasing product. But thesedays, embedded platforms have varios workloads which cannot be expected when the product was released so compression ratio expectation could be wrong sometime so zram could consume lots of memory than expected once compression ratio is low. It makes admin trouble to manage memeory on the product because there is no way to release memory zram is using so that one of the way is to limit memory usage of zram from the beginning. > I assume the former is uncompressed size (virtual size) and the latter is > compressed size (real memory usage)? Maybe the difference should be made Right. > clearer in the documentation. Okay. > If disksize is the uncompressed size, why would we want to set this at all? For example, we have 500M disksize of zram0 because we assumed 2:1 compression ratio so that we could guess zram will consume 250M physical memory in the end. But our guessing could be wrong so if real compression ratio is 4:1, we use up 125M phsyical memory to store 500M uncompressed pages. It's good but admin want to use up more memory for zram because we saved 100% than expected zram memory but we couldn't becuase upper layer point of view from zram, zram is already full by 500M and if zram is used for swap, we will encounter OOM. :( So, it would be better to increase disksize to 1000M but in this case, if compression ratio becomes 4:1 by something(ex, workload change), zram can consume 500M physical memory, which is above we expected and admin don't want zram to use up system memory too much. In summary, we couldn't control exact zram memory usage with only disksize by compression ratio. > > Marc > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/4] zram: zram memory size limitation 2014-08-19 23:32 ` Minchan Kim @ 2014-08-20 7:58 ` Marc Dietrich 0 siblings, 0 replies; 19+ messages in thread From: Marc Dietrich @ 2014-08-20 7:58 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman, ds2horner [-- Attachment #1: Type: text/plain, Size: 3313 bytes --] Am Mittwoch, 20. August 2014, 08:32:25 schrieb Minchan Kim: > Hello, > > On Tue, Aug 19, 2014 at 10:06:22AM +0200, Marc Dietrich wrote: > > Am Dienstag, 19. August 2014, 16:54:46 schrieb Minchan Kim: > > > Since zram has no control feature to limit memory usage, > > > it makes hard to manage system memrory. > > > > > > This patch adds new knob "mem_limit" via sysfs to set up the > > > a limit so that zram could fail allocation once it reaches > > > the limit. > > > > Sorry to jump in late with a probably silly question, but I couldn't find > > the answer easily. What's the difference between disksize and mem_limit? > No need to say sorry. > It was totally my fault because zram documentation sucks. > > The disksize means the size point of view upper layer from block subsystem > so filesystem based on zram or blockdevice itself(ie, zram0) seen by admin > will have the disksize size but keep in mind that it's virtual size, > not compressed. As you know already, zram is backed on volatile storage > (ie, DRAM) with *compressed form*, not permanent storage. > > The point of this patchset is that anybody cannot expect exact memory > usage of zram in advance. Acutally, zram folks have estimated it by several > experiment and assuming zram compression ratio(ex, 2:1 or 3:1) before > releasing product. But thesedays, embedded platforms have varios workloads > which cannot be expected when the product was released so compression > ratio expectation could be wrong sometime so zram could consume lots of > memory than expected once compression ratio is low. > > It makes admin trouble to manage memeory on the product because there > is no way to release memory zram is using so that one of the way is > to limit memory usage of zram from the beginning. > > > I assume the former is uncompressed size (virtual size) and the latter is > > compressed size (real memory usage)? Maybe the difference should be made > > Right. > > > clearer in the documentation. > > Okay. > > > If disksize is the uncompressed size, why would we want to set this at > > all? > > For example, we have 500M disksize of zram0 because we assumed 2:1 > compression ratio so that we could guess zram will consume 250M physical > memory in the end. But our guessing could be wrong so if real compression > ratio is 4:1, we use up 125M phsyical memory to store 500M uncompressed > pages. It's good but admin want to use up more memory for zram because we > saved 100% than expected zram memory but we couldn't becuase upper layer > point of view from zram, zram is already full by 500M and if zram is used > for swap, we will encounter OOM. :( > > So, it would be better to increase disksize to 1000M but in this case, > if compression ratio becomes 4:1 by something(ex, workload change), > zram can consume 500M physical memory, which is above we expected > and admin don't want zram to use up system memory too much. > > In summary, we couldn't control exact zram memory usage with only disksize > by compression ratio. thanks for your detailed explanation. It's a bit confusing that you can specify two limits (for two different layers). I guess a floating disksize is not possible, because you wouldn't be able to create a filesystem/swapfile on it, so you need to make a *fixed* assumption. Regards, Marc [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 490 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 4/4] zram: report maximum used memory 2014-08-19 7:54 [PATCH v2 0/4] zram memory control enhance Minchan Kim ` (2 preceding siblings ...) 2014-08-19 7:54 ` [PATCH v2 3/4] zram: zram memory size limitation Minchan Kim @ 2014-08-19 7:54 ` Minchan Kim 2014-08-20 6:26 ` David Horner 3 siblings, 1 reply; 19+ messages in thread From: Minchan Kim @ 2014-08-19 7:54 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman, ds2horner, Minchan Kim Normally, zram user could get maximum memory usage zram consumed via polling mem_used_total with sysfs in userspace. But it has a critical problem because user can miss peak memory usage during update inverval of polling. For avoiding that, user should poll it with shorter interval(ie, 0.0000000001s) with mlocking to avoid page fault delay when memory pressure is heavy. It would be troublesome. This patch adds new knob "mem_used_max" so user could see the maximum memory usage easily via reading the knob and reset it via "echo 0 > /sys/block/zram0/mem_used_max". Signed-off-by: Minchan Kim <minchan@kernel.org> --- Documentation/ABI/testing/sysfs-block-zram | 10 +++++ Documentation/blockdev/zram.txt | 1 + drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++- drivers/block/zram/zram_drv.h | 1 + 4 files changed, 70 insertions(+), 2 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram index 025331c19045..ffd1ea7443dd 100644 --- a/Documentation/ABI/testing/sysfs-block-zram +++ b/Documentation/ABI/testing/sysfs-block-zram @@ -120,6 +120,16 @@ Description: statistic. Unit: bytes +What: /sys/block/zram<id>/mem_used_max +Date: August 2014 +Contact: Minchan Kim <minchan@kernel.org> +Description: + The mem_used_max file is read/write and specifies the amount + of maximum memory zram have consumed to store compressed data. + For resetting the value, you should do "echo 0". Otherwise, + you could see -EINVAL. + Unit: bytes + What: /sys/block/zram<id>/mem_limit Date: August 2014 Contact: Minchan Kim <minchan@kernel.org> diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt index 9f239ff8c444..3b2247c2d4cf 100644 --- a/Documentation/blockdev/zram.txt +++ b/Documentation/blockdev/zram.txt @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful. orig_data_size compr_data_size mem_used_total + mem_used_max 8) Deactivate: swapoff /dev/zram0 diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index adc91c7ecaef..e4d44842a91d 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev, return len; } +static ssize_t mem_used_max_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + u64 val = 0; + struct zram *zram = dev_to_zram(dev); + + down_read(&zram->init_lock); + if (init_done(zram)) + val = atomic64_read(&zram->stats.max_used_pages); + up_read(&zram->init_lock); + + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT); +} + +static ssize_t mem_used_max_store(struct device *dev, + struct device_attribute *attr, const char *buf, size_t len) +{ + u64 limit; + struct zram *zram = dev_to_zram(dev); + struct zram_meta *meta = zram->meta; + + limit = memparse(buf, NULL); + if (0 != limit) + return -EINVAL; + + down_read(&zram->init_lock); + if (init_done(zram)) + atomic64_set(&zram->stats.max_used_pages, + zs_get_total_size(meta->mem_pool)); + up_read(&zram->init_lock); + + return len; +} + static ssize_t max_comp_streams_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { @@ -461,6 +495,26 @@ out_cleanup: return ret; } +static bool check_limit(struct zram *zram) +{ + unsigned long alloced_pages; + u64 old_max, cur_max; + struct zram_meta *meta = zram->meta; + + do { + alloced_pages = zs_get_total_size(meta->mem_pool); + if (zram->limit_pages && alloced_pages > zram->limit_pages) + return false; + + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages); + if (alloced_pages > cur_max) + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages, + cur_max, alloced_pages); + } while (old_max != cur_max); + + return true; +} + static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, int offset) { @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, goto out; } - if (zram->limit_pages && - zs_get_total_size(meta->mem_pool) > zram->limit_pages) { + if (!check_limit(zram)) { zs_free(meta->mem_pool, handle); ret = -ENOMEM; goto out; @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL); static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL); static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show, mem_limit_store); +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show, + mem_used_max_store); static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR, max_comp_streams_show, max_comp_streams_store); static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR, @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = { &dev_attr_compr_data_size.attr, &dev_attr_mem_used_total.attr, &dev_attr_mem_limit.attr, + &dev_attr_mem_used_max.attr, &dev_attr_max_comp_streams.attr, &dev_attr_comp_algorithm.attr, NULL, diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h index b7aa9c21553f..29383312d543 100644 --- a/drivers/block/zram/zram_drv.h +++ b/drivers/block/zram/zram_drv.h @@ -90,6 +90,7 @@ struct zram_stats { atomic64_t notify_free; /* no. of swap slot free notifications */ atomic64_t zero_pages; /* no. of zero filled pages */ atomic64_t pages_stored; /* no. of pages currently stored */ + atomic64_t max_used_pages; /* no. of maximum pages stored */ }; struct zram_meta { -- 2.0.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/4] zram: report maximum used memory 2014-08-19 7:54 ` [PATCH v2 4/4] zram: report maximum used memory Minchan Kim @ 2014-08-20 6:26 ` David Horner 2014-08-20 6:53 ` Minchan Kim 0 siblings, 1 reply; 19+ messages in thread From: David Horner @ 2014-08-20 6:26 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, Linux-MM, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman On Tue, Aug 19, 2014 at 3:54 AM, Minchan Kim <minchan@kernel.org> wrote: > Normally, zram user could get maximum memory usage zram consumed > via polling mem_used_total with sysfs in userspace. > > But it has a critical problem because user can miss peak memory > usage during update inverval of polling. For avoiding that, > user should poll it with shorter interval(ie, 0.0000000001s) > with mlocking to avoid page fault delay when memory pressure > is heavy. It would be troublesome. > > This patch adds new knob "mem_used_max" so user could see > the maximum memory usage easily via reading the knob and reset > it via "echo 0 > /sys/block/zram0/mem_used_max". > > Signed-off-by: Minchan Kim <minchan@kernel.org> > --- > Documentation/ABI/testing/sysfs-block-zram | 10 +++++ > Documentation/blockdev/zram.txt | 1 + > drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++- > drivers/block/zram/zram_drv.h | 1 + > 4 files changed, 70 insertions(+), 2 deletions(-) > > diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram > index 025331c19045..ffd1ea7443dd 100644 > --- a/Documentation/ABI/testing/sysfs-block-zram > +++ b/Documentation/ABI/testing/sysfs-block-zram > @@ -120,6 +120,16 @@ Description: > statistic. > Unit: bytes > > +What: /sys/block/zram<id>/mem_used_max > +Date: August 2014 > +Contact: Minchan Kim <minchan@kernel.org> > +Description: > + The mem_used_max file is read/write and specifies the amount > + of maximum memory zram have consumed to store compressed data. > + For resetting the value, you should do "echo 0". Otherwise, > + you could see -EINVAL. > + Unit: bytes > + > What: /sys/block/zram<id>/mem_limit > Date: August 2014 > Contact: Minchan Kim <minchan@kernel.org> > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt > index 9f239ff8c444..3b2247c2d4cf 100644 > --- a/Documentation/blockdev/zram.txt > +++ b/Documentation/blockdev/zram.txt > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful. > orig_data_size > compr_data_size > mem_used_total > + mem_used_max > > 8) Deactivate: > swapoff /dev/zram0 > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > index adc91c7ecaef..e4d44842a91d 100644 > --- a/drivers/block/zram/zram_drv.c > +++ b/drivers/block/zram/zram_drv.c > @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev, > return len; > } > > +static ssize_t mem_used_max_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + u64 val = 0; > + struct zram *zram = dev_to_zram(dev); > + > + down_read(&zram->init_lock); > + if (init_done(zram)) > + val = atomic64_read(&zram->stats.max_used_pages); > + up_read(&zram->init_lock); > + > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT); > +} > + > +static ssize_t mem_used_max_store(struct device *dev, > + struct device_attribute *attr, const char *buf, size_t len) > +{ > + u64 limit; > + struct zram *zram = dev_to_zram(dev); > + struct zram_meta *meta = zram->meta; > + > - limit = memparse(buf, NULL); > - if (0 != limit) we wanted explicit "0" and nothing else for extensibility if (len != 1 || *buf != "0") > + return -EINVAL; > + > + down_read(&zram->init_lock); > + if (init_done(zram)) > + atomic64_set(&zram->stats.max_used_pages, > + zs_get_total_size(meta->mem_pool)); > + up_read(&zram->init_lock); > + > + return len; return 1; the standard convention is to return used amount of buffer > +} > + > static ssize_t max_comp_streams_store(struct device *dev, > struct device_attribute *attr, const char *buf, size_t len) > { > @@ -461,6 +495,26 @@ out_cleanup: > return ret; > } > > +static bool check_limit(struct zram *zram) > +{ > + unsigned long alloced_pages; > + u64 old_max, cur_max; > + struct zram_meta *meta = zram->meta; > + > + do { > + alloced_pages = zs_get_total_size(meta->mem_pool); > + if (zram->limit_pages && alloced_pages > zram->limit_pages) > + return false; > + > + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages); > + if (alloced_pages > cur_max) > + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages, > + cur_max, alloced_pages); > + } while (old_max != cur_max); > + > + return true; > +} > + Check_limit does more than check limit - it has a substantial side effect of updating max used. Basically if we already allocated the buffer and our alloced_pages is less than the limit then we are good to go. It is the race to update that we need to have the cmpxchg. And maybe a helper function would aid readability - not sure, see next point. I don't believe there is need for the loop either. Any other updater will also be including our allocated pages (and at this point in the code eliminated from roll back) so if they beat us to it, then no problem, their max is better than ours. > static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, > int offset) > { > @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, > goto out; > } > > - if (zram->limit_pages && > - zs_get_total_size(meta->mem_pool) > zram->limit_pages) { > + if (!check_limit(zram)) { > zs_free(meta->mem_pool, handle); > ret = -ENOMEM; > goto out; > @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL); > static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL); > static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show, > mem_limit_store); > +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show, > + mem_used_max_store); > static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR, > max_comp_streams_show, max_comp_streams_store); > static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR, > @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = { > &dev_attr_compr_data_size.attr, > &dev_attr_mem_used_total.attr, > &dev_attr_mem_limit.attr, > + &dev_attr_mem_used_max.attr, > &dev_attr_max_comp_streams.attr, > &dev_attr_comp_algorithm.attr, > NULL, > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h > index b7aa9c21553f..29383312d543 100644 > --- a/drivers/block/zram/zram_drv.h > +++ b/drivers/block/zram/zram_drv.h > @@ -90,6 +90,7 @@ struct zram_stats { > atomic64_t notify_free; /* no. of swap slot free notifications */ > atomic64_t zero_pages; /* no. of zero filled pages */ > atomic64_t pages_stored; /* no. of pages currently stored */ > + atomic64_t max_used_pages; /* no. of maximum pages stored */ > }; > > struct zram_meta { > -- > 2.0.0 > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/4] zram: report maximum used memory 2014-08-20 6:26 ` David Horner @ 2014-08-20 6:53 ` Minchan Kim 2014-08-20 7:38 ` David Horner 2014-08-21 0:06 ` Minchan Kim 0 siblings, 2 replies; 19+ messages in thread From: Minchan Kim @ 2014-08-20 6:53 UTC (permalink / raw) To: David Horner Cc: Andrew Morton, Linux-MM, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman On Wed, Aug 20, 2014 at 02:26:50AM -0400, David Horner wrote: > On Tue, Aug 19, 2014 at 3:54 AM, Minchan Kim <minchan@kernel.org> wrote: > > Normally, zram user could get maximum memory usage zram consumed > > via polling mem_used_total with sysfs in userspace. > > > > But it has a critical problem because user can miss peak memory > > usage during update inverval of polling. For avoiding that, > > user should poll it with shorter interval(ie, 0.0000000001s) > > with mlocking to avoid page fault delay when memory pressure > > is heavy. It would be troublesome. > > > > This patch adds new knob "mem_used_max" so user could see > > the maximum memory usage easily via reading the knob and reset > > it via "echo 0 > /sys/block/zram0/mem_used_max". > > > > Signed-off-by: Minchan Kim <minchan@kernel.org> > > --- > > Documentation/ABI/testing/sysfs-block-zram | 10 +++++ > > Documentation/blockdev/zram.txt | 1 + > > drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++- > > drivers/block/zram/zram_drv.h | 1 + > > 4 files changed, 70 insertions(+), 2 deletions(-) > > > > diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram > > index 025331c19045..ffd1ea7443dd 100644 > > --- a/Documentation/ABI/testing/sysfs-block-zram > > +++ b/Documentation/ABI/testing/sysfs-block-zram > > @@ -120,6 +120,16 @@ Description: > > statistic. > > Unit: bytes > > > > +What: /sys/block/zram<id>/mem_used_max > > +Date: August 2014 > > +Contact: Minchan Kim <minchan@kernel.org> > > +Description: > > + The mem_used_max file is read/write and specifies the amount > > + of maximum memory zram have consumed to store compressed data. > > + For resetting the value, you should do "echo 0". Otherwise, > > + you could see -EINVAL. > > + Unit: bytes > > + > > What: /sys/block/zram<id>/mem_limit > > Date: August 2014 > > Contact: Minchan Kim <minchan@kernel.org> > > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt > > index 9f239ff8c444..3b2247c2d4cf 100644 > > --- a/Documentation/blockdev/zram.txt > > +++ b/Documentation/blockdev/zram.txt > > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful. > > orig_data_size > > compr_data_size > > mem_used_total > > + mem_used_max > > > > 8) Deactivate: > > swapoff /dev/zram0 > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > > index adc91c7ecaef..e4d44842a91d 100644 > > --- a/drivers/block/zram/zram_drv.c > > +++ b/drivers/block/zram/zram_drv.c > > @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev, > > return len; > > } > > > > +static ssize_t mem_used_max_show(struct device *dev, > > + struct device_attribute *attr, char *buf) > > +{ > > + u64 val = 0; > > + struct zram *zram = dev_to_zram(dev); > > + > > + down_read(&zram->init_lock); > > + if (init_done(zram)) > > + val = atomic64_read(&zram->stats.max_used_pages); > > + up_read(&zram->init_lock); > > + > > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT); > > +} > > + > > +static ssize_t mem_used_max_store(struct device *dev, > > + struct device_attribute *attr, const char *buf, size_t len) > > +{ > > + u64 limit; > > + struct zram *zram = dev_to_zram(dev); > > + struct zram_meta *meta = zram->meta; > > + > > - limit = memparse(buf, NULL); > > - if (0 != limit) > > we wanted explicit "0" and nothing else for extensibility > > if (len != 1 || *buf != "0") > I wanted to work with "0", "0K", "0M", "0G" but agree it's meaningless at the moment so your version is better. > > + return -EINVAL; > > + > > + down_read(&zram->init_lock); > > + if (init_done(zram)) > > + atomic64_set(&zram->stats.max_used_pages, > > + zs_get_total_size(meta->mem_pool)); > > + up_read(&zram->init_lock); > > + > > + return len; > return 1; > > the standard convention is to return used amount of buffer If I follow your suggestion, len should be 1 right before returning so no problem for functionality POV but I agree explicit "1" is better for readability so your version is better, better. > > > > > +} > > + > > static ssize_t max_comp_streams_store(struct device *dev, > > struct device_attribute *attr, const char *buf, size_t len) > > { > > @@ -461,6 +495,26 @@ out_cleanup: > > return ret; > > } > > > > +static bool check_limit(struct zram *zram) > > +{ > > + unsigned long alloced_pages; > > + u64 old_max, cur_max; > > + struct zram_meta *meta = zram->meta; > > + > > + do { > > + alloced_pages = zs_get_total_size(meta->mem_pool); > > + if (zram->limit_pages && alloced_pages > zram->limit_pages) > > + return false; > > + > > + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages); > > + if (alloced_pages > cur_max) > > + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages, > > + cur_max, alloced_pages); > > + } while (old_max != cur_max); > > + > > + return true; > > +} > > + > > Check_limit does more than check limit - it has a substantial side > effect of updating max used. Hmm, Normally, limit check is best place to update the max although function name imply just checking the limit and I don't think code piece for max updating doesn't hurt readbilty. If you or other reviewer is strong against, I will be happy to factor out part of max updating into another function because I think it's just preference problem for small logic and don't want to waste argue for that. If you really want it, pz, ping me again. > > Basically if we already allocated the buffer and our alloced_pages is > less than the limit then we are good to go. Yeb. > > It is the race to update that we need to have the cmpxchg. > And maybe a helper function would aid readability - not sure, see next point. > > I don't believe there is need for the loop either. > Any other updater will also be including our allocated pages > (and at this point in the code eliminated from roll back) > so if they beat us to it, then no problem, their max is better than ours. Let's assume we don't have the loop. CPU A CPU B alloced_pages = 2001 old_max = cur_max = 2000 alloced_pages = 2005 old_max = cur_max = 2000 cmpxchg(2000, 2000, 2001) -> OK cmpxchg(2001, 2000, 2005) -> FAIL So, we lose 2005 which is bigger vaule. > > > > > static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, > > int offset) > > { > > @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, > > goto out; > > } > > > > - if (zram->limit_pages && > > - zs_get_total_size(meta->mem_pool) > zram->limit_pages) { > > + if (!check_limit(zram)) { > > zs_free(meta->mem_pool, handle); > > ret = -ENOMEM; > > goto out; > > @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL); > > static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL); > > static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show, > > mem_limit_store); > > +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show, > > + mem_used_max_store); > > static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR, > > max_comp_streams_show, max_comp_streams_store); > > static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR, > > @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = { > > &dev_attr_compr_data_size.attr, > > &dev_attr_mem_used_total.attr, > > &dev_attr_mem_limit.attr, > > + &dev_attr_mem_used_max.attr, > > &dev_attr_max_comp_streams.attr, > > &dev_attr_comp_algorithm.attr, > > NULL, > > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h > > index b7aa9c21553f..29383312d543 100644 > > --- a/drivers/block/zram/zram_drv.h > > +++ b/drivers/block/zram/zram_drv.h > > @@ -90,6 +90,7 @@ struct zram_stats { > > atomic64_t notify_free; /* no. of swap slot free notifications */ > > atomic64_t zero_pages; /* no. of zero filled pages */ > > atomic64_t pages_stored; /* no. of pages currently stored */ > > + atomic64_t max_used_pages; /* no. of maximum pages stored */ > > }; > > > > struct zram_meta { > > -- > > 2.0.0 > > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/4] zram: report maximum used memory 2014-08-20 6:53 ` Minchan Kim @ 2014-08-20 7:38 ` David Horner 2014-08-20 7:53 ` Minchan Kim 2014-08-21 0:06 ` Minchan Kim 1 sibling, 1 reply; 19+ messages in thread From: David Horner @ 2014-08-20 7:38 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, Linux-MM, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman On Wed, Aug 20, 2014 at 2:53 AM, Minchan Kim <minchan@kernel.org> wrote: > On Wed, Aug 20, 2014 at 02:26:50AM -0400, David Horner wrote: >> On Tue, Aug 19, 2014 at 3:54 AM, Minchan Kim <minchan@kernel.org> wrote: >> > Normally, zram user could get maximum memory usage zram consumed >> > via polling mem_used_total with sysfs in userspace. >> > >> > But it has a critical problem because user can miss peak memory >> > usage during update inverval of polling. For avoiding that, >> > user should poll it with shorter interval(ie, 0.0000000001s) >> > with mlocking to avoid page fault delay when memory pressure >> > is heavy. It would be troublesome. >> > >> > This patch adds new knob "mem_used_max" so user could see >> > the maximum memory usage easily via reading the knob and reset >> > it via "echo 0 > /sys/block/zram0/mem_used_max". >> > >> > Signed-off-by: Minchan Kim <minchan@kernel.org> >> > --- >> > Documentation/ABI/testing/sysfs-block-zram | 10 +++++ >> > Documentation/blockdev/zram.txt | 1 + >> > drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++- >> > drivers/block/zram/zram_drv.h | 1 + >> > 4 files changed, 70 insertions(+), 2 deletions(-) >> > >> > diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram >> > index 025331c19045..ffd1ea7443dd 100644 >> > --- a/Documentation/ABI/testing/sysfs-block-zram >> > +++ b/Documentation/ABI/testing/sysfs-block-zram >> > @@ -120,6 +120,16 @@ Description: >> > statistic. >> > Unit: bytes >> > >> > +What: /sys/block/zram<id>/mem_used_max >> > +Date: August 2014 >> > +Contact: Minchan Kim <minchan@kernel.org> >> > +Description: >> > + The mem_used_max file is read/write and specifies the amount >> > + of maximum memory zram have consumed to store compressed data. >> > + For resetting the value, you should do "echo 0". Otherwise, >> > + you could see -EINVAL. >> > + Unit: bytes >> > + >> > What: /sys/block/zram<id>/mem_limit >> > Date: August 2014 >> > Contact: Minchan Kim <minchan@kernel.org> >> > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt >> > index 9f239ff8c444..3b2247c2d4cf 100644 >> > --- a/Documentation/blockdev/zram.txt >> > +++ b/Documentation/blockdev/zram.txt >> > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful. >> > orig_data_size >> > compr_data_size >> > mem_used_total >> > + mem_used_max >> > >> > 8) Deactivate: >> > swapoff /dev/zram0 >> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c >> > index adc91c7ecaef..e4d44842a91d 100644 >> > --- a/drivers/block/zram/zram_drv.c >> > +++ b/drivers/block/zram/zram_drv.c >> > @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev, >> > return len; >> > } >> > >> > +static ssize_t mem_used_max_show(struct device *dev, >> > + struct device_attribute *attr, char *buf) >> > +{ >> > + u64 val = 0; >> > + struct zram *zram = dev_to_zram(dev); >> > + >> > + down_read(&zram->init_lock); >> > + if (init_done(zram)) >> > + val = atomic64_read(&zram->stats.max_used_pages); >> > + up_read(&zram->init_lock); >> > + >> > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT); >> > +} >> > + >> > +static ssize_t mem_used_max_store(struct device *dev, >> > + struct device_attribute *attr, const char *buf, size_t len) >> > +{ >> > + u64 limit; >> > + struct zram *zram = dev_to_zram(dev); >> > + struct zram_meta *meta = zram->meta; >> > + >> > - limit = memparse(buf, NULL); >> > - if (0 != limit) >> >> we wanted explicit "0" and nothing else for extensibility >> >> if (len != 1 || *buf != "0") >> > > I wanted to work with "0", "0K", "0M", "0G" but agree it's meaningless > at the moment so your version is better. > > >> > + return -EINVAL; >> > + >> > + down_read(&zram->init_lock); >> > + if (init_done(zram)) >> > + atomic64_set(&zram->stats.max_used_pages, >> > + zs_get_total_size(meta->mem_pool)); >> > + up_read(&zram->init_lock); >> > + >> > + return len; >> return 1; >> >> the standard convention is to return used amount of buffer > > If I follow your suggestion, len should be 1 right before returning > so no problem for functionality POV but I agree explicit "1" is better > for readability so your version is better, better. > >> >> >> >> > +} >> > + >> > static ssize_t max_comp_streams_store(struct device *dev, >> > struct device_attribute *attr, const char *buf, size_t len) >> > { >> > @@ -461,6 +495,26 @@ out_cleanup: >> > return ret; >> > } >> > >> > +static bool check_limit(struct zram *zram) >> > +{ >> > + unsigned long alloced_pages; >> > + u64 old_max, cur_max; >> > + struct zram_meta *meta = zram->meta; >> > + >> > + do { >> > + alloced_pages = zs_get_total_size(meta->mem_pool); >> > + if (zram->limit_pages && alloced_pages > zram->limit_pages) >> > + return false; >> > + >> > + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages); >> > + if (alloced_pages > cur_max) >> > + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages, >> > + cur_max, alloced_pages); >> > + } while (old_max != cur_max); >> > + >> > + return true; >> > +} >> > + >> >> Check_limit does more than check limit - it has a substantial side >> effect of updating max used. > > Hmm, Normally, limit check is best place to update the max although > function name imply just checking the limit and I don't think > code piece for max updating doesn't hurt readbilty. > If you or other reviewer is strong against, I will be happy to > factor out part of max updating into another function because > I think it's just preference problem for small logic and don't want > to waste argue for that. > > If you really want it, pz, ping me again. > >> >> Basically if we already allocated the buffer and our alloced_pages is >> less than the limit then we are good to go. > > Yeb. > >> >> It is the race to update that we need to have the cmpxchg. >> And maybe a helper function would aid readability - not sure, see next point. >> >> I don't believe there is need for the loop either. >> Any other updater will also be including our allocated pages >> (and at this point in the code eliminated from roll back) >> so if they beat us to it, then no problem, their max is better than ours. > > Let's assume we don't have the loop. > > > CPU A CPU B > > alloced_pages = 2001 > old_max = cur_max = 2000 > alloced_pages = 2005 > old_max = cur_max = 2000 > > cmpxchg(2000, 2000, 2001) -> OK > > cmpxchg(2001, 2000, 2005) -> FAIL > > So, we lose 2005 which is bigger vaule. > Yes - you are absolutely correct - I missed that scenario. but there isn't the need to redo zs_get_total_size. we only need to loop while our value is still the max. So the two parts are not closely coupled and the inline code for the exceeded check is simple enough. And the loop to apply max would be best in helper function. >> >> >> >> > static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, >> > int offset) >> > { >> > @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, >> > goto out; >> > } >> > >> > - if (zram->limit_pages && >> > - zs_get_total_size(meta->mem_pool) > zram->limit_pages) { >> > + if (!check_limit(zram)) { >> > zs_free(meta->mem_pool, handle); >> > ret = -ENOMEM; >> > goto out; >> > @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL); >> > static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL); >> > static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show, >> > mem_limit_store); >> > +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show, >> > + mem_used_max_store); >> > static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR, >> > max_comp_streams_show, max_comp_streams_store); >> > static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR, >> > @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = { >> > &dev_attr_compr_data_size.attr, >> > &dev_attr_mem_used_total.attr, >> > &dev_attr_mem_limit.attr, >> > + &dev_attr_mem_used_max.attr, >> > &dev_attr_max_comp_streams.attr, >> > &dev_attr_comp_algorithm.attr, >> > NULL, >> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h >> > index b7aa9c21553f..29383312d543 100644 >> > --- a/drivers/block/zram/zram_drv.h >> > +++ b/drivers/block/zram/zram_drv.h >> > @@ -90,6 +90,7 @@ struct zram_stats { >> > atomic64_t notify_free; /* no. of swap slot free notifications */ >> > atomic64_t zero_pages; /* no. of zero filled pages */ >> > atomic64_t pages_stored; /* no. of pages currently stored */ >> > + atomic64_t max_used_pages; /* no. of maximum pages stored */ >> > }; >> > >> > struct zram_meta { >> > -- >> > 2.0.0 >> > >> >> -- >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> the body to majordomo@kvack.org. For more info on Linux MM, >> see: http://www.linux-mm.org/ . >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > > -- > Kind regards, > Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/4] zram: report maximum used memory 2014-08-20 7:38 ` David Horner @ 2014-08-20 7:53 ` Minchan Kim 2014-08-20 8:25 ` David Horner 0 siblings, 1 reply; 19+ messages in thread From: Minchan Kim @ 2014-08-20 7:53 UTC (permalink / raw) To: David Horner Cc: Andrew Morton, Linux-MM, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman On Wed, Aug 20, 2014 at 03:38:27AM -0400, David Horner wrote: > On Wed, Aug 20, 2014 at 2:53 AM, Minchan Kim <minchan@kernel.org> wrote: > > On Wed, Aug 20, 2014 at 02:26:50AM -0400, David Horner wrote: > >> On Tue, Aug 19, 2014 at 3:54 AM, Minchan Kim <minchan@kernel.org> wrote: > >> > Normally, zram user could get maximum memory usage zram consumed > >> > via polling mem_used_total with sysfs in userspace. > >> > > >> > But it has a critical problem because user can miss peak memory > >> > usage during update inverval of polling. For avoiding that, > >> > user should poll it with shorter interval(ie, 0.0000000001s) > >> > with mlocking to avoid page fault delay when memory pressure > >> > is heavy. It would be troublesome. > >> > > >> > This patch adds new knob "mem_used_max" so user could see > >> > the maximum memory usage easily via reading the knob and reset > >> > it via "echo 0 > /sys/block/zram0/mem_used_max". > >> > > >> > Signed-off-by: Minchan Kim <minchan@kernel.org> > >> > --- > >> > Documentation/ABI/testing/sysfs-block-zram | 10 +++++ > >> > Documentation/blockdev/zram.txt | 1 + > >> > drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++- > >> > drivers/block/zram/zram_drv.h | 1 + > >> > 4 files changed, 70 insertions(+), 2 deletions(-) > >> > > >> > diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram > >> > index 025331c19045..ffd1ea7443dd 100644 > >> > --- a/Documentation/ABI/testing/sysfs-block-zram > >> > +++ b/Documentation/ABI/testing/sysfs-block-zram > >> > @@ -120,6 +120,16 @@ Description: > >> > statistic. > >> > Unit: bytes > >> > > >> > +What: /sys/block/zram<id>/mem_used_max > >> > +Date: August 2014 > >> > +Contact: Minchan Kim <minchan@kernel.org> > >> > +Description: > >> > + The mem_used_max file is read/write and specifies the amount > >> > + of maximum memory zram have consumed to store compressed data. > >> > + For resetting the value, you should do "echo 0". Otherwise, > >> > + you could see -EINVAL. > >> > + Unit: bytes > >> > + > >> > What: /sys/block/zram<id>/mem_limit > >> > Date: August 2014 > >> > Contact: Minchan Kim <minchan@kernel.org> > >> > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt > >> > index 9f239ff8c444..3b2247c2d4cf 100644 > >> > --- a/Documentation/blockdev/zram.txt > >> > +++ b/Documentation/blockdev/zram.txt > >> > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful. > >> > orig_data_size > >> > compr_data_size > >> > mem_used_total > >> > + mem_used_max > >> > > >> > 8) Deactivate: > >> > swapoff /dev/zram0 > >> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > >> > index adc91c7ecaef..e4d44842a91d 100644 > >> > --- a/drivers/block/zram/zram_drv.c > >> > +++ b/drivers/block/zram/zram_drv.c > >> > @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev, > >> > return len; > >> > } > >> > > >> > +static ssize_t mem_used_max_show(struct device *dev, > >> > + struct device_attribute *attr, char *buf) > >> > +{ > >> > + u64 val = 0; > >> > + struct zram *zram = dev_to_zram(dev); > >> > + > >> > + down_read(&zram->init_lock); > >> > + if (init_done(zram)) > >> > + val = atomic64_read(&zram->stats.max_used_pages); > >> > + up_read(&zram->init_lock); > >> > + > >> > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT); > >> > +} > >> > + > >> > +static ssize_t mem_used_max_store(struct device *dev, > >> > + struct device_attribute *attr, const char *buf, size_t len) > >> > +{ > >> > + u64 limit; > >> > + struct zram *zram = dev_to_zram(dev); > >> > + struct zram_meta *meta = zram->meta; > >> > + > >> > - limit = memparse(buf, NULL); > >> > - if (0 != limit) > >> > >> we wanted explicit "0" and nothing else for extensibility > >> > >> if (len != 1 || *buf != "0") > >> > > > > I wanted to work with "0", "0K", "0M", "0G" but agree it's meaningless > > at the moment so your version is better. > > > > > >> > + return -EINVAL; > >> > + > >> > + down_read(&zram->init_lock); > >> > + if (init_done(zram)) > >> > + atomic64_set(&zram->stats.max_used_pages, > >> > + zs_get_total_size(meta->mem_pool)); > >> > + up_read(&zram->init_lock); > >> > + > >> > + return len; > >> return 1; > >> > >> the standard convention is to return used amount of buffer > > > > If I follow your suggestion, len should be 1 right before returning > > so no problem for functionality POV but I agree explicit "1" is better > > for readability so your version is better, better. > > > >> > >> > >> > >> > +} > >> > + > >> > static ssize_t max_comp_streams_store(struct device *dev, > >> > struct device_attribute *attr, const char *buf, size_t len) > >> > { > >> > @@ -461,6 +495,26 @@ out_cleanup: > >> > return ret; > >> > } > >> > > >> > +static bool check_limit(struct zram *zram) > >> > +{ > >> > + unsigned long alloced_pages; > >> > + u64 old_max, cur_max; > >> > + struct zram_meta *meta = zram->meta; > >> > + > >> > + do { > >> > + alloced_pages = zs_get_total_size(meta->mem_pool); > >> > + if (zram->limit_pages && alloced_pages > zram->limit_pages) > >> > + return false; > >> > + > >> > + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages); > >> > + if (alloced_pages > cur_max) > >> > + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages, > >> > + cur_max, alloced_pages); > >> > + } while (old_max != cur_max); > >> > + > >> > + return true; > >> > +} > >> > + > >> > >> Check_limit does more than check limit - it has a substantial side > >> effect of updating max used. > > > > Hmm, Normally, limit check is best place to update the max although > > function name imply just checking the limit and I don't think > > code piece for max updating doesn't hurt readbilty. > > If you or other reviewer is strong against, I will be happy to > > factor out part of max updating into another function because > > I think it's just preference problem for small logic and don't want > > to waste argue for that. > > > > If you really want it, pz, ping me again. > > > >> > >> Basically if we already allocated the buffer and our alloced_pages is > >> less than the limit then we are good to go. > > > > Yeb. > > > >> > >> It is the race to update that we need to have the cmpxchg. > >> And maybe a helper function would aid readability - not sure, see next point. > >> > >> I don't believe there is need for the loop either. > >> Any other updater will also be including our allocated pages > >> (and at this point in the code eliminated from roll back) > >> so if they beat us to it, then no problem, their max is better than ours. > > > > Let's assume we don't have the loop. > > > > > > CPU A CPU B > > > > alloced_pages = 2001 > > old_max = cur_max = 2000 > > alloced_pages = 2005 > > old_max = cur_max = 2000 > > > > cmpxchg(2000, 2000, 2001) -> OK > > > > cmpxchg(2001, 2000, 2005) -> FAIL > > > > So, we lose 2005 which is bigger vaule. > > > > Yes - you are absolutely correct - I missed that scenario. > > but there isn't the need to redo zs_get_total_size. > > we only need to loop while our value is still the max. Yes - you are absolutely right. :) > > So the two parts are not closely coupled and the inline code for the > exceeded check is simple enough. > And the loop to apply max would be best in helper function. Okay, you proved helper function would be better for readabilty to indicate limit check and max_used_check is not coupled. Thanks for the review, David! > > >> > >> > >> > >> > static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, > >> > int offset) > >> > { > >> > @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, > >> > goto out; > >> > } > >> > > >> > - if (zram->limit_pages && > >> > - zs_get_total_size(meta->mem_pool) > zram->limit_pages) { > >> > + if (!check_limit(zram)) { > >> > zs_free(meta->mem_pool, handle); > >> > ret = -ENOMEM; > >> > goto out; > >> > @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL); > >> > static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL); > >> > static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show, > >> > mem_limit_store); > >> > +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show, > >> > + mem_used_max_store); > >> > static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR, > >> > max_comp_streams_show, max_comp_streams_store); > >> > static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR, > >> > @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = { > >> > &dev_attr_compr_data_size.attr, > >> > &dev_attr_mem_used_total.attr, > >> > &dev_attr_mem_limit.attr, > >> > + &dev_attr_mem_used_max.attr, > >> > &dev_attr_max_comp_streams.attr, > >> > &dev_attr_comp_algorithm.attr, > >> > NULL, > >> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h > >> > index b7aa9c21553f..29383312d543 100644 > >> > --- a/drivers/block/zram/zram_drv.h > >> > +++ b/drivers/block/zram/zram_drv.h > >> > @@ -90,6 +90,7 @@ struct zram_stats { > >> > atomic64_t notify_free; /* no. of swap slot free notifications */ > >> > atomic64_t zero_pages; /* no. of zero filled pages */ > >> > atomic64_t pages_stored; /* no. of pages currently stored */ > >> > + atomic64_t max_used_pages; /* no. of maximum pages stored */ > >> > }; > >> > > >> > struct zram_meta { > >> > -- > >> > 2.0.0 > >> > > >> > >> -- > >> To unsubscribe, send a message with 'unsubscribe linux-mm' in > >> the body to majordomo@kvack.org. For more info on Linux MM, > >> see: http://www.linux-mm.org/ . > >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > > > > -- > > Kind regards, > > Minchan Kim > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/4] zram: report maximum used memory 2014-08-20 7:53 ` Minchan Kim @ 2014-08-20 8:25 ` David Horner 0 siblings, 0 replies; 19+ messages in thread From: David Horner @ 2014-08-20 8:25 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, Linux-MM, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman I'm really surprised there isn't an atomic_set_max function - that keeps trying the exchange until the target value is equal or greater than provided value. (and complimentary atomic_set_min) I would be surprised that this is the only place in the kernel with this scenario. On Wed, Aug 20, 2014 at 3:53 AM, Minchan Kim <minchan@kernel.org> wrote: > On Wed, Aug 20, 2014 at 03:38:27AM -0400, David Horner wrote: >> On Wed, Aug 20, 2014 at 2:53 AM, Minchan Kim <minchan@kernel.org> wrote: >> > On Wed, Aug 20, 2014 at 02:26:50AM -0400, David Horner wrote: >> >> On Tue, Aug 19, 2014 at 3:54 AM, Minchan Kim <minchan@kernel.org> wrote: >> >> > Normally, zram user could get maximum memory usage zram consumed >> >> > via polling mem_used_total with sysfs in userspace. >> >> > >> >> > But it has a critical problem because user can miss peak memory >> >> > usage during update inverval of polling. For avoiding that, >> >> > user should poll it with shorter interval(ie, 0.0000000001s) >> >> > with mlocking to avoid page fault delay when memory pressure >> >> > is heavy. It would be troublesome. >> >> > >> >> > This patch adds new knob "mem_used_max" so user could see >> >> > the maximum memory usage easily via reading the knob and reset >> >> > it via "echo 0 > /sys/block/zram0/mem_used_max". >> >> > >> >> > Signed-off-by: Minchan Kim <minchan@kernel.org> >> >> > --- >> >> > Documentation/ABI/testing/sysfs-block-zram | 10 +++++ >> >> > Documentation/blockdev/zram.txt | 1 + >> >> > drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++- >> >> > drivers/block/zram/zram_drv.h | 1 + >> >> > 4 files changed, 70 insertions(+), 2 deletions(-) >> >> > >> >> > diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram >> >> > index 025331c19045..ffd1ea7443dd 100644 >> >> > --- a/Documentation/ABI/testing/sysfs-block-zram >> >> > +++ b/Documentation/ABI/testing/sysfs-block-zram >> >> > @@ -120,6 +120,16 @@ Description: >> >> > statistic. >> >> > Unit: bytes >> >> > >> >> > +What: /sys/block/zram<id>/mem_used_max >> >> > +Date: August 2014 >> >> > +Contact: Minchan Kim <minchan@kernel.org> >> >> > +Description: >> >> > + The mem_used_max file is read/write and specifies the amount >> >> > + of maximum memory zram have consumed to store compressed data. >> >> > + For resetting the value, you should do "echo 0". Otherwise, >> >> > + you could see -EINVAL. >> >> > + Unit: bytes >> >> > + >> >> > What: /sys/block/zram<id>/mem_limit >> >> > Date: August 2014 >> >> > Contact: Minchan Kim <minchan@kernel.org> >> >> > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt >> >> > index 9f239ff8c444..3b2247c2d4cf 100644 >> >> > --- a/Documentation/blockdev/zram.txt >> >> > +++ b/Documentation/blockdev/zram.txt >> >> > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful. >> >> > orig_data_size >> >> > compr_data_size >> >> > mem_used_total >> >> > + mem_used_max >> >> > >> >> > 8) Deactivate: >> >> > swapoff /dev/zram0 >> >> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c >> >> > index adc91c7ecaef..e4d44842a91d 100644 >> >> > --- a/drivers/block/zram/zram_drv.c >> >> > +++ b/drivers/block/zram/zram_drv.c >> >> > @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev, >> >> > return len; >> >> > } >> >> > >> >> > +static ssize_t mem_used_max_show(struct device *dev, >> >> > + struct device_attribute *attr, char *buf) >> >> > +{ >> >> > + u64 val = 0; >> >> > + struct zram *zram = dev_to_zram(dev); >> >> > + >> >> > + down_read(&zram->init_lock); >> >> > + if (init_done(zram)) >> >> > + val = atomic64_read(&zram->stats.max_used_pages); >> >> > + up_read(&zram->init_lock); >> >> > + >> >> > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT); >> >> > +} >> >> > + >> >> > +static ssize_t mem_used_max_store(struct device *dev, >> >> > + struct device_attribute *attr, const char *buf, size_t len) >> >> > +{ >> >> > + u64 limit; >> >> > + struct zram *zram = dev_to_zram(dev); >> >> > + struct zram_meta *meta = zram->meta; >> >> > + >> >> > - limit = memparse(buf, NULL); >> >> > - if (0 != limit) >> >> >> >> we wanted explicit "0" and nothing else for extensibility >> >> >> >> if (len != 1 || *buf != "0") >> >> >> > >> > I wanted to work with "0", "0K", "0M", "0G" but agree it's meaningless >> > at the moment so your version is better. >> > >> > >> >> > + return -EINVAL; >> >> > + >> >> > + down_read(&zram->init_lock); >> >> > + if (init_done(zram)) >> >> > + atomic64_set(&zram->stats.max_used_pages, >> >> > + zs_get_total_size(meta->mem_pool)); >> >> > + up_read(&zram->init_lock); >> >> > + >> >> > + return len; >> >> return 1; >> >> >> >> the standard convention is to return used amount of buffer >> > >> > If I follow your suggestion, len should be 1 right before returning >> > so no problem for functionality POV but I agree explicit "1" is better >> > for readability so your version is better, better. >> > >> >> >> >> >> >> >> >> > +} >> >> > + >> >> > static ssize_t max_comp_streams_store(struct device *dev, >> >> > struct device_attribute *attr, const char *buf, size_t len) >> >> > { >> >> > @@ -461,6 +495,26 @@ out_cleanup: >> >> > return ret; >> >> > } >> >> > >> >> > +static bool check_limit(struct zram *zram) >> >> > +{ >> >> > + unsigned long alloced_pages; >> >> > + u64 old_max, cur_max; >> >> > + struct zram_meta *meta = zram->meta; >> >> > + >> >> > + do { >> >> > + alloced_pages = zs_get_total_size(meta->mem_pool); >> >> > + if (zram->limit_pages && alloced_pages > zram->limit_pages) >> >> > + return false; >> >> > + >> >> > + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages); >> >> > + if (alloced_pages > cur_max) >> >> > + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages, >> >> > + cur_max, alloced_pages); >> >> > + } while (old_max != cur_max); >> >> > + >> >> > + return true; >> >> > +} >> >> > + >> >> >> >> Check_limit does more than check limit - it has a substantial side >> >> effect of updating max used. >> > >> > Hmm, Normally, limit check is best place to update the max although >> > function name imply just checking the limit and I don't think >> > code piece for max updating doesn't hurt readbilty. >> > If you or other reviewer is strong against, I will be happy to >> > factor out part of max updating into another function because >> > I think it's just preference problem for small logic and don't want >> > to waste argue for that. >> > >> > If you really want it, pz, ping me again. >> > >> >> >> >> Basically if we already allocated the buffer and our alloced_pages is >> >> less than the limit then we are good to go. >> > >> > Yeb. >> > >> >> >> >> It is the race to update that we need to have the cmpxchg. >> >> And maybe a helper function would aid readability - not sure, see next point. >> >> >> >> I don't believe there is need for the loop either. >> >> Any other updater will also be including our allocated pages >> >> (and at this point in the code eliminated from roll back) >> >> so if they beat us to it, then no problem, their max is better than ours. >> > >> > Let's assume we don't have the loop. >> > >> > >> > CPU A CPU B >> > >> > alloced_pages = 2001 >> > old_max = cur_max = 2000 >> > alloced_pages = 2005 >> > old_max = cur_max = 2000 >> > >> > cmpxchg(2000, 2000, 2001) -> OK >> > >> > cmpxchg(2001, 2000, 2005) -> FAIL >> > >> > So, we lose 2005 which is bigger vaule. >> > >> >> Yes - you are absolutely correct - I missed that scenario. >> >> but there isn't the need to redo zs_get_total_size. >> >> we only need to loop while our value is still the max. > > Yes - you are absolutely right. :) > >> >> So the two parts are not closely coupled and the inline code for the >> exceeded check is simple enough. >> And the loop to apply max would be best in helper function. > > Okay, you proved helper function would be better for readabilty > to indicate limit check and max_used_check is not coupled. > > Thanks for the review, David! > >> >> >> >> >> >> >> >> >> > static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, >> >> > int offset) >> >> > { >> >> > @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, >> >> > goto out; >> >> > } >> >> > >> >> > - if (zram->limit_pages && >> >> > - zs_get_total_size(meta->mem_pool) > zram->limit_pages) { >> >> > + if (!check_limit(zram)) { >> >> > zs_free(meta->mem_pool, handle); >> >> > ret = -ENOMEM; >> >> > goto out; >> >> > @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL); >> >> > static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL); >> >> > static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show, >> >> > mem_limit_store); >> >> > +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show, >> >> > + mem_used_max_store); >> >> > static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR, >> >> > max_comp_streams_show, max_comp_streams_store); >> >> > static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR, >> >> > @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = { >> >> > &dev_attr_compr_data_size.attr, >> >> > &dev_attr_mem_used_total.attr, >> >> > &dev_attr_mem_limit.attr, >> >> > + &dev_attr_mem_used_max.attr, >> >> > &dev_attr_max_comp_streams.attr, >> >> > &dev_attr_comp_algorithm.attr, >> >> > NULL, >> >> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h >> >> > index b7aa9c21553f..29383312d543 100644 >> >> > --- a/drivers/block/zram/zram_drv.h >> >> > +++ b/drivers/block/zram/zram_drv.h >> >> > @@ -90,6 +90,7 @@ struct zram_stats { >> >> > atomic64_t notify_free; /* no. of swap slot free notifications */ >> >> > atomic64_t zero_pages; /* no. of zero filled pages */ >> >> > atomic64_t pages_stored; /* no. of pages currently stored */ >> >> > + atomic64_t max_used_pages; /* no. of maximum pages stored */ >> >> > }; >> >> > >> >> > struct zram_meta { >> >> > -- >> >> > 2.0.0 >> >> > >> >> >> >> -- >> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> >> the body to majordomo@kvack.org. For more info on Linux MM, >> >> see: http://www.linux-mm.org/ . >> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> >> > >> > -- >> > Kind regards, >> > Minchan Kim >> >> -- >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> the body to majordomo@kvack.org. For more info on Linux MM, >> see: http://www.linux-mm.org/ . >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > > -- > Kind regards, > Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/4] zram: report maximum used memory 2014-08-20 6:53 ` Minchan Kim 2014-08-20 7:38 ` David Horner @ 2014-08-21 0:06 ` Minchan Kim 1 sibling, 0 replies; 19+ messages in thread From: Minchan Kim @ 2014-08-21 0:06 UTC (permalink / raw) To: David Horner Cc: Andrew Morton, Linux-MM, linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta, Seth Jennings, Dan Streetman On Wed, Aug 20, 2014 at 03:53:18PM +0900, Minchan Kim wrote: > On Wed, Aug 20, 2014 at 02:26:50AM -0400, David Horner wrote: > > On Tue, Aug 19, 2014 at 3:54 AM, Minchan Kim <minchan@kernel.org> wrote: > > > Normally, zram user could get maximum memory usage zram consumed > > > via polling mem_used_total with sysfs in userspace. > > > > > > But it has a critical problem because user can miss peak memory > > > usage during update inverval of polling. For avoiding that, > > > user should poll it with shorter interval(ie, 0.0000000001s) > > > with mlocking to avoid page fault delay when memory pressure > > > is heavy. It would be troublesome. > > > > > > This patch adds new knob "mem_used_max" so user could see > > > the maximum memory usage easily via reading the knob and reset > > > it via "echo 0 > /sys/block/zram0/mem_used_max". > > > > > > Signed-off-by: Minchan Kim <minchan@kernel.org> > > > --- > > > Documentation/ABI/testing/sysfs-block-zram | 10 +++++ > > > Documentation/blockdev/zram.txt | 1 + > > > drivers/block/zram/zram_drv.c | 60 +++++++++++++++++++++++++++++- > > > drivers/block/zram/zram_drv.h | 1 + > > > 4 files changed, 70 insertions(+), 2 deletions(-) > > > > > > diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram > > > index 025331c19045..ffd1ea7443dd 100644 > > > --- a/Documentation/ABI/testing/sysfs-block-zram > > > +++ b/Documentation/ABI/testing/sysfs-block-zram > > > @@ -120,6 +120,16 @@ Description: > > > statistic. > > > Unit: bytes > > > > > > +What: /sys/block/zram<id>/mem_used_max > > > +Date: August 2014 > > > +Contact: Minchan Kim <minchan@kernel.org> > > > +Description: > > > + The mem_used_max file is read/write and specifies the amount > > > + of maximum memory zram have consumed to store compressed data. > > > + For resetting the value, you should do "echo 0". Otherwise, > > > + you could see -EINVAL. > > > + Unit: bytes > > > + > > > What: /sys/block/zram<id>/mem_limit > > > Date: August 2014 > > > Contact: Minchan Kim <minchan@kernel.org> > > > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt > > > index 9f239ff8c444..3b2247c2d4cf 100644 > > > --- a/Documentation/blockdev/zram.txt > > > +++ b/Documentation/blockdev/zram.txt > > > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful. > > > orig_data_size > > > compr_data_size > > > mem_used_total > > > + mem_used_max > > > > > > 8) Deactivate: > > > swapoff /dev/zram0 > > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > > > index adc91c7ecaef..e4d44842a91d 100644 > > > --- a/drivers/block/zram/zram_drv.c > > > +++ b/drivers/block/zram/zram_drv.c > > > @@ -149,6 +149,40 @@ static ssize_t mem_limit_store(struct device *dev, > > > return len; > > > } > > > > > > +static ssize_t mem_used_max_show(struct device *dev, > > > + struct device_attribute *attr, char *buf) > > > +{ > > > + u64 val = 0; > > > + struct zram *zram = dev_to_zram(dev); > > > + > > > + down_read(&zram->init_lock); > > > + if (init_done(zram)) > > > + val = atomic64_read(&zram->stats.max_used_pages); > > > + up_read(&zram->init_lock); > > > + > > > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT); > > > +} > > > + > > > +static ssize_t mem_used_max_store(struct device *dev, > > > + struct device_attribute *attr, const char *buf, size_t len) > > > +{ > > > + u64 limit; > > > + struct zram *zram = dev_to_zram(dev); > > > + struct zram_meta *meta = zram->meta; > > > + > > > - limit = memparse(buf, NULL); > > > - if (0 != limit) > > > > we wanted explicit "0" and nothing else for extensibility > > > > if (len != 1 || *buf != "0") > > > > I wanted to work with "0", "0K", "0M", "0G" but agree it's meaningless > at the moment so your version is better. When I tested your way, it makes trobule for use who wanted "echo 0 > /sys/block/zram0/mem_used_max" because normally echo adds newline. Although we can guide them to use "echo -n", it's not a handy for users as well as not consistent with other knobs. IMO, we shouldn't force uncomfortable way to user for our uncertain future expandability. So, I will use kstrtoul. > > > > > + return -EINVAL; > > > + > > > + down_read(&zram->init_lock); > > > + if (init_done(zram)) > > > + atomic64_set(&zram->stats.max_used_pages, > > > + zs_get_total_size(meta->mem_pool)); > > > + up_read(&zram->init_lock); > > > + > > > + return len; > > return 1; > > > > the standard convention is to return used amount of buffer > > If I follow your suggestion, len should be 1 right before returning > so no problem for functionality POV but I agree explicit "1" is better > for readability so your version is better, better. > > > > > > > > > > +} > > > + > > > static ssize_t max_comp_streams_store(struct device *dev, > > > struct device_attribute *attr, const char *buf, size_t len) > > > { > > > @@ -461,6 +495,26 @@ out_cleanup: > > > return ret; > > > } > > > > > > +static bool check_limit(struct zram *zram) > > > +{ > > > + unsigned long alloced_pages; > > > + u64 old_max, cur_max; > > > + struct zram_meta *meta = zram->meta; > > > + > > > + do { > > > + alloced_pages = zs_get_total_size(meta->mem_pool); > > > + if (zram->limit_pages && alloced_pages > zram->limit_pages) > > > + return false; > > > + > > > + old_max = cur_max = atomic64_read(&zram->stats.max_used_pages); > > > + if (alloced_pages > cur_max) > > > + old_max = atomic64_cmpxchg(&zram->stats.max_used_pages, > > > + cur_max, alloced_pages); > > > + } while (old_max != cur_max); > > > + > > > + return true; > > > +} > > > + > > > > Check_limit does more than check limit - it has a substantial side > > effect of updating max used. > > Hmm, Normally, limit check is best place to update the max although > function name imply just checking the limit and I don't think > code piece for max updating doesn't hurt readbilty. > If you or other reviewer is strong against, I will be happy to > factor out part of max updating into another function because > I think it's just preference problem for small logic and don't want > to waste argue for that. > > If you really want it, pz, ping me again. > > > > > Basically if we already allocated the buffer and our alloced_pages is > > less than the limit then we are good to go. > > Yeb. > > > > > It is the race to update that we need to have the cmpxchg. > > And maybe a helper function would aid readability - not sure, see next point. > > > > I don't believe there is need for the loop either. > > Any other updater will also be including our allocated pages > > (and at this point in the code eliminated from roll back) > > so if they beat us to it, then no problem, their max is better than ours. > > Let's assume we don't have the loop. > > > CPU A CPU B > > alloced_pages = 2001 > old_max = cur_max = 2000 > alloced_pages = 2005 > old_max = cur_max = 2000 > > cmpxchg(2000, 2000, 2001) -> OK > > cmpxchg(2001, 2000, 2005) -> FAIL > > So, we lose 2005 which is bigger vaule. > > > > > > > > > > static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, > > > int offset) > > > { > > > @@ -541,8 +595,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, > > > goto out; > > > } > > > > > > - if (zram->limit_pages && > > > - zs_get_total_size(meta->mem_pool) > zram->limit_pages) { > > > + if (!check_limit(zram)) { > > > zs_free(meta->mem_pool, handle); > > > ret = -ENOMEM; > > > goto out; > > > @@ -897,6 +950,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL); > > > static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL); > > > static DEVICE_ATTR(mem_limit, S_IRUGO | S_IWUSR, mem_limit_show, > > > mem_limit_store); > > > +static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show, > > > + mem_used_max_store); > > > static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR, > > > max_comp_streams_show, max_comp_streams_store); > > > static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR, > > > @@ -926,6 +981,7 @@ static struct attribute *zram_disk_attrs[] = { > > > &dev_attr_compr_data_size.attr, > > > &dev_attr_mem_used_total.attr, > > > &dev_attr_mem_limit.attr, > > > + &dev_attr_mem_used_max.attr, > > > &dev_attr_max_comp_streams.attr, > > > &dev_attr_comp_algorithm.attr, > > > NULL, > > > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h > > > index b7aa9c21553f..29383312d543 100644 > > > --- a/drivers/block/zram/zram_drv.h > > > +++ b/drivers/block/zram/zram_drv.h > > > @@ -90,6 +90,7 @@ struct zram_stats { > > > atomic64_t notify_free; /* no. of swap slot free notifications */ > > > atomic64_t zero_pages; /* no. of zero filled pages */ > > > atomic64_t pages_stored; /* no. of pages currently stored */ > > > + atomic64_t max_used_pages; /* no. of maximum pages stored */ > > > }; > > > > > > struct zram_meta { > > > -- > > > 2.0.0 > > > > > > > -- > > To unsubscribe, send a message with 'unsubscribe linux-mm' in > > the body to majordomo@kvack.org. For more info on Linux MM, > > see: http://www.linux-mm.org/ . > > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > > -- > Kind regards, > Minchan Kim > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2014-08-21 0:05 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-08-19 7:54 [PATCH v2 0/4] zram memory control enhance Minchan Kim 2014-08-19 7:54 ` [PATCH v2 1/4] zsmalloc: move pages_allocated to zs_pool Minchan Kim 2014-08-19 7:54 ` [PATCH v2 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes Minchan Kim 2014-08-19 14:46 ` Seth Jennings 2014-08-19 15:11 ` Seth Jennings 2014-08-19 23:47 ` Minchan Kim 2014-08-19 23:46 ` Minchan Kim 2014-08-20 4:44 ` Seth Jennings 2014-08-19 7:54 ` [PATCH v2 3/4] zram: zram memory size limitation Minchan Kim 2014-08-19 8:06 ` Marc Dietrich 2014-08-19 23:32 ` Minchan Kim 2014-08-20 7:58 ` Marc Dietrich 2014-08-19 7:54 ` [PATCH v2 4/4] zram: report maximum used memory Minchan Kim 2014-08-20 6:26 ` David Horner 2014-08-20 6:53 ` Minchan Kim 2014-08-20 7:38 ` David Horner 2014-08-20 7:53 ` Minchan Kim 2014-08-20 8:25 ` David Horner 2014-08-21 0:06 ` Minchan Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).