* [PATCH] zsmalloc: fix zs_can_compact() integer overflow
@ 2016-05-09 7:35 Sergey Senozhatsky
2016-05-09 8:07 ` Minchan Kim
0 siblings, 1 reply; 3+ messages in thread
From: Sergey Senozhatsky @ 2016-05-09 7:35 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, linux-kernel, linux-mm, Sergey Senozhatsky,
Sergey Senozhatsky, [4.3+]
zs_can_compact() has two race conditions in its core calculation:
unsigned long obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
zs_stat_get(class, OBJ_USED);
1) classes are not locked, so the numbers of allocated and used
objects can change by the concurrent ops happening on other CPUs
2) shrinker invokes it from preemptible context
Depending on the circumstances, OBJ_ALLOCATED can become less
than OBJ_USED, which can result in either very high or negative
`total_scan' value calculated in do_shrink_slab().
Take a pool stat snapshot and use it instead of racy zs_stat_get()
calls.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org> [4.3+]
---
mm/zsmalloc.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 107ec06..1bc2a98 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -2262,10 +2262,13 @@ static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage)
static unsigned long zs_can_compact(struct size_class *class)
{
unsigned long obj_wasted;
+ unsigned long obj_allocated = zs_stat_get(class, OBJ_ALLOCATED);
+ unsigned long obj_used = zs_stat_get(class, OBJ_USED);
- obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
- zs_stat_get(class, OBJ_USED);
+ if (obj_allocated <= obj_used)
+ return 0;
+ obj_wasted = obj_allocated - obj_used;
obj_wasted /= get_maxobj_per_zspage(class->size,
class->pages_per_zspage);
--
2.8.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] zsmalloc: fix zs_can_compact() integer overflow
2016-05-09 7:35 [PATCH] zsmalloc: fix zs_can_compact() integer overflow Sergey Senozhatsky
@ 2016-05-09 8:07 ` Minchan Kim
2016-05-09 8:41 ` Sergey Senozhatsky
0 siblings, 1 reply; 3+ messages in thread
From: Minchan Kim @ 2016-05-09 8:07 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-mm,
Sergey Senozhatsky, [4.3+]
Hello Sergey,
On Mon, May 09, 2016 at 04:35:33PM +0900, Sergey Senozhatsky wrote:
> zs_can_compact() has two race conditions in its core calculation:
>
> unsigned long obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
> zs_stat_get(class, OBJ_USED);
>
> 1) classes are not locked, so the numbers of allocated and used
> objects can change by the concurrent ops happening on other CPUs
> 2) shrinker invokes it from preemptible context
>
> Depending on the circumstances, OBJ_ALLOCATED can become less
> than OBJ_USED, which can result in either very high or negative
> `total_scan' value calculated in do_shrink_slab().
So, do you see pr_err("shrink_slab: %pF negative objects xxxx)
in vmscan.c and skip shrinking?
It would be better to explain what's the result without this patch
and end-user effect for going -stable.
At least, I seem to see above pr_err but at that time if I remember
correctly but at that time I thought it was a bug I introduces in
development process. Since then, I cannot reproduce the symptom
until now. :)
Good catch!
Comment is below.
>
> Take a pool stat snapshot and use it instead of racy zs_stat_get()
> calls.
>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: <stable@vger.kernel.org> [4.3+]
> ---
> mm/zsmalloc.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 107ec06..1bc2a98 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -2262,10 +2262,13 @@ static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage)
It seems this patch is based on my old page migration work?
It's not go to the mainline yet but your patch which fixes the bug should
be supposed to go to the -stable. So, I hope this patch first.
Thanks.
> static unsigned long zs_can_compact(struct size_class *class)
> {
> unsigned long obj_wasted;
> + unsigned long obj_allocated = zs_stat_get(class, OBJ_ALLOCATED);
> + unsigned long obj_used = zs_stat_get(class, OBJ_USED);
>
> - obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
> - zs_stat_get(class, OBJ_USED);
> + if (obj_allocated <= obj_used)
> + return 0;
>
> + obj_wasted = obj_allocated - obj_used;
> obj_wasted /= get_maxobj_per_zspage(class->size,
> class->pages_per_zspage);
>
> --
> 2.8.2
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] zsmalloc: fix zs_can_compact() integer overflow
2016-05-09 8:07 ` Minchan Kim
@ 2016-05-09 8:41 ` Sergey Senozhatsky
0 siblings, 0 replies; 3+ messages in thread
From: Sergey Senozhatsky @ 2016-05-09 8:41 UTC (permalink / raw)
To: Minchan Kim
Cc: Sergey Senozhatsky, Andrew Morton, linux-kernel, linux-mm,
Sergey Senozhatsky, [4.3+]
Hello,
On (05/09/16 17:07), Minchan Kim wrote:
[..]
> > Depending on the circumstances, OBJ_ALLOCATED can become less
> > than OBJ_USED, which can result in either very high or negative
> > `total_scan' value calculated in do_shrink_slab().
>
> So, do you see pr_err("shrink_slab: %pF negative objects xxxx)
> in vmscan.c and skip shrinking?
yes
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-64
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-64
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-64
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-64
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
> It would be better to explain what's the result without this patch
> and end-user effect for going -stable.
it seems that not every overflowed value returned from zs_can_compact()
is getting detected in do_shrink_slab():
freeable = shrinker->count_objects(shrinker, shrinkctl);
if (freeable == 0)
return 0;
/*
* copy the current shrinker scan count into a local variable
* and zero it so that other concurrent shrinker invocations
* don't also do this scanning work.
*/
nr = atomic_long_xchg(&shrinker->nr_deferred[nid], 0);
total_scan = nr;
delta = (4 * nr_scanned) / shrinker->seeks;
delta *= freeable;
do_div(delta, nr_eligible + 1);
total_scan += delta;
if (total_scan < 0) {
pr_err("shrink_slab: %pF negative objects to delete nr=%ld\n",
shrinker->scan_objects, total_scan);
total_scan = freeable;
}
this calculation can hide the shrinker->count_objects() error. I added
some debugging code (on x86_64), and the output was:
[ 59.041959] vmscan: >> OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615]
[ 59.041963] vmscan: >> but total_scan > 0: 92679974445502
[ 59.041964] vmscan: >> resulting total_scan: 92679974445502
[ 59.192734] vmscan: >> OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615]
[ 59.192737] vmscan: >> but total_scan > 0: 5830197242006811
[ 59.192738] vmscan: >> resulting total_scan: 5830197242006811
[ 59.259805] vmscan: >> OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615]
[ 59.259809] vmscan: >> but total_scan > 0: 23649671889371219
[ 59.259810] vmscan: >> resulting total_scan: 23649671889371219
[ 76.279767] vmscan: >> OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615]
[ 76.279770] vmscan: >> but total_scan > 0: 895907920044174
[ 76.279771] vmscan: >> resulting total_scan: 895907920044174
[ 84.807837] vmscan: >> OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615]
[ 84.807841] vmscan: >> but total_scan > 0: 22634041808232578
[ 84.807842] vmscan: >> resulting total_scan: 22634041808232578
so we can end up with insanely huge total_scan values.
[..]
> > @@ -2262,10 +2262,13 @@ static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage)
>
> It seems this patch is based on my old page migration work?
> It's not go to the mainline yet but your patch which fixes the bug should
> be supposed to go to the -stable. So, I hope this patch first.
oops... my fat fingers! good catch, thanks! I have two versions: for -next and
-mmots (with your LRU rework applied, indeed). somehow I managed to cd to the
wrong dir. sorry, will resend.
-ss
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-05-09 8:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-09 7:35 [PATCH] zsmalloc: fix zs_can_compact() integer overflow Sergey Senozhatsky
2016-05-09 8:07 ` Minchan Kim
2016-05-09 8:41 ` Sergey Senozhatsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).