* [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage() @ 2016-05-05 5:17 Ganesh Mahendran 2016-05-05 10:03 ` Sergey Senozhatsky 0 siblings, 1 reply; 9+ messages in thread From: Ganesh Mahendran @ 2016-05-05 5:17 UTC (permalink / raw) To: minchan, ngupta, sergey.senozhatsky.work, akpm Cc: linux-mm, linux-kernel, Ganesh Mahendran if we find a zspage with usage == 100%, there is no need to try other zspages. Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> --- mm/zsmalloc.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index fda7177..310c7b0 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -765,6 +765,9 @@ static int get_pages_per_zspage(int class_size) if (usedpc > max_usedpc) { max_usedpc = usedpc; max_usedpc_order = i; + + if (max_usedpc == 100) + break; } } -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage() 2016-05-05 5:17 [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage() Ganesh Mahendran @ 2016-05-05 10:03 ` Sergey Senozhatsky 2016-05-06 3:09 ` Minchan Kim 0 siblings, 1 reply; 9+ messages in thread From: Sergey Senozhatsky @ 2016-05-05 10:03 UTC (permalink / raw) To: Ganesh Mahendran Cc: minchan, ngupta, sergey.senozhatsky.work, akpm, linux-mm, linux-kernel On (05/05/16 13:17), Ganesh Mahendran wrote: > if we find a zspage with usage == 100%, there is no need to > try other zspages. Hello, well... we iterate there from 0 to 1<<2, which is not awfully a lot to break it in the middle, and we do this only when we initialize a new pool (for every size class). the check is - true 15 times - false 492 times so it _sort of_ feels like this new if-condition doesn't buy us a lot, and most of the time it just sits there with no particular gain. let's hear from Minchan. -ss > Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com> > Cc: Minchan Kim <minchan@kernel.org> > Cc: Nitin Gupta <ngupta@vflare.org> > Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> > --- > mm/zsmalloc.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > index fda7177..310c7b0 100644 > --- a/mm/zsmalloc.c > +++ b/mm/zsmalloc.c > @@ -765,6 +765,9 @@ static int get_pages_per_zspage(int class_size) > if (usedpc > max_usedpc) { > max_usedpc = usedpc; > max_usedpc_order = i; > + > + if (max_usedpc == 100) > + break; > } > } > > -- > 1.7.9.5 > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage() 2016-05-05 10:03 ` Sergey Senozhatsky @ 2016-05-06 3:09 ` Minchan Kim 2016-05-06 4:25 ` Ganesh Mahendran 0 siblings, 1 reply; 9+ messages in thread From: Minchan Kim @ 2016-05-06 3:09 UTC (permalink / raw) To: Sergey Senozhatsky; +Cc: Ganesh Mahendran, ngupta, akpm, linux-mm, linux-kernel On Thu, May 05, 2016 at 07:03:29PM +0900, Sergey Senozhatsky wrote: > On (05/05/16 13:17), Ganesh Mahendran wrote: > > if we find a zspage with usage == 100%, there is no need to > > try other zspages. > > Hello, > > well... we iterate there from 0 to 1<<2, which is not awfully > a lot to break it in the middle, and we do this only when we > initialize a new pool (for every size class). > > the check is > - true 15 times > - false 492 times Thanks for the data, Sergey! > > so it _sort of_ feels like this new if-condition doesn't > buy us a lot, and most of the time it just sits there with > no particular gain. let's hear from Minchan. > I agree with Sergey. First of al, I appreciates your patch, Ganesh! But as Sergey pointed out, I don't see why it improves current zsmalloc. If you want to merge strongly, please convince me with more detail reason. Thanks. > -ss > > > Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com> > > Cc: Minchan Kim <minchan@kernel.org> > > Cc: Nitin Gupta <ngupta@vflare.org> > > Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> > > --- > > mm/zsmalloc.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > > index fda7177..310c7b0 100644 > > --- a/mm/zsmalloc.c > > +++ b/mm/zsmalloc.c > > @@ -765,6 +765,9 @@ static int get_pages_per_zspage(int class_size) > > if (usedpc > max_usedpc) { > > max_usedpc = usedpc; > > max_usedpc_order = i; > > + > > + if (max_usedpc == 100) > > + break; > > } > > } > > > > -- > > 1.7.9.5 > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage() 2016-05-06 3:09 ` Minchan Kim @ 2016-05-06 4:25 ` Ganesh Mahendran 2016-05-06 4:37 ` Minchan Kim 2016-05-06 9:08 ` Sergey Senozhatsky 0 siblings, 2 replies; 9+ messages in thread From: Ganesh Mahendran @ 2016-05-06 4:25 UTC (permalink / raw) To: Minchan Kim Cc: Sergey Senozhatsky, Nitin Gupta, Andrew Morton, Linux-MM, linux-kernel Hi, Minchan: 2016-05-06 11:09 GMT+08:00 Minchan Kim <minchan@kernel.org>: > On Thu, May 05, 2016 at 07:03:29PM +0900, Sergey Senozhatsky wrote: >> On (05/05/16 13:17), Ganesh Mahendran wrote: >> > if we find a zspage with usage == 100%, there is no need to >> > try other zspages. >> >> Hello, >> >> well... we iterate there from 0 to 1<<2, which is not awfully >> a lot to break it in the middle, and we do this only when we >> initialize a new pool (for every size class). >> >> the check is >> - true 15 times >> - false 492 times > > Thanks for the data, Sergey! > >> >> so it _sort of_ feels like this new if-condition doesn't >> buy us a lot, and most of the time it just sits there with >> no particular gain. let's hear from Minchan. >> > > I agree with Sergey. > First of al, I appreciates your patch, Ganesh! But as Sergey pointed > out, I don't see why it improves current zsmalloc. This patch does not obviously improve zsmalloc. It just reduces unnecessary code path. >From data provided by Sergey, 15 * (4 - 1) = 45 times loop will be avoided. So 45 times of below caculation will be reduced: --- zspage_size = i * PAGE_SIZE; waste = zspage_size % class_size; usedpc = (zspage_size - waste) * 100 / zspage_size; if (usedpc > max_usedpc) { --- Thanks. > If you want to merge strongly, please convince me with more detail > reason. > > Thanks. > > >> -ss >> >> > Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com> >> > Cc: Minchan Kim <minchan@kernel.org> >> > Cc: Nitin Gupta <ngupta@vflare.org> >> > Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> >> > --- >> > mm/zsmalloc.c | 3 +++ >> > 1 file changed, 3 insertions(+) >> > >> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c >> > index fda7177..310c7b0 100644 >> > --- a/mm/zsmalloc.c >> > +++ b/mm/zsmalloc.c >> > @@ -765,6 +765,9 @@ static int get_pages_per_zspage(int class_size) >> > if (usedpc > max_usedpc) { >> > max_usedpc = usedpc; >> > max_usedpc_order = i; >> > + >> > + if (max_usedpc == 100) >> > + break; >> > } >> > } >> > >> > -- >> > 1.7.9.5 >> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage() 2016-05-06 4:25 ` Ganesh Mahendran @ 2016-05-06 4:37 ` Minchan Kim 2016-05-06 9:08 ` Sergey Senozhatsky 1 sibling, 0 replies; 9+ messages in thread From: Minchan Kim @ 2016-05-06 4:37 UTC (permalink / raw) To: Ganesh Mahendran Cc: Sergey Senozhatsky, Nitin Gupta, Andrew Morton, Linux-MM, linux-kernel Hi Ganesh, On Fri, May 06, 2016 at 12:25:18PM +0800, Ganesh Mahendran wrote: > Hi, Minchan: > > 2016-05-06 11:09 GMT+08:00 Minchan Kim <minchan@kernel.org>: > > On Thu, May 05, 2016 at 07:03:29PM +0900, Sergey Senozhatsky wrote: > >> On (05/05/16 13:17), Ganesh Mahendran wrote: > >> > if we find a zspage with usage == 100%, there is no need to > >> > try other zspages. > >> > >> Hello, > >> > >> well... we iterate there from 0 to 1<<2, which is not awfully > >> a lot to break it in the middle, and we do this only when we > >> initialize a new pool (for every size class). > >> > >> the check is > >> - true 15 times > >> - false 492 times > > > > Thanks for the data, Sergey! > > > >> > >> so it _sort of_ feels like this new if-condition doesn't > >> buy us a lot, and most of the time it just sits there with > >> no particular gain. let's hear from Minchan. > >> > > > > I agree with Sergey. > > First of al, I appreciates your patch, Ganesh! But as Sergey pointed > > out, I don't see why it improves current zsmalloc. > > This patch does not obviously improve zsmalloc. > It just reduces unnecessary code path. > > From data provided by Sergey, 15 * (4 - 1) = 45 times loop will be avoided. > So 45 times of below caculation will be reduced: > --- > zspage_size = i * PAGE_SIZE; > waste = zspage_size % class_size; > usedpc = (zspage_size - waste) * 100 / zspage_size; > > if (usedpc > max_usedpc) { As well, it bloats code side without much gain. I don't think it's worth to do until someone really has trouble with slow zs_create_pool performance. add/remove: 0/0 grow/shrink: 1/0 up/down: 15/0 (15) function old new delta zs_create_pool 960 975 +15 > --- > > Thanks. > > > If you want to merge strongly, please convince me with more detail > > reason. > > > > Thanks. > > > > > >> -ss > >> > >> > Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com> > >> > Cc: Minchan Kim <minchan@kernel.org> > >> > Cc: Nitin Gupta <ngupta@vflare.org> > >> > Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> > >> > --- > >> > mm/zsmalloc.c | 3 +++ > >> > 1 file changed, 3 insertions(+) > >> > > >> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > >> > index fda7177..310c7b0 100644 > >> > --- a/mm/zsmalloc.c > >> > +++ b/mm/zsmalloc.c > >> > @@ -765,6 +765,9 @@ static int get_pages_per_zspage(int class_size) > >> > if (usedpc > max_usedpc) { > >> > max_usedpc = usedpc; > >> > max_usedpc_order = i; > >> > + > >> > + if (max_usedpc == 100) > >> > + break; > >> > } > >> > } > >> > > >> > -- > >> > 1.7.9.5 > >> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage() 2016-05-06 4:25 ` Ganesh Mahendran 2016-05-06 4:37 ` Minchan Kim @ 2016-05-06 9:08 ` Sergey Senozhatsky 2016-05-06 9:33 ` Sergey Senozhatsky 1 sibling, 1 reply; 9+ messages in thread From: Sergey Senozhatsky @ 2016-05-06 9:08 UTC (permalink / raw) To: Ganesh Mahendran Cc: Minchan Kim, Sergey Senozhatsky, Nitin Gupta, Andrew Morton, Linux-MM, linux-kernel On (05/06/16 12:25), Ganesh Mahendran wrote: [..] > > I agree with Sergey. > > First of al, I appreciates your patch, Ganesh! But as Sergey pointed > > out, I don't see why it improves current zsmalloc. > > This patch does not obviously improve zsmalloc. > It just reduces unnecessary code path. > > From data provided by Sergey, 15 * (4 - 1) = 45 times loop will be avoided. > So 45 times of below caculation will be reduced: > --- > zspage_size = i * PAGE_SIZE; > waste = zspage_size % class_size; > usedpc = (zspage_size - waste) * 100 / zspage_size; > > if (usedpc > max_usedpc) { > --- Hello, I kinda believe we end up doing more work (instruction-count-wise), actually. it adds 495 `cmp' for false case + 15 `cmp je' for true case to eliminate 15 `mov cltd idiv mov sub imul cltd idiv cmp' *. and it's not 45 iterations that we are getting rid of, but around 31: not every class reaches it's ideal 100% ratio on the first iteration. so, no, sorry, I don't think the patch really does what we want. * by the way, we don't even need `cltd' in those calculations. the reason why gcc puts cltd is because ZS_MAX_PAGES_PER_ZSPAGE has the 'wrong' data type. the patch to correct it is below (not a formal patch). ** well, we force gcc to generate `worse' code in several more places. for example, there is no need for `obj_idx' and `obj_offset' to be `unsigned long', it can easly (and probably must) be `unsigned int', or simply `int'. that can save some instructions in very-very hot paths: add/remove: 0/0 grow/shrink: 1/6 up/down: 1/-27 (-26) function old new delta obj_free 234 235 +1 obj_to_location 45 44 -1 obj_malloc 234 233 -1 zs_malloc 817 815 -2 obj_idx_to_offset 32 28 -4 zs_unmap_object 556 551 -5 zs_compact 1611 1597 -14 I can cook a trivial patch later. /* * on x86_64, gcc 6.1. no idea what does the picture look like on ARM32. * but smells like these two patches combined can make CPU a little less * busy. */ ===================================================================== ZS_MAX_PAGES_PER_ZSPAGE defined as 'unsigned long' which forces the compiler to generate unneeded signed extension instructions `cltd' in several places. for instance: 711: 44 89 d0 mov %r10d,%eax 714: 99 cltd 715: 41 f7 fe idiv %r14d 718: 44 89 d0 mov %r10d,%eax 71b: 29 d0 sub %edx,%eax 71d: 6b c0 64 imul $0x64,%eax,%eax 720: 99 cltd 721: 41 f7 fa idiv %r10d there is no reason to do this and ZS_MAX_PAGES_PER_ZSPAGE can simply be 'int'. the patch reduces the code size, a bit: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-25 (-25) function old new delta zs_malloc 842 817 -25 Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> --- mm/zsmalloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index f9b58d1..1c28e0f6 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -78,7 +78,7 @@ * pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N. */ #define ZS_MAX_ZSPAGE_ORDER 2 -#define ZS_MAX_PAGES_PER_ZSPAGE (_AC(1, UL) << ZS_MAX_ZSPAGE_ORDER) +#define ZS_MAX_PAGES_PER_ZSPAGE (1 << ZS_MAX_ZSPAGE_ORDER) #define ZS_HANDLE_SIZE (sizeof(unsigned long)) -- 2.8.2 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage() 2016-05-06 9:08 ` Sergey Senozhatsky @ 2016-05-06 9:33 ` Sergey Senozhatsky 2016-05-09 5:01 ` Minchan Kim 0 siblings, 1 reply; 9+ messages in thread From: Sergey Senozhatsky @ 2016-05-06 9:33 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Ganesh Mahendran, Minchan Kim, Nitin Gupta, Andrew Morton, Linux-MM, linux-kernel On (05/06/16 18:08), Sergey Senozhatsky wrote: [..] > and it's not 45 iterations that we are getting rid of, but around 31: > not every class reaches it's ideal 100% ratio on the first iteration. > so, no, sorry, I don't think the patch really does what we want. to be clear, what I meant was: 495 `cmp' + 15 `cmp je' IN 31 `mov cltd idiv mov sub imul cltd idiv cmp' OUT IN > OUT. CORRECTION here: > * by the way, we don't even need `cltd' in those calculations. the > reason why gcc puts cltd is because ZS_MAX_PAGES_PER_ZSPAGE has the > 'wrong' data type. the patch to correct it is below (not a formal > patch). no, we need cltd there. but ZS_MAX_PAGES_PER_ZSPAGE also affects ZS_MIN_ALLOC_SIZE, which is used in several places, like get_size_class_index(). that's why ZS_MAX_PAGES_PER_ZSPAGE data type change `improves' zs_malloc(). -ss -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage() 2016-05-06 9:33 ` Sergey Senozhatsky @ 2016-05-09 5:01 ` Minchan Kim 2016-05-11 9:34 ` Sergey Senozhatsky 0 siblings, 1 reply; 9+ messages in thread From: Minchan Kim @ 2016-05-09 5:01 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Ganesh Mahendran, Minchan Kim, Nitin Gupta, Andrew Morton, Linux-MM, linux-kernel On Fri, May 06, 2016 at 06:33:42PM +0900, Sergey Senozhatsky wrote: > On (05/06/16 18:08), Sergey Senozhatsky wrote: > [..] > > and it's not 45 iterations that we are getting rid of, but around 31: > > not every class reaches it's ideal 100% ratio on the first iteration. > > so, no, sorry, I don't think the patch really does what we want. > > > to be clear, what I meant was: > > 495 `cmp' + 15 `cmp je' IN > 31 `mov cltd idiv mov sub imul cltd idiv cmp' OUT > > IN > OUT. > > > CORRECTION here: > > > * by the way, we don't even need `cltd' in those calculations. the > > reason why gcc puts cltd is because ZS_MAX_PAGES_PER_ZSPAGE has the > > 'wrong' data type. the patch to correct it is below (not a formal > > patch). > > no, we need cltd there. but ZS_MAX_PAGES_PER_ZSPAGE also affects > ZS_MIN_ALLOC_SIZE, which is used in several places, like > get_size_class_index(). that's why ZS_MAX_PAGES_PER_ZSPAGE data > type change `improves' zs_malloc(). Why not if such simple improves zsmalloc? :) Please send a patch. Thanks a lot, Sergey! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage() 2016-05-09 5:01 ` Minchan Kim @ 2016-05-11 9:34 ` Sergey Senozhatsky 0 siblings, 0 replies; 9+ messages in thread From: Sergey Senozhatsky @ 2016-05-11 9:34 UTC (permalink / raw) To: Minchan Kim Cc: Sergey Senozhatsky, Ganesh Mahendran, Nitin Gupta, Andrew Morton, Linux-MM, linux-kernel On (05/09/16 14:01), Minchan Kim wrote: [..] > > no, we need cltd there. but ZS_MAX_PAGES_PER_ZSPAGE also affects > > ZS_MIN_ALLOC_SIZE, which is used in several places, like > > get_size_class_index(). that's why ZS_MAX_PAGES_PER_ZSPAGE data > > type change `improves' zs_malloc(). > > Why not if such simple improves zsmalloc? :) > Please send a patch. > > Thanks a lot, Sergey! Hello Minchan, sorry for long reply, I decided to investigate it a bit further. with this patch, gcc 6.1 -O2 generates "+13" instructions more, -Os "-25" instructions less. this +13 ins case is a no-no-no. -ss -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-05-11 9:33 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-05-05 5:17 [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage() Ganesh Mahendran 2016-05-05 10:03 ` Sergey Senozhatsky 2016-05-06 3:09 ` Minchan Kim 2016-05-06 4:25 ` Ganesh Mahendran 2016-05-06 4:37 ` Minchan Kim 2016-05-06 9:08 ` Sergey Senozhatsky 2016-05-06 9:33 ` Sergey Senozhatsky 2016-05-09 5:01 ` Minchan Kim 2016-05-11 9:34 ` Sergey Senozhatsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).