From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Ganesh Mahendran <opensource.ganesh@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>,
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
Nitin Gupta <ngupta@vflare.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linux-MM <linux-mm@kvack.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage()
Date: Fri, 6 May 2016 18:08:02 +0900 [thread overview]
Message-ID: <20160506090801.GA488@swordfish> (raw)
In-Reply-To: <CADAEsF9S4GQE6V+zsvRRVYjdbfN3VRQFcTiN5E_MWw60bfk0Zw@mail.gmail.com>
On (05/06/16 12:25), Ganesh Mahendran wrote:
[..]
> > I agree with Sergey.
> > First of al, I appreciates your patch, Ganesh! But as Sergey pointed
> > out, I don't see why it improves current zsmalloc.
>
> This patch does not obviously improve zsmalloc.
> It just reduces unnecessary code path.
>
> From data provided by Sergey, 15 * (4 - 1) = 45 times loop will be avoided.
> So 45 times of below caculation will be reduced:
> ---
> zspage_size = i * PAGE_SIZE;
> waste = zspage_size % class_size;
> usedpc = (zspage_size - waste) * 100 / zspage_size;
>
> if (usedpc > max_usedpc) {
> ---
Hello,
I kinda believe we end up doing more work (instruction-count-wise),
actually. it adds 495 `cmp' for false case + 15 `cmp je' for true
case to eliminate 15 `mov cltd idiv mov sub imul cltd idiv cmp' *.
and it's not 45 iterations that we are getting rid of, but around 31:
not every class reaches it's ideal 100% ratio on the first iteration.
so, no, sorry, I don't think the patch really does what we want.
* by the way, we don't even need `cltd' in those calculations. the
reason why gcc puts cltd is because ZS_MAX_PAGES_PER_ZSPAGE has the
'wrong' data type. the patch to correct it is below (not a formal
patch).
** well, we force gcc to generate `worse' code in several more places.
for example, there is no need for `obj_idx' and `obj_offset' to be
`unsigned long', it can easly (and probably must) be `unsigned int',
or simply `int'. that can save some instructions in very-very hot paths:
add/remove: 0/0 grow/shrink: 1/6 up/down: 1/-27 (-26)
function old new delta
obj_free 234 235 +1
obj_to_location 45 44 -1
obj_malloc 234 233 -1
zs_malloc 817 815 -2
obj_idx_to_offset 32 28 -4
zs_unmap_object 556 551 -5
zs_compact 1611 1597 -14
I can cook a trivial patch later.
/*
* on x86_64, gcc 6.1. no idea what does the picture look like on ARM32.
* but smells like these two patches combined can make CPU a little less
* busy.
*/
=====================================================================
ZS_MAX_PAGES_PER_ZSPAGE defined as 'unsigned long' which forces
the compiler to generate unneeded signed extension instructions
`cltd' in several places. for instance:
711: 44 89 d0 mov %r10d,%eax
714: 99 cltd
715: 41 f7 fe idiv %r14d
718: 44 89 d0 mov %r10d,%eax
71b: 29 d0 sub %edx,%eax
71d: 6b c0 64 imul $0x64,%eax,%eax
720: 99 cltd
721: 41 f7 fa idiv %r10d
there is no reason to do this and ZS_MAX_PAGES_PER_ZSPAGE can
simply be 'int'.
the patch reduces the code size, a bit:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-25 (-25)
function old new delta
zs_malloc 842 817 -25
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
mm/zsmalloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index f9b58d1..1c28e0f6 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -78,7 +78,7 @@
* pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N.
*/
#define ZS_MAX_ZSPAGE_ORDER 2
-#define ZS_MAX_PAGES_PER_ZSPAGE (_AC(1, UL) << ZS_MAX_ZSPAGE_ORDER)
+#define ZS_MAX_PAGES_PER_ZSPAGE (1 << ZS_MAX_ZSPAGE_ORDER)
#define ZS_HANDLE_SIZE (sizeof(unsigned long))
--
2.8.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Ganesh Mahendran <opensource.ganesh@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>,
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
Nitin Gupta <ngupta@vflare.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linux-MM <linux-mm@kvack.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage()
Date: Fri, 6 May 2016 18:08:02 +0900 [thread overview]
Message-ID: <20160506090801.GA488@swordfish> (raw)
In-Reply-To: <CADAEsF9S4GQE6V+zsvRRVYjdbfN3VRQFcTiN5E_MWw60bfk0Zw@mail.gmail.com>
On (05/06/16 12:25), Ganesh Mahendran wrote:
[..]
> > I agree with Sergey.
> > First of al, I appreciates your patch, Ganesh! But as Sergey pointed
> > out, I don't see why it improves current zsmalloc.
>
> This patch does not obviously improve zsmalloc.
> It just reduces unnecessary code path.
>
> From data provided by Sergey, 15 * (4 - 1) = 45 times loop will be avoided.
> So 45 times of below caculation will be reduced:
> ---
> zspage_size = i * PAGE_SIZE;
> waste = zspage_size % class_size;
> usedpc = (zspage_size - waste) * 100 / zspage_size;
>
> if (usedpc > max_usedpc) {
> ---
Hello,
I kinda believe we end up doing more work (instruction-count-wise),
actually. it adds 495 `cmp' for false case + 15 `cmp je' for true
case to eliminate 15 `mov cltd idiv mov sub imul cltd idiv cmp' *.
and it's not 45 iterations that we are getting rid of, but around 31:
not every class reaches it's ideal 100% ratio on the first iteration.
so, no, sorry, I don't think the patch really does what we want.
* by the way, we don't even need `cltd' in those calculations. the
reason why gcc puts cltd is because ZS_MAX_PAGES_PER_ZSPAGE has the
'wrong' data type. the patch to correct it is below (not a formal
patch).
** well, we force gcc to generate `worse' code in several more places.
for example, there is no need for `obj_idx' and `obj_offset' to be
`unsigned long', it can easly (and probably must) be `unsigned int',
or simply `int'. that can save some instructions in very-very hot paths:
add/remove: 0/0 grow/shrink: 1/6 up/down: 1/-27 (-26)
function old new delta
obj_free 234 235 +1
obj_to_location 45 44 -1
obj_malloc 234 233 -1
zs_malloc 817 815 -2
obj_idx_to_offset 32 28 -4
zs_unmap_object 556 551 -5
zs_compact 1611 1597 -14
I can cook a trivial patch later.
/*
* on x86_64, gcc 6.1. no idea what does the picture look like on ARM32.
* but smells like these two patches combined can make CPU a little less
* busy.
*/
=====================================================================
ZS_MAX_PAGES_PER_ZSPAGE defined as 'unsigned long' which forces
the compiler to generate unneeded signed extension instructions
`cltd' in several places. for instance:
711: 44 89 d0 mov %r10d,%eax
714: 99 cltd
715: 41 f7 fe idiv %r14d
718: 44 89 d0 mov %r10d,%eax
71b: 29 d0 sub %edx,%eax
71d: 6b c0 64 imul $0x64,%eax,%eax
720: 99 cltd
721: 41 f7 fa idiv %r10d
there is no reason to do this and ZS_MAX_PAGES_PER_ZSPAGE can
simply be 'int'.
the patch reduces the code size, a bit:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-25 (-25)
function old new delta
zs_malloc 842 817 -25
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
mm/zsmalloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index f9b58d1..1c28e0f6 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -78,7 +78,7 @@
* pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N.
*/
#define ZS_MAX_ZSPAGE_ORDER 2
-#define ZS_MAX_PAGES_PER_ZSPAGE (_AC(1, UL) << ZS_MAX_ZSPAGE_ORDER)
+#define ZS_MAX_PAGES_PER_ZSPAGE (1 << ZS_MAX_ZSPAGE_ORDER)
#define ZS_HANDLE_SIZE (sizeof(unsigned long))
--
2.8.2
next prev parent reply other threads:[~2016-05-06 9:06 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-05 5:17 [PATCH] mm/zsmalloc: avoid unnecessary iteration in get_pages_per_zspage() Ganesh Mahendran
2016-05-05 10:03 ` Sergey Senozhatsky
2016-05-05 10:03 ` Sergey Senozhatsky
2016-05-06 3:09 ` Minchan Kim
2016-05-06 3:09 ` Minchan Kim
2016-05-06 4:25 ` Ganesh Mahendran
2016-05-06 4:25 ` Ganesh Mahendran
2016-05-06 4:37 ` Minchan Kim
2016-05-06 4:37 ` Minchan Kim
2016-05-06 9:08 ` Sergey Senozhatsky [this message]
2016-05-06 9:08 ` Sergey Senozhatsky
2016-05-06 9:33 ` Sergey Senozhatsky
2016-05-06 9:33 ` Sergey Senozhatsky
2016-05-09 5:01 ` Minchan Kim
2016-05-09 5:01 ` Minchan Kim
2016-05-11 9:34 ` Sergey Senozhatsky
2016-05-11 9:34 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160506090801.GA488@swordfish \
--to=sergey.senozhatsky.work@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=ngupta@vflare.org \
--cc=opensource.ganesh@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.