All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: lkp@lists.01.org
Subject: Re: [mm] 09bc0443e9: will-it-scale.per_thread_ops -7.2% regression
Date: Mon, 03 May 2021 21:59:11 -0400	[thread overview]
Message-ID: <YJCqb8XDz1CVbrzx@redhat.com> (raw)
In-Reply-To: <YJCe9omGbKAGPwmK@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4050 bytes --]

Hello,

here's the result of this benchmark work of all source code released
under GPLv2 on github and kernel.org:

https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/commit/?h=main&id=d9c85cf85aeb8de7d1490aa97b19be2feb2a1048

Added:

"This commits increases the SMP scalability of pin_user_pages_fast()
executed by different threads of the same process by more than 4000%."

https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/commit/?h=main&id=0d2285c622103f0314ced7485c3b5b43f870c2d3

Added:

"will-it-scale "mmap2" shows no change in performance with enterprise
config as expected.

will-it-scale "pin_fast" retains the > 4000% SMP scalability
performance improvement against upstream as expected.

This is a noop as far as overall performance and SMP scalability are
concerned.
"

Also documented in the summary of the mapcount_deshare branch:

https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/commit/?h=mapcount_deshare&id=68e65632c1ce13e846aa39e38f8ea7399d5abbbd

"- >4000% SMP scalability performance improvement to
   pin_user_pages_fast() with 1 thread per CPU and 2 NUMA nodes with
   64 cores each."

I didn't add any reported-by because the report couldn't be confirmed
and there was no source code change that resulted from it, to the
contrary I documented the improvement the code already delivered with
will-it-scale (as it was already described in the commit header, but
now it's exactly quantified and verified).

Thanks,
Andrea

On Mon, May 03, 2021 at 09:10:14PM -0400, Andrea Arcangeli wrote:
> On Sun, May 02, 2021 at 09:25:54PM -0400, Andrea Arcangeli wrote:
> > === pin_fast will-it-scale follows ===
> > 
> > In addition, to see a related scalability increase, I added a pin_fast
> > testcase to will-it-scale that I'm submitting here to Anton (CC'ed).
> > 
> > To run it in lkp-test you only have to add the attached patch on top
> > of your patch and then call "pin_fast" instead of "mmap2" in the
> > invocation, like below:
> > 
> > # for i in `seq 3`; do python3 runtest.py pin_fast 295 thread `nproc`; done
> > 
> > I recommend to add this to lkp-test since I think it's much more
> > interesting than mmap2 and will show huge differences. Ideally we
> > should add a both FOLL_WRITE test too later.
> > 
> > This is aa.git main branch commit 918037878bcf:
> > 
> > tasks,processes,processes_idle,threads,threads_idle,linear
> > 0,0,100,0,100,0
> > 256,0,0.00,1196513,0.19,0
> > tasks,processes,processes_idle,threads,threads_idle,linear
> > 0,0,100,0,100,0
> > 256,0,0.00,1194664,0.19,0
> > tasks,processes,processes_idle,threads,threads_idle,linear
> > 0,0,100,0,100,0
> > 256,0,0.00,1193194,0.19,0
> > 
> > This is mainline, upstream commit 18a3c5f7abfd:
> > 
> > tasks,processes,processes_idle,threads,threads_idle,linear
> > 0,0,100,0,100,0
> > 256,0,0.00,25641,0.17,0
> > tasks,processes,processes_idle,threads,threads_idle,linear
> > 0,0,100,0,100,0
> > 256,0,0.00,25652,0.16,0
> > tasks,processes,processes_idle,threads,threads_idle,linear
> > 0,0,100,0,100,0
> > 256,0,0.00,25559,0.16,0
> 
> I now verified that the 4668% increase in scalability as expected is
> thanks to this very patch:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/commit/?id=5a9bd1dce03d0a7c55c5f81992bc06fc6630f78d
> 
> "mm: gup: allow FOLL_PIN to scale in SMP"
> 
> And the patch you flagged as regression:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/commit/?id=d2f271ca96b42cafd701d445f11af0d2ef993772
> 
> changes nothing in performance, it retains the 4668% improvement in
> SMP scalability of FOLL_PIN compared upstream, and it changes nothing
> for all other tests, but it saves 64bit per vma of RAM, it packs the
> structure.
> 
> All results matches what I described in the commit headers.
> 
> I'll document the 4668% improvement in SMP scalability of FOLL_PIN in
> the first patch commit header now for the next rebase.
> 
> Thanks,
> Andrea

  reply	other threads:[~2021-05-04  1:59 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-27  5:54 [mm] 09bc0443e9: will-it-scale.per_thread_ops -7.2% regression kernel test robot
2021-05-03  1:25 ` Andrea Arcangeli
2021-05-04  1:10   ` Andrea Arcangeli
2021-05-04  1:59     ` Andrea Arcangeli [this message]
2021-05-08  2:58   ` Xing Zhengjun
2021-05-08 23:12     ` Andrea Arcangeli
2021-05-11  1:34       ` Xing Zhengjun
2021-05-11  2:35         ` Andrea Arcangeli
2021-05-11  8:43           ` Xing Zhengjun
2021-05-11 22:41             ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YJCqb8XDz1CVbrzx@redhat.com \
    --to=aarcange@redhat.com \
    --cc=lkp@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.