From: Nicholas Piggin <npiggin@gmail.com>
To: Christophe LEROY <christophe.leroy@c-s.fr>
Cc: linuxppc-dev@lists.ozlabs.org,
"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
Michael Ellerman <mpe@ellerman.id.au>
Subject: Re: [RFC PATCH 0/5] powerpc/mm/slice: improve slice speed and stack use
Date: Tue, 13 Feb 2018 18:40:17 +1000 [thread overview]
Message-ID: <20180213184017.168c31f0@roar.ozlabs.ibm.com> (raw)
In-Reply-To: <7364b502-83a7-8b40-3530-b160e3c80523@c-s.fr>
On Mon, 12 Feb 2018 18:42:21 +0100
Christophe LEROY <christophe.leroy@c-s.fr> wrote:
> Le 12/02/2018 à 16:24, Nicholas Piggin a écrit :
> > On Mon, 12 Feb 2018 16:02:23 +0100
> > Christophe LEROY <christophe.leroy@c-s.fr> wrote:
> >
> >> Le 10/02/2018 à 09:11, Nicholas Piggin a écrit :
> >>> This series intends to improve performance and reduce stack
> >>> consumption in the slice allocation code. It does it by keeping slice
> >>> masks in the mm_context rather than compute them for each allocation,
> >>> and by reducing bitmaps and slice_masks from stacks, using pointers
> >>> instead where possible.
> >>>
> >>> checkstack.pl gives, before:
> >>> 0x00000de4 slice_get_unmapped_area [slice.o]: 656
> >>> 0x00001b4c is_hugepage_only_range [slice.o]: 512
> >>> 0x0000075c slice_find_area_topdown [slice.o]: 416
> >>> 0x000004c8 slice_find_area_bottomup.isra.1 [slice.o]: 272
> >>> 0x00001aa0 slice_set_range_psize [slice.o]: 240
> >>> 0x00000a64 slice_find_area [slice.o]: 176
> >>> 0x00000174 slice_check_fit [slice.o]: 112
> >>>
> >>> after:
> >>> 0x00000d70 slice_get_unmapped_area [slice.o]: 320
> >>> 0x000008f8 slice_find_area [slice.o]: 144
> >>> 0x00001860 slice_set_range_psize [slice.o]: 144
> >>> 0x000018ec is_hugepage_only_range [slice.o]: 144
> >>> 0x00000750 slice_find_area_bottomup.isra.4 [slice.o]: 128
> >>>
> >>> The benchmark in https://github.com/linuxppc/linux/issues/49 gives, before:
> >>> $ time ./slicemask
> >>> real 0m20.712s
> >>> user 0m5.830s
> >>> sys 0m15.105s
> >>>
> >>> after:
> >>> $ time ./slicemask
> >>> real 0m13.197s
> >>> user 0m5.409s
> >>> sys 0m7.779s
> >>
> >> Hi,
> >>
> >> I tested your serie on an 8xx, on top of patch
> >> https://patchwork.ozlabs.org/patch/871675/
> >>
> >> I don't get a result as significant as yours, but there is some
> >> improvment anyway:
> >>
> >> ITERATION 500000
> >>
> >> Before:
> >>
> >> root@vgoip:~# time ./slicemask
> >> real 0m 33.26s
> >> user 0m 1.94s
> >> sys 0m 30.85s
> >>
> >> After:
> >> root@vgoip:~# time ./slicemask
> >> real 0m 29.69s
> >> user 0m 2.11s
> >> sys 0m 27.15s
> >>
> >> Most significant improvment is obtained with the first patch of your serie:
> >> root@vgoip:~# time ./slicemask
> >> real 0m 30.85s
> >> user 0m 1.80s
> >> sys 0m 28.57s
> >
> > Okay, thanks. Are you still spending significant time in the slice
> > code?
>
> Do you mean am I still updating my patches ? No I hope we are at last
Actually I was wondering about CPU time spent for the microbenchmark :)
> run with v4 now that Aneesh has tagged all of them as reviewed-by himself.
> Once the serie has been accepted, my next step will be to backport at
> least the 3 first ones in kernel 4.14
>
> >
> >>
> >> Had to modify your serie a bit, if you are interested I can post it.
> >>
> >
> > Sure, that would be good.
>
> Ok, lets share it. The patch are not 100% clean.
Those look pretty good, thanks for doing that work.
Thanks,
Nick
next prev parent reply other threads:[~2018-02-13 8:40 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-10 8:11 [RFC PATCH 0/5] powerpc/mm/slice: improve slice speed and stack use Nicholas Piggin
2018-02-10 8:11 ` [RFC PATCH 1/5] powerpc/mm/slice: pass pointers to struct slice_mask where possible Nicholas Piggin
2018-02-10 8:11 ` [RFC PATCH 2/5] powerpc/mm/slice: implement a slice mask cache Nicholas Piggin
2018-02-10 8:11 ` [RFC PATCH 3/5] powerpc/mm/slice: implement slice_check_range_fits Nicholas Piggin
2018-02-10 8:11 ` [RFC PATCH 4/5] powerpc/mm/slice: Use const pointers to cached slice masks where possible Nicholas Piggin
2018-02-10 8:11 ` [RFC PATCH 5/5] powerpc/mm/slice: use the dynamic high slice size to limit bitmap operations Nicholas Piggin
2018-02-12 15:02 ` [RFC PATCH 0/5] powerpc/mm/slice: improve slice speed and stack use Christophe LEROY
2018-02-12 15:24 ` Nicholas Piggin
2018-02-12 17:42 ` Christophe LEROY
2018-02-13 8:40 ` Nicholas Piggin [this message]
2018-02-13 11:24 ` Christophe LEROY
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180213184017.168c31f0@roar.ozlabs.ibm.com \
--to=npiggin@gmail.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=christophe.leroy@c-s.fr \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).