From: Jay Patel <jaypatel@linux.ibm.com>
To: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: linux-mm@kvack.org, cl@linux.com, penberg@kernel.org,
rientjes@google.com, iamjoonsoo.kim@lge.com,
akpm@linux-foundation.org, vbabka@suse.cz,
aneesh.kumar@linux.ibm.com, tsahu@linux.ibm.com,
piyushs@linux.ibm.com
Subject: Re: [RFC PATCH v4] mm/slub: Optimize slub memory usage
Date: Fri, 18 Aug 2023 12:11:34 +0530 [thread overview]
Message-ID: <53ede2dd7f7c0d35b05bb43a9c8a1c60e9a25b59.camel@linux.ibm.com> (raw)
In-Reply-To: <CAB=+i9TCZC=qsG2Jmy=KJjSRJqU44dmujCvfP5=DArvX_HDdpg@mail.gmail.com>
On Fri, 2023-08-18 at 14:11 +0900, Hyeonggon Yoo wrote:
> On Fri, Aug 11, 2023 at 3:52 PM Jay Patel <jaypatel@linux.ibm.com>
> wrote:
> > On Fri, 2023-08-11 at 02:54 +0900, Hyeonggon Yoo wrote:
> > > On Thu, Jul 20, 2023 at 7:24 PM Jay Patel <jaypatel@linux.ibm.com
> > > >
> > > wrote:
> > > > In the current implementation of the slub memory allocator, the
> > > > slab
> > > > order selection process follows these criteria:
> > > >
> > > > 1) Determine the minimum order required to serve the minimum
> > > > number
> > > > of
> > > > objects (min_objects). This calculation is based on the formula
> > > > (order
> > > > = min_objects * object_size / PAGE_SIZE).
> > > > 2) If the minimum order is greater than the maximum allowed
> > > > order
> > > > (slub_max_order), set slub_max_order as the order for this
> > > > slab.
> > > > 3) If the minimum order is less than the slub_max_order,
> > > > iterate
> > > > through a loop from minimum order to slub_max_order and check
> > > > if
> > > > the
> > > > condition (rem <= slab_size / fract_leftover) holds true. Here,
> > > > slab_size is calculated as (PAGE_SIZE << order), rem is
> > > > (slab_size
> > > > %
> > > > object_size), and fract_leftover can have values of 16, 8, or
> > > > 4. If
> > > > the condition is true, select that order for the slab.
> > > >
> > > >
> > > > However, in point 3, when calculating the fraction left over,
> > > > it
> > > > can
> > > > result in a large range of values (like 1 Kb to 256 bytes on 4K
> > > > page
> > > > size & 4 Kb to 16 Kb on 64K page size with order 0 and goes on
> > > > increasing with higher order) when compared to the remainder
> > > > (rem).
> > > > This
> > > > can lead to the selection of an order that results in more
> > > > memory
> > > > wastage. To mitigate such wastage, we have modified point 3 as
> > > > follows:
> > > > To adjust the value of fract_leftover based on the page size,
> > > > while
> > > > retaining the current value as the default for a 4K page size.
> > > >
> > > > Test results are as follows:
> > > >
> > > > 1) On 160 CPUs with 64K Page size
> > > >
> > > > +-----------------+----------------+----------------+
> > > > > Total wastage in slub memory |
> > > > +-----------------+----------------+----------------+
> > > > > | After Boot |After Hackbench |
> > > > > Normal | 932 Kb | 1812 Kb |
> > > > > With Patch | 729 Kb | 1636 Kb |
> > > > > Wastage reduce | ~22% | ~10% |
> > > > +-----------------+----------------+----------------+
> > > >
> > > > +-----------------+----------------+----------------+
> > > > > Total slub memory |
> > > > +-----------------+----------------+----------------+
> > > > > | After Boot | After Hackbench|
> > > > > Normal | 1855296 | 2944576 |
> > > > > With Patch | 1544576 | 2692032 |
> > > > > Memory reduce | ~17% | ~9% |
> > > > +-----------------+----------------+----------------+
> > > >
> > > > hackbench-process-sockets
> > > > +-------+-----+----------+----------+-----------+
> > > > > Amean | 1 | 1.2727 | 1.2450 | ( 2.22%) |
> > > > > Amean | 4 | 1.6063 | 1.5810 | ( 1.60%) |
> > > > > Amean | 7 | 2.4190 | 2.3983 | ( 0.86%) |
> > > > > Amean | 12 | 3.9730 | 3.9347 | ( 0.97%) |
> > > > > Amean | 21 | 6.9823 | 6.8957 | ( 1.26%) |
> > > > > Amean | 30 | 10.1867 | 10.0600 | ( 1.26%) |
> > > > > Amean | 48 | 16.7490 | 16.4853 | ( 1.60%) |
> > > > > Amean | 79 | 28.1870 | 27.8673 | ( 1.15%) |
> > > > > Amean | 110 | 39.8363 | 39.3793 | ( 1.16%) |
> > > > > Amean | 141 | 51.5277 | 51.4907 | ( 0.07%) |
> > > > > Amean | 172 | 62.9700 | 62.7300 | ( 0.38%) |
> > > > > Amean | 203 | 74.5037 | 74.0630 | ( 0.59%) |
> > > > > Amean | 234 | 85.6560 | 85.3587 | ( 0.35%) |
> > > > > Amean | 265 | 96.9883 | 96.3770 | ( 0.63%) |
> > > > > Amean | 296 | 108.6893 | 108.0870 | ( 0.56%) |
> > > > +-------+-----+----------+----------+-----------+
> > > >
> > > > 2) On 16 CPUs with 64K Page size
> > > >
> > > > +----------------+----------------+----------------+
> > > > > Total wastage in slub memory |
> > > > +----------------+----------------+----------------+
> > > > > | After Boot | After Hackbench|
> > > > > Normal | 273 Kb | 544 Kb |
> > > > > With Patch | 260 Kb | 500 Kb |
> > > > > Wastage reduce | ~5% | ~9% |
> > > > +----------------+----------------+----------------+
> > > >
> > > > +-----------------+----------------+----------------+
> > > > > Total slub memory |
> > > > +-----------------+----------------+----------------+
> > > > > | After Boot | After Hackbench|
> > > > > Normal | 275840 | 412480 |
> > > > > With Patch | 272768 | 406208 |
> > > > > Memory reduce | ~1% | ~2% |
> > > > +-----------------+----------------+----------------+
> > > >
> > > > hackbench-process-sockets
> > > > +-------+----+---------+---------+-----------+
> > > > > Amean | 1 | 0.9513 | 0.9250 | ( 2.77%) |
> > > > > Amean | 4 | 2.9630 | 2.9570 | ( 0.20%) |
> > > > > Amean | 7 | 5.1780 | 5.1763 | ( 0.03%) |
> > > > > Amean | 12 | 8.8833 | 8.8817 | ( 0.02%) |
> > > > > Amean | 21 | 15.7577 | 15.6883 | ( 0.44%) |
> > > > > Amean | 30 | 22.2063 | 22.2843 | ( -0.35%) |
> > > > > Amean | 48 | 36.0587 | 36.1390 | ( -0.22%) |
> > > > > Amean | 64 | 49.7803 | 49.3457 | ( 0.87%) |
> > > > +-------+----+---------+---------+-----------+
> > > >
> > > > Signed-off-by: Jay Patel <jaypatel@linux.ibm.com>
> > > > ---
> > > > Changes from V3
> > > > 1) Resolved error and optimise logic for all arch
> > > >
> > > > Changes from V2
> > > > 1) removed all page order selection logic for slab cache base
> > > > on
> > > > wastage.
> > > > 2) Increasing fraction size base on page size (keeping current
> > > > value
> > > > as default to 4K page)
> > > >
> > > > Changes from V1
> > > > 1) If min_objects * object_size > PAGE_ALLOC_COSTLY_ORDER, then
> > > > it
> > > > will return with PAGE_ALLOC_COSTLY_ORDER.
> > > > 2) Similarly, if min_objects * object_size < PAGE_SIZE, then it
> > > > will
> > > > return with slub_min_order.
> > > > 3) Additionally, I changed slub_max_order to 2. There is no
> > > > specific
> > > > reason for using the value 2, but it provided the best results
> > > > in
> > > > terms of performance without any noticeable impact.
> > > >
> > > > mm/slub.c | 17 +++++++----------
> > > > 1 file changed, 7 insertions(+), 10 deletions(-)
> > > >
> > > > diff --git a/mm/slub.c b/mm/slub.c
> > > > index c87628cd8a9a..8f6f38083b94 100644
> > > > --- a/mm/slub.c
> > > > +++ b/mm/slub.c
> > > > @@ -287,6 +287,7 @@ static inline bool
> > > > kmem_cache_has_cpu_partial(struct kmem_cache *s)
> > > > #define OO_SHIFT 16
> > > > #define OO_MASK ((1 << OO_SHIFT) - 1)
> > > > #define MAX_OBJS_PER_PAGE 32767 /* since slab.objects is
> > > > u15
> > > > */
> > > > +#define SLUB_PAGE_FRAC_SHIFT 12
> > > >
> > > > /* Internal SLUB flags */
> > > > /* Poison object */
> > > > @@ -4117,6 +4118,7 @@ static inline int
> > > > calculate_order(unsigned
> > > > int size)
> > > > unsigned int min_objects;
> > > > unsigned int max_objects;
> > > > unsigned int nr_cpus;
> > > > + unsigned int page_size_frac;
> > > >
> > > > /*
> > > > * Attempt to find best configuration for a slab. This
> > > > @@ -4145,10 +4147,13 @@ static inline int
> > > > calculate_order(unsigned
> > > > int size)
> > > > max_objects = order_objects(slub_max_order, size);
> > > > min_objects = min(min_objects, max_objects);
> > > >
> > > > - while (min_objects > 1) {
> > > > + page_size_frac = ((PAGE_SIZE >> SLUB_PAGE_FRAC_SHIFT)
> > > > == 1)
> > > > ? 0
> > > > + : PAGE_SIZE >> SLUB_PAGE_FRAC_SHIFT;
> > > > +
> > > > + while (min_objects >= 1) {
> > > > unsigned int fraction;
> > > >
> > > > - fraction = 16;
> > > > + fraction = 16 + page_size_frac;
> > > > while (fraction >= 4) {
> > >
> > > Sorry I'm a bit late for the review.
> > >
> > > IIRC hexagon/powerpc can have ridiculously large page sizes (1M
> > > or
> > > 256KB)
> > > (but I don't know if such config is actually used, tbh) so I
> > > think
> > > there should be
> > > an upper bound.
> >
> > Hi,
> > I think that might not be required as arch with larger page size
> > will required larger fraction value as per this exit condition (rem
> > <=
> > slab_size / fract_leftover) during calc_slab_order.
>
> Okay, with 256KB pages the fraction will start from 80, and then 40,
> 20, 10, 5, ...
> and 1/80 of 256KB is about 3KB. So it's to waste less even when the
> machine uses large page sizes,
> because 1/16 of 256KB is still large, right?
Yes correct, so with this approach we can save memory wastage and total
memory for slub when using larger page size :)
>
> > > > order = calc_slab_order(size,
> > > > min_objects,
> > > > slub_max_order,
> > > > fraction);
> > > > @@ -4159,14 +4164,6 @@ static inline int
> > > > calculate_order(unsigned
> > > > int size)
> > > > min_objects--;
> > > > }
> > > > - /*
> > > > - * We were unable to place multiple objects in a slab.
> > > > Now
> > > > - * lets see if we can place a single object there.
> > > > - */
> > > > - order = calc_slab_order(size, 1, slub_max_order, 1);
> > > > - if (order <= slub_max_order)
> > > > - return order;
> > >
> > > I'm not sure if it's okay to remove this?
> > > It was fine in v2 because the least wasteful order was chosen
> > > regardless of fraction but that's not true anymore.
> > >
> > Ok, So my though are like if single object in slab with slab_size =
> > PAGE_SIZE << slub_max_order and it wastage more then 1\4th of
> > slab_size
> > then it's better to skip this part and use MAX_ORDER instead of
> > slub_max_order.
> > Could you kindly share your perspective on this part?
>
> I simply missed that part! :)
> That looks fine to me.
>
>
> > Tha
> > nks
> > Jay Patel
> > > Otherwise, everything looks fine to me. I'm too dumb to
> > > anticipate
> > > the outcome of increasing the slab order :P but this patch does
> > > not
> > > sound crazy to me.
> > >
> > > Thanks!
> > > --
> > > Hyeonggon
next prev parent reply other threads:[~2023-08-18 6:41 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-20 10:23 [RFC PATCH v4] mm/slub: Optimize slub memory usage Jay Patel
2023-08-10 17:54 ` Hyeonggon Yoo
2023-08-11 6:52 ` Jay Patel
2023-08-18 5:11 ` Hyeonggon Yoo
2023-08-18 6:41 ` Jay Patel [this message]
2023-08-11 15:43 ` Vlastimil Babka
2023-08-24 10:52 ` Jay Patel
2023-09-07 13:42 ` Vlastimil Babka
2023-09-14 5:40 ` Jay Patel
2023-09-14 6:38 ` Vlastimil Babka
2023-09-14 12:43 ` Jay Patel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53ede2dd7f7c0d35b05bb43a9c8a1c60e9a25b59.camel@linux.ibm.com \
--to=jaypatel@linux.ibm.com \
--cc=42.hyeyoo@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-mm@kvack.org \
--cc=penberg@kernel.org \
--cc=piyushs@linux.ibm.com \
--cc=rientjes@google.com \
--cc=tsahu@linux.ibm.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).