From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7041CEB64D9 for ; Mon, 3 Jul 2023 00:13:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C4FAC8E0088; Sun, 2 Jul 2023 20:13:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BD8A08E007C; Sun, 2 Jul 2023 20:13:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A52758E0088; Sun, 2 Jul 2023 20:13:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8EE798E007C for ; Sun, 2 Jul 2023 20:13:23 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 6B08E160216 for ; Mon, 3 Jul 2023 00:13:23 +0000 (UTC) X-FDA: 80968376286.09.2DD66CC Received: from mail-il1-f175.google.com (mail-il1-f175.google.com [209.85.166.175]) by imf07.hostedemail.com (Postfix) with ESMTP id A5FBD40009 for ; Mon, 3 Jul 2023 00:13:21 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=rveWnkHj; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of rientjes@google.com designates 209.85.166.175 as permitted sender) smtp.mailfrom=rientjes@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688343201; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jisHREuv/kucktdG1Jbs6HLQNTqGaN8f/GIWHPI7u+E=; b=fKWjv8xHzQzqBhRrYMQgHTq+lNh/tQmWtVeZm2DUUrByBRFUuBRtO+9Ve5g9o9ROM5Ta/3 UC1QZEr0tBf4UG+St6326uAkqkYQKaamf8Y0mOCkfW0CJ5JY/hr1Naek+Ugdeuv98QJOXm b3rGzOUVd81CmM9ac4aAmEVwQwc8MOM= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=rveWnkHj; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of rientjes@google.com designates 209.85.166.175 as permitted sender) smtp.mailfrom=rientjes@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688343201; a=rsa-sha256; cv=none; b=W2dzn9FxbGz/9r3aBRa71Yq2/J1qDAJvGSErrC+EXwV1ZKo+byJIPLIctGlXy0Yi+RM7kK kpdpsutxFdfD+CuolwV3KqnTvwnMQwTJQfbFMZlJinLBxbbWGolMArtgWNV9OismnfKtMj dzLxrzzG1AgAINqbURWHtKPd5MpWDYg= Received: by mail-il1-f175.google.com with SMTP id e9e14a558f8ab-3460770afe2so97385ab.1 for ; Sun, 02 Jul 2023 17:13:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688343201; x=1690935201; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=jisHREuv/kucktdG1Jbs6HLQNTqGaN8f/GIWHPI7u+E=; b=rveWnkHjtQzlCvWJmczY2qr0loz/ibJjLqsinrG+XrTSxFb7SFvUn1N93rRUG4XwPp sp6BgBNM1ninklaEq7ZDny9eHnndnZ3P12zFx3nu7lEH98U5dav1OPMdZcuDdeaaeyxD E2ZAfnRxTiJiYET3QNjsfpH7npUFjyXqAtObVAuxxlRDjYuNSA5u7SENEpTblkdxOXLK 7iUY9rTc4lglrCMILifqWbr2YJ53g4qV35Pfb7GukXjHA6Aeakc2ZIr0ddPWRrwHdS/b 6BGWUI1ib3CrN104bGF8zFcGfIEJhEgyeFjw9SM0a401hcYXKPKz1rmGBtvCrbalmXSi U6Mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688343201; x=1690935201; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jisHREuv/kucktdG1Jbs6HLQNTqGaN8f/GIWHPI7u+E=; b=RL6HgWfP2B3nO1Ep5WnpoLBQtbMUp03e1qgrnqchmd41IxkVcfiU0gmT5NuhJXup8k yVfyGygEdlwguVLM8WGWX/sVyJrWtW4v3pcucvbNTqru1eBAyp07wJ0QuS36DI3xUJsp 3gDG5HT8qzKqTKIffQ55OL1px5tHRY5+1OEMYMPKya9llcEDgt1O/4szBy9T3hwa8cvy 7GTeWNs9TeLDDQPWvrY+WKATiNyQNba4RxwE/91Kmp9JHx1CmKTsRUkwF7lRbvkto8qX 9YrqZzrig8IrWYMnIWDfLgBlUf6VU4WzIOR4Xt+bCDGDmiW26HO3KWUOhHyiRudGbZ+e dc0g== X-Gm-Message-State: ABy/qLZ6vwRmSceQB9Bq0oRrTsDLXDpMrTf1nUvTZYHS5YwkvDCMZsiB UthdQ5SlX277TaBUy3vA6HxXfg== X-Google-Smtp-Source: APBJJlH8xQSg78VeaeuRYmm9SKJQIrSbeslNelra7WRUZ9qUc5mfXiRiZNR0lZsoEYkCU/6uHCWekg== X-Received: by 2002:a05:6e02:1a6d:b0:32f:7715:4482 with SMTP id w13-20020a056e021a6d00b0032f77154482mr314183ilv.4.1688343200622; Sun, 02 Jul 2023 17:13:20 -0700 (PDT) Received: from [2620:0:1008:15:f9d4:d371:9d50:fcd6] ([2620:0:1008:15:f9d4:d371:9d50:fcd6]) by smtp.gmail.com with ESMTPSA id 22-20020a17090a019600b00262e0c91d27sm13483574pjc.48.2023.07.02.17.13.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 02 Jul 2023 17:13:20 -0700 (PDT) Date: Sun, 2 Jul 2023 17:13:19 -0700 (PDT) From: David Rientjes To: Jay Patel , =?UTF-8?Q?Brian_=E2=80=9CBinder=E2=80=9D_Makin?= cc: linux-mm@kvack.org, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, Andrew Morton , Vlastimil Babka , aneesh.kumar@linux.ibm.com, tsahu@linux.ibm.com, piyushs@linux.ibm.com Subject: Re: [PATCH] [RFC PATCH v2]mm/slub: Optimize slub memory usage In-Reply-To: <20230628095740.589893-1-jaypatel@linux.ibm.com> Message-ID: References: <20230628095740.589893-1-jaypatel@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: A5FBD40009 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: pefpokdwywjkmxq3ag7wx1o5j5rz3en5 X-HE-Tag: 1688343201-163092 X-HE-Meta: U2FsdGVkX18v1c6L5EjPAHEIpImAtxJ7fQEPa88fLOp3OKkrqmFiDHJbXBr39h8+8yTnq2vjT6/9FXsflK3bM25GVbNzLCbtapORxodB4fJLas7SFkD55kcgtG8+TK/eWQ1hQ0v78TS6y3nBqKoEJ+BuYWMpzZzqsaDJoWi+iCj4mTX566L6/1ghfQgzRa4OhIhisjxUn2SswJ075iFUmL7vPjc193UklqeHHnpYOhx3Wml7Pt2anaN3sS92ldcYCztRaO/eesHTTYNi4KBYk7NMxaZl1qstwbo9BeIeQeZMmkSyMEbu0CFE8zjQ5EEbv+AAvt0l6HsV+RI2E+aUixHn9Hi6g+xo5YZi4MYjoz43SlwCmeKghLsJ56rVSUOYpOElZo6ELAVLun2GbYN1jg/1Q3j6oGBfH0etBxYA2J9bNzyrsTwL9MXGMmLBskZ4m9ChDAtBs31sugd3IyrNJKlfOrY9py2ZUkJPl9to6L+KGgcuC+CrRyiG2b9lP/BhrYBg5tbh/JXUF1rIg2iARx7Hk9x1Znm2Cd8ddCwDJnc3EyUBQ2OToHa3z0woHsWC+3miBwZ2kHiEkQPOla9UEU9Hhrmacy5r+uDSO0Ty5ZY2x0izaxH0q9JmBbzl9dPOw0zhwDi/TyVDObS9V9rsy3j+VjuHxuoyXQ8ve5WgPSkAoGHJXL9QMyHjHuS5avTAQu0g76qWDsNoJUpW4xpcpJDDni5k7SHXKdfMeeh+1d3RmWHXsk58YTghnFpq+Bzcch/HB4mUBfcbtcJJ17c6pDct0b7G3sFbStRLZEd1m6/Ts35ouuABorGgdB9fiS8+6Tn0aCXdmRlNqnm0fuMtK8hYyGlMQ399ZZ2yHee6rJtl4IlpoJguF/7cAKLLdIP3aQ19zlq1VSx+66hOCW8TxXS/XEQq3uzr7v3QMRX8+ixzd7YYDAuTwepUaRUjeEm8DjNVDbqtUhQgoh1iB8R N6k2Is3P oFUbuNHxuDtvTApBwD7Kozk59wGt++XWBSpy6GaBgGEf8+sCmlI3feow4ITDc+ZUAc9AkZWgZ/k9WYDYEmzq4df+ne3p1t67GrkkDqJ7qyKVwFr7YtcKAtB4NfucwQz0AAwcThmQvYcVzdptSIS1/UeoY9n/8URRnfHiKg7kqkMphuBGq24VG1M3/M/pSt+yWpRuFb5YpxQTtICoUQ3iQH9q8zZFJyqsAjegQTuz30mEyoNFQgTnHIvMDepYsHqqU3qM5cj5k/h+sC8Bl3Tx0PQ5BalaGJJOuDasE8nnvAo9+apUPIuEj4U2f+ztjK+XsKMI3WAZ8iShG+cDfJ/SaSlv3dZ2HJ7Ph71HEOzPZtnupuWL9Ys79+NPofpQIBFkxou6b8eQ3UmwUroDELO7OzqorVhx1aIWXbFoY7XvlO/cMbrz9ZUqpTliIygQu5Vc0pwgmrW8959X+P+E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Thanks very much for looking at this, Jay! My colleague, Binder, has also been looking at opportunities to optimize memory usage when using SLUB. We're preparing to deprecate SLAB internally and shift toward SLUB since SLAB is scheduled for removal after the next LTS kernel. Binder, do you have an evaluation with this patch similar to what Jay did? Also, tangentially: we are looking at other opportunities for reduction in memory overhead when using SLUB. If you or anybody else are interested in being involved in a working group with this shared goal, please let me know. We could brainstorm, collaborate, and share data. Thanks again! On Wed, 28 Jun 2023, Jay Patel wrote: > In the previous version [1], we were able to reduce slub memory > wastage, but the total memory was also increasing so to solve > this problem have modified the patch as follow: > > 1) If min_objects * object_size > PAGE_ALLOC_COSTLY_ORDER, then it > will return with PAGE_ALLOC_COSTLY_ORDER. > 2) Similarly, if min_objects * object_size < PAGE_SIZE, then it will > return with slub_min_order. > 3) Additionally, I changed slub_max_order to 2. There is no specific > reason for using the value 2, but it provided the best results in > terms of performance without any noticeable impact. > > [1] > https://lore.kernel.org/linux-mm/20230612085535.275206-1-jaypatel@linux.ibm.com/ > > I have conducted tests on systems with 160 CPUs and 16 CPUs using 4K > and 64K page sizes. The tests showed that the patch successfully > reduces the total and wastage of slab memory without any noticeable > performance degradation in the hackbench test. > > Test Results are as follows: > 1) On 160 CPUs with 4K Page size > > +----------------+----------------+----------------+ > | Total wastage in slub memory | > +----------------+----------------+----------------+ > | | After Boot | After Hackbench| > | Normal | 2090 Kb | 3204 Kb | > | With Patch | 1825 Kb | 3088 Kb | > | Wastage reduce | ~12% | ~4% | > +----------------+----------------+----------------+ > > +-----------------+----------------+----------------+ > | Total slub memory | > +-----------------+----------------+----------------+ > | | After Boot | After Hackbench| > | Normal | 500572 | 713568 | > | With Patch | 482036 | 688312 | > | Memory reduce | ~4% | ~3% | > +-----------------+----------------+----------------+ > > hackbench-process-sockets > +-------+-----+----------+----------+-----------+ > | | Normal |With Patch| | > +-------+-----+----------+----------+-----------+ > | Amean | 1 | 1.3237 | 1.2737 | ( 3.78%) | > | Amean | 4 | 1.5923 | 1.6023 | ( -0.63%) | > | Amean | 7 | 2.3727 | 2.4260 | ( -2.25%) | > | Amean | 12 | 3.9813 | 4.1290 | ( -3.71%) | > | Amean | 21 | 6.9680 | 7.0630 | ( -1.36%) | > | Amean | 30 | 10.1480 | 10.2170 | ( -0.68%) | > | Amean | 48 | 16.7793 | 16.8780 | ( -0.59%) | > | Amean | 79 | 28.9537 | 28.8187 | ( 0.47%) | > | Amean | 110 | 39.5507 | 40.0157 | ( -1.18%) | > | Amean | 141 | 51.5670 | 51.8200 | ( -0.49%) | > | Amean | 172 | 62.8710 | 63.2540 | ( -0.61%) | > | Amean | 203 | 74.6417 | 75.2520 | ( -0.82%) | > | Amean | 234 | 86.0853 | 86.5653 | ( -0.56%) | > | Amean | 265 | 97.9203 | 98.4617 | ( -0.55%) | > | Amean | 296 | 108.6243 | 109.8770 | ( -1.15%) | > +-------+-----+----------+----------+-----------+ > > 2) On 160 CPUs with 64K Page size > +-----------------+----------------+----------------+ > | Total wastage in slub memory | > +-----------------+----------------+----------------+ > | | After Boot |After Hackbench | > | Normal | 919 Kb | 1880 Kb | > | With Patch | 807 Kb | 1684 Kb | > | Wastage reduce | ~12% | ~10% | > +-----------------+----------------+----------------+ > > +-----------------+----------------+----------------+ > | Total slub memory | > +-----------------+----------------+----------------+ > | | After Boot | After Hackbench| > | Normal | 1862592 | 3023744 | > | With Patch | 1644416 | 2675776 | > | Memory reduce | ~12% | ~11% | > +-----------------+----------------+----------------+ > > hackbench-process-sockets > +-------+-----+----------+----------+-----------+ > | | Normal |With Patch| | > +-------+-----+----------+----------+-----------+ > | Amean | 1 | 1.2547 | 1.2677 | ( -1.04%) | > | Amean | 4 | 1.5523 | 1.5783 | ( -1.67%) | > | Amean | 7 | 2.4157 | 2.3883 | ( 1.13%) | > | Amean | 12 | 3.9807 | 3.9793 | ( 0.03%) | > | Amean | 21 | 6.9687 | 6.9703 | ( -0.02%) | > | Amean | 30 | 10.1403 | 10.1297 | ( 0.11%) | > | Amean | 48 | 16.7477 | 16.6893 | ( 0.35%) | > | Amean | 79 | 27.9510 | 28.0463 | ( -0.34%) | > | Amean | 110 | 39.6833 | 39.5687 | ( 0.29%) | > | Amean | 141 | 51.5673 | 51.4477 | ( 0.23%) | > | Amean | 172 | 62.9643 | 63.1647 | ( -0.32%) | > | Amean | 203 | 74.6220 | 73.7900 | ( 1.11%) | > | Amean | 234 | 85.1783 | 85.3420 | ( -0.19%) | > | Amean | 265 | 96.6627 | 96.7903 | ( -0.13%) | > | Amean | 296 | 108.2543 | 108.2253 | ( 0.03%) | > +-------+-----+----------+----------+-----------+ > > 3) On 16 CPUs with 4K Page size > +-----------------+----------------+------------------+ > | Total wastage in slub memory | > +-----------------+----------------+------------------+ > | | After Boot | After Hackbench | > | Normal | 491 Kb | 727 Kb | > | With Patch | 483 Kb | 670 Kb | > | Wastage reduce | ~1% | ~8% | > +-----------------+----------------+------------------+ > > +-----------------+----------------+----------------+ > | Total slub memory | > +-----------------+----------------+----------------+ > | | After Boot | After Hackbench| > | Normal | 105340 | 153116 | > | With Patch | 103620 | 147412 | > | Memory reduce | ~1.6% | ~4% | > +-----------------+----------------+----------------+ > > hackbench-process-sockets > +-------+-----+----------+----------+---------+ > | | Normal |With Patch| | > +-------+-----+----------+----------+---------+ > | Amean | 1 | 1.0963 | 1.1070 | ( -0.97%) | > | Amean | 4 | 3.7963) | 3.7957 | ( 0.02%) | > | Amean | 7 | 6.5947) | 6.6017 | ( -0.11%) | > | Amean | 12 | 11.1993) | 11.1730 | ( 0.24%) | > | Amean | 21 | 19.4097) | 19.3647 | ( 0.23%) | > | Amean | 30 | 27.7023) | 27.6040 | ( 0.35%) | > | Amean | 48 | 44.1287) | 43.9630 | ( 0.38%) | > | Amean | 64 | 58.8147) | 58.5753 | ( 0.41%) | > +-------+----+---------+----------+-----------+ > > 4) On 16 CPUs with 64K Page size > +----------------+----------------+----------------+ > | Total wastage in slub memory | > +----------------+----------------+----------------+ > | | After Boot | After Hackbench| > | Normal | 194 Kb | 349 Kb | > | With Patch | 191 Kb | 344 Kb | > | Wastage reduce | ~1% | ~1% | > +----------------+----------------+----------------+ > > +-----------------+----------------+----------------+ > | Total slub memory | > +-----------------+----------------+----------------+ > | | After Boot | After Hackbench| > | Normal | 330304 | 472960 | > | With Patch | 319808 | 458944 | > | Memory reduce | ~3% | ~3% | > +-----------------+----------------+----------------+ > > hackbench-process-sockets > +-------+-----+----------+----------+---------+ > | | Normal |With Patch| | > +-------+----+----------+----------+----------+ > | Amean | 1 | 1.9030 | 1.8967 | ( 0.33%) | > | Amean | 4 | 7.2117 | 7.1283 | ( 1.16%) | > | Amean | 7 | 12.5247 | 12.3460 | ( 1.43%) | > | Amean | 12 | 21.7157 | 21.4753 | ( 1.11%) | > | Amean | 21 | 38.2693 | 37.6670 | ( 1.57%) | > | Amean | 30 | 54.5930 | 53.8657 | ( 1.33%) | > | Amean | 48 | 87.6700 | 86.3690 | ( 1.48%) | > | Amean | 64 | 117.1227 | 115.4893 | ( 1.39%) | > +-------+----+----------+----------+----------+ > > Signed-off-by: Jay Patel > --- > mm/slub.c | 52 +++++++++++++++++++++++++--------------------------- > 1 file changed, 25 insertions(+), 27 deletions(-) > > diff --git a/mm/slub.c b/mm/slub.c > index c87628cd8a9a..0a1090c528da 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -4058,7 +4058,7 @@ EXPORT_SYMBOL(kmem_cache_alloc_bulk); > */ > static unsigned int slub_min_order; > static unsigned int slub_max_order = > - IS_ENABLED(CONFIG_SLUB_TINY) ? 1 : PAGE_ALLOC_COSTLY_ORDER; > + IS_ENABLED(CONFIG_SLUB_TINY) ? 1 : 2; > static unsigned int slub_min_objects; > > /* > @@ -4087,11 +4087,10 @@ static unsigned int slub_min_objects; > * the smallest order which will fit the object. > */ > static inline unsigned int calc_slab_order(unsigned int size, > - unsigned int min_objects, unsigned int max_order, > - unsigned int fract_leftover) > + unsigned int min_objects, unsigned int max_order) > { > unsigned int min_order = slub_min_order; > - unsigned int order; > + unsigned int order, min_wastage = size, min_wastage_order = MAX_ORDER+1; > > if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE) > return get_order(size * MAX_OBJS_PER_PAGE) - 1; > @@ -4104,11 +4103,17 @@ static inline unsigned int calc_slab_order(unsigned int size, > > rem = slab_size % size; > > - if (rem <= slab_size / fract_leftover) > - break; > + if (rem < min_wastage) { > + min_wastage = rem; > + min_wastage_order = order; > + } > } > > - return order; > + if (min_wastage_order <= slub_max_order) > + return min_wastage_order; > + else > + return order; > + > } > > static inline int calculate_order(unsigned int size) > @@ -4142,35 +4147,28 @@ static inline int calculate_order(unsigned int size) > nr_cpus = nr_cpu_ids; > min_objects = 4 * (fls(nr_cpus) + 1); > } > + > + if ((min_objects * size) > (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) > + return PAGE_ALLOC_COSTLY_ORDER; > + > + if ((min_objects * size) <= PAGE_SIZE) > + return slub_min_order; > + > max_objects = order_objects(slub_max_order, size); > min_objects = min(min_objects, max_objects); > > - while (min_objects > 1) { > - unsigned int fraction; > - > - fraction = 16; > - while (fraction >= 4) { > - order = calc_slab_order(size, min_objects, > - slub_max_order, fraction); > - if (order <= slub_max_order) > - return order; > - fraction /= 2; > - } > + while (min_objects >= 1) { > + order = calc_slab_order(size, min_objects, > + slub_max_order); > + if (order <= slub_max_order) > + return order; > min_objects--; > } > > - /* > - * We were unable to place multiple objects in a slab. Now > - * lets see if we can place a single object there. > - */ > - order = calc_slab_order(size, 1, slub_max_order, 1); > - if (order <= slub_max_order) > - return order; > - > /* > * Doh this slab cannot be placed using slub_max_order. > */ > - order = calc_slab_order(size, 1, MAX_ORDER, 1); > + order = calc_slab_order(size, 1, MAX_ORDER); > if (order <= MAX_ORDER) > return order; > return -ENOSYS; > -- > 2.39.1 > >