From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A441AC001DE for ; Fri, 18 Aug 2023 06:41:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 07FB3940056; Fri, 18 Aug 2023 02:41:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 02FB5940053; Fri, 18 Aug 2023 02:41:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E3A78940056; Fri, 18 Aug 2023 02:41:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D56F2940053 for ; Fri, 18 Aug 2023 02:41:52 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 01BEAC1056 for ; Fri, 18 Aug 2023 06:41:50 +0000 (UTC) X-FDA: 81136280022.16.F0998AB Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf07.hostedemail.com (Postfix) with ESMTP id 595734002D for ; Fri, 18 Aug 2023 06:41:48 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=PI6XKPj9; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf07.hostedemail.com: domain of jaypatel@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=jaypatel@linux.ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692340908; a=rsa-sha256; cv=none; b=n338h6bLk/z4fwDi3+fECx3YIEg9u9PWkF6YczYNo852iXmUgGbcLuMOC3AZx8SZfhcZBf RamsO5rznEvS38rK3oTF7unFLQ71B9J83SN6KhWIAZ1jBOpXYss1ZlpaHZ4bw8Jyh6uQ9w 8xabrwGuE5nenpU+7Lsj80n4SfMvG94= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=PI6XKPj9; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf07.hostedemail.com: domain of jaypatel@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=jaypatel@linux.ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692340908; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EBJMgmbjIS59y9sxqEg1ExZkaqG71bw44hp8wkp4PY4=; b=6AaBnjNDELpEBlacML/9ohpY06zOVEM5NP2oCQn3wMSLXNRcatx7uaXco0bzq4fzrdVvmj pByC4CkBPmziatHW/7YvcUwNY8+kNccK4plJop2ta7QEFIH/P1aUTnMS8GYozky86DL96x oIH5crMe1YkRoDoX2ALGXS1B87e+Ges= Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 37I6Y5s9026383; Fri, 18 Aug 2023 06:41:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : subject : from : reply-to : to : cc : date : in-reply-to : references : content-type : mime-version : content-transfer-encoding; s=pp1; bh=EBJMgmbjIS59y9sxqEg1ExZkaqG71bw44hp8wkp4PY4=; b=PI6XKPj9ihuCrBMNW3pLYXiVBo72l6+i9S5PbavOB2ULs6+n8OXs7qMgfq9CtoEpsZir 0G/Nw9VGiN0O3yyX8+Li/qaf1C6XJaP35KuS5xyn1qRunjwC/xqJv5ZEULPF5qSuYbUw WPXq81IrFPQ6ofpQKaK0W8N/oVRKAFq2AJzfw6mwMu2++sIej1qYMOWXpoODL7qIyb19 szP44xhR4u2qtfbbL87Vck6kNgyygmgXhUTdbuj2hZotjLjuCCNaVkVCPUREaigDDLIw ZJl6Kp8Ji8fbN6PEVlZ0s/aanfzuxt8C60DQe9P76eyxxV8vBIt0JbjcHIUSQD5A3Wvy AQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sj3fwg8a2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 18 Aug 2023 06:41:43 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 37I6Y6DD026420; Fri, 18 Aug 2023 06:41:43 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sj3fwg89x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 18 Aug 2023 06:41:43 +0000 Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 37I64Ep4007832; Fri, 18 Aug 2023 06:41:42 GMT Received: from smtprelay03.dal12v.mail.ibm.com ([172.16.1.5]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3senwkvse6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 18 Aug 2023 06:41:42 +0000 Received: from smtpav03.wdc07v.mail.ibm.com (smtpav03.wdc07v.mail.ibm.com [10.39.53.230]) by smtprelay03.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 37I6ff7m6554360 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 18 Aug 2023 06:41:41 GMT Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 89C2458054; Fri, 18 Aug 2023 06:41:41 +0000 (GMT) Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3A0025805C; Fri, 18 Aug 2023 06:41:37 +0000 (GMT) Received: from patel (unknown [9.179.31.12]) by smtpav03.wdc07v.mail.ibm.com (Postfix) with ESMTP; Fri, 18 Aug 2023 06:41:36 +0000 (GMT) Message-ID: <53ede2dd7f7c0d35b05bb43a9c8a1c60e9a25b59.camel@linux.ibm.com> Subject: Re: [RFC PATCH v4] mm/slub: Optimize slub memory usage From: Jay Patel Reply-To: jaypatel@linux.ibm.com To: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: linux-mm@kvack.org, cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, aneesh.kumar@linux.ibm.com, tsahu@linux.ibm.com, piyushs@linux.ibm.com Date: Fri, 18 Aug 2023 12:11:34 +0530 In-Reply-To: References: <20230720102337.2069722-1-jaypatel@linux.ibm.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-22.el8) Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: WlZiq316mgTAtVtJC2yfjZiwhihkhnk5 X-Proofpoint-GUID: GgrbXLSRYPGoKGY2cY21JkEpR_pad6Br X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.957,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-08-18_07,2023-08-17_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 phishscore=0 mlxlogscore=999 impostorscore=0 adultscore=0 suspectscore=0 bulkscore=0 spamscore=0 lowpriorityscore=0 priorityscore=1501 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308180062 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 595734002D X-Stat-Signature: hdy15up1g8ucn51843bd8anwbokeymm4 X-HE-Tag: 1692340908-243790 X-HE-Meta: U2FsdGVkX1/8Tdk2sQtfppgkZfS9D2lVulbIKu2fbHDkKpU11zPQP5BDs0uR6Smo1yJYEI+Sf8conhETCsdNkacE7DnDMjC4Ij6EjsTUeKz7uTtzjP5GKAOvC9ClkqDqSJKmN5fJl9nIFwSckJ57Abx0LRvEh1T78NWYyTdJaBQ9zetL3K6KHwIv4yen3xd0pPuGEwvqliFS8Wkj+AwiEL3ANuvwR3UiQz9YEpqovz6y21OUlDU2HCeJxf5nobAAnI0uw/UEayS6up6Mx7SlzGMzRYzVZwLmvdWiZXs1EPb0QlokotgS+I/XG0NrVLhWvoaxW9kO2Z8O+98IZQrHZhbD1zVSs/IWdI4+4rCaXL/qKBLm1BfB/km0spcGEl2fXNQBkZDuvcEis/+BlIbz4KLP9SaOruapZ0JS6ye7CbDvr2ev9xihzNfUKxuTmJbypH3zjtIZX0dUFLVis0S+4rshVWQXQxKYvLfHwPxCEgkCK84KuLnt+syOpv6Uki6ml7hkBGslyhaLb5ETcQOiwOFVAN9AjFgQeLaVzBGIsY9po4mb4uuNCwQ47uwrJ05kJLN/NbkX4UZqvGgld+Q+vPcWQZlpPMpZCKwjJSa2hQFHiIZ4hJtWuLNjuKkgiMZIKxdWBBtemWX9TUqwOqFiTIRTPRCUVR3e5W3M2je6GAWEccAVFGoM0hNe6Y9RpO9KClcXVfj5Rg+rEfMAtBwIXGbUCuBWQI/1IkDM0DWcVOLWxVII2/0nJmozx5hcTz5QT29tGX539rlCTs9feVE+FDVobOuM/Zy2X/7Ys4KDbJaDzLhACwtYOwtCJIJybHCbzFc44OoxM+5fwcIWkKzjouuzuHWoZ9nWGlpt8tgJIWVp+g+9FUIWhXTF63IhzHYlb7QblXVA9mhR7Y8aoIabd4SygHj/FEwhfb2+J0Cc6APhxF6tUZiVphdSF5TXGk6x13iK0hvp4LCsMLxstRP m7X/wQYc JohNwyaz5VLD5aXIZwKUzaKqzS3UclTPYh3WC2AMzf0QQ9FBjFQP0gK7i+dNk35GJhpH9H75nwSQQIuL8B0SbwvyTAvyzbl4J9m9UiYSHdnyYq4QYvHFvixbuWxxNWmE8ciwit2SMyYzHDaLahDeYRSHE/qp1agJrQMtufzsr02GqHnNfhT3MSlCVS8+jDqxDjXkaPUcNJgM7hbh0r7CnZRqO0VZcEVLK09fla0feCiSQ/yGK+lSZhyLDtWUtaSWdMZz4BeDVp7tAwPqci0CrG7I+/P6b7ZnYjjaQ714I1gv/aDIUmSECnJKxW3zfqgVaxE6r5DDo+3KQOpVGJiUxCKAc8rnXo/grIMrcgSoRNNLMlWLkQokWb4zRgjR5ZltcQpRr8Xg/ipyr5fvD9llTuddQzA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, 2023-08-18 at 14:11 +0900, Hyeonggon Yoo wrote: > On Fri, Aug 11, 2023 at 3:52 PM Jay Patel > wrote: > > On Fri, 2023-08-11 at 02:54 +0900, Hyeonggon Yoo wrote: > > > On Thu, Jul 20, 2023 at 7:24 PM Jay Patel > > > > > > wrote: > > > > In the current implementation of the slub memory allocator, the > > > > slab > > > > order selection process follows these criteria: > > > > > > > > 1) Determine the minimum order required to serve the minimum > > > > number > > > > of > > > > objects (min_objects). This calculation is based on the formula > > > > (order > > > > = min_objects * object_size / PAGE_SIZE). > > > > 2) If the minimum order is greater than the maximum allowed > > > > order > > > > (slub_max_order), set slub_max_order as the order for this > > > > slab. > > > > 3) If the minimum order is less than the slub_max_order, > > > > iterate > > > > through a loop from minimum order to slub_max_order and check > > > > if > > > > the > > > > condition (rem <= slab_size / fract_leftover) holds true. Here, > > > > slab_size is calculated as (PAGE_SIZE << order), rem is > > > > (slab_size > > > > % > > > > object_size), and fract_leftover can have values of 16, 8, or > > > > 4. If > > > > the condition is true, select that order for the slab. > > > > > > > > > > > > However, in point 3, when calculating the fraction left over, > > > > it > > > > can > > > > result in a large range of values (like 1 Kb to 256 bytes on 4K > > > > page > > > > size & 4 Kb to 16 Kb on 64K page size with order 0 and goes on > > > > increasing with higher order) when compared to the remainder > > > > (rem). > > > > This > > > > can lead to the selection of an order that results in more > > > > memory > > > > wastage. To mitigate such wastage, we have modified point 3 as > > > > follows: > > > > To adjust the value of fract_leftover based on the page size, > > > > while > > > > retaining the current value as the default for a 4K page size. > > > > > > > > Test results are as follows: > > > > > > > > 1) On 160 CPUs with 64K Page size > > > > > > > > +-----------------+----------------+----------------+ > > > > > Total wastage in slub memory | > > > > +-----------------+----------------+----------------+ > > > > > | After Boot |After Hackbench | > > > > > Normal | 932 Kb | 1812 Kb | > > > > > With Patch | 729 Kb | 1636 Kb | > > > > > Wastage reduce | ~22% | ~10% | > > > > +-----------------+----------------+----------------+ > > > > > > > > +-----------------+----------------+----------------+ > > > > > Total slub memory | > > > > +-----------------+----------------+----------------+ > > > > > | After Boot | After Hackbench| > > > > > Normal | 1855296 | 2944576 | > > > > > With Patch | 1544576 | 2692032 | > > > > > Memory reduce | ~17% | ~9% | > > > > +-----------------+----------------+----------------+ > > > > > > > > hackbench-process-sockets > > > > +-------+-----+----------+----------+-----------+ > > > > > Amean | 1 | 1.2727 | 1.2450 | ( 2.22%) | > > > > > Amean | 4 | 1.6063 | 1.5810 | ( 1.60%) | > > > > > Amean | 7 | 2.4190 | 2.3983 | ( 0.86%) | > > > > > Amean | 12 | 3.9730 | 3.9347 | ( 0.97%) | > > > > > Amean | 21 | 6.9823 | 6.8957 | ( 1.26%) | > > > > > Amean | 30 | 10.1867 | 10.0600 | ( 1.26%) | > > > > > Amean | 48 | 16.7490 | 16.4853 | ( 1.60%) | > > > > > Amean | 79 | 28.1870 | 27.8673 | ( 1.15%) | > > > > > Amean | 110 | 39.8363 | 39.3793 | ( 1.16%) | > > > > > Amean | 141 | 51.5277 | 51.4907 | ( 0.07%) | > > > > > Amean | 172 | 62.9700 | 62.7300 | ( 0.38%) | > > > > > Amean | 203 | 74.5037 | 74.0630 | ( 0.59%) | > > > > > Amean | 234 | 85.6560 | 85.3587 | ( 0.35%) | > > > > > Amean | 265 | 96.9883 | 96.3770 | ( 0.63%) | > > > > > Amean | 296 | 108.6893 | 108.0870 | ( 0.56%) | > > > > +-------+-----+----------+----------+-----------+ > > > > > > > > 2) On 16 CPUs with 64K Page size > > > > > > > > +----------------+----------------+----------------+ > > > > > Total wastage in slub memory | > > > > +----------------+----------------+----------------+ > > > > > | After Boot | After Hackbench| > > > > > Normal | 273 Kb | 544 Kb | > > > > > With Patch | 260 Kb | 500 Kb | > > > > > Wastage reduce | ~5% | ~9% | > > > > +----------------+----------------+----------------+ > > > > > > > > +-----------------+----------------+----------------+ > > > > > Total slub memory | > > > > +-----------------+----------------+----------------+ > > > > > | After Boot | After Hackbench| > > > > > Normal | 275840 | 412480 | > > > > > With Patch | 272768 | 406208 | > > > > > Memory reduce | ~1% | ~2% | > > > > +-----------------+----------------+----------------+ > > > > > > > > hackbench-process-sockets > > > > +-------+----+---------+---------+-----------+ > > > > > Amean | 1 | 0.9513 | 0.9250 | ( 2.77%) | > > > > > Amean | 4 | 2.9630 | 2.9570 | ( 0.20%) | > > > > > Amean | 7 | 5.1780 | 5.1763 | ( 0.03%) | > > > > > Amean | 12 | 8.8833 | 8.8817 | ( 0.02%) | > > > > > Amean | 21 | 15.7577 | 15.6883 | ( 0.44%) | > > > > > Amean | 30 | 22.2063 | 22.2843 | ( -0.35%) | > > > > > Amean | 48 | 36.0587 | 36.1390 | ( -0.22%) | > > > > > Amean | 64 | 49.7803 | 49.3457 | ( 0.87%) | > > > > +-------+----+---------+---------+-----------+ > > > > > > > > Signed-off-by: Jay Patel > > > > --- > > > > Changes from V3 > > > > 1) Resolved error and optimise logic for all arch > > > > > > > > Changes from V2 > > > > 1) removed all page order selection logic for slab cache base > > > > on > > > > wastage. > > > > 2) Increasing fraction size base on page size (keeping current > > > > value > > > > as default to 4K page) > > > > > > > > Changes from V1 > > > > 1) If min_objects * object_size > PAGE_ALLOC_COSTLY_ORDER, then > > > > it > > > > will return with PAGE_ALLOC_COSTLY_ORDER. > > > > 2) Similarly, if min_objects * object_size < PAGE_SIZE, then it > > > > will > > > > return with slub_min_order. > > > > 3) Additionally, I changed slub_max_order to 2. There is no > > > > specific > > > > reason for using the value 2, but it provided the best results > > > > in > > > > terms of performance without any noticeable impact. > > > > > > > > mm/slub.c | 17 +++++++---------- > > > > 1 file changed, 7 insertions(+), 10 deletions(-) > > > > > > > > diff --git a/mm/slub.c b/mm/slub.c > > > > index c87628cd8a9a..8f6f38083b94 100644 > > > > --- a/mm/slub.c > > > > +++ b/mm/slub.c > > > > @@ -287,6 +287,7 @@ static inline bool > > > > kmem_cache_has_cpu_partial(struct kmem_cache *s) > > > > #define OO_SHIFT 16 > > > > #define OO_MASK ((1 << OO_SHIFT) - 1) > > > > #define MAX_OBJS_PER_PAGE 32767 /* since slab.objects is > > > > u15 > > > > */ > > > > +#define SLUB_PAGE_FRAC_SHIFT 12 > > > > > > > > /* Internal SLUB flags */ > > > > /* Poison object */ > > > > @@ -4117,6 +4118,7 @@ static inline int > > > > calculate_order(unsigned > > > > int size) > > > > unsigned int min_objects; > > > > unsigned int max_objects; > > > > unsigned int nr_cpus; > > > > + unsigned int page_size_frac; > > > > > > > > /* > > > > * Attempt to find best configuration for a slab. This > > > > @@ -4145,10 +4147,13 @@ static inline int > > > > calculate_order(unsigned > > > > int size) > > > > max_objects = order_objects(slub_max_order, size); > > > > min_objects = min(min_objects, max_objects); > > > > > > > > - while (min_objects > 1) { > > > > + page_size_frac = ((PAGE_SIZE >> SLUB_PAGE_FRAC_SHIFT) > > > > == 1) > > > > ? 0 > > > > + : PAGE_SIZE >> SLUB_PAGE_FRAC_SHIFT; > > > > + > > > > + while (min_objects >= 1) { > > > > unsigned int fraction; > > > > > > > > - fraction = 16; > > > > + fraction = 16 + page_size_frac; > > > > while (fraction >= 4) { > > > > > > Sorry I'm a bit late for the review. > > > > > > IIRC hexagon/powerpc can have ridiculously large page sizes (1M > > > or > > > 256KB) > > > (but I don't know if such config is actually used, tbh) so I > > > think > > > there should be > > > an upper bound. > > > > Hi, > > I think that might not be required as arch with larger page size > > will required larger fraction value as per this exit condition (rem > > <= > > slab_size / fract_leftover) during calc_slab_order. > > Okay, with 256KB pages the fraction will start from 80, and then 40, > 20, 10, 5, ... > and 1/80 of 256KB is about 3KB. So it's to waste less even when the > machine uses large page sizes, > because 1/16 of 256KB is still large, right? Yes correct, so with this approach we can save memory wastage and total memory for slub when using larger page size :) > > > > > order = calc_slab_order(size, > > > > min_objects, > > > > slub_max_order, > > > > fraction); > > > > @@ -4159,14 +4164,6 @@ static inline int > > > > calculate_order(unsigned > > > > int size) > > > > min_objects--; > > > > } > > > > - /* > > > > - * We were unable to place multiple objects in a slab. > > > > Now > > > > - * lets see if we can place a single object there. > > > > - */ > > > > - order = calc_slab_order(size, 1, slub_max_order, 1); > > > > - if (order <= slub_max_order) > > > > - return order; > > > > > > I'm not sure if it's okay to remove this? > > > It was fine in v2 because the least wasteful order was chosen > > > regardless of fraction but that's not true anymore. > > > > > Ok, So my though are like if single object in slab with slab_size = > > PAGE_SIZE << slub_max_order and it wastage more then 1\4th of > > slab_size > > then it's better to skip this part and use MAX_ORDER instead of > > slub_max_order. > > Could you kindly share your perspective on this part? > > I simply missed that part! :) > That looks fine to me. > > > > Tha > > nks > > Jay Patel > > > Otherwise, everything looks fine to me. I'm too dumb to > > > anticipate > > > the outcome of increasing the slab order :P but this patch does > > > not > > > sound crazy to me. > > > > > > Thanks! > > > -- > > > Hyeonggon