From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C557EDE980 for ; Thu, 14 Sep 2023 05:40:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C9C48D0014; Thu, 14 Sep 2023 01:40:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5785E8D0001; Thu, 14 Sep 2023 01:40:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4406C8D0014; Thu, 14 Sep 2023 01:40:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 34A968D0001 for ; Thu, 14 Sep 2023 01:40:36 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id EA26440FA5 for ; Thu, 14 Sep 2023 05:40:35 +0000 (UTC) X-FDA: 81234103230.17.1CA5AED Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf09.hostedemail.com (Postfix) with ESMTP id 3008614000A for ; Thu, 14 Sep 2023 05:40:32 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=aXw+8LfK; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf09.hostedemail.com: domain of jaypatel@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=jaypatel@linux.ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694670033; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7XFy6+vPvaXx15AitYtJEut+D8Yj6/vH5FwQ7ZbdCwg=; b=kTjjYfo6fgk1B6zX0wg1WfWN5hIkd8nuhZcf1O5W5Qyhu4keS+jHetLHYdoFWC11wbGysk Evhl2KLPjc9xrvqq+y+naS66PjYO5oRHhArKIpePTNYZEnsdpFYhqh/qVocg/mzHNS5uAT dNRb/yN4tXAC3eBpDD8vxVOe7M1rC3s= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=aXw+8LfK; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf09.hostedemail.com: domain of jaypatel@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=jaypatel@linux.ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694670033; a=rsa-sha256; cv=none; b=5t3+2RE0dCCPTqTojlVwL3ZXzt5VJK1ILzciQ+nRTVkytHOz/cnB7HtfBjDEo3W3Itq5Ae PKc+t/v+XiWCu5AV6i3cKAAro8ZneLzEH8S3NfIbQIK834pmAelK+o7zWNzPW5eu6u/Sjj zjVMR7oCypt8BQylZoGBZ29q9HPyjg0= Received: from pps.filterd (m0353726.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E5e0Jk031364; Thu, 14 Sep 2023 05:40:28 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : subject : from : reply-to : to : cc : date : in-reply-to : references : content-type : content-transfer-encoding : mime-version; s=pp1; bh=7XFy6+vPvaXx15AitYtJEut+D8Yj6/vH5FwQ7ZbdCwg=; b=aXw+8LfKsL9Q34Ms+O2QIWdOwA4yT+T3aOHQG5oYp9KCJCTvDAXd7kR9NFuOGJZoZWa9 TlcythIaiv4LHKKtk1a1zGGEO2HbVZ1Le4w/CVPXo8xYTDp9cD5+/O/syvzZbHnfFWa1 k2kPp7de41jovljYiseFnBadYNXKptXw7eA7aQa3+azgEQKZBYL8sQTm654eAu33YBav cv7QvhAncOKl/j2aqqo6+zA6RmBW2cUrr4lXqMtjtkpu9vxIEBO1i1+rxP2kdnRrcHKu XVQhcg3BwWehJEIIVme6oAZyNlubDxWShMgZLxhb0IWC9sbq+naKIi8yHp4DnOuok0Bs /g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3v13g80u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 05:40:27 +0000 Received: from m0353726.ppops.net (m0353726.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E5eKCK001790; Thu, 14 Sep 2023 05:40:24 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3v13g7mm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 05:40:23 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E3kPdx002362; Thu, 14 Sep 2023 05:40:17 GMT Received: from smtprelay07.wdc07v.mail.ibm.com ([172.16.1.74]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3t158kg6yg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 05:40:17 +0000 Received: from smtpav02.wdc07v.mail.ibm.com (smtpav02.wdc07v.mail.ibm.com [10.39.53.229]) by smtprelay07.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E5eFBe45089252 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 05:40:16 GMT Received: from smtpav02.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CFF9E5805C; Thu, 14 Sep 2023 05:40:15 +0000 (GMT) Received: from smtpav02.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A14125805B; Thu, 14 Sep 2023 05:40:11 +0000 (GMT) Received: from patel.in.ibm.com (unknown [9.109.195.146]) by smtpav02.wdc07v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 05:40:11 +0000 (GMT) Message-ID: <2e257eb4b3cc76f78619f5b8c9f95462421762d4.camel@linux.ibm.com> Subject: Re: [RFC PATCH v4] mm/slub: Optimize slub memory usage From: Jay Patel Reply-To: jaypatel@linux.ibm.com To: Vlastimil Babka , Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: linux-mm@kvack.org, cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, tsahu@linux.ibm.com, piyushs@linux.ibm.com Date: Thu, 14 Sep 2023 11:10:10 +0530 In-Reply-To: References: <20230720102337.2069722-1-jaypatel@linux.ibm.com> <7fdf3f5dfd9fa1b5e210cc4176cac58a9992ecc0.camel@linux.ibm.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-22.el8) X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: LWEKUV-FrvjWQgQOtEbr5Up48EQYVm5O X-Proofpoint-GUID: KMzpP_wrRZQ8QfiCQzW9_jxETeSttN8N Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-14_03,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 mlxscore=0 malwarescore=0 clxscore=1011 phishscore=0 priorityscore=1501 mlxlogscore=999 suspectscore=0 impostorscore=0 spamscore=0 lowpriorityscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140048 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 3008614000A X-Stat-Signature: 66dgtms89zrurohnw7opaic96szmi8j4 X-Rspam-User: X-HE-Tag: 1694670032-676935 X-HE-Meta: U2FsdGVkX1+6BdylaHN1R3yL8wGN8zZaokcI70EbRB75RufYakAUb/lgnGx6sUeq+OyLKqufwNpGvoxT2Un+voulm84YKTyIIt2EdkeGyO8L7y5ibr/xqJWMaTEkHlE109ZQeCC+9vDNqpV0GVtD6NeiYFuY1cP/8ciuuQiEE++M8P/FkGh9mp0q6LNL4H33u6ra2ePzpIs9WXK4KWKyHqjFkTAo/hSfa8ho29tNKw1xMeoc5HSoNEqo9VgC+pO/Ep76Ron8dHJUm7zDvnD/kDNvSpGw/hd87iKER8QcStrX0n73pKDes5CjMFPaSCGvUwA5yLrJrTLnYeflzzT/qmYXAP9+HE/Ji+oVG2mKarVxnVAL22QKzaObNX/o/Nbuno0bAPA/AxIies+A65z/r5Q1aKh8qkZ+nU1ICcC4AfdguD2aSl2l0IKqz+KJVCOPelOy+8mQkdpWxAPM6xWNEZTFp+Wu7KX+1s1gf+TugzPOGV3PFQuKoNyywfePVgKfkaItXZV3SQcn3NYuWBAEb96I103bdGjCG6UwiLAE7bzOZGeqOWx93uiZWkkFtlfpzxRzsT3wCWRdgJHsPiIXw/VGFhpgwMDOBf7ChHU9Rdn5JXJ87GngsAEwOkQpmGLXiz1Sqi5BRB0dp1KJQJQt5M1NxEwQdJOSVH+LZdTbnB6Hru5lbDQwtlklEpkcFhne21buQQ5iJNiGBZ1LzEwt0TE5zQC2FvgQWrvvofO2JA0QJcN8r+nW+V23TPI0kHZgmZMYvEDOp2XbhS1iZTmJTKv4x1EsOuiZCXsrAthn9ThLhjJmB1URD+H0Qygy6P3WpY06x6R0t9PjqLbGAO1Tf38GYQmmDzAf3RdVCpSoUbE//cB5kkaCAeyPJmZhiu3vAI0fNlAil/yFbbF5MXWvaLwSLq3fctIT5stb21IBFlQk1izKpGobKd4yKjFhwjR/Ln+xZzZadiBr6wDubkE Kwst0uxm nfTsjQfpM3RNZ3IazFGdGyaUdvdkAywzC77uGjW1vDerohyYeSqSeRXHfHc+11rfH6EEipbNk9XxeU/xtCiIj8sugiGyCxdWiiUrjF5p7QiOE18eDLEXnJnS0UDKNLqK4FDEOffwRwl01r7j9l1hNt37ruFrb+DW6aHSL9cOocE6RrdI37rhE7yNO7ViAUMdWiHzCVp9GZgLihzwEG1sFRaKcpEJloLKOkr6dptlwv2RDSVi0LThVNkhRiRjjlZ+C70UYTUV4ZJZRKfXSP2mrfxoa6m+36OMopMfECcsrUAfbfDa0Hi1NcA0JMVi2OnF+SLci0U1QIUJ6O55DcX8061Lm6ypR8FS/X+F/FFwwacfrMODXd0m2nz/1/tLJwOmxbWZlqh86gfq7/eanBqfFMbVTiCjNAx6Biwz1/nqvj5cG20IKxTRjLUw/Fcpa2JgaQFNn7ug6Dabclmg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 2023-09-07 at 15:42 +0200, Vlastimil Babka wrote: > On 8/24/23 12:52, Jay Patel wrote: > > On Fri, 2023-08-11 at 17:43 +0200, Vlastimil Babka wrote: > > > On 8/10/23 19:54, Hyeonggon Yoo wrote: > > > > > order = calc_slab_order(size, > > > > > min_objects, > > > > > slub_max_order, > > > > > fraction); > > > > > @@ -4159,14 +4164,6 @@ static inline int > > > > > calculate_order(unsigned > > > > > int size) > > > > > min_objects--; > > > > > } > > > > > - /* > > > > > - * We were unable to place multiple objects in a > > > > > slab. > > > > > Now > > > > > - * lets see if we can place a single object there. > > > > > - */ > > > > > - order = calc_slab_order(size, 1, slub_max_order, 1); > > > > > - if (order <= slub_max_order) > > > > > - return order; > > > > > > > > I'm not sure if it's okay to remove this? > > > > It was fine in v2 because the least wasteful order was chosen > > > > regardless of fraction but that's not true anymore. > > > > > > > > Otherwise, everything looks fine to me. I'm too dumb to > > > > anticipate > > > > the outcome of increasing the slab order :P but this patch does > > > > not > > > > sound crazy to me. > > > > > > I wanted to have a better idea how the orders change so I hacked > > > up a > > > patch > > > to print them for all sizes up to 1MB (unnecessarily large I > > > guess) > > > and also > > > for various page sizes and nr_cpus (that's however rather > > > invasive > > > and prone > > > to me missing some helper being used that still relies on real > > > PAGE_SHIFT), > > > then I applied v4 (needed some conflict fixups with my hack) on > > > top: > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-orders > > > > > > As expected, things didn't change with 4k PAGE_SIZE. With 64k > > > PAGE_SIZE, I > > > thought the patch in v4 form would result in lower orders, but > > > seems > > > not always? > > > > > > I.e. I can see before the patch: > > > > > > Calculated slab orders for page_shift 16 nr_cpus 1: > > > 8 0 > > > 4376 1 > > > > > > (so until 4368 bytes it keeps order at 0) > > > > > > And after: > > > 8 0 > > > 2264 1 > > > 2272 0 > > > 2344 1 > > > 2352 0 > > > 2432 1 > > > > > > Not sure this kind of "oscillation" is helpful with a small > > > machine > > > (1CPU), > > > and 64kB pages so the unused part of page is quite small. > > > > > Hi Vlastimil, > > > > With patch. it will cause the fraction_size to rise to 32 > > when utilizing a 64k page size. As a result, the maximum wastage > > cap > > for each slab cache will be 2k (64k divided by 32). Any object size > > exceeding this cap will be moved to order 1 or beyond due to which > > this > > oscillation is seen. > > Hi, sorry for the late reply. > > > > With 16 cpus, AFAICS the orders are also larger for some sizes. > > > Hm but you reported reduction of total slab memory which suggests > > > lower > > > orders were selected somewhere, so maybe I did some mistake.A > > > > AFAIK total slab memory is reduce because of two reason (with this > > patch for larger page size) > > 1) order for some slab cache is reduce (by increasing > > fraction_size) > > How can increased fraction_size ever result in a lower order? I think > it can > only result in increased order (or same order). And the simulations > with my > hack patch don't seem to counter example that. Note previously I did > expect > the order to be lower (or same) and was surprised by my results, but > now I > realized I misunderstood the v4 patch. Hi, Sorry for late reply as i was on vacation :) You're absolutely right. Increasing the fraction size won't reduce the order, and I apologize for any confusion in my previous response. > > > 2) Have also seen reduction in overall slab cache numbers as > > because of > > increasing page order > > I think your results might be just due to randomness and could turn > out > different with repeating the test, or converge to be the same if you > average > multiple runs. You posted them for "160 CPUs with 64K Page size" and > if I > add that combination to my hack print, I see the same result before > and > after your patch: > > Calculated slab orders for page_shift 16 nr_cpus 160: > 8 0 > 1824 1 > 3648 2 > 7288 3 > 174768 2 > 196608 3 > 524296 4 > > Still, I might have a bug there. Can you confirm there are actual > differences with a /proc/slabinfo before/after your patch? If there > are > none, any differences observed have to be due to randomness, not > differences > in order. Indeed, to eliminate randomness, I've consistently gathered data from /proc/slabinfo, and I can confirm a decrease in the total number of slab caches. Values as on 160 cpu system with 64k page size Without patch 24892 slab caches with patch 23891 slab caches > > Going back to the idea behind your patch, I don't think it makes > sense to > try increase the fraction only for higher-orders. Yes, with 1/16 > fraction, > the waste with 64kB page can be 4kB, while with 1/32 it will be just > 2kB, > and with 4kB this is only 256 vs 128bytes. However the object sizes > and > counts don't differ with page size, so with 4kB pages we'll have more > slabs > to host the same number of objects, and the waste will accumulate > accordingly - i.e. the fraction metric should be independent of page > size > wrt resulting total kilobytes of waste. > > So maybe the only thing we need to do is to try setting it to 32 > initial > value instead of 16 regardless of page size. That should hopefully > again > show a good tradeoff for 4kB as one of the earlier versions, while on > 64kB > it shouldn't cause much difference (again, none at all with 160 cpus, > some > difference with less than 128 cpus, if my simulations were correct). > Yes, We can modify the default fraction size to 32 for all page sizes. I've noticed that on a 160 CPU system with a 64K page size, there's a noticeable change in the total memory allocated for slabs – it decreases. Alright, I'll make the necessary changes to the patch, setting the fraction size default to 32, and I'll post v5 along with some performance metrics. > > > > Anyway my point here is that this evaluation approach might be > > > useful, even > > > if it's a non-upstreamable hack, and some postprocessing of the > > > output is > > > needed for easier comparison of before/after, so feel free to try > > > that out. > > > > Thank you for this details test :) > > > BTW I'll be away for 2 weeks from now, so further feedback will > > > have > > > to come > > > from others in that time... > > > > > Do we have any additional feedback from others on the same matter? > > > > Thank > > > > Jay Patel > > > > Thanks! > > > > -- > > > > Hyeonggon