From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11574C4332F for ; Thu, 3 Nov 2022 18:03:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 720EE6B0072; Thu, 3 Nov 2022 14:03:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D0E36B0073; Thu, 3 Nov 2022 14:03:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E7896B0074; Thu, 3 Nov 2022 14:03:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4F2BC6B0072 for ; Thu, 3 Nov 2022 14:03:35 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 11005C02D5 for ; Thu, 3 Nov 2022 18:03:35 +0000 (UTC) X-FDA: 80092903590.10.CCC7846 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf09.hostedemail.com (Postfix) with ESMTP id 6918114000D for ; Thu, 3 Nov 2022 18:03:33 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C96CEB80AD6; Thu, 3 Nov 2022 18:03:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DB9F1C433D6; Thu, 3 Nov 2022 18:03:27 +0000 (UTC) Date: Thu, 3 Nov 2022 18:03:24 +0000 From: Catalin Marinas To: Vlastimil Babka Cc: Greg Kroah-Hartman , Linus Torvalds , Arnd Bergmann , Will Deacon , Marc Zyngier , Andrew Morton , Herbert Xu , Ard Biesheuvel , Christoph Hellwig , Isaac Manjarres , Saravana Kannan , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, Feng Tang Subject: Re: [PATCH v2 2/2] treewide: Add the __GFP_PACKED flag to several non-DMA kmalloc() allocations Message-ID: References: <20221025205247.3264568-1-catalin.marinas@arm.com> <20221025205247.3264568-3-catalin.marinas@arm.com> <4c7e3762-ebc3-c2cc-7ea5-71d4ea97e327@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4c7e3762-ebc3-c2cc-7ea5-71d4ea97e327@suse.cz> ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667498613; a=rsa-sha256; cv=none; b=Cnus6ex0XuYg8Es29P3rxbcIBdkVRNc7k5LVZV1n8mGkX50U753TMEM9eOnceRbVlV8qr3 euqL9ww4qNjiPMacGWdRHiDlbvVo9bwjB9C8aCstQMrxab4dpv3WjY/nzVzmGHYzEJHNnD 0gGj8FKa+g0Vudq9wV/D64JaTNrE0hc= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf09.hostedemail.com: domain of cmarinas@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667498613; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Yq7EWXq4cDffRyScAVjX/GVwUNPkwZmU/o7TL0g/8U4=; b=vCO1uOu8q3jvcCzDpSHA2Yt/BtdyFgYa2isTZ9/6mc/QNYxTbreOxqAI8TltEPBr6uDtv1 5ABbUvqkyrlo6VQCh/GJ/xehPOgBP1wRyHWicBm7uMwazE7RRMJp/+gWZ7qXBURhf4vmkM xN8Qpy/Ldx0xEtMRdiiacjhsdvzBnWY= X-Stat-Signature: 9iuyb3c9jxt5eqp8chps1ds58wqktba6 X-Rspamd-Queue-Id: 6918114000D Authentication-Results: imf09.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf09.hostedemail.com: domain of cmarinas@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=cmarinas@kernel.org X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1667498613-931093 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 03, 2022 at 05:15:51PM +0100, Vlastimil Babka wrote: > On 10/26/22 11:48, Catalin Marinas wrote: > >> > diff --git a/lib/kobject.c b/lib/kobject.c > >> > index a0b2dbfcfa23..2c4acb36925d 100644 > >> > --- a/lib/kobject.c > >> > +++ b/lib/kobject.c > >> > @@ -144,7 +144,7 @@ char *kobject_get_path(struct kobject *kobj, gfp_t gfp_mask) > >> > len = get_kobj_path_length(kobj); > >> > if (len == 0) > >> > return NULL; > >> > - path = kzalloc(len, gfp_mask); > >> > + path = kzalloc(len, gfp_mask | __GFP_PACKED); > >> > >> This might not be small, and it's going to be very very short-lived > >> (within a single function call), why does it need to be allocated this > >> way? > > > > Regarding short-lived objects, you are right, they won't affect > > slabinfo. My ftrace-fu is not great, I only looked at the allocation > > hits and they keep adding up without counting how many are > > freed. So maybe we need tracing free() as well but not always easy to > > match against the allocation point and infer how many live objects there > > are. > > BTW, since 6.1-rc1 we have a new way with slub_debug to determine how much > memory is wasted, thanks to commit 6edf2576a6cc ("mm/slub: enable debugging > memory wasting of kmalloc") by Feng Tang. > > You need to boot the kernel with parameter such as: > slub_debug=U,kmalloc-64,kmalloc-128,kmalloc-192,kmalloc-256 > (or just slub_debug=U,kmalloc-* for all sizes, but I guess you are > interested mainly in those that are affected by DMA alignment) > Note it does have some alloc/free CPU overhead and memory overhead, so not > intended for normal production. > > Then you can check e.g. > cat /sys/kernel/debug/slab/kmalloc-128/alloc_traces | head -n 50 > 77 set_kthread_struct+0x60/0x100 waste=1232/16 age=19492/31067/32465 pid=2 cpus=0-3 > __kmem_cache_alloc_node+0x102/0x340 > kmalloc_trace+0x26/0xa0 > set_kthread_struct+0x60/0x100 > copy_process+0x1903/0x2ee0 > kernel_clone+0xf4/0x4f0 > kernel_thread+0xae/0xe0 > kthreadd+0x491/0x500 > ret_from_fork+0x22/0x30 > > which tells you there are currently 77 live allocations with this exact > stack trace. The new information in 6.1 is the "waste=1232/16" which > means these allocations waste 16 bytes each due to rounding up to the > kmalloc cache size, or 1232 bytes in total (16*77). This should help > finding the prominent sources of waste. Thanks. That's a lot more useful than ftrace for this scenario. At a quick test in a VM, the above reports about 1200 cases but there are only around 100 unique allocation places (e.g. kstrdup called from several places with different sizes). So not too bad if we are to go with a GFP_ flag. -- Catalin