From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98802ECAAA1 for ; Sun, 30 Oct 2022 21:30:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 38E7F6B0073; Sun, 30 Oct 2022 17:30:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3187F8E0005; Sun, 30 Oct 2022 17:30:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 193B26B0075; Sun, 30 Oct 2022 17:30:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 038F76B0073 for ; Sun, 30 Oct 2022 17:30:31 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CB4E91C6191 for ; Sun, 30 Oct 2022 21:30:30 +0000 (UTC) X-FDA: 80078909820.16.FC0816A Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf16.hostedemail.com (Postfix) with ESMTP id 08D5B180009 for ; Sun, 30 Oct 2022 21:30:29 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id CCE9B20C00; Sun, 30 Oct 2022 21:30:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1667165427; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/NY1rcDEamX9FhWGip4UCMQebRBOgw945um303QnZ04=; b=Uj0DUIpNi1R/OV3OnCC3nSV7gq8lqDRVHCUXiJCzmw1r/BLA0ke2vHiffszibHQVWgggGF az9XnUqCfUxmrDy+NOvuXJU2gkzeetHwZ1FX4ljDBkLYbDOU1VGyX3GmL0PaT0V6xU2WDb pg233ss2X7ls59R4g4NO+ygAVmX3p5s= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1667165427; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/NY1rcDEamX9FhWGip4UCMQebRBOgw945um303QnZ04=; b=n4W7wmdYiR1XLADlFighHpkY9mck/9eKG46k6bB956NvPY1M+hEYAJZSrBKK/7zXrGopym utvQOtE01XoFEdBg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 2AE5913A37; Sun, 30 Oct 2022 21:30:27 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id uYpNCfPsXmOTOAAAMHmgww (envelope-from ); Sun, 30 Oct 2022 21:30:27 +0000 Message-ID: Date: Sun, 30 Oct 2022 22:30:24 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.0 Subject: Re: [PATCH v6 1/4] mm/slub: enable debugging memory wasting of kmalloc Content-Language: en-US To: John Thomson , Feng Tang , Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Dmitry Vyukov , Jonathan Corbet , Andrey Konovalov Cc: Dave Hansen , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, Robin Murphy , John Garry , Kefeng Wang References: <20220913065423.520159-1-feng.tang@intel.com> <20220913065423.520159-2-feng.tang@intel.com> From: Vlastimil Babka In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667165430; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/NY1rcDEamX9FhWGip4UCMQebRBOgw945um303QnZ04=; b=oKf4uwgzAPw/hzLpDIVMvhSln3LN3ig8B9ksss04plZLpPVTECFEUNf70ZKepR7G6QbaF5 JIoFRIg+Vnd9pRIeLEEnJwb6Trs9arw6lZ4drQCRASCXTjohJ9IQ9OJBCnTxlrTN0e/AyO BmUJJtWuldM88jNPi2SO12hSEnLdJPw= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=Uj0DUIpN; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=n4W7wmdY; spf=pass (imf16.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667165430; a=rsa-sha256; cv=none; b=VrEgrRkmT7yneo+y0BPA8022pSVfFmHXnwkoM6k+P1Dud8aYe5wR1DIUYvG6X46lGHIDIb l0QdNn76P3NNkg3+qql6MHYqPvcSxdfNCVlRlBrtgpauFRKYxtV+3H2B5eh9AQR81f7D4S Z9B+qLgKTATYjnB/rJx5A0zDdf9U4p4= X-Stat-Signature: 7dj7oyiusxnqj4k9ear81tt3x4o1scgd X-Rspamd-Queue-Id: 08D5B180009 Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=Uj0DUIpN; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=n4W7wmdY; spf=pass (imf16.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1667165429-914503 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 10/30/22 20:23, John Thomson wrote: > On Tue, 13 Sep 2022, at 06:54, Feng Tang wrote: >> kmalloc's API family is critical for mm, with one nature that it will >> round up the request size to a fixed one (mostly power of 2). Say >> when user requests memory for '2^n + 1' bytes, actually 2^(n+1) bytes >> could be allocated, so in worst case, there is around 50% memory >> space waste. > > > I have a ralink mt7621 router running Openwrt, using the mips ZBOOT kernel, and appear to have bisected > a very-nearly-clean kernel v6.1rc-2 boot issue to this commit. > I have 3 commits atop 6.1-rc2: fix a ZBOOT compile error, use the Openwrt LZMA options, > and enable DEBUG_ZBOOT for my platform. I am compiling my kernel within the Openwrt build system. > No guarantees this is not due to something I am doing wrong, but any insight would be greatly appreciated. > > > On UART, No indication of the (once extracted) kernel booting: > > transfer started ......................................... transfer ok, time=2.01s > setting up elf image... OK > jumping to kernel code > zimage at: 80BA4100 810D4720 > Uncompressing Linux at load address 80001000 > Copy device tree to address 80B96EE0 > Now, booting the kernel... It's weird that the commit would cause no output so early, SLUB code is run only later. > Nothing follows > > 6edf2576a6cc ("mm/slub: enable debugging memory wasting of kmalloc") reverted, normal boot: > transfer started ......................................... transfer ok, time=2.01s > setting up elf image... OK > jumping to kernel code > zimage at: 80BA4100 810D47A4 > Uncompressing Linux at load address 80001000 > Copy device tree to address 80B96EE0 > Now, booting the kernel... > > [ 0.000000] Linux version 6.1.0-rc2 (john@john) (mipsel-openwrt-linux-musl-gcc (OpenWrt GCC 11.3.0 r19724+16-1521d5f453) 11.3.0, GNU ld (GNU Binutils) 2.37) #0 SMP Fri Oct 28 03:48:10 2022 > [ 0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3 > [ 0.000000] printk: bootconsole [early0] enabled > [ 0.000000] CPU0 revision is: 0001992f (MIPS 1004Kc) > [ 0.000000] MIPS: machine is MikroTik RouterBOARD 760iGS > [ 0.000000] Initrd not found or empty - disabling initrd > [ 0.000000] VPE topology {2,2} total 4 > [ 0.000000] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes. > [ 0.000000] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes > [ 0.000000] MIPS secondary cache 256kB, 8-way, linesize 32 bytes. > [ 0.000000] Zone ranges: > [ 0.000000] Normal [mem 0x0000000000000000-0x000000000fffffff] > [ 0.000000] HighMem empty > [ 0.000000] Movable zone start for each node > [ 0.000000] Early memory node ranges > [ 0.000000] node 0: [mem 0x0000000000000000-0x000000000fffffff] > [ 0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000000fffffff] > [ 0.000000] percpu: Embedded 11 pages/cpu s16064 r8192 d20800 u45056 > [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 64960 > [ 0.000000] Kernel command line: console=ttyS0,115200 rootfstype=squashfs,jffs2 > [ 0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes, linear) > [ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear) > [ 0.000000] Writing ErrCtl register=00019146 > [ 0.000000] Readback ErrCtl register=00019146 > [ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off > [ 0.000000] Memory: 246220K/262144K available (7455K kernel code, 628K rwdata, 1308K rodata, 3524K init, 245K bss, 15924K reserved, 0K cma-reserved, 0K highmem) > [ 0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 > [ 0.000000] rcu: Hierarchical RCU implementation. > > > boot continues as expected > > > possibly relevant config options: > grep -E '(SLUB|SLAB)' .config > # SLAB allocator options > # CONFIG_SLAB is not set > CONFIG_SLUB=y > CONFIG_SLAB_MERGE_DEFAULT=y > # CONFIG_SLAB_FREELIST_RANDOM is not set > # CONFIG_SLAB_FREELIST_HARDENED is not set > # CONFIG_SLUB_STATS is not set > CONFIG_SLUB_CPU_PARTIAL=y > # end of SLAB allocator options > # CONFIG_SLUB_DEBUG is not set Also not having CONFIG_SLUB_DEBUG enabled means most of the code the patch/commit touches is not even active. Could this be some miscompile or code layout change exposing some different bug, hmm. Is it any different if you do enable CONFIG_SLUB_DEBUG ? Or change to CONFIG_SLAB? (that would be really weird if not) > > With this commit reverted: cpuinfo and meminfo > > system type : MediaTek MT7621 ver:1 eco:3 > machine : MikroTik RouterBOARD 760iGS > processor : 0 > cpu model : MIPS 1004Kc V2.15 > BogoMIPS : 586.13 > wait instruction : yes > microsecond timers : yes > tlb_entries : 32 > extra interrupt vector : yes > hardware watchpoint : yes, count: 4, address/irw mask: [0x0ffc, 0x0ffc, 0x0ffb, 0x0ffb] > isa : mips1 mips2 mips32r1 mips32r2 > ASEs implemented : mips16 dsp mt > Options implemented : tlb 4kex 4k_cache prefetch mcheck ejtag llsc pindexed_dcache userlocal vint perf_cntr_intr_bit cdmm perf > shadow register sets : 1 > kscratch registers : 0 > package : 0 > core : 0 > VPE : 0 > VCED exceptions : not available > VCEI exceptions : not available > > processor : 1 > cpu model : MIPS 1004Kc V2.15 > BogoMIPS : 586.13 > wait instruction : yes > microsecond timers : yes > tlb_entries : 32 > extra interrupt vector : yes > hardware watchpoint : yes, count: 4, address/irw mask: [0x0ffc, 0x0ffc, 0x0ffb, 0x0ffb] > isa : mips1 mips2 mips32r1 mips32r2 > ASEs implemented : mips16 dsp mt > Options implemented : tlb 4kex 4k_cache prefetch mcheck ejtag llsc pindexed_dcache userlocal vint perf_cntr_intr_bit cdmm perf > shadow register sets : 1 > kscratch registers : 0 > package : 0 > core : 0 > VPE : 1 > VCED exceptions : not available > VCEI exceptions : not available > > processor : 2 > cpu model : MIPS 1004Kc V2.15 > BogoMIPS : 586.13 > wait instruction : yes > microsecond timers : yes > tlb_entries : 32 > extra interrupt vector : yes > hardware watchpoint : yes, count: 4, address/irw mask: [0x0ffc, 0x0ffc, 0x0ffb, 0x0ffb] > isa : mips1 mips2 mips32r1 mips32r2 > ASEs implemented : mips16 dsp mt > Options implemented : tlb 4kex 4k_cache prefetch mcheck ejtag llsc pindexed_dcache userlocal vint perf_cntr_intr_bit cdmm perf > shadow register sets : 1 > kscratch registers : 0 > package : 0 > core : 1 > VPE : 0 > VCED exceptions : not available > VCEI exceptions : not available > > processor : 3 > cpu model : MIPS 1004Kc V2.15 > BogoMIPS : 586.13 > wait instruction : yes > microsecond timers : yes > tlb_entries : 32 > extra interrupt vector : yes > hardware watchpoint : yes, count: 4, address/irw mask: [0x0ffc, 0x0ffc, 0x0ffb, 0x0ffb] > isa : mips1 mips2 mips32r1 mips32r2 > ASEs implemented : mips16 dsp mt > Options implemented : tlb 4kex 4k_cache prefetch mcheck ejtag llsc pindexed_dcache userlocal vint perf_cntr_intr_bit cdmm perf > shadow register sets : 1 > kscratch registers : 0 > package : 0 > core : 1 > VPE : 1 > VCED exceptions : not available > VCEI exceptions : not available > > MemTotal: 249744 kB > MemFree: 211088 kB > MemAvailable: 187364 kB > Buffers: 0 kB > Cached: 8824 kB > SwapCached: 0 kB > Active: 1104 kB > Inactive: 8860 kB > Active(anon): 1104 kB > Inactive(anon): 8860 kB > Active(file): 0 kB > Inactive(file): 0 kB > Unevictable: 0 kB > Mlocked: 0 kB > HighTotal: 0 kB > HighFree: 0 kB > LowTotal: 249744 kB > LowFree: 211088 kB > SwapTotal: 0 kB > SwapFree: 0 kB > Dirty: 0 kB > Writeback: 0 kB > AnonPages: 1192 kB > Mapped: 2092 kB > Shmem: 8824 kB > KReclaimable: 1704 kB > Slab: 9372 kB > SReclaimable: 1704 kB > SUnreclaim: 7668 kB > KernelStack: 592 kB > PageTables: 264 kB > SecPageTables: 0 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > WritebackTmp: 0 kB > CommitLimit: 124872 kB > Committed_AS: 14676 kB > VmallocTotal: 1040376 kB > VmallocUsed: 2652 kB > VmallocChunk: 0 kB > Percpu: 272 kB > > > Cheers, >