From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 42B982DA775 for ; Tue, 12 May 2026 22:02:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778623364; cv=none; b=Tg/5D0Imeg3PZKtzLLlACFGb7xsi9jFHCYi4wVZr0wMD+CqiOVUjR7AY5vie87oHu5S460XNel/sAGjTS0LYcEQBolTxUo0+G3rPgBZgn8sC6tPWmLAJbr2wNeoztZCw0IkweTw4WKkxxz9lIpR32axQhkoW3Kb5YiEIUbC5Yo8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778623364; c=relaxed/simple; bh=BL15RunQEIVSR2uNdDuz9B5gcAb9x4mlfAjhcn+LJt8=; h=Date:To:From:Subject:Message-Id; b=uk9vA8c44CdZtgH4a0e7TZODIxSucjLCZs6qQm4S6gsDAObJUo1xmbjOYENRsIbsutw2GE2Jq0ZAYGGg1zlwO5aOT3QT6bg8K8p9freR1p+hkm66pV/Xt/GyixwCFfQoEQ/RsJ3GH9FkcFb9W5+cPBx9DcfJylX/zewzg0ie0z0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=EBmtnx69; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="EBmtnx69" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C51CFC2BCB0; Tue, 12 May 2026 22:02:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1778623363; bh=BL15RunQEIVSR2uNdDuz9B5gcAb9x4mlfAjhcn+LJt8=; h=Date:To:From:Subject:From; b=EBmtnx69SLPqmrSCtsz2c5vsS/GFpT2zB9ZoF57an9S5cvyIWIHNKypTYoBQ8m1YX MGsU6p6PSPkhx9qOaM38HuA0bdQojAH4YVl2LEDcy82m6OuN3KRYcvywoY8fzYXII+ x2xOasM3mFD+HI7TCZ3l0T1tIxWdSwm98xzdusng= Date: Tue, 12 May 2026 15:02:43 -0700 To: mm-commits@vger.kernel.org,tj@kernel.org,dennis@kernel.org,zenghongling@kylinos.cn,akpm@linux-foundation.org From: Andrew Morton Subject: + mm-percpu-internalh-optimise-pcpu_chunk-struct-to-save-memory.patch added to mm-new branch Message-Id: <20260512220243.C51CFC2BCB0@smtp.kernel.org> Precedence: bulk X-Mailing-List: mm-commits@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: The patch titled Subject: mm/percpu-internal.h: optimise pcpu_chunk struct to save memory has been added to the -mm mm-new branch. Its filename is mm-percpu-internalh-optimise-pcpu_chunk-struct-to-save-memory.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-percpu-internalh-optimise-pcpu_chunk-struct-to-save-memory.patch This patch will later appear in the mm-new branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Note, mm-new is a provisional staging ground for work-in-progress patches, and acceptance into mm-new is a notification for others take notice and to finish up reviews. Please do not hesitate to respond to review feedback and post updated versions to replace or incrementally fixup patches in mm-new. The mm-new branch of mm.git is not included in linux-next If a few days of testing in mm-new is successful, the patch will me moved into mm.git's mm-unstable branch, which is included in linux-next Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via various branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there most days ------------------------------------------------------ From: zenghongling Subject: mm/percpu-internal.h: optimise pcpu_chunk struct to save memory Date: Mon, 11 May 2026 15:03:09 +0800 Using pahole, we can see that there are some padding holes in the current pcpu_chunk structure,Adjusting the layout of pcpu_chunk can reduce these holes,decreasing its size from 192 bytes to 128 bytes and eliminating a wasted cache line. With allmodconfig (CONFIG_PERCPU_STATS + NEED_PCPUOBJ_EXT) Before: /* size: 256, cachelines: 4, members: 19 */ After: /* size: 192, cachelines: 3, members: 19 */ with NEED_PCPUOBJ_EXT Before: struct pcpu_chunk { struct list_head list; /* 0 16 */ int free_bytes; /* 16 4 */ struct pcpu_block_md chunk_md; /* 20 32 */ /* XXX 4 bytes hole, try to pack */ long unsigned int * bound_map; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ void * base_addr __attribute__((__aligned__(64))); /* 64 8 */ long unsigned int * alloc_map; /* 72 8 */ struct pcpu_block_md * md_blocks; /* 80 8 */ void * data; /* 88 8 */ bool immutable; /* 96 1 */ bool isolated; /* 97 1 */ /* XXX 2 bytes hole, try to pack */ int start_offset; /* 100 4 */ int end_offset; /* 104 4 */ /* XXX 4 bytes hole, try to pack */ struct obj_cgroup * * obj_cgroups; /* 112 8 */ int nr_pages; /* 120 4 */ int nr_populated; /* 124 4 */ /* --- cacheline 2 boundary (128 bytes) --- */ int nr_empty_pop_pages; /* 128 4 */ /* XXX 4 bytes hole, try to pack */ long unsigned int populated[]; /* 136 0 */ /* size: 192, cachelines: 3, members: 17 */ /* sum members: 122, holes: 4, sum holes: 14 */ /* padding: 56 */ /* forced alignments: 1 */ } __attribute__((__aligned__(64))); After: struct pcpu_chunk { struct list_head list; /* 0 16 */ int free_bytes; /* 16 4 */ struct pcpu_block_md chunk_md; /* 20 32 */ /* XXX 4 bytes hole, try to pack */ long unsigned int * bound_map; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ void * base_addr __attribute__((__aligned__(64))); /* 64 8 */ long unsigned int * alloc_map; /* 72 8 */ struct pcpu_block_md * md_blocks; /* 80 8 */ void * data; /* 88 8 */ bool immutable; /* 96 1 */ bool isolated; /* 97 1 */ /* XXX 2 bytes hole, try to pack */ int start_offset; /* 100 4 */ int end_offset; /* 104 4 */ int nr_pages; /* 108 4 */ int nr_populated; /* 112 4 */ int nr_empty_pop_pages; /* 116 4 */ struct obj_cgroup * * obj_cgroups; /* 120 8 */ /* --- cacheline 2 boundary (128 bytes) --- */ long unsigned int populated[]; /* 128 0 */ /* size: 128, cachelines: 2, members: 17 */ /* sum members: 122, holes: 2, sum holes: 6 */ /* forced alignments: 1 */ } __attribute__((__aligned__(64))); Link: https://lore.kernel.org/20260511070309.44044-1-zenghongling@kylinos.cn Signed-off-by: zenghongling Suggested-by: Dennis Zhou Cc: Dennis Zhou Cc: Tejun Heo Signed-off-by: Andrew Morton --- mm/percpu-internal.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/mm/percpu-internal.h~mm-percpu-internalh-optimise-pcpu_chunk-struct-to-save-memory +++ a/mm/percpu-internal.h @@ -77,13 +77,13 @@ struct pcpu_chunk { int end_offset; /* additional area required to have the region end page aligned */ + int nr_pages; /* # of pages served by this chunk */ + int nr_populated; /* # of populated pages */ + int nr_empty_pop_pages; /* # of empty populated pages */ #ifdef NEED_PCPUOBJ_EXT struct pcpuobj_ext *obj_exts; /* vector of object cgroups */ #endif - int nr_pages; /* # of pages served by this chunk */ - int nr_populated; /* # of populated pages */ - int nr_empty_pop_pages; /* # of empty populated pages */ unsigned long populated[]; /* populated bitmap */ }; _ Patches currently in -mm which might be from zenghongling@kylinos.cn are mm-percpu-internalh-optimise-pcpu_chunk-struct-to-save-memory.patch