All of lore.kernel.org
 help / color / mirror / Atom feed
* + percpu-internal-pcpu_chunk-re-layout-pcpu_chunk-structure-to-reduce-false-sharing.patch added to mm-unstable branch
@ 2023-06-12 21:43 Andrew Morton
  0 siblings, 0 replies; only message in thread
From: Andrew Morton @ 2023-06-12 21:43 UTC (permalink / raw)
  To: mm-commits, tim.c.chen, shakeelb, Liam.Howlett, dennis,
	dave.hansen, dan.j.williams, yu.ma, akpm


The patch titled
     Subject: percpu-internal/pcpu_chunk: re-layout pcpu_chunk structure to reduce false sharing
has been added to the -mm mm-unstable branch.  Its filename is
     percpu-internal-pcpu_chunk-re-layout-pcpu_chunk-structure-to-reduce-false-sharing.patch

This patch will shortly appear at
     https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/percpu-internal-pcpu_chunk-re-layout-pcpu_chunk-structure-to-reduce-false-sharing.patch

This patch will later appear in the mm-unstable branch at
    git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days

------------------------------------------------------
From: Yu Ma <yu.ma@intel.com>
Subject: percpu-internal/pcpu_chunk: re-layout pcpu_chunk structure to reduce false sharing
Date: Fri, 9 Jun 2023 23:07:30 -0400

When running UnixBench/Execl throughput case, false sharing is observed
due to frequent read on base_addr and write on free_bytes, chunk_md.

UnixBench/Execl represents a class of workload where bash scripts are
spawned frequently to do some short jobs.  It will do system call on execl
frequently, and execl will call mm_init to initialize mm_struct of the
process.  mm_init will call __percpu_counter_init for percpu_counters
initialization.  Then pcpu_alloc is called to read the base_addr of
pcpu_chunk for memory allocation.  Inside pcpu_alloc, it will call
pcpu_alloc_area to allocate memory from a specified chunk.  This function
will update "free_bytes" and "chunk_md" to record the rest free bytes and
other meta data for this chunk.  Correspondingly, pcpu_free_area will also
update these 2 members when free memory.

Call trace from perf is as below:
+   57.15%  0.01%  execl   [kernel.kallsyms] [k] __percpu_counter_init
+   57.13%  0.91%  execl   [kernel.kallsyms] [k] pcpu_alloc
-   55.27% 54.51%  execl   [kernel.kallsyms] [k] osq_lock
   - 53.54% 0x654278696e552f34
        main
        __execve
        entry_SYSCALL_64_after_hwframe
        do_syscall_64
        __x64_sys_execve
        do_execveat_common.isra.47
        alloc_bprm
        mm_init
        __percpu_counter_init
        pcpu_alloc
      - __mutex_lock.isra.17

In current pcpu_chunk layout, `base_addr' is in the same cache line with
`free_bytes' and `chunk_md', and `base_addr' is at the last 8 bytes.  This
patch moves `bound_map' up to `base_addr', to let `base_addr' locate in a
new cacheline.

With this change, on Intel Sapphire Rapids 112C/224T platform, based on
v6.4-rc4, the 160 parallel score improves by 24%.

Link: https://lkml.kernel.org/r/20230610030730.110074-1-yu.ma@intel.com
Signed-off-by: Yu Ma <yu.ma@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/percpu-internal.h |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

--- a/mm/percpu-internal.h~percpu-internal-pcpu_chunk-re-layout-pcpu_chunk-structure-to-reduce-false-sharing
+++ a/mm/percpu-internal.h
@@ -41,10 +41,17 @@ struct pcpu_chunk {
 	struct list_head	list;		/* linked to pcpu_slot lists */
 	int			free_bytes;	/* free bytes in the chunk */
 	struct pcpu_block_md	chunk_md;
-	void			*base_addr;	/* base address of this chunk */
+	unsigned long		*bound_map;	/* boundary map */
+
+	/*
+	 * base_addr is the base address of this chunk.
+	 * To reduce false sharing, current layout is optimized to make sure
+	 * base_addr locate in the different cacheline with free_bytes and
+	 * chunk_md.
+	 */
+	void			*base_addr ____cacheline_aligned_in_smp;
 
 	unsigned long		*alloc_map;	/* allocation map */
-	unsigned long		*bound_map;	/* boundary map */
 	struct pcpu_block_md	*md_blocks;	/* metadata blocks */
 
 	void			*data;		/* chunk data */
_

Patches currently in -mm which might be from yu.ma@intel.com are

percpu-internal-pcpu_chunk-re-layout-pcpu_chunk-structure-to-reduce-false-sharing.patch


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-06-12 21:43 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-12 21:43 + percpu-internal-pcpu_chunk-re-layout-pcpu_chunk-structure-to-reduce-false-sharing.patch added to mm-unstable branch Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.