All of lore.kernel.org
 help / color / mirror / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>, <linux-doc@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	"Bagas Sanjaya" <bagasdotme@gmail.com>,
	Shakeel Butt <shakeelb@google.com>,
	"Arnaldo Carvalho de Melo" <acme@redhat.com>,
	Joe Mario <jmario@redhat.com>, "Ingo Molnar" <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Eric Dumazet <edumazet@google.com>,
	"Thomas Gleixner" <tglx@linutronix.de>, <dave.hansen@intel.com>,
	<ying.huang@intel.com>, <tim.c.chen@intel.com>,
	<andi.kleen@intel.com>
Subject: Re: [PATCH v3] Documentation: Add document for false sharing
Date: Thu, 6 Apr 2023 09:14:27 +0800	[thread overview]
Message-ID: <ZC4c8615XM8dfxa+@feng-clx> (raw)
In-Reply-To: <2520c3d060e7b77560ea32a4b132d8e1a5f14ac9.camel@linux.intel.com>

Hi Tim,

On Wed, Apr 05, 2023 at 03:38:20PM -0700, Tim Chen wrote:
> Thanks for the update. Looks good.  Some minor nits to improve readability.

Many thanks for the review!

> 
> On Tue, 2023-04-04 at 13:22 +0800, Feng Tang wrote:
> > 
> > +
> > +False sharing could easily happen unless they are intentionally
> > +checked, and it is valuable to run specific tools for performance
> > +critical workloads to detect false sharing affecting performance case
> > +and optimize accordingly.
> > +
> > +
> > +How to detect and analysis False Sharing
> 
> s/analysis/analyse

Will correct it.

> > +========================================
> > +perf record/report/stat are widely used for performance tuning, and
> > +once hotspots are detected, tools like 'perf-c2c' and 'pahole' can
> > +be further used to detect and pinpoint the possible false sharing
> > +data structures.  'addr2line' is also good at decoding instruction
> > +pointer when there are multiple layers of inline functions.
[...]
> > +Possible Mitigations
> > +====================
> > +False sharing does not always need to be mitigated.  False sharing
> > +mitigations need to balance performance gains with complexity and
> 
> s/need to/should/
 
Will change.

> > +space consumption.  Sometimes, lower performance is OK, and it's
> > +unnecessary to hyper-optimize every rarely used data structure or
> > +a cold data path.
> > +
> > +False sharing hurting performance cases are seen more frequently with
> > +core count increasing.  Because of these detrimental effects, many
> > +patches have been proposed across variety of subsystems (like
> > +networking and memory management) and merged.  Some common mitigations
> > +(with examples) are:
> > +
> > +* Separate hot global data in its own dedicated cache line, even if it
> > +  is just a 'short' type. The downside is more consumption of memory,
> > +  cache line and TLB entries.
> > +
> > +  - Commit 91b6d3256356 ("net: cache align tcp_memory_allocated, tcp_sockets_allocated")
> > +
> > +* Reorganize the data structure, separate the interfering members to
> > +  different cache lines.  One downside is it may introduce new false
> > +  sharing of other members.
> > +
> > +  - Commit 802f1d522d5f ("mm: page_counter: re-layout structure to reduce false sharing")
> > +
> > +* Replace 'write' with 'read' when possible, especially in loops.
> > +  Like for some global variable, use compare(read)-then-write instead
> > +  of unconditional write. For example, use::
> > +
> > +	if (!test_bit(XXX))
> > +		set_bit(XXX);
> > +
> > +  instead of directly "set_bit(XXX);", similarly for atomic_t data::
> > +
> > +	if (atomic_read(XXX) == AAA)
> > +		atomic_set(XXX, BBB);
> > +
> > +  - Commit 7b1002f7cfe5 ("bcache: fixup bcache_dev_sectors_dirty_add() multithreaded CPU false sharing")
> > +  - Commit 292648ac5cf1 ("mm: gup: allow FOLL_PIN to scale in SMP")
> > +
> > +* Turn hot global data to 'per-cpu data + global data' when possible,
> > +  or reasonably increase the threshold for syncing per-cpu data to
> > +  global data, to reduce or postpone the 'write' to that global data.
> > +
> > +  - Commit 520f897a3554 ("ext4: use percpu_counters for extent_status cache hits/misses")
> > +  - Commit 56f3547bfa4d ("mm: adjust vm_committed_as_batch according to vm overcommit policy")
> > +
> > +Surely, all mitigations should be carefully verified to not cause side
> > +effects.  And to avoid false sharing in advance during coding, it's
> 
> Maybe say
> To avoid introducing false sharing when coding

Yes, this is better.

> > +better to:
> > +
> > +* Be aware of cache line boundaries
> > +* Group mostly read-only fields together
> > +* Group things that are written at the same time together
> > +* Separate known read-mostly and written-mostly fields
> 
> Separate frequently read and frequently written fields on different cache lines

Will change. Thanks!

- Feng

> > +
> > +and better add a comment stating the false sharing consideration.
> > +
> > +One note is, sometimes even after a severe false sharing is detected
> > +and solved, the performance may still have no obvious improvement as
> > +the hotspot switches to a new place.
> > +
> > +
> > +Miscellaneous
> > +=============
> > +One open issue is that kernel has an optional data structure
> > +randomization mechanism, which also randomizes the situation of cache
> > +line sharing of data members.
> > +
> > +
> > +.. [1] https://en.wikipedia.org/wiki/False_sharing
> > +.. [2] https://lore.kernel.org/lkml/CAHk-=whoqV=cX5VC80mmR9rr+Z+yQ6fiQZm36Fb-izsanHg23w@mail.gmail.com/
> > +.. [3] https://joemario.github.io/blog/2016/09/01/c2c-blog/
> > diff --git a/Documentation/kernel-hacking/index.rst b/Documentation/kernel-hacking/index.rst
> > index f53027652290..79c03bac99a2 100644
> > --- a/Documentation/kernel-hacking/index.rst
> > +++ b/Documentation/kernel-hacking/index.rst
> > @@ -9,3 +9,4 @@ Kernel Hacking Guides
> >  
> >     hacking
> >     locking
> > +   false-sharing
> 
> Thanks.
> 
> Tim
> 

      parent reply	other threads:[~2023-04-06  1:20 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-04  5:22 [PATCH v3] Documentation: Add document for false sharing Feng Tang
2023-04-04 19:38 ` Shakeel Butt
2023-04-05 22:38 ` Tim Chen
2023-04-05 23:29   ` Tim Chen
2023-04-06  1:14   ` Feng Tang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZC4c8615XM8dfxa+@feng-clx \
    --to=feng.tang@intel.com \
    --cc=acme@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi.kleen@intel.com \
    --cc=bagasdotme@gmail.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@intel.com \
    --cc=edumazet@google.com \
    --cc=jmario@redhat.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=shakeelb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.