From: Vlad Zolotarov <vlad@scalemp.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel <linux-kernel@vger.kernel.org>,
Shai@scalemp.com, ido@wizery.com
Subject: Re: [PATCH v4 0/2] Move x86_cpu_to_apicid to the __read_mostly section
Date: Tue, 22 May 2012 18:55:41 +0300 [thread overview]
Message-ID: <5555210.ryZGheQWJb@vlad> (raw)
In-Reply-To: <20120521201958.GB10848@gmail.com>
> >
> > I have no fundamental prefer to either approach, but the
> > direction taken should be justified explicitly, with numbers,
> > arguments, etc. - also a short blurb somewhere in the headers
> > that explains when they should be used, so that others can be
> > aware of vSMP's special needs here.
>
> I.e. *numbers* are needed: roughly how many percpu variables in
> a defconfig of one type versus the other type. This settles the
> question whether we want to identify read-mostly or
> write-frequently variables, to address this particular problem
> ...
Ok, let's start with *numbers*:
- In the defconfig compilation i've got 219 per_cpu variables: 218 declared
as NOT read_mostly and only one (tlb_vector_offset) - as read_mostly.
- From those that are not declared as read_mostly the grep on
"time|lock|state|stats|last|left|work|owned|reason|list|idle|count|warn|head|
rcu|nr_|pvecs|irq", which are a likely candidates to be NOT read_mostly (I
verified a few from each pattern) returned 88. I created the above patterns
list after walking through about the 1/3 of the per_cpu variables list.
Unfortunately I have no time to analyze all 219 variables in depth - I leave
it to the maintainers of the code containing them but it's obvious that there
are quite a lot read-write per_cpu variables in the kernel code and u know,
I'm not surprised - if u consider the reasoning to declare a per-cpu variable
u might notice that in most cases u need it when u actually mean to actively
write to it. This is because a main motivation for using per_cpu variables is
to hide/prevent from other CPUs seeing the *changes* of the local per_cpu
variable: e.g. lists used for lockless stuff, local (per-CPU) locks, counters,
etc.
On the other hand declaration of a read_mostly per-cpu variable performance-
wise is similar to using a regular read_mostly (*not* per_cpu) array and the
main difference is a semantics which I feel like a weaker motivation.
> I.e. it might make more sense to identify the frequenty
> modified percpu variables, and move them to a separate
> section. I think most percpu variables are read mostly, so it
> would be more maintainable in the long run to figure out the
> frequently modified ones, not the frequently not modified ones.
I guess the numbers above tell us the opposite. So, I think we'd better stick
with the read_mostly semantics. ;)
I also would like to draw your attention to the fact that this patch series
doesn't introduce the read_mostly semantics either in a per-cpu context or in
a non-per-cpu context:
Originally we have discovered that x86_cpu_to_apicid variable has a
read_mostly nature and is quite contented and wanted to define it as
__read_mostly as it should be in order to prevent false sharing.
More specifically, lapic_events and x86_cpu_to_apicid shared the same cache
line and the first one was frequently written.
Since it's defined as an EARLY_PER_CPU variable we had to define the missing
XXX_EARLY_PER_CPU_READ_MOSTLY() macros.
Then u asked me to see if other variables from smp.h are also read_mostly,
which I did however it wasn't our intention to change any infrastructure, on
the contrary we used the existing one. I have a feeling that thought the
opposite... ;)
So, pls., tell me what's next? Frankly, I don't think the *numbers* part above
is of any interest to the wide public except for u and me... ;)
However the "lapic_event" fact above might clarify the motivation a bit more.
Pls., comment.
thanks,
vlad
>
> Thanks,
>
> Ingo
next prev parent reply other threads:[~2012-05-22 15:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-21 15:02 [PATCH v4 0/2] Move x86_cpu_to_apicid to the __read_mostly section Vlad Zolotarov
2012-05-21 15:23 ` Ingo Molnar
2012-05-21 15:42 ` Vlad Zolotarov
2012-05-21 20:19 ` Ingo Molnar
2012-05-22 15:55 ` Vlad Zolotarov [this message]
2012-05-22 15:59 ` Vlad Zolotarov
2012-05-23 9:16 ` Vlad Zolotarov
2012-06-07 8:18 ` Vlad Zolotarov
2012-06-11 9:00 ` Ingo Molnar
2012-06-11 9:08 ` Vlad Zolotarov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5555210.ryZGheQWJb@vlad \
--to=vlad@scalemp.com \
--cc=Shai@scalemp.com \
--cc=hpa@zytor.com \
--cc=ido@wizery.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox