From: Vlad Zolotarov <vlad@scalemp.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel <linux-kernel@vger.kernel.org>,
Shai@scalemp.com, ido@wizery.com
Subject: Re: [PATCH v4 0/2] Move x86_cpu_to_apicid to the __read_mostly section
Date: Tue, 22 May 2012 18:55:41 +0300 [thread overview]
Message-ID: <5555210.ryZGheQWJb@vlad> (raw)
In-Reply-To: <20120521201958.GB10848@gmail.com>
> >
> > I have no fundamental prefer to either approach, but the
> > direction taken should be justified explicitly, with numbers,
> > arguments, etc. - also a short blurb somewhere in the headers
> > that explains when they should be used, so that others can be
> > aware of vSMP's special needs here.
>
> I.e. *numbers* are needed: roughly how many percpu variables in
> a defconfig of one type versus the other type. This settles the
> question whether we want to identify read-mostly or
> write-frequently variables, to address this particular problem
> ...
Ok, let's start with *numbers*:
- In the defconfig compilation i've got 219 per_cpu variables: 218 declared
as NOT read_mostly and only one (tlb_vector_offset) - as read_mostly.
- From those that are not declared as read_mostly the grep on
"time|lock|state|stats|last|left|work|owned|reason|list|idle|count|warn|head|
rcu|nr_|pvecs|irq", which are a likely candidates to be NOT read_mostly (I
verified a few from each pattern) returned 88. I created the above patterns
list after walking through about the 1/3 of the per_cpu variables list.
Unfortunately I have no time to analyze all 219 variables in depth - I leave
it to the maintainers of the code containing them but it's obvious that there
are quite a lot read-write per_cpu variables in the kernel code and u know,
I'm not surprised - if u consider the reasoning to declare a per-cpu variable
u might notice that in most cases u need it when u actually mean to actively
write to it. This is because a main motivation for using per_cpu variables is
to hide/prevent from other CPUs seeing the *changes* of the local per_cpu
variable: e.g. lists used for lockless stuff, local (per-CPU) locks, counters,
etc.
On the other hand declaration of a read_mostly per-cpu variable performance-
wise is similar to using a regular read_mostly (*not* per_cpu) array and the
main difference is a semantics which I feel like a weaker motivation.
> I.e. it might make more sense to identify the frequenty
> modified percpu variables, and move them to a separate
> section. I think most percpu variables are read mostly, so it
> would be more maintainable in the long run to figure out the
> frequently modified ones, not the frequently not modified ones.
I guess the numbers above tell us the opposite. So, I think we'd better stick
with the read_mostly semantics. ;)
I also would like to draw your attention to the fact that this patch series
doesn't introduce the read_mostly semantics either in a per-cpu context or in
a non-per-cpu context:
Originally we have discovered that x86_cpu_to_apicid variable has a
read_mostly nature and is quite contented and wanted to define it as
__read_mostly as it should be in order to prevent false sharing.
More specifically, lapic_events and x86_cpu_to_apicid shared the same cache
line and the first one was frequently written.
Since it's defined as an EARLY_PER_CPU variable we had to define the missing
XXX_EARLY_PER_CPU_READ_MOSTLY() macros.
Then u asked me to see if other variables from smp.h are also read_mostly,
which I did however it wasn't our intention to change any infrastructure, on
the contrary we used the existing one. I have a feeling that thought the
opposite... ;)
So, pls., tell me what's next? Frankly, I don't think the *numbers* part above
is of any interest to the wide public except for u and me... ;)
However the "lapic_event" fact above might clarify the motivation a bit more.
Pls., comment.
thanks,
vlad
>
> Thanks,
>
> Ingo
next prev parent reply other threads:[~2012-05-22 15:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-21 15:02 [PATCH v4 0/2] Move x86_cpu_to_apicid to the __read_mostly section Vlad Zolotarov
2012-05-21 15:23 ` Ingo Molnar
2012-05-21 15:42 ` Vlad Zolotarov
2012-05-21 20:19 ` Ingo Molnar
2012-05-22 15:55 ` Vlad Zolotarov [this message]
2012-05-22 15:59 ` Vlad Zolotarov
2012-05-23 9:16 ` Vlad Zolotarov
2012-06-07 8:18 ` Vlad Zolotarov
2012-06-11 9:00 ` Ingo Molnar
2012-06-11 9:08 ` Vlad Zolotarov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5555210.ryZGheQWJb@vlad \
--to=vlad@scalemp.com \
--cc=Shai@scalemp.com \
--cc=hpa@zytor.com \
--cc=ido@wizery.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.