From: Ingo Molnar <mingo@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>,
Vince Weaver <vincent.weaver@maine.edu>,
Jiri Olsa <jolsa@redhat.com>, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
Date: Sat, 23 May 2015 10:26:01 +0200 [thread overview]
Message-ID: <20150523082601.GB7025@gmail.com> (raw)
In-Reply-To: <20150522134811.GI3644@twins.programming.kicks-ass.net>
* Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, May 22, 2015 at 06:40:49AM -0700, Stephane Eranian wrote:
> > On Fri, May 22, 2015 at 6:36 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > > On Fri, May 22, 2015 at 06:29:47AM -0700, Stephane Eranian wrote:
> > >> On Fri, May 22, 2015 at 6:25 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > >> > On Fri, May 22, 2015 at 06:07:00AM -0700, Stephane Eranian wrote:
> > >> >>
> > >> >> One other thing I noticed is that the --n_excl needs to be protected by the
> > >> >> excl_cntrs->lock in put_excl_constraints().
> > >> >
> > >> > Nah, its strictly per cpu.
> > >>
> > >> No. the excl_cntrs struct is pointed to by cpuc but it is shared between the
> > >> sibling HT. Otherwise this would not work!
> > >
> > > n_excl is per cpuc, see the trickery with has_exclusive vs
> > > exclusive_present on how I avoid the lock.
> >
> > Yes, but I believe you create a store forward penalty with this.
> > You store 16bits and you load 32 bits on the same cache line.
Same cacheline access has no such penalty: only if the partial access
is for the same word.
> The store and load are fairly well spaced -- the entire scheduling
> fast path is in between.
>
> And such a penalty is still cheap compared to locking, no?
The 'penalty' is essentially just a delay in the execution of the
load, if the store has not finished yet: typically less than 10
cycles, around 3 cycles on recent uarchs.
So it should not be a big issue if there's indeed so much code between
them - probably it's not even causing any delay anywhere.
Thanks,
Ingo
next prev parent reply other threads:[~2015-05-23 8:26 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
2015-05-21 11:17 ` [PATCH 01/10] perf,x86: Fix event/group validation Peter Zijlstra
2015-05-21 12:35 ` Stephane Eranian
2015-05-21 12:56 ` Peter Zijlstra
2015-05-21 13:07 ` Stephane Eranian
2015-05-21 13:09 ` Peter Zijlstra
2015-05-21 13:18 ` Stephane Eranian
2015-05-21 13:20 ` Peter Zijlstra
2015-05-21 13:27 ` Stephane Eranian
2015-05-21 13:29 ` Peter Zijlstra
2015-05-21 13:36 ` Stephane Eranian
2015-05-21 14:03 ` Peter Zijlstra
2015-05-21 15:11 ` Stephane Eranian
2015-05-22 6:49 ` Ingo Molnar
2015-05-22 9:26 ` Stephane Eranian
2015-05-22 9:46 ` Ingo Molnar
2015-05-21 14:53 ` Peter Zijlstra
2015-05-21 15:42 ` Stephane Eranian
2015-08-21 20:31 ` Sasha Levin
2015-09-10 4:48 ` Sasha Levin
2015-09-10 8:54 ` Stephane Eranian
2015-09-10 10:01 ` Peter Zijlstra
2015-05-21 11:17 ` [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint Peter Zijlstra
2015-05-22 10:04 ` Stephane Eranian
2015-05-22 11:21 ` Peter Zijlstra
2015-05-22 11:24 ` Stephane Eranian
2015-05-22 11:28 ` Peter Zijlstra
2015-05-22 12:35 ` Stephane Eranian
2015-05-22 12:53 ` Peter Zijlstra
2015-05-22 12:55 ` Stephane Eranian
2015-05-22 12:59 ` Peter Zijlstra
2015-05-22 13:05 ` Stephane Eranian
2015-05-22 13:07 ` Stephane Eranian
2015-05-22 13:25 ` Peter Zijlstra
2015-05-22 13:29 ` Stephane Eranian
2015-05-22 13:36 ` Peter Zijlstra
2015-05-22 13:40 ` Stephane Eranian
2015-05-22 13:48 ` Peter Zijlstra
2015-05-23 8:26 ` Ingo Molnar [this message]
2015-05-22 13:25 ` Peter Zijlstra
2015-05-22 13:10 ` Peter Zijlstra
2015-05-21 11:17 ` [PATCH 03/10] perf/x86: Correct local vs remote sibling state Peter Zijlstra
2015-05-21 13:31 ` Stephane Eranian
2015-05-21 14:10 ` Peter Zijlstra
2015-05-21 11:17 ` [PATCH 04/10] perf/x86: Use lockdep Peter Zijlstra
2015-05-21 11:17 ` [PATCH 05/10] perf/x86: Simplify dynamic constraint code somewhat Peter Zijlstra
2015-05-21 11:17 ` [PATCH 06/10] perf/x86: Make WARNs consistent Peter Zijlstra
2015-05-21 11:17 ` [PATCH 07/10] perf/x86: Move intel_commit_scheduling() Peter Zijlstra
2015-05-21 11:17 ` [PATCH 08/10] perf/x86: Remove pointless tests Peter Zijlstra
2015-05-21 13:24 ` Stephane Eranian
2015-05-21 11:17 ` [PATCH 09/10] perf/x86: Remove intel_excl_states::init_state Peter Zijlstra
2015-05-21 13:39 ` Stephane Eranian
2015-05-21 14:12 ` Peter Zijlstra
2015-05-21 11:17 ` [PATCH 10/10] perf,x86: Simplify logic Peter Zijlstra
2015-05-21 11:48 ` [PATCH 00/10] Various x86 pmu scheduling patches Stephane Eranian
2015-05-21 12:53 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150523082601.GB7025@gmail.com \
--to=mingo@kernel.org \
--cc=eranian@google.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=vincent.weaver@maine.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.