Re: [PATCH 0/2] x86/intel_rdt and perf/x86: Fix lack of coordination with perf

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: Reinette Chatre <reinette.chatre@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
	tglx@linutronix.de, mingo@redhat.com, fenghua.yu@intel.com,
	tony.luck@intel.com, vikas.shivappa@linux.intel.com,
	gavin.hindman@intel.com, jithu.joseph@intel.com, hpa@zytor.com,
	x86@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/2] x86/intel_rdt and perf/x86: Fix lack of coordination with perf
Date: Wed, 8 Aug 2018 09:51:54 +0200	[thread overview]
Message-ID: <20180808075154.GN2494@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <ace0bebb-91ab-5d40-e7d7-d72d48302fa8@intel.com>

On Tue, Aug 07, 2018 at 03:47:15PM -0700, Reinette Chatre wrote:
> > FWIW, how long is that IRQ disabled section? It looks like something
> > that could be taking a bit of time. We have these people that care about
> > IRQ latency.
> 
> We work closely with customers needing low latency as well as customers
> needing deterministic behavior.
> 
> This measurement is triggered by the user as a validation mechanism of
> the pseudo-locked memory region after it has been created as part of
> system setup as well as during runtime if there are any concerns with
> the performance of an application that uses it.
> 
> This measurement would thus be triggered before the sensitive workloads
> start - during system setup, or if an issue is already present. In
> either case the measurement is triggered by the administrator via debugfs.

That does not in fact include the answer to the question. Also, it
assumes a competent operator (something I've found is not always true).

> >  - I don't much fancy people accessing the guts of events like that;
> >    would not an inline function like:
> > 
> >    static inline u64 x86_perf_rdpmc(struct perf_event *event)
> >    {
> > 	u64 val;
> > 
> > 	lockdep_assert_irqs_disabled();
> > 
> > 	rdpmcl(event->hw.event_base_rdpmc, val);
> > 	return val;
> >    }
> > 
> >    Work for you?
> 
> No. This does not provide accurate results. Implementing the above produces:
> pseudo_lock_mea-366   [002] ....    34.950740: pseudo_lock_l2: hits=4096
> miss=4

But it being an inline function should allow the compiler to optimize
and lift the event->hw.event_base_rdpmc load like you now do manually.
Also, like Tony already suggested, you can prime that load just fine by
doing an extra invocation.

(and note that the above function is _much_ simpler than
perf_event_read_local())

> >  - native_read_pmc(); are you 100% sure this code only ever runs on
> >    native and not in some dodgy virt environment?
> 
> My understanding is that a virtual environment would be a customer of a
> RDT allocation (cache or memory bandwidth). I do not see if/where this
> is restricted though - I'll move to rdpmcl() but the usage of a cache
> allocation feature like this from a virtual machine needs more
> investigation.

I can imagine that hypervisors that allow physical partitioning could
allow delegating the rdt crud to their guests when they 'own' a full
socket or whatever the domain is for this.

> Will do. I created the following helper function that can be used after
> interrupts are disabled:
> 
> static inline int perf_event_error_state(struct perf_event *event)
> {
>         int ret = 0;
>         u64 tmp;
> 
>         ret = perf_event_read_local(event, &tmp, NULL, NULL);
>         if (ret < 0)
>                 return ret;
> 
>         if (event->attr.pinned && event->oncpu != smp_processor_id())
>                 return -EBUSY;
> 
>         return ret;
> }

Nah, stick the test in perf_event_read_local(), that actually needs it.

> > Also, while you disable IRQs, your fancy pants loop is still subject to
> > NMIs that can/will perturb your measurements, how do you deal with
> > those?

> Customers interested in this feature are familiar with dealing with them
> (and also SMIs). The user space counterpart is able to detect such an
> occurrence.

You're very optimistic about your customers capabilities. And this might
be true for the current people you're talking to, but once this is
available and public, joe monkey will have access and he _will_ screw it
up.

> Please note that if an NMI arrives it would be handled with the
> currently active cache capacity bitmask so none of the pseudo-locked
> memory will be evicted since no capacity bitmask overlaps with the
> pseudo-locked region.

So exceptions change / have their own bitmask?

next prev parent reply	other threads:[~2018-08-08  7:52 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-31 19:38 [PATCH 0/2] x86/intel_rdt and perf/x86: Fix lack of coordination with perf Reinette Chatre
2018-07-31 19:38 ` [PATCH 1/2] perf/x86: Expose PMC hardware reservation Reinette Chatre
2018-07-31 19:38 ` [PATCH 2/2] x86/intel_rdt: Coordinate performance monitoring with perf Reinette Chatre
2018-08-02 12:39 ` [PATCH 0/2] x86/intel_rdt and perf/x86: Fix lack of coordination " Peter Zijlstra
2018-08-02 16:14   ` Reinette Chatre
2018-08-02 16:18     ` Peter Zijlstra
2018-08-02 16:44       ` Reinette Chatre
2018-08-02 17:37         ` Peter Zijlstra
2018-08-02 18:18           ` Dave Hansen
2018-08-02 19:54             ` Peter Zijlstra
2018-08-02 20:06               ` Dave Hansen
2018-08-02 20:13                 ` Peter Zijlstra
2018-08-02 20:43                   ` Reinette Chatre
2018-08-03 10:49                     ` Peter Zijlstra
2018-08-03 15:18                       ` Reinette Chatre
2018-08-03 15:25                         ` Peter Zijlstra
2018-08-03 18:37                           ` Reinette Chatre
2018-08-06 19:50                             ` Reinette Chatre
2018-08-06 22:12                               ` Peter Zijlstra
2018-08-06 23:07                                 ` Reinette Chatre
2018-08-07  9:36                                   ` Peter Zijlstra
     [not found]                                     ` <ace0bebb-91ab-5d40-e7d7-d72d48302fa8@intel.com>
2018-08-08  1:28                                       ` Luck, Tony
2018-08-08  5:44                                         ` Reinette Chatre
2018-08-08  7:41                                           ` Peter Zijlstra
2018-08-08 15:55                                             ` Luck, Tony
2018-08-08 16:47                                               ` Peter Zijlstra
2018-08-08 16:51                                                 ` Reinette Chatre
2018-08-08  7:51                                       ` Peter Zijlstra [this message]
2018-08-08 17:33                                         ` Reinette Chatre
2018-08-10 16:25                                           ` Reinette Chatre
2018-08-10 17:52                                             ` Reinette Chatre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180808075154.GN2494@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=dave.hansen@intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=gavin.hindman@intel.com \
    --cc=hpa@zytor.com \
    --cc=jithu.joseph@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=reinette.chatre@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=vikas.shivappa@linux.intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox