From: "Liang, Kan" <kan.liang@linux.intel.com>
To: John Stultz <jstultz@google.com>
Cc: peterz@infradead.org, mingo@redhat.com, tglx@linutronix.de,
sboyd@kernel.org, linux-kernel@vger.kernel.org,
eranian@google.com, namhyung@kernel.org, ak@linux.intel.com
Subject: Re: [PATCH 3/3] perf/x86/intel/ds: Support monotonic clock for PEBS
Date: Tue, 24 Jan 2023 10:17:29 -0500 [thread overview]
Message-ID: <8cb0ebd6-59ea-5c01-dc2a-d3f11730ab43@linux.intel.com> (raw)
In-Reply-To: <CANDhNCqMaqg1S4Vt_6Pe6M-9seGwA8Hxb8vR5KnLaByvG1JANg@mail.gmail.com>
On 2023-01-24 1:56 a.m., John Stultz wrote:
> On Mon, Jan 23, 2023 at 10:27 AM <kan.liang@linux.intel.com> wrote:
>>
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> Users try to reconcile user samples with PEBS samples and require a
>> common clock source. However, the current PEBS codes only convert to
>> sched_clock, which is not available from the user space.
>>
>> Only support converting to clock monotonic. Having one common clock
>> source is good enough to fulfill the requirement.
>>
>> Enable the large PEBS for the monotonic clock to reduce the PEBS
>> overhead.
>>
>> There are a few rare cases that may make the conversion fails. For
>> example, TSC overflows. The cycle_last may be changed between samples.
>> The time will fallback to the inaccurate SW times. But the cases are
>> extremely unlikely to happen.
>>
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> ---
>
> Thanks for sending this out!
> A few minor style issues below and a warning.
Thanks.
>
>> The patch has to be on top of the below patch
>> https://lore.kernel.org/all/20230123172027.125385-1-kan.liang@linux.intel.com/
>>
>> arch/x86/events/intel/core.c | 2 +-
>> arch/x86/events/intel/ds.c | 30 ++++++++++++++++++++++++++----
>> 2 files changed, 27 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 14f0a746257d..ea194556cc73 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -3777,7 +3777,7 @@ static unsigned long intel_pmu_large_pebs_flags(struct perf_event *event)
>> {
>> unsigned long flags = x86_pmu.large_pebs_flags;
>>
>> - if (event->attr.use_clockid)
>> + if (event->attr.use_clockid && (event->attr.clockid != CLOCK_MONOTONIC))
>> flags &= ~PERF_SAMPLE_TIME;
>> if (!event->attr.exclude_kernel)
>> flags &= ~PERF_SAMPLE_REGS_USER;
>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>> index 7980e92dec64..d7f0eaf4405c 100644
>> --- a/arch/x86/events/intel/ds.c
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -1570,13 +1570,33 @@ static u64 get_data_src(struct perf_event *event, u64 aux)
>> return val;
>> }
>>
>> +static int pebs_get_synctime(struct system_counterval_t *system,
>> + void *ctx)
>
> Just because the abstract function type taken by
> get_mono_fast_from_given_time is vague, doesn't mean the
> implementation needs to be.
> ctx is really a tsc value, right? So let's call it that to make this a
> bit more readable.
Sure, I will use the tsc to replace ctx.
>
>> +{
>> + *system = set_tsc_system_counterval(*(u64 *)ctx);
>> + return 0;
>> +}
>> +
>> +static inline int pebs_clockid_time(clockid_t clk_id, u64 tsc, u64 *clk_id_time)
>
> clk_id_time is maybe a bit too fuzzy. It is really a mono_ns value,
> right? Let's keep that explicit here.
Yes. Will make it explicit.
>
>> +{
>> + /* Only support converting to clock monotonic */
>> + if (clk_id != CLOCK_MONOTONIC)
>> + return -EINVAL;
>> +
>> + return get_mono_fast_from_given_time(pebs_get_synctime, &tsc, clk_id_time);
>> +}
>> +
>> static void setup_pebs_time(struct perf_event *event,
>> struct perf_sample_data *data,
>> u64 tsc)
>> {
>> - /* Converting to a user-defined clock is not supported yet. */
>> - if (event->attr.use_clockid != 0)
>> - return;
>> + u64 time;
>
> Again, "time" is too generic a term without any context here.
> mono_nsec or something would be more clear.
Sure.
>
>> +
>> + if (event->attr.use_clockid != 0) {
>> + if (pebs_clockid_time(event->attr.clockid, tsc, &time))
>> + return;
>> + goto done;
>> + }
>
> Apologies for this warning/rant:
>
> So, I do get the NMI safety of the "fast" time accessors (along with
> the "high performance" sounding name!) is attractive, but as its use
> expands I worry the downsides of this interface isn't made clear
> enough.
>
> The fast accessors *can* see time discontinuities! Because the logic
> is done without holding the tk_core.seq lock, If you are reading in
> the middle of a ntp adjustment, you may find the current value to be
> larger than the next time you read the time. These discontinuities
> are likely to be very small, but a negative delta will look very large
> as a u64. So part of using these "fast *and unsafe*" interfaces is
> you get to keep both pieces when it breaks. Make sure the code here
> that is using these interfaces guards against this (zeroing out
> negative deltas).
>
Thanks for the warning.
I will add more comments and specially handle it here.
Thanks,
Kan
next prev parent reply other threads:[~2023-01-24 15:17 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-23 18:27 [PATCH 0/3] Convert TSC to monotonic clock for PEBS kan.liang
2023-01-23 18:27 ` [PATCH 1/3] timekeeping: NMI safe converter from a given time to monotonic kan.liang
2023-01-24 7:01 ` John Stultz
2023-01-24 15:09 ` Liang, Kan
2023-01-24 18:43 ` John Stultz
2023-01-24 20:12 ` Liang, Kan
2023-01-24 20:33 ` John Stultz
2023-01-24 22:08 ` Liang, Kan
2023-01-24 22:40 ` John Stultz
2023-01-25 16:44 ` Liang, Kan
2023-01-24 8:51 ` Thomas Gleixner
2023-01-24 9:10 ` Stephane Eranian
2023-01-24 16:06 ` Liang, Kan
2023-01-27 13:30 ` Thomas Gleixner
2023-01-23 18:27 ` [PATCH 2/3] x86/tsc: Add set_tsc_system_counterval kan.liang
2023-01-23 18:27 ` [PATCH 3/3] perf/x86/intel/ds: Support monotonic clock for PEBS kan.liang
2023-01-24 6:56 ` John Stultz
2023-01-24 15:17 ` Liang, Kan [this message]
2023-01-24 6:13 ` [PATCH 0/3] Convert TSC to " John Stultz
2023-01-24 15:04 ` Liang, Kan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8cb0ebd6-59ea-5c01-dc2a-d3f11730ab43@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=ak@linux.intel.com \
--cc=eranian@google.com \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=sboyd@kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.