From: "Liang, Kan" <kan.liang@linux.intel.com>
To: John Stultz <jstultz@google.com>
Cc: peterz@infradead.org, mingo@redhat.com, tglx@linutronix.de,
sboyd@kernel.org, linux-kernel@vger.kernel.org,
eranian@google.com, namhyung@kernel.org, ak@linux.intel.com
Subject: Re: [PATCH 1/3] timekeeping: NMI safe converter from a given time to monotonic
Date: Tue, 24 Jan 2023 15:12:50 -0500 [thread overview]
Message-ID: <1fb59dfa-1ab9-51ad-98c6-89431aa56918@linux.intel.com> (raw)
In-Reply-To: <CANDhNCp_0Os+e0A0LZ7yKw16mWai9MAPMPYL0p1NkcVxifh88w@mail.gmail.com>
On 2023-01-24 1:43 p.m., John Stultz wrote:
> On Tue, Jan 24, 2023 at 7:09 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>> On 2023-01-24 2:01 a.m., John Stultz wrote:
>>> On Mon, Jan 23, 2023 at 10:27 AM <kan.liang@linux.intel.com> wrote:
>>>> + /*
>>>> + * Check whether the given timestamp is on the current
>>>> + * timekeeping interval.
>>>> + */
>>>> + now = tk_clock_read(tkr);
>>>> + interval_start = tkr->cycle_last;
>>>> + if (!cycle_between(interval_start, cycles, now))
>>>> + return -EOPNOTSUPP;
>>>
>>> So. I've not fully thought this out, but it seems like it would be
>>> quite likely that you'd run into the case where the cycle_last value
>>> is updated and your earlier TSC timestamp isn't valid for the current
>>> interval. The get_device_system_crosststamp() logic has a big chunk of
>>> complex code to try to handle this case by interpolating the cycle
>>> value back in time. How well does just failing in this case work out?
>>>
>>
>> For the case, perf fallback to the time captured in the NMI handler, via
>> ktime_get_mono_fast_ns().
>
> This feels like *very* subtle behavior. Maybe I'm misunderstanding,
> but the goal seems to be to have more accurate timestamps on the hw
> events, and using the captured tsc timestamp avoids the measuring
> latency reading the time again. But if every timekeeping update
> interval (~tick) you transparently get a delayed value due to the
> fallback, it makes it hard to understand which timestamps are better
> or worse. The latency between two reads may be real or it may be just
> bad luck. This doesn't intuitively seem like a great benefit over more
> consistent latency of just using the ktime_get_mono_fast()
> timestamping.
Your understand is correct. We want a more accurate timestamp for the
analysis work.
As my understanding, the timekeeping update should not be very often. If
I read the code correctly, it should happen only when adjusting NTP or
suspending/resuming. If so, I think the drawback should not impact the
normal analysis work. I will call out the drwabacks in the comments
where the function is used.
>
>> The TSC in PEBS is captured by HW when the sample was generated. There
>> should be a small delta compared with the time captured in the NMI
>> handler. But I think the delta should be acceptable as a backup solution
>> for the most analysis cases. Also, I don't think the case (the
>> cycle_last value is updated during the monitoring) should occur very
>> often either. So I drop the history support to simplify the function.
>
> So the reads and this function are *always* used in NMI context? Has
> this been stressed with things like SMIs to see how it does if
> interrupted in those cases?
Yes, it's *always* and only used in NMI context.
>
> My worry is that (as I bored everyone earlier), the
> ktime_get_*_fast_ns() interfaces already have some sharp edges and
> need a fair amount of thought as to when they should be used. This is
> sort of compounding that adding an interface that has further special
> cases where it can fail, making it difficult to fully understand and
> easier to accidentally misuse.
>
> My other concern is that interfaces always get stretched and used
> beyond anything they were initially planned for (see the
> ktime_get_*fast_ns() interfaces here as an example! :), and in this
> case the logic seems to have lots of implicit dependencies on the
> facts of your specific use case, so it seems a bit fragile should
> folks on other architectures with other constraints try to use it.
>
> So I just want to push a bit to think how you might be able to
> extend/generalize the existing get_system_crosststamp for your
> purposes, or alternatively find a way to simplify the logic's behavior
> so its less tied to specific constraints ("this works most of the time
> from NMI, but otherwise no promises"). Or at least some better
> documentation around the code, its uses and its constraints? ( "NMI
> safe" is not the same as "Only safe to use from NMI" :)
Since our usage is fixed (only in NMI), I prefer the latter. I think
extending/generalizing the existing function only makes the function
extremely complex and low efficient. The new function should have the
same constraints as the existing ktime_get_mono_fast_ns(). Since perf
can live with the ktime_get_mono_fast_ns(), there should be no problem
with the new function for the constraints. I will add more comments to
clarify the usage and constraints to avoid the abuse of the new function.
Thanks,
Kan
next prev parent reply other threads:[~2023-01-24 20:13 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-23 18:27 [PATCH 0/3] Convert TSC to monotonic clock for PEBS kan.liang
2023-01-23 18:27 ` [PATCH 1/3] timekeeping: NMI safe converter from a given time to monotonic kan.liang
2023-01-24 7:01 ` John Stultz
2023-01-24 15:09 ` Liang, Kan
2023-01-24 18:43 ` John Stultz
2023-01-24 20:12 ` Liang, Kan [this message]
2023-01-24 20:33 ` John Stultz
2023-01-24 22:08 ` Liang, Kan
2023-01-24 22:40 ` John Stultz
2023-01-25 16:44 ` Liang, Kan
2023-01-24 8:51 ` Thomas Gleixner
2023-01-24 9:10 ` Stephane Eranian
2023-01-24 16:06 ` Liang, Kan
2023-01-27 13:30 ` Thomas Gleixner
2023-01-23 18:27 ` [PATCH 2/3] x86/tsc: Add set_tsc_system_counterval kan.liang
2023-01-23 18:27 ` [PATCH 3/3] perf/x86/intel/ds: Support monotonic clock for PEBS kan.liang
2023-01-24 6:56 ` John Stultz
2023-01-24 15:17 ` Liang, Kan
2023-01-24 6:13 ` [PATCH 0/3] Convert TSC to " John Stultz
2023-01-24 15:04 ` Liang, Kan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1fb59dfa-1ab9-51ad-98c6-89431aa56918@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=ak@linux.intel.com \
--cc=eranian@google.com \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=sboyd@kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox