public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Liang, Kan" <kan.liang@linux.intel.com>
To: John Stultz <jstultz@google.com>
Cc: peterz@infradead.org, mingo@redhat.com, tglx@linutronix.de,
	sboyd@kernel.org, linux-kernel@vger.kernel.org,
	eranian@google.com, namhyung@kernel.org, ak@linux.intel.com
Subject: Re: [PATCH 1/3] timekeeping: NMI safe converter from a given time to monotonic
Date: Tue, 24 Jan 2023 15:12:50 -0500	[thread overview]
Message-ID: <1fb59dfa-1ab9-51ad-98c6-89431aa56918@linux.intel.com> (raw)
In-Reply-To: <CANDhNCp_0Os+e0A0LZ7yKw16mWai9MAPMPYL0p1NkcVxifh88w@mail.gmail.com>



On 2023-01-24 1:43 p.m., John Stultz wrote:
> On Tue, Jan 24, 2023 at 7:09 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>> On 2023-01-24 2:01 a.m., John Stultz wrote:
>>> On Mon, Jan 23, 2023 at 10:27 AM <kan.liang@linux.intel.com> wrote:
>>>> +               /*
>>>> +                * Check whether the given timestamp is on the current
>>>> +                * timekeeping interval.
>>>> +                */
>>>> +               now = tk_clock_read(tkr);
>>>> +               interval_start = tkr->cycle_last;
>>>> +               if (!cycle_between(interval_start, cycles, now))
>>>> +                       return -EOPNOTSUPP;
>>>
>>> So. I've not fully thought this out, but it seems like it would be
>>> quite likely that you'd run into the case where the cycle_last value
>>> is updated and your earlier TSC timestamp isn't valid for the current
>>> interval. The get_device_system_crosststamp() logic has a big chunk of
>>> complex code to try to handle this case by interpolating the cycle
>>> value back in time. How well does just failing in this case work out?
>>>
>>
>> For the case, perf fallback to the time captured in the NMI handler, via
>> ktime_get_mono_fast_ns().
> 
> This feels like *very* subtle behavior. Maybe I'm misunderstanding,
> but the goal seems to be to have more accurate timestamps on the hw
> events, and using the captured tsc timestamp avoids the measuring
> latency reading the time again. But if every timekeeping update
> interval (~tick) you transparently get a delayed value due to the
> fallback, it makes it hard to understand which timestamps are better
> or worse. The latency between two reads may be real or it may be just
> bad luck. This doesn't intuitively seem like a great benefit over more
> consistent latency of just using the ktime_get_mono_fast()
> timestamping.

Your understand is correct. We want a more accurate timestamp for the
analysis work.

As my understanding, the timekeeping update should not be very often. If
I read the code correctly, it should happen only when adjusting NTP or
suspending/resuming. If so, I think the drawback should not impact the
normal analysis work. I will call out the drwabacks in the comments
where the function is used.

> 
>> The TSC in PEBS is captured by HW when the sample was generated. There
>> should be a small delta compared with the time captured in the NMI
>> handler. But I think the delta should be acceptable as a backup solution
>> for the most analysis cases. Also, I don't think the case (the
>> cycle_last value is updated during the monitoring) should occur very
>> often either. So I drop the history support to simplify the function.
> 
> So the reads and this function are *always* used in NMI context?   Has
> this been stressed with things like SMIs to see how it does if
> interrupted in those cases?

Yes, it's *always* and only used in NMI context.

> 
> My worry is that (as I bored everyone earlier), the
> ktime_get_*_fast_ns() interfaces already have some sharp edges and
> need a fair amount of thought as to when they should be used. This is
> sort of compounding that adding an interface that has further special
> cases where it can fail, making it difficult to fully understand and
> easier to accidentally misuse.
> 
> My other concern is that interfaces always get stretched and used
> beyond anything they were initially planned for (see the
> ktime_get_*fast_ns() interfaces here as an example! :), and in this
> case the logic seems to have lots of implicit dependencies on the
> facts of your specific use case, so it seems a bit fragile should
> folks on other architectures with other constraints try to use it.
> 
> So I just want to push a bit to think how you might be able to
> extend/generalize the existing get_system_crosststamp for your
> purposes, or alternatively find a way to simplify the logic's behavior
> so its less tied to specific constraints ("this works most of the time
> from NMI, but otherwise no promises").  Or at least some better
> documentation around the code, its uses and its constraints? ( "NMI
> safe" is not the same as "Only safe to use from NMI" :)

Since our usage is fixed (only in NMI), I prefer the latter. I think
extending/generalizing the existing function only makes the function
extremely complex and low efficient. The new function should have the
same constraints as the existing ktime_get_mono_fast_ns(). Since perf
can live with the ktime_get_mono_fast_ns(), there should be no problem
with the new function for the constraints. I will add more comments to
clarify the usage and constraints to avoid the abuse of the new function.

Thanks,
Kan


  reply	other threads:[~2023-01-24 20:13 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-23 18:27 [PATCH 0/3] Convert TSC to monotonic clock for PEBS kan.liang
2023-01-23 18:27 ` [PATCH 1/3] timekeeping: NMI safe converter from a given time to monotonic kan.liang
2023-01-24  7:01   ` John Stultz
2023-01-24 15:09     ` Liang, Kan
2023-01-24 18:43       ` John Stultz
2023-01-24 20:12         ` Liang, Kan [this message]
2023-01-24 20:33           ` John Stultz
2023-01-24 22:08             ` Liang, Kan
2023-01-24 22:40               ` John Stultz
2023-01-25 16:44                 ` Liang, Kan
2023-01-24  8:51   ` Thomas Gleixner
2023-01-24  9:10     ` Stephane Eranian
2023-01-24 16:06       ` Liang, Kan
2023-01-27 13:30       ` Thomas Gleixner
2023-01-23 18:27 ` [PATCH 2/3] x86/tsc: Add set_tsc_system_counterval kan.liang
2023-01-23 18:27 ` [PATCH 3/3] perf/x86/intel/ds: Support monotonic clock for PEBS kan.liang
2023-01-24  6:56   ` John Stultz
2023-01-24 15:17     ` Liang, Kan
2023-01-24  6:13 ` [PATCH 0/3] Convert TSC to " John Stultz
2023-01-24 15:04   ` Liang, Kan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1fb59dfa-1ab9-51ad-98c6-89431aa56918@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=ak@linux.intel.com \
    --cc=eranian@google.com \
    --cc=jstultz@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sboyd@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox