From: "Liang, Kan" <kan.liang@linux.intel.com>
To: John Stultz <jstultz@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
peterz@infradead.org, mingo@redhat.com,
linux-kernel@vger.kernel.org, sboyd@kernel.org,
eranian@google.com, namhyung@kernel.org, ak@linux.intel.com,
adrian.hunter@intel.com
Subject: Re: [RFC PATCH V2 2/9] perf: Extend ABI to support post-processing monotonic raw conversion
Date: Tue, 14 Feb 2023 12:00:04 -0500 [thread overview]
Message-ID: <0df181b9-fb34-78e8-1376-65d45f7f938f@linux.intel.com> (raw)
In-Reply-To: <6898b1c8-9dbf-67ce-46e6-15d5307ced25@linux.intel.com>
On 2023-02-14 9:51 a.m., Liang, Kan wrote:
>
>
> On 2023-02-13 5:22 p.m., John Stultz wrote:
>> On Mon, Feb 13, 2023 at 1:40 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>> On 2023-02-13 2:37 p.m., John Stultz wrote:
>>>> On Mon, Feb 13, 2023 at 11:08 AM <kan.liang@linux.intel.com> wrote:
>>>>>
>>>>> From: Kan Liang <kan.liang@linux.intel.com>
>>>>>
>>>>> The monotonic raw clock is not affected by NTP/PTP correction. The
>>>>> calculation of the monotonic raw clock can be done in the
>>>>> post-processing, which can reduce the kernel overhead.
>>>>>
>>>>> Add hw_time in the struct perf_event_attr to tell the kernel dump the
>>>>> raw HW time to user space. The perf tool will calculate the HW time
>>>>> in post-processing.
>>>>> Currently, only supports the monotonic raw conversion.
>>>>> Only dump the raw HW time with PERF_RECORD_SAMPLE, because the accurate
>>>>> HW time can only be provided in a sample by HW. For other type of
>>>>> records, the user requested clock should be returned as usual. Nothing
>>>>> is changed.
>>>>>
>>>>> Add perf_event_mmap_page::cap_user_time_mono_raw ABI to dump the
>>>>> conversion information. The cap_user_time_mono_raw also indicates
>>>>> whether the monotonic raw conversion information is available.
>>>>> If yes, the clock monotonic raw can be calculated as
>>>>> mono_raw = base + ((cyc - last) * mult + nsec) >> shift
>>>>
>>>> Again, I appreciate you reworking and resending this series out, I
>>>> know it took some effort.
>>>>
>>>> But oof, I'd really like to make sure we're not exporting timekeeping
>>>> internals to userland.
>>>>
>>>> I think Thomas' suggestion of doing the timestamp conversion in
>>>> post-processing was more about interpolating collected system times
>>>> with the counter (tsc) values captured.
>>>>
>>>
>>> Thomas, could you please clarify your suggestion regarding "the relevant
>>> conversion information" provided by the kernel?
>>> https://lore.kernel.org/lkml/87ilgsgl5f.ffs@tglx/
>>>
>>> Is it only the interpolation information or the entire conversion
>>> information (Mult, shift etc.)?
>>>
>>> If it's only the interpolation information, the user space will be lack
>>> of information to handle all the cases. If I understand John's comments
>>> correctly, it could also bring some interpolation error which can only
>>> be addressed by the mult/shift conversion.
>>
>
>
> Thanks for the details John.
>
>> "Only" is maybe too strong a word. I think having the driver use
>> kernel timekeeping accessors to CLOCK_MONONOTONIC_RAW time with
>> counter values will minimize the error.
>>
>
> The key motivation of using the TSC in the PEBS record is to get an
> accurate timestamp of each record. We definitely want the conversion has
> minimized error.
>
>
>> But again, it's not yet established that any interpolation error using
>> existing interfaces is great enough to be problematic here.
>>
>> The interpoloation is pretty easy to do:
>>
>> do {
>> start= readtsc();
>> clock_gett(CLOCK_MONOTONIC_RAW, &ts);
>> end = readtsc();
>> delta = end-start;
>> } while (delta > THRESHOLD) // make sure the reads were not preempted
>> mid = start + (delta +(delta/2))/2; //round-closest
>>
>
> How to choose the THRESHOLD? It seems the THRESHOLD value also impacts
> the accuracy.
>
>
>> and be able to get you a fairly close matching of TSC to
>> CLOCK_MONOTONIC_RAW value.
>>
>> Once you have that mapping you can take a few samples and establish
>> the linear function.
>>
>> But that will have some error, so quantifying that error helps
>> establish why being able to get an atomic mapping of TSC ->
>> CLOCK_MONOTONIC_RAW would help.
>>
>> So I really don't think we need to expose the kernel internal values
>> to userland, but I'm willing to guess the atomic mapping (which the
>> driver will have access to, not userland) may be helpful for the fine
>> granularity you want in the trace.
>>
>
> If I understand correctly, the idea is to let the user space tool run
> the above interpoloation algorithm several times to 'guess' the atomic
> mapping. Using the mapping information to covert the TSC from the PEBS
> record. Is my understanding correct?
>
> If so, to be honest, I doubt we can get the accuracy we want.
>
I implemented a simple test to evaluate the error.
I collected TSC -> CLOCK_MONOTONIC_RAW mapping using the above algorithm
at the start and end of perf cmd.
MONO_RAW TSC
start 89553516545645 223619715214239
end 89562251233830 223641517000376
Here is what I get via mult/shift conversion from this patch.
MONO_RAW TSC
PEBS 89555942691466 223625770878571
Then I use the time information from start and end to create a linear
function and 'guess' the MONO_RAW of PEBS from the TSC. I get
89555942692721.
There is a 1255 ns difference.
I tried several different PEBS records. The error is ~1000ns.
I think it should be an observable error.
Thanks,
Kan
next prev parent reply other threads:[~2023-02-14 17:01 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-13 19:07 [RFC PATCH V2 0/9] Convert TSC to monotonic raw clock for PEBS kan.liang
2023-02-13 19:07 ` [RFC PATCH V2 1/9] timekeeping: Expose the conversion information of monotonic raw kan.liang
2023-02-13 19:28 ` John Stultz
2023-02-13 19:07 ` [RFC PATCH V2 2/9] perf: Extend ABI to support post-processing monotonic raw conversion kan.liang
2023-02-13 19:37 ` John Stultz
2023-02-13 21:40 ` Liang, Kan
2023-02-13 22:22 ` John Stultz
2023-02-14 10:43 ` Peter Zijlstra
2023-02-14 17:46 ` Liang, Kan
2023-02-14 19:37 ` John Stultz
2023-02-14 20:09 ` Liang, Kan
2023-02-14 20:21 ` John Stultz
2023-03-12 20:50 ` Andi Kleen
2023-02-14 19:34 ` John Stultz
2023-02-14 14:51 ` Liang, Kan
2023-02-14 17:00 ` Liang, Kan [this message]
2023-02-14 20:11 ` John Stultz
2023-02-14 20:38 ` Liang, Kan
2023-02-17 23:11 ` John Stultz
2023-03-08 18:44 ` Liang, Kan
2023-03-09 1:17 ` John Stultz
2023-03-09 16:56 ` Liang, Kan
2023-03-11 5:55 ` John Stultz
2023-03-13 21:19 ` Liang, Kan
2023-03-18 6:02 ` John Stultz
2023-03-21 15:26 ` Liang, Kan
2023-02-14 19:52 ` John Stultz
2023-02-13 19:07 ` [RFC PATCH V2 3/9] perf/x86: Factor out x86_pmu_sample_preload() kan.liang
2023-02-13 19:07 ` [RFC PATCH V2 4/9] perf/x86: Enable post-processing monotonic raw conversion kan.liang
2023-02-14 20:02 ` Thomas Gleixner
2023-02-14 20:21 ` Liang, Kan
2023-02-14 20:55 ` Thomas Gleixner
2023-03-21 15:38 ` Liang, Kan
2023-02-13 19:07 ` [RFC PATCH V2 5/9] perf/x86/intel: Enable large PEBS for monotonic raw kan.liang
2023-02-13 19:07 ` [RFC PATCH V2 6/9] tools headers UAPI: Sync linux/perf_event.h with the kernel sources kan.liang
2023-02-13 19:07 ` [RFC PATCH V2 7/9] perf session: Support the monotonic raw clock conversion information kan.liang
2023-02-13 19:07 ` [RFC PATCH V2 8/9] perf evsel, tsc: Support the monotonic raw clock conversion kan.liang
2023-02-13 19:07 ` [RFC PATCH V2 9/9] perf evsel: Enable post-processing monotonic raw conversion by default kan.liang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0df181b9-fb34-78e8-1376-65d45f7f938f@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=eranian@google.com \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=sboyd@kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox