From: "Liang, Kan" <kan.liang@linux.intel.com>
To: John Stultz <jstultz@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
peterz@infradead.org, mingo@redhat.com,
linux-kernel@vger.kernel.org, sboyd@kernel.org,
eranian@google.com, namhyung@kernel.org, ak@linux.intel.com,
adrian.hunter@intel.com
Subject: Re: [RFC PATCH V2 2/9] perf: Extend ABI to support post-processing monotonic raw conversion
Date: Tue, 21 Mar 2023 11:26:55 -0400 [thread overview]
Message-ID: <3ec02222-4096-224d-d3c7-4cea811d76ab@linux.intel.com> (raw)
In-Reply-To: <CANDhNComKRDdZJ8SJECNdoAzQhmR3vu9yKAtp7NKDmECxff=fg@mail.gmail.com>
Hi John,
On 2023-03-18 2:02 a.m., John Stultz wrote:
> On Mon, Mar 13, 2023 at 2:19 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>>
>>
>> On 2023-03-11 12:55 a.m., John Stultz wrote:
>>> On Thu, Mar 9, 2023 at 8:56 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>>> On 2023-03-08 8:17 p.m., John Stultz wrote:
>>>>> So I spent a little bit of time today adding some trace_printks to the
>>>>> timekeeping code so I could record the actual TSC and timestamps being
>>>>> calculated from CLOCK_MONOTONIC_RAW.
>>>>>
>>>>> I did catch one error in the test code, which unfortunately I'm to blame for:
>>>>> mid = start + (delta +(delta/2))/2; //round-closest
>>>>>
>>>>> That should be
>>>>> mid = start + (delta +(2/2))/2 //round-closest
>>>>> or more simply
>>>>> mid = start + (delta +1)/2; //round-closest
>>>>>
>>>>> Generalized rounding should be: (value + (DIV/2))/DIV), but I'm
>>>>> guessing with two as the divisor, my brain mixed it up and typed
>>>>> "delta". My apologies!
>>>>>
>>>>> With that fix, I'm seeing closer to ~500ns of error in the
>>>>> interpolation, just using the userland sampling. Now, I've also
>>>>> disabled vsyscalls for this (otherwise I wouldn't be able to
>>>>> trace_printk), so the error likely would be higher than with
>>>>> vsyscalls.
>>>>>
>>>>> Now, part of the error is that:
>>>>> start= rdtsc();
>>>>> clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
>>>>> end = rdtsc();
>>>>>
>>>>> Ends up looking like
>>>>> start= rdtsc();
>>>>> clock_gettime() {
>>>>> now = rdtsc();
>>>>> delta = now - last;
>>>>> ns = (delta * mult) >> shift
>>>>> [~midpoint~]
>>>>> ts->nsec = base_ns + ns;
>>>>> ts->sec = base_sec;
>>>>> normalize_ts(ts)
>>>>> }
>>>>> end = rdtsc();
>>>>>
>>>>> And so by taking the mid-point we're always a little skewed from where
>>>>> the tsc was actually read. Looking at the data for my case the tsc
>>>>> read seems to be ~12% in, so you could instead try:
>>>>>
>>>>> delta = end - start;
>>>>> p12 = start + ((delta * 12) + (100/2))/100;
>>>>>
>>>>> With that adjustment, I'm seeing error around ~40ns.
>>>>>
>>>>> Mind giving that a try?
>>>>
>>>> I tried both the new mid and p12. The error becomes even larger.
>>>>
>>>> With new mid (start + (delta +1)/2), the error is now ~3800ns
>>>> With p12 adjustment, the error is ~6700ns.
>>>>
>>>>
>>>> Here is how I run the test.
>>>> $./time
>>>> $perf record -e cycles:upp --clockid monotonic_raw $some_workaround
>>>> $./time
>>>>
>>>> Here are some raw data.
>>>>
>>>> For the first ./time,
>>>> start: 961886196018
>>>> end: 961886215603
>>>> MONO_RAW: 341485848531
>>>>
>>>> For the second ./time,
>>>> start: 986870117783
>>>> end: 986870136152
>>>> MONO_RAW: 351495432044
>>>>
>>>> Here is the time generated from one PEBS record.
>>>> TSC: 968210217271
>>>> PEBS_MONO_RAW (calculated via kernel conversion information): 344019503072
>>>>
>>>> Using new mid (start + (delta +1)/2), the guessed PEBS_MONO_RAW is
>>>> 344019506897. The error is 3825ns.
>>>> Using p12 adjustment, the guessed PEBS_MONO_RAW is 344019509831.
>>>> The error is 6759ns
>>>
>>> Huh. I dunno. That seems wild that the error increased.
>>>
>>> Just in case something is going astray with the PEBS_MONO_RAW logic,
>>> can you apply the hack patch I was using to display the MONOTONIC_RAW
>>> values the kernel calculates?
>>> https://github.com/johnstultz-work/linux-dev/commit/8d7896b078965b059ea5e8cc21841580557f6df6
>>>
>>> It uses trace_printk, so you'll have to cat /sys/kernel/tracing/trace
>>> to get the output.
>>>
>>
>>
>> $ ./time_3
>> start: 7358368893806 end: 7358368902944 delta: 9138
>> MONO_RAW: 2899739790738
>> MID: 7358368898375
>> P12: 7358368894903
>> $ sudo cat /sys/kernel/tracing/trace | grep time_3
>> time_3-1443 [002] ..... 2899.858936: ktime_get_raw_ts64:
>> JDB: timekeeping_get_delta cycle_now: 7358368897679
>> time_3-1443 [002] ..... 2899.858937: ktime_get_raw_ts64:
>> JDB: ktime_get_raw_ts64: 2899739790738
>>
>> The error between MID and cycle_now is -696ns
>> The error between P12 and cycle_now is 2776ns
>
> Hey Kan,
> So I'm terribly sorry, I'm a bit underwater right now and haven't
> had time to look deeper at this. The MID case you have above looks
> closer to what I was seeing but I can't explain why the 12% case is
> worse.
>
> Since I feel it's not really fair to object to your patch but not have
> the time to work through an alternative with you, I'm going to
> withdraw my objection (though others may persist!).
> I'd still really prefer if we avoided exposing internal timekeeping
> state directly to userland, and it would be good to see some further
> exploration in other directions, but there is the existing perf mmap
> precedence (even if I dislike it). Sorry I can't be of more help to
> find a better approach here. :(
>
Thank you all the same. I think we learnt that there should be more work
for the pure user space solution. It is not a solution for the monotonic
raw conversion for now.
I have no idea how to do the post-processing conversion without the
internal conversion information.
So, for now, there seems only two candidate solutions.
- Pure kernel solution (Similar to V1).
- Expose the internal conversion information to the user space and does
post-processing conversion. (V2)
I will ping Thomas in the other thread and see if he has any suggestions.
Thanks,
Kan
next prev parent reply other threads:[~2023-03-21 15:27 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-13 19:07 [RFC PATCH V2 0/9] Convert TSC to monotonic raw clock for PEBS kan.liang
2023-02-13 19:07 ` [RFC PATCH V2 1/9] timekeeping: Expose the conversion information of monotonic raw kan.liang
2023-02-13 19:28 ` John Stultz
2023-02-13 19:07 ` [RFC PATCH V2 2/9] perf: Extend ABI to support post-processing monotonic raw conversion kan.liang
2023-02-13 19:37 ` John Stultz
2023-02-13 21:40 ` Liang, Kan
2023-02-13 22:22 ` John Stultz
2023-02-14 10:43 ` Peter Zijlstra
2023-02-14 17:46 ` Liang, Kan
2023-02-14 19:37 ` John Stultz
2023-02-14 20:09 ` Liang, Kan
2023-02-14 20:21 ` John Stultz
2023-03-12 20:50 ` Andi Kleen
2023-02-14 19:34 ` John Stultz
2023-02-14 14:51 ` Liang, Kan
2023-02-14 17:00 ` Liang, Kan
2023-02-14 20:11 ` John Stultz
2023-02-14 20:38 ` Liang, Kan
2023-02-17 23:11 ` John Stultz
2023-03-08 18:44 ` Liang, Kan
2023-03-09 1:17 ` John Stultz
2023-03-09 16:56 ` Liang, Kan
2023-03-11 5:55 ` John Stultz
2023-03-13 21:19 ` Liang, Kan
2023-03-18 6:02 ` John Stultz
2023-03-21 15:26 ` Liang, Kan [this message]
2023-02-14 19:52 ` John Stultz
2023-02-13 19:07 ` [RFC PATCH V2 3/9] perf/x86: Factor out x86_pmu_sample_preload() kan.liang
2023-02-13 19:07 ` [RFC PATCH V2 4/9] perf/x86: Enable post-processing monotonic raw conversion kan.liang
2023-02-14 20:02 ` Thomas Gleixner
2023-02-14 20:21 ` Liang, Kan
2023-02-14 20:55 ` Thomas Gleixner
2023-03-21 15:38 ` Liang, Kan
2023-02-13 19:07 ` [RFC PATCH V2 5/9] perf/x86/intel: Enable large PEBS for monotonic raw kan.liang
2023-02-13 19:07 ` [RFC PATCH V2 6/9] tools headers UAPI: Sync linux/perf_event.h with the kernel sources kan.liang
2023-02-13 19:07 ` [RFC PATCH V2 7/9] perf session: Support the monotonic raw clock conversion information kan.liang
2023-02-13 19:07 ` [RFC PATCH V2 8/9] perf evsel, tsc: Support the monotonic raw clock conversion kan.liang
2023-02-13 19:07 ` [RFC PATCH V2 9/9] perf evsel: Enable post-processing monotonic raw conversion by default kan.liang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3ec02222-4096-224d-d3c7-4cea811d76ab@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=eranian@google.com \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=sboyd@kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox