From: Thomas Gleixner <tglx@linutronix.de>
To: Delyan Kratunov <delyan@delyan.me>, linux-kernel@vger.kernel.org
Subject: Re: Posix process cpu timer inaccuracies
Date: Fri, 01 Mar 2024 22:27:26 +0100 [thread overview]
Message-ID: <87jzmlwsld.ffs@tglx> (raw)
In-Reply-To: <4547873.LvFx2qVVIh@discovery>
Delyan!
On Mon, Feb 26 2024 at 16:29, Delyan Kratunov wrote:
>> I don't know and those assumptions have been clearly wrong at the point
>> where the tool was written.
>
> That was my impression as well, thanks for confirming. (I've found at least 3
> tools with this same incorrect belief)
The wonders of error proliferation by mindless copy & pasta and/or design
borrowing.
> Absolutely, the ability to write a profiler with perf_event_open is not in
> question at all. However, not every situation allows for PMU or
> perf_event_open access. Timers could form a nice middle ground, in exactly the
> way people have tried to use them.
>
> I'd like to push back a little on the "CLOCK_THREAD_CPUTIME_ID fixes things"
> point, though. From an application and library point of view, the per-thread
> clocks are harder to use - you need to either orchestrate every thread to
> participate voluntarily or poll the thread ids and create timers from another
> thread. In perf_event_open, this is solved via the .inherit/.inherit_thread
> bits.
I did not say it's easy and fixes all problems magically :)
As accessing a different thread/process requires ptrace permissions this
might be solvable via ptrace, which might turn out to be too heavy weight.
Though it would be certainly possible to implement inheritance for those
timers and let the kernel set them up for all existing and future threads.
That's a bit tricky vs. accounting on behalf of and association to the
profiler thread in the (v)fork() case and also needs some thought about
how the profiler thread gets informed of the newly associated timer_id,
but I think it's doable.
> More importantly, they don't work for all workloads. If I have 10 threads that
> each run for 5ms, a 10ms process timer would fire 5 times, while per-thread
> 10ms timers would never fire. You can easily imagine an application that
> accrues all its cpu time in a way that doesn't generate a single signal (in
> the extreme, threads only living a single tick).
That's true, but you have to look at the life time rules of those
timers.
A CLOCK_THREAD_CPUTIME_ID timer is owned by the thread which creates it,
no matter what the monitored target thread is. So when the monitored
thread exits then it disarms the timer, but the timer itself stays
accessible to the owner. That means the owner can still query the timer.
As of today a timer_get(CLOCK_THREAD_CPUTIME_ID) after the monitored
thread exited results in { .it_value = 0, .it_interval = 0 }.
We can't change in general, but if we go and do the inheritance mode,
then the timer would be owned by the profiler thread. Even without
inheritance mode we can handle a special flag for timer_create() to
denote that this is a magic timer :)
So that magic flag would preserve the accumulated runtime when the
thread exits in the timer in some way and either return that in
timer_get() along with some magic to denote that the monitored thread is
gone or add a new timer_get_foo() syscall for it.
Whether the profiler then polls the timers periodically or acts on an
exit signal that's a user space implementation detail.
Thanks,
tglx
prev parent reply other threads:[~2024-03-01 21:27 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-11 1:30 Posix process cpu timer inaccuracies Delyan Kratunov
2024-02-13 18:20 ` Thomas Gleixner
2024-02-27 0:29 ` Delyan Kratunov
2024-03-01 21:27 ` Thomas Gleixner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87jzmlwsld.ffs@tglx \
--to=tglx@linutronix.de \
--cc=delyan@delyan.me \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox