From: micpof@gmail.com (Michael O'Farrell)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] Add cap_user_time aarch64
Date: Fri, 13 Jul 2018 12:24:51 -0700 [thread overview]
Message-ID: <20180713192451.49280-1-micpof@gmail.com> (raw)
It is useful to get the running time of a thread. Doing so in an
efficient manner can be important for performance of user applications.
Avoiding system calls in `clock_gettime` when handling
CLOCK_THREAD_CPUTIME_ID is important. Other clocks are handled in the
VDSO, but CLOCK_THREAD_CPUTIME_ID falls back on the system call.
CLOCK_THREAD_CPUTIME_ID is not handled in the VDSO since it would have
costs associated with maintaining updated user space accessible time
offsets. These offsets have to be updated everytime the a thread is
scheduled/descheduled. However, for programs regularly checking the
running time of a thread, this is a performance improvement.
This patch takes a middle ground, and adds support for cap_user_time an
optional feature of the perf_event API. This way costs are only
incurred when the perf_event api is enabled. This is done the same way
as it is in x86. This patch also enables the cap_user_time facility of
perf_event since it comes free with the stable clock.
Ultimately this allows calculating the thread running time in userspace
on aarch64 as follows (adapted from perf_event_open manpage):
u32 seq, time_mult, time_shift;
u64 running, count, time_offset, quot, rem, delta;
struct perf_event_mmap_page *pc;
pc = buf; // buf is the perf event mmaped page as documented in the API.
if (pc->cap_usr_time) {
do {
seq = pc->lock;
barrier();
running = pc->time_running;
count = readCNTVCT_EL0(); // Read ARM hardware clock.
time_offset = pc->time_offset;
time_mult = pc->time_mult;
time_shift = pc->time_shift;
barrier();
} while (pc->lock != seq);
quot = (count >> time_shift);
rem = count & (((u64)1 << time_shift) - 1);
delta = time_offset + quot * time_mult +
((rem * time_mult) >> time_shift);
running += delta;
// running now has the current nanosecond level thread time.
}
Summary of changes in the patch:
For aarch64 systems with a stable scheduler clock, make
arch_perf_update_userpage update the timing information stored in the
perf_event page. Requiring the following calculations:
- Calculate the appropriate time_mult, and time_shift factors to convert
ticks to nano seconds for the current clock frequency.
- Adjust the mult and shift factors to avoid shift factors of 32 bits.
(possibly unnecessary)
- The time_offset userspace should apply when doing calculations:
negative the current sched time (now), because time_running and
time_enabled fields of the perf_event page have just been updated.
- The time_zero field is zero since we have are using the stable
sched_clock, and it started at zero.
Toggle bits to appropriate values:
- Enable cap_user_time
- Enable cap_user_time_zero. Since the sched clock is stable, and
it started at zero this is able to be true
Signed-off-by: Michael O'Farrell <micpof@gmail.com>
---
arch/arm64/kernel/perf_event.c | 39 ++++++++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 33147aacdafd..c63f31caf7ac 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -25,6 +25,7 @@
#include <asm/virt.h>
#include <linux/acpi.h>
+#include <linux/clocksource.h>
#include <linux/of.h>
#include <linux/perf/arm_pmu.h>
#include <linux/platform_device.h>
@@ -1127,3 +1128,41 @@ static int __init armv8_pmu_driver_init(void)
return arm_pmu_acpi_probe(armv8_pmuv3_init);
}
device_initcall(armv8_pmu_driver_init)
+
+#ifndef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
+void arch_perf_update_userpage(struct perf_event *event,
+ struct perf_event_mmap_page *userpg, u64 now)
+{
+ u32 freq;
+ u32 shift;
+
+ /*
+ * Internal timekeeping for enabled/running/stopped times
+ * is always computed with the sched_clock.
+ */
+ freq = arch_timer_get_rate();
+ userpg->cap_user_time = 1;
+
+ clocks_calc_mult_shift(&userpg->time_mult, &shift, freq,
+ NSEC_PER_SEC, 0);
+ /*
+ * time_shift is not expected to be greater than 31 due to
+ * the original published conversion algorithm shifting a
+ * 32-bit value (now specifies a 64-bit value) - refer
+ * perf_event_mmap_page documentation in perf_event.h.
+ */
+ if (shift == 32) {
+ shift = 31;
+ userpg->time_mult >>= 1;
+ }
+ userpg->time_shift = (u16)shift;
+ userpg->time_offset = -now;
+
+ /*
+ * We are always using the sched_clock as the base, so
+ * cap_user_time_zero makes sense.
+ */
+ userpg->cap_user_time_zero = 1;
+ userpg->time_zero = 0;
+}
+#endif /* !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK */
--
2.17.1
next reply other threads:[~2018-07-13 19:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-13 19:24 Michael O'Farrell [this message]
2018-07-24 15:14 ` [PATCH] Add cap_user_time aarch64 Will Deacon
2018-07-24 15:45 ` Peter Zijlstra
2018-07-24 22:41 ` Michael O'Farrell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180713192451.49280-1-micpof@gmail.com \
--to=micpof@gmail.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).