From: Kurt Kanzenbach <kurt@linutronix.de>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Jesper Dangaard Brouer <jbrouer@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Joanne Koong <joannelkoong@gmail.com>,
Jiri Olsa <jolsa@kernel.org>,
Dave Marchevsky <davemarchevsky@fb.com>,
Lorenzo Bianconi <lorenzo@kernel.org>,
Geliang Tang <geliang.tang@suse.com>,
Jakub Sitnicki <jakub@cloudflare.com>,
Network Development <netdev@vger.kernel.org>,
bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next] bpf: Add BPF-helper for accessing CLOCK_TAI
Date: Wed, 08 Jun 2022 08:13:28 +0200 [thread overview]
Message-ID: <87zginy96f.fsf@kurt> (raw)
In-Reply-To: <CAADnVQKqo1XfrPO8OYA1VpArKHZotuDjGNtxM0AftUj_R+vU7g@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4272 bytes --]
On Tue Jun 07 2022, Alexei Starovoitov wrote:
> On Tue, Jun 7, 2022 at 12:35 PM Jesper Dangaard Brouer
> <jbrouer@redhat.com> wrote:
>>
>>
>> On 07/06/2022 11.14, Thomas Gleixner wrote:
>> > Alexei,
>> >
>> > On Mon, Jun 06 2022 at 08:57, Alexei Starovoitov wrote:
>> >> On Mon, Jun 6, 2022 at 3:38 AM Kurt Kanzenbach <kurt@linutronix.de> wrote:
>> >>>
>> >>> From: Jesper Dangaard Brouer <brouer@redhat.com>
>> >>>
>> >>> Commit 3dc6ffae2da2 ("timekeeping: Introduce fast accessor to clock tai")
>> >>> introduced a fast and NMI-safe accessor for CLOCK_TAI. Especially in time
>> >>> sensitive networks (TSN), where all nodes are synchronized by Precision Time
>> >>> Protocol (PTP), it's helpful to have the possibility to generate timestamps
>> >>> based on CLOCK_TAI instead of CLOCK_MONOTONIC. With a BPF helper for TAI in
>> >>> place, it becomes very convenient to correlate activity across different
>> >>> machines in the network.
>> >>
>> >> That's a fresh feature. It feels risky to bake it into uapi already.
>> >
>> > What? That's just support for a different CLOCK. What's so risky about
>> > it?
>>
>> I didn't think it was "risky" as this is already exported as:
>> EXPORT_SYMBOL_GPL(ktime_get_tai_fast_ns);
>>
>> Correct me if I'm wrong, but this simple gives BPF access to CLOCK_TAI
>> (see man clock_gettime(2)), right?
>> And CLOCK_TAI is not really a new/fresh type of CLOCK.
>>
>> Especially for networking we need this CLOCK_TAI time as HW LaunchTime
>> need this (e.g. see qdisc's sch_etf.c and sch_taprio.c).
In addition to Tx launchtime I have two other uses cases in mind:
Timestamping and policing.
>
> I see. I interpreted the commit log that commit 3dc6ffae2da2
> introduced TAI into the kernel for the first time.
> But it introduced the NMI safe version of it, right?
Yes, exactly. The clock itself is nothing new. Only the NMI safe version
of it is. It is designated to be used e.g., in tracing or bpf.
I'll update the changelog.
>
>> >
>> >> imo it would be better to annotate tk_core variable in vmlinux BTF.
>> >> Then progs will be able to read all possible timekeeper offsets.
>> >
>> > We are exposing APIs. APIs can be supported, but exposing implementation
>> > details creates ABIs of the worst sort because that prevents the kernel
>> > from changing the implementation. We've seen the fallout with the recent
>> > tracepoint changes already.
>>
>> Hmm... annotate tk_core variable in vmlinux BTF and letting BPF progs
>> access this seems like an unsafe approach and we tempt BPF-developers to
>> think other parts are okay to access.
>
> It is safe to access.
> Whether garbage will be read it's a different story.
>
> The following works (with lose definition of 'works'):
>
> extern const void tk_core __ksym;
>
> struct timekeeper {
> long long offs_tai;
> } __attribute__((preserve_access_index));
>
> struct seqcount_raw_spinlock {
> } __attribute__((preserve_access_index));
>
> long get_clock_tai(void)
> {
> long long off = 0;
> void *addr = (void *)&tk_core +
> ((bpf_core_type_size(struct seqcount_raw_spinlock) + 7) / 8) * 8 +
> bpf_core_field_offset(struct timekeeper, offs_tai);
>
> bpf_probe_read_kernel(&off, sizeof(off), addr);
> return bpf_ktime_get_ns() + off;
> }
Thanks for the example.
>
> It's ugly, but no kernel changes are necessary.
> If you need to access clock_tai you can do so on older kernels too.
> Even on android 4.19 kernels.
>
> It's not foolproof. Future kernel changes will break it,
> but CO-RE will detect the breakage.
> The prog author would have to adjust the prog every time.
>
> People do these kinds of tricks all the time.
>
> Note that above was possible even before CO-RE came to existence.
> The first day bpf_probe_read_kernel() became available 8 years ago
> the tracing progs could read any kernel memory.
> Even earlier kprobes could read it for 10+ years.
>
> And in all those years bpf progs accessing kernel internals
> didn't freeze kernel development. bpf progs don't extend
> uapi surface unless the changes are in uapi/bpf.h.
>
> Anyway I guess new helper bpf_ktime_get_tai_ns() is ok, since
> it's so trivial, but selftest is necessary.
OK. I'll add a selftest and resolve the warnings detected by netdev ci.
Thanks,
Kurt
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]
next prev parent reply other threads:[~2022-06-08 7:28 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-06 10:37 [PATCH bpf-next] bpf: Add BPF-helper for accessing CLOCK_TAI Kurt Kanzenbach
2022-06-06 15:57 ` Alexei Starovoitov
2022-06-07 9:14 ` Thomas Gleixner
2022-06-07 19:35 ` Jesper Dangaard Brouer
2022-06-08 2:01 ` Alexei Starovoitov
2022-06-08 6:13 ` Kurt Kanzenbach [this message]
2022-08-02 7:06 ` Kurt Kanzenbach
2022-08-02 15:03 ` Alexei Starovoitov
2022-08-03 6:29 ` Kurt Kanzenbach
2022-08-03 9:28 ` Maciej Fijalkowski
2022-08-03 17:29 ` Andrii Nakryiko
2022-08-03 17:28 ` Andrii Nakryiko
2022-08-04 6:40 ` Kurt Kanzenbach
2022-08-09 0:28 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zginy96f.fsf@kurt \
--to=kurt@linutronix.de \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=davemarchevsky@fb.com \
--cc=geliang.tang@suse.com \
--cc=jakub@cloudflare.com \
--cc=jbrouer@redhat.com \
--cc=joannelkoong@gmail.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kafai@fb.com \
--cc=kpsingh@kernel.org \
--cc=lorenzo@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=songliubraving@fb.com \
--cc=tglx@linutronix.de \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).