Re: [PATCH bpf-next 1/3] bpf: Introduce bpf_timer

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: davem@davemloft.net, daniel@iogearbox.net, andrii@kernel.org,
	netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH bpf-next 1/3] bpf: Introduce bpf_timer
Date: Thu, 03 Jun 2021 12:59:22 +0200	[thread overview]
Message-ID: <874kef9pth.fsf@toke.dk> (raw)
In-Reply-To: <20210603015809.l2dez754vzxjueew@ast-mbp.dhcp.thefacebook.com>

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Thu, Jun 03, 2021 at 12:21:05AM +0200, Toke Høiland-Jørgensen wrote:
>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>> 
>> > On Thu, May 27, 2021 at 06:57:17PM +0200, Toke Høiland-Jørgensen wrote:
>> >> >     if (val) {
>> >> >         bpf_timer_init(&val->timer, timer_cb2, 0);
>> >> >         bpf_timer_start(&val->timer, 1000 /* call timer_cb2 in 1 msec */);
>> >> 
>> >> nit: there are 1M nanoseconds in a millisecond :)
>> >
>> > oops :)
>> >
>> >> >     }
>> >> > }
>> >> >
>> >> > This patch adds helper implementations that rely on hrtimers
>> >> > to call bpf functions as timers expire.
>> >> > The following patch adds necessary safety checks.
>> >> >
>> >> > Only programs with CAP_BPF are allowed to use bpf_timer.
>> >> >
>> >> > The amount of timers used by the program is constrained by
>> >> > the memcg recorded at map creation time.
>> >> >
>> >> > The bpf_timer_init() helper is receiving hidden 'map' and 'prog' arguments
>> >> > supplied by the verifier. The prog pointer is needed to do refcnting of bpf
>> >> > program to make sure that program doesn't get freed while timer is armed.
>> >> >
>> >> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>> >> 
>> >> Overall this LGTM, and I believe it will be usable for my intended use
>> >> case. One question:
>> >> 
>> >> With this, it will basically be possible to create a BPF daemon, won't
>> >> it? I.e., if a program includes a timer and the callback keeps re-arming
>> >> itself this will continue indefinitely even if userspace closes all refs
>> >> to maps and programs? Not saying this is a problem, just wanted to check
>> >> my understanding (i.e., that there's not some hidden requirement on
>> >> userspace keeping a ref open that I'm missing)...
>> >
>> > That is correct.
>> > Another option would be to auto-cancel the timer when the last reference
>> > to the prog drops. That may feel safer, since forever
>> > running bpf daemon is a certainly new concept.
>> > The main benefits of doing prog_refcnt++ from bpf_timer_start are ease
>> > of use and no surprises.
>> > Disappearing timer callback when prog unloads is outside of bpf prog control.
>> > For example the tracing bpf prog might collect some data and periodically
>> > flush it to user space. The prog would arm the timer and when callback
>> > is invoked it would send the data via ring buffer and start another
>> > data collection cycle.
>> > When the user space part of the service exits it doesn't have
>> > an ability to enforce the flush of the last chunk of data.
>> > It could do prog_run cmd that will call the timer callback,
>> > but it's racy.
>> > The solution to this problem could be __init and __fini
>> > sections that will be invoked when the prog is loaded
>> > and when the last refcnt drops.
>> > It's a complementary feature though.
>> > The prog_refcnt++ from bpf_timer_start combined with a prog
>> > explicitly doing bpf_timer_cancel from __fini
>> > would be the most flexible combination.
>> > This way the prog can choose to be a daemon or it can choose
>> > to cancel its timers and do data flushing when the last prog
>> > reference drops.
>> > The prog refcnt would be split (similar to uref). The __fini callback
>> > will be invoked when refcnt reaches zero, but all increments
>> > done by bpf_timer_start will be counted separately.
>> > The user space wouldn't need to do the prog_run command.
>> > It would detach the prog and close(prog_fd).
>> > That will trigger __fini callback that will cancel the timers
>> > and the prog will be fully unloaded.
>> > That would make bpf progs resemble kernel modules even more.
>> 
>> I like the idea of a "destructor" that will trigger on refcnt drop to
>> zero. And I do think a "bpf daemon" is potentially a useful, if novel,
>> concept.
>
> I think so too. Long ago folks requested periodic bpf progs to
> do sampling in tracing. All these years attaching bpf prog
> to a perf_event was a workaround for such feature request.
> perf_event bpf prog can be pinned in perf_event array,
> so "bpf daemon" kinda exist today. Just more convoluted.

Right, agreed - triggering periodic sampling directly from BPF does seem
like the right direction.

>> The __fini thing kinda supposes a well-behaved program, though, right?
>> I.e., it would be fairly trivial to write a program that spins forever
>> by repeatedly scheduling the timer with a very short interval (whether
>> by malice or bugginess).
>
> It's already possible without bpf_timer.

Hmm, fair point.

>> So do we need a 'bpfkill' type utility to nuke
>> buggy programs, or how would resource constraints be enforced?
>
> That is possible without 'bpfkill'.
> bpftool can delete map element that contains bpf_timer and
> that will cancel it. I'll add tests to make sure it's the case.

Ah, right, of course! Thanks, LGTM then :)

-Toke

next prev parent reply	other threads:[~2021-06-03 10:59 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-27  4:02 [PATCH bpf-next 0/3] bpf: Introduce BPF timers Alexei Starovoitov
2021-05-27  4:02 ` [PATCH bpf-next 1/3] bpf: Introduce bpf_timer Alexei Starovoitov
2021-05-27 16:57   ` Toke Høiland-Jørgensen
2021-06-02  1:46     ` Alexei Starovoitov
2021-06-02 22:21       ` Toke Høiland-Jørgensen
2021-06-03  1:58         ` Alexei Starovoitov
2021-06-03 10:59           ` Toke Høiland-Jørgensen [this message]
2021-06-03 17:21             ` Andrii Nakryiko
2021-06-02 22:08   ` Andrii Nakryiko
2021-06-03  1:53     ` Alexei Starovoitov
2021-06-03 17:10       ` Andrii Nakryiko
2021-06-04  1:12         ` Alexei Starovoitov
2021-06-04  4:17           ` Andrii Nakryiko
2021-06-04 18:16             ` Alexei Starovoitov
2021-05-27  4:02 ` [PATCH bpf-next 2/3] bpf: Add verifier checks for bpf_timer Alexei Starovoitov
2021-06-02 22:34   ` Andrii Nakryiko
2021-06-02 22:38     ` Andrii Nakryiko
2021-06-03  2:04     ` Alexei Starovoitov
2021-06-03 17:35       ` Andrii Nakryiko
2021-05-27  4:02 ` [PATCH bpf-next 3/3] selftests/bpf: Add bpf_timer test Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874kef9pth.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).