From: Jiri Olsa <olsajiri@gmail.com>
To: Song Liu <song@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Hao Sun <sunhao.th@gmail.com>,
bpf@vger.kernel.org, Martin KaFai Lau <kafai@fb.com>,
Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@chromium.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>
Subject: Re: [PATCH bpf-next] bpf: Remove trace_printk_lock lock
Date: Tue, 13 Dec 2022 22:53:33 +0100 [thread overview]
Message-ID: <Y5j0XYQyiDP1Uu68@krava> (raw)
In-Reply-To: <CAPhsuW5sQaspOhurLWm0igDUJk+V9ihmt0WnjaKsq1gJ66F6Gw@mail.gmail.com>
On Tue, Dec 13, 2022 at 10:48:43AM -0800, Song Liu wrote:
> On Tue, Dec 13, 2022 at 6:09 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Both bpf_trace_printk and bpf_trace_vprintk helpers use static buffer
> > guarded with trace_printk_lock spin lock.
> >
> > The spin lock contention causes issues with bpf programs attached to
> > contention_begin tracepoint [1] [2].
> >
> > Andrii suggested we could get rid of the contention by using trylock,
> > but we could actually get rid of the spinlock completely by using
> > percpu buffers the same way as for bin_args in bpf_bprintf_prepare
> > function.
> >
> > Adding 4 per cpu buffers (1k each) which should be enough for all
> > possible nesting contexts (normal, softirq, irq, nmi) or possible
> > (yet unlikely) probe within the printk helpers.
> >
> > In very unlikely case we'd run out of the nesting levels the printk
> > will be omitted.
> >
> > [1] https://lore.kernel.org/bpf/CACkBjsakT_yWxnSWr4r-0TpPvbKm9-OBmVUhJb7hV3hY8fdCkw@mail.gmail.com/
> > [2] https://lore.kernel.org/bpf/CACkBjsaCsTovQHFfkqJKto6S4Z8d02ud1D7MPESrHa1cVNNTrw@mail.gmail.com/
> >
> > Reported-by: Hao Sun <sunhao.th@gmail.com>
> > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> > kernel/trace/bpf_trace.c | 61 +++++++++++++++++++++++++++++++---------
> > 1 file changed, 47 insertions(+), 14 deletions(-)
> >
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 3bbd3f0c810c..b9287b3a5540 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -369,33 +369,62 @@ static const struct bpf_func_proto *bpf_get_probe_write_proto(void)
> > return &bpf_probe_write_user_proto;
> > }
> >
> > -static DEFINE_RAW_SPINLOCK(trace_printk_lock);
> > -
> > #define MAX_TRACE_PRINTK_VARARGS 3
> > #define BPF_TRACE_PRINTK_SIZE 1024
> > +#define BPF_TRACE_PRINTK_LEVELS 4
> > +
> > +struct trace_printk_buf {
> > + char data[BPF_TRACE_PRINTK_LEVELS][BPF_TRACE_PRINTK_SIZE];
> > + int level;
> > +};
> > +static DEFINE_PER_CPU(struct trace_printk_buf, printk_buf);
> > +
> > +static void put_printk_buf(struct trace_printk_buf __percpu *buf)
> > +{
> > + if (WARN_ON_ONCE(this_cpu_read(buf->level) == 0))
> > + return;
> > + this_cpu_dec(buf->level);
> > + preempt_enable();
> > +}
> > +
> > +static bool get_printk_buf(struct trace_printk_buf __percpu *buf, char **data)
> > +{
> > + int level;
> > +
> > + preempt_disable();
>
> Can we use migrate_disable() instead?
I think that should work.. while checking on that I found
comment in in include/linux/preempt.h (though dated):
The end goal must be to get rid of migrate_disable
but looks like both should work here and there are trade offs
for using each of them
>
> > + level = this_cpu_inc_return(buf->level);
> > + if (level > BPF_TRACE_PRINTK_LEVELS) {
>
> Maybe add WARN_ON_ONCE() here?
ok, will add
thanks,
jirka
next prev parent reply other threads:[~2022-12-13 21:56 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-13 14:08 [PATCH bpf-next] bpf: Remove trace_printk_lock lock Jiri Olsa
2022-12-13 18:48 ` Song Liu
2022-12-13 21:53 ` Jiri Olsa [this message]
2022-12-13 23:52 ` Yonghong Song
2022-12-14 9:33 ` Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y5j0XYQyiDP1Uu68@krava \
--to=olsajiri@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=kpsingh@chromium.org \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=songliubraving@fb.com \
--cc=sunhao.th@gmail.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.