All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Andrii Nakryiko <andrii@kernel.org>
Cc: linux-trace-kernel@vger.kernel.org, rostedt@goodmis.org,
	mhiramat@kernel.org, bpf@vger.kernel.org,
	mathieu.desnoyers@efficios.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH bpf-next 3/3] uprobes: add speculative lockless system-wide uprobe filter check
Date: Wed, 13 Mar 2024 14:19:26 +0100	[thread overview]
Message-ID: <20240313131926.GA19986@redhat.com> (raw)
In-Reply-To: <20240312210233.1941599-4-andrii@kernel.org>

I forgot everything about this code, plus it has changed a lot since
I looked at it many years ago, but ...

I think this change is fine but the changelog looks a bit confusing
(overcomplicated) to me.

On 03/12, Andrii Nakryiko wrote:
>
> This patch adds a speculative check before grabbing that rwlock. If
> nr_systemwide is non-zero, lock is skipped and event is passed through.
> From examining existing logic it looks correct and safe to do. If
> nr_systemwide is being modified under rwlock in parallel, we have to
> consider basically just one important race condition: the case when
> nr_systemwide is dropped from one to zero (from
> trace_uprobe_filter_remove()) under filter->rwlock, but
> uprobe_perf_filter() raced and saw it as >0.

Unless I am totally confused, there is nothing new. Even without
this change trace_uprobe_filter_remove() can clear nr_systemwide
right after uprobe_perf_filter() drops filter->rwlock.

And of course, trace_uprobe_filter_add() can change nr_systemwide
from 0 to 1. In this case uprobe_perf_func() can "wrongly" return
UPROBE_HANDLER_REMOVE but we can't avoid this and afaics this is
fine even if handler_chain() does unapply_uprobe(), uprobe_perf_open()
will do uprobe_apply() after that, we can rely on ->register_rwsem.

> In case we speculatively read nr_systemwide as zero, while it was
> incremented in parallel, we'll proceed to grabbing filter->rwlock and
> re-doing the check, this time in lock-protected and non-racy way.

See above...


So I think uprobe_perf_filter() needs filter->rwlock only to iterate
the list, it can check nr_systemwide lockless and this means that you
can also remove the same check in __uprobe_perf_filter(), other callers
trace_uprobe_filter_add/remove check it themselves.


> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -1351,6 +1351,10 @@ static bool uprobe_perf_filter(struct uprobe_consumer *uc,
>  	tu = container_of(uc, struct trace_uprobe, consumer);
>  	filter = tu->tp.event->filter;
>
> +	/* speculative check */
> +	if (READ_ONCE(filter->nr_systemwide))
> +		return true;
> +
>  	read_lock(&filter->rwlock);
>  	ret = __uprobe_perf_filter(filter, mm);
>  	read_unlock(&filter->rwlock);

ACK,

but see above. I think the changelog should be simplified and the
filter->nr_systemwide check in __uprobe_perf_filter() should be
removed. But I won't insist and perhaps I missed something...

Oleg.


  reply	other threads:[~2024-03-13 13:20 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-12 21:02 [PATCH bpf-next 0/3] uprobes: two common case speed ups Andrii Nakryiko
2024-03-12 21:02 ` [PATCH bpf-next 1/3] uprobes: encapsulate preparation of uprobe args buffer Andrii Nakryiko
2024-03-13 15:15   ` Oleg Nesterov
2024-03-13 16:52     ` Andrii Nakryiko
2024-03-12 21:02 ` [PATCH bpf-next 2/3] uprobes: prepare uprobe args buffer lazily Andrii Nakryiko
2024-03-13 15:47   ` Oleg Nesterov
2024-03-13 16:57     ` Andrii Nakryiko
2024-03-12 21:02 ` [PATCH bpf-next 3/3] uprobes: add speculative lockless system-wide uprobe filter check Andrii Nakryiko
2024-03-13 13:19   ` Oleg Nesterov [this message]
2024-03-13 17:01     ` Andrii Nakryiko
2024-03-13  9:41 ` [PATCH bpf-next 0/3] uprobes: two common case speed ups Jiri Olsa
2024-03-13 17:33   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240313131926.GA19986@redhat.com \
    --to=oleg@redhat.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.