All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: rostedt <rostedt@goodmis.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Ingo Molnar <mingo@redhat.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Dmitry Vyukov <dvyukov@google.com>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>, Andrii Nakryiko <andriin@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@chromium.org>, netdev <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>, Kees Cook <keescook@chromium.org>,
	Florian Weimer <fw@deneb.enyo.de>,
	syzbot+83aa762ef23b6f0d1991@syzkaller.appspotmail.com,
	syzbot+d29e58bb557324e55e5e@syzkaller.appspotmail.com,
	Matt Mullins <mmullins@mmlx.us>
Subject: Re: [for-next][PATCH 14/15] tracepoint: Do not fail unregistering a probe due to memory failure
Date: Wed, 3 Feb 2021 12:57:24 -0500 (EST)	[thread overview]
Message-ID: <1836191179.6214.1612375044968.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20210203160550.710877069@goodmis.org>



----- On Feb 3, 2021, at 11:05 AM, rostedt rostedt@goodmis.org wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> The list of tracepoint callbacks is managed by an array that is protected
> by RCU. To update this array, a new array is allocated, the updates are
> copied over to the new array, and then the list of functions for the
> tracepoint is switched over to the new array. After a completion of an RCU
> grace period, the old array is freed.
> 
> This process happens for both adding a callback as well as removing one.
> But on removing a callback, if the new array fails to be allocated, the
> callback is not removed, and may be used after it is freed by the clients
> of the tracepoint.
> 
> There's really no reason to fail if the allocation for a new array fails
> when removing a function. Instead, the function can simply be replaced by a
> stub function that could be cleaned up on the next modification of the
> array. That is, instead of calling the function registered to the
> tracepoint, it would call a stub function in its place.
> 
> Link: https://lore.kernel.org/r/20201115055256.65625-1-mmullins@mmlx.us
> Link: https://lore.kernel.org/r/20201116175107.02db396d@gandalf.local.home
> Link: https://lore.kernel.org/r/20201117211836.54acaef2@oasis.local.home
> Link: https://lkml.kernel.org/r/20201118093405.7a6d2290@gandalf.local.home
> 
> [ Note, this version does use undefined compiler behavior (assuming that
>  a stub function with no parameters or return, can be called by a location
>  that thinks it has parameters but still no return value. Static calls
>  do the same thing, so this trick is not without precedent.
> 
>  There's another solution that uses RCU tricks and is more complex, but
>  can be an alternative if this solution becomes an issue.
> 
>  Link: https://lore.kernel.org/lkml/20210127170721.58bce7cc@gandalf.local.home/
> ]
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Martin KaFai Lau <kafai@fb.com>
> Cc: Song Liu <songliubraving@fb.com>
> Cc: Yonghong Song <yhs@fb.com>
> Cc: Andrii Nakryiko <andriin@fb.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: KP Singh <kpsingh@chromium.org>
> Cc: netdev <netdev@vger.kernel.org>
> Cc: bpf <bpf@vger.kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Florian Weimer <fw@deneb.enyo.de>
> Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints")
> Reported-by: syzbot+83aa762ef23b6f0d1991@syzkaller.appspotmail.com
> Reported-by: syzbot+d29e58bb557324e55e5e@syzkaller.appspotmail.com
> Reported-by: Matt Mullins <mmullins@mmlx.us>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> Tested-by: Matt Mullins <mmullins@mmlx.us>
> ---
> kernel/tracepoint.c | 80 ++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 64 insertions(+), 16 deletions(-)
> 
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index 7261fa0f5e3c..e8f20ae29c18 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -53,6 +53,12 @@ struct tp_probes {
> 	struct tracepoint_func probes[];
> };
> 
> +/* Called in removal of a func but failed to allocate a new tp_funcs */
> +static void tp_stub_func(void)
> +{
> +	return;
> +}
> +
> static inline void *allocate_probes(int count)
> {
> 	struct tp_probes *p  = kmalloc(struct_size(p, probes, count),
> @@ -131,6 +137,7 @@ func_add(struct tracepoint_func **funcs, struct
> tracepoint_func *tp_func,
> {
> 	struct tracepoint_func *old, *new;
> 	int nr_probes = 0;
> +	int stub_funcs = 0;
> 	int pos = -1;
> 
> 	if (WARN_ON(!tp_func->func))
> @@ -147,14 +154,34 @@ func_add(struct tracepoint_func **funcs, struct
> tracepoint_func *tp_func,
> 			if (old[nr_probes].func == tp_func->func &&
> 			    old[nr_probes].data == tp_func->data)
> 				return ERR_PTR(-EEXIST);
> +			if (old[nr_probes].func == tp_stub_func)
> +				stub_funcs++;
> 		}
> 	}
> -	/* + 2 : one for new probe, one for NULL func */
> -	new = allocate_probes(nr_probes + 2);
> +	/* + 2 : one for new probe, one for NULL func - stub functions */
> +	new = allocate_probes(nr_probes + 2 - stub_funcs);
> 	if (new == NULL)
> 		return ERR_PTR(-ENOMEM);
> 	if (old) {
> -		if (pos < 0) {
> +		if (stub_funcs) {

Considering that we end up implementing a case where we carefully copy over
each item, I recommend we replace the two "memcpy" branches by a single item-wise
implementation. It's a slow-path anyway, and reducing the overall complexity
is a benefit for slow paths. Fewer bugs, less code to review, and it's easier to
reach a decent testing state-space coverage.

> +			/* Need to copy one at a time to remove stubs */
> +			int probes = 0;
> +
> +			pos = -1;
> +			for (nr_probes = 0; old[nr_probes].func; nr_probes++) {
> +				if (old[nr_probes].func == tp_stub_func)
> +					continue;
> +				if (pos < 0 && old[nr_probes].prio < prio)
> +					pos = probes++;
> +				new[probes++] = old[nr_probes];
> +			}
> +			nr_probes = probes;

Repurposing "nr_probes" from accounting for the number of items in the old
array to counting the number of items in the new array in the middle of the
function is confusing.

> +			if (pos < 0)
> +				pos = probes;
> +			else
> +				nr_probes--; /* Account for insertion */

This is probably why you need to play tricks with nr_probes here.

> +		} else if (pos < 0) {
> 			pos = nr_probes;
> 			memcpy(new, old, nr_probes * sizeof(struct tracepoint_func));
> 		} else {
> @@ -188,8 +215,9 @@ static void *func_remove(struct tracepoint_func **funcs,
> 	/* (N -> M), (N > 1, M >= 0) probes */
> 	if (tp_func->func) {
> 		for (nr_probes = 0; old[nr_probes].func; nr_probes++) {
> -			if (old[nr_probes].func == tp_func->func &&
> -			     old[nr_probes].data == tp_func->data)
> +			if ((old[nr_probes].func == tp_func->func &&
> +			     old[nr_probes].data == tp_func->data) ||
> +			    old[nr_probes].func == tp_stub_func)
> 				nr_del++;
> 		}
> 	}
> @@ -208,14 +236,32 @@ static void *func_remove(struct tracepoint_func **funcs,
> 		/* N -> M, (N > 1, M > 0) */
> 		/* + 1 for NULL */
> 		new = allocate_probes(nr_probes - nr_del + 1);
> -		if (new == NULL)
> -			return ERR_PTR(-ENOMEM);
> -		for (i = 0; old[i].func; i++)
> -			if (old[i].func != tp_func->func
> -					|| old[i].data != tp_func->data)
> -				new[j++] = old[i];
> -		new[nr_probes - nr_del].func = NULL;
> -		*funcs = new;
> +		if (new) {
> +			for (i = 0; old[i].func; i++)
> +				if ((old[i].func != tp_func->func
> +				     || old[i].data != tp_func->data)
> +				    && old[i].func != tp_stub_func)
> +					new[j++] = old[i];
> +			new[nr_probes - nr_del].func = NULL;
> +			*funcs = new;
> +		} else {
> +			/*
> +			 * Failed to allocate, replace the old function
> +			 * with calls to tp_stub_func.
> +			 */
> +			for (i = 0; old[i].func; i++)
> +				if (old[i].func == tp_func->func &&
> +				    old[i].data == tp_func->data) {
> +					old[i].func = tp_stub_func;

This updates "func" while readers are loading it concurrently. I would recommend
using WRITE_ONCE here paired with READ_ONCE within __traceiter_##_name.

> +					/* Set the prio to the next event. */

I don't get why the priority needs to be changed here. Could it simply stay
at its original value ? It's already in the correct priority order anyway.

> +					if (old[i + 1].func)
> +						old[i].prio =
> +							old[i + 1].prio;
> +					else
> +						old[i].prio = -1;
> +				}
> +			*funcs = old;

I'm not sure what setting *funcs to old achieves ? Isn't it already pointing
to old ?

I'll send a patch which applies on top of yours implementing my recommendations.
It shrinks the code complexity nicely:

 include/linux/tracepoint.h |  2 +-
 kernel/tracepoint.c        | 80 +++++++++++++-------------------------
 2 files changed, 28 insertions(+), 54 deletions(-)

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  parent reply	other threads:[~2021-02-03 17:58 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-03 16:05 [for-next][PATCH 00/15] tracing: Updates for 5.12 Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 01/15] tracing: Add printf attribute to log function Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 02/15] tracing: Update trace_ignore_this_task() kernel-doc comment Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 03/15] tracing: Remove get/put_cpu() from function_trace_init Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 04/15] ring-buffer: Remove cpu_buffer argument from the rb_inc_page() Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 05/15] ring-buffer: Drop unneeded check in ring_buffer_resize() Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 07/15] tracing: Inline tracing_gen_ctx_flags() Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 08/15] tracing: Use in_serving_softirq() to deduct softirq status Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 09/15] tracing: Remove NULL check from current in tracing_generic_entry_update() Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 10/15] tracing: Fix spelling mistake in Kconfig "infinit" -> "infinite" Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 11/15] tracing: Fix spelling of controlling in uprobes Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 12/15] tracing: Fix a kernel doc warning Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 13/15] tracing: Remove definition of DEBUG in trace_mmiotrace.c Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 14/15] tracepoint: Do not fail unregistering a probe due to memory failure Steven Rostedt
2021-02-03 17:05   ` Peter Zijlstra
2021-02-03 17:09   ` Peter Zijlstra
2021-02-03 17:23     ` Steven Rostedt
2021-02-03 17:57   ` Mathieu Desnoyers [this message]
2021-02-04 16:50     ` Steven Rostedt
2021-02-03 17:57   ` [PATCH 1/1] tracepoint: cleanup: do not fail unregistration Mathieu Desnoyers
2021-02-04 17:47     ` Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 15/15] kernel: trace: preemptirq_delay_test: add cpu affinity Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1836191179.6214.1612375044968.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dvyukov@google.com \
    --cc=fw@deneb.enyo.de \
    --cc=john.fastabend@gmail.com \
    --cc=jpoimboe@redhat.com \
    --cc=kafai@fb.com \
    --cc=keescook@chromium.org \
    --cc=kpsingh@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mmullins@mmlx.us \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=songliubraving@fb.com \
    --cc=syzbot+83aa762ef23b6f0d1991@syzkaller.appspotmail.com \
    --cc=syzbot+d29e58bb557324e55e5e@syzkaller.appspotmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.