public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
To: Jiri Olsa <jolsa@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, Martin KaFai Lau <kafai@fb.com>,
	Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@chromium.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Subject: Re: [PATCHv2 bpf-next 1/9] uprobe: Add support for session consumer
Date: Wed, 3 Jul 2024 08:55:33 +0900	[thread overview]
Message-ID: <20240703085533.820f90544c3fc42edf79468d@kernel.org> (raw)
In-Reply-To: <20240701164115.723677-2-jolsa@kernel.org>

Hi Jiri,

On Mon,  1 Jul 2024 18:41:07 +0200
Jiri Olsa <jolsa@kernel.org> wrote:

> Adding support for uprobe consumer to be defined as session and have
> new behaviour for consumer's 'handler' and 'ret_handler' callbacks.
> 
> The session means that 'handler' and 'ret_handler' callbacks are
> connected in a way that allows to:
> 
>   - control execution of 'ret_handler' from 'handler' callback
>   - share data between 'handler' and 'ret_handler' callbacks
> 
> The session is enabled by setting new 'session' bool field to true
> in uprobe_consumer object.
> 
> We keep count of session consumers for uprobe and allocate session_consumer
> object for each in return_instance object. This allows us to store
> return values of 'handler' callbacks and data pointers of shared
> data between both handlers.
> 
> The session concept fits to our common use case where we do filtering
> on entry uprobe and based on the result we decide to run the return
> uprobe (or not).
> 
> It's also convenient to share the data between session callbacks.
> 
> The control of 'ret_handler' callback execution is done via return
> value of the 'handler' callback. If it's 0 we install and execute
> return uprobe, if it's 1 we do not.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  include/linux/uprobes.h     |  16 ++++-
>  kernel/events/uprobes.c     | 129 +++++++++++++++++++++++++++++++++---
>  kernel/trace/bpf_trace.c    |   6 +-
>  kernel/trace/trace_uprobe.c |  12 ++--
>  4 files changed, 144 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
> index f46e0ca0169c..903a860a8d01 100644
> --- a/include/linux/uprobes.h
> +++ b/include/linux/uprobes.h
> @@ -34,15 +34,18 @@ enum uprobe_filter_ctx {
>  };
>  
>  struct uprobe_consumer {
> -	int (*handler)(struct uprobe_consumer *self, struct pt_regs *regs);
> +	int (*handler)(struct uprobe_consumer *self, struct pt_regs *regs, __u64 *data);
>  	int (*ret_handler)(struct uprobe_consumer *self,
>  				unsigned long func,
> -				struct pt_regs *regs);
> +				struct pt_regs *regs, __u64 *data);
>  	bool (*filter)(struct uprobe_consumer *self,
>  				enum uprobe_filter_ctx ctx,
>  				struct mm_struct *mm);
>  
>  	struct uprobe_consumer *next;
> +
> +	bool			session;	/* marks uprobe session consumer */
> +	unsigned int		session_id;	/* set when uprobe_consumer is registered */

Hmm, why this has both session and session_id?
I also think we can use the address of uprobe_consumer itself as a unique id.

Also, if we can set session enabled by default, and skip ret_handler by handler's
return value, it is more simpler. (If handler returns a specific value, skip ret_handler)

>  };
>  
>  #ifdef CONFIG_UPROBES
> @@ -80,6 +83,12 @@ struct uprobe_task {
>  	unsigned int			depth;
>  };
>  
> +struct session_consumer {
> +	__u64		cookie;

And this cookie looks not scalable. If we can pass a data to handler, I would like to
reuse it to pass the target function parameters to ret_handler as kretprobe/fprobe does.

	int (*handler)(struct uprobe_consumer *self, struct pt_regs *regs, void *data);

uprobes can collect its uc's required sizes and allocate the memory (shadow stack frame)
at handler_chain().

> +	unsigned int	id;
> +	int		rc;
> +};
> +
>  struct return_instance {
>  	struct uprobe		*uprobe;
>  	unsigned long		func;
> @@ -88,6 +97,9 @@ struct return_instance {
>  	bool			chained;	/* true, if instance is nested */
>  
>  	struct return_instance	*next;		/* keep as stack */
> +
> +	int			sessions_cnt;
> +	struct session_consumer	sessions[];

In that case, we don't have this array, but 

	char data[];

And decode data array, which is a slice of variable length structure;

struct session_consumer {
	struct uprobe_consumer *uc;
	char data[];
};

The size of session_consumer is uc->session_data_size + sizeof(uc).

What would you think?

Thank you,

>  };
>  
>  enum rp_check {
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 2c83ba776fc7..4da410460f2a 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -63,6 +63,8 @@ struct uprobe {
>  	loff_t			ref_ctr_offset;
>  	unsigned long		flags;
>  
> +	unsigned int		sessions_cnt;
> +
>  	/*
>  	 * The generic code assumes that it has two members of unknown type
>  	 * owned by the arch-specific code:
> @@ -750,11 +752,30 @@ static struct uprobe *alloc_uprobe(struct inode *inode, loff_t offset,
>  	return uprobe;
>  }
>  
> +static void
> +uprobe_consumer_account(struct uprobe *uprobe, struct uprobe_consumer *uc)
> +{
> +	static unsigned int session_id;
> +
> +	if (uc->session) {
> +		uprobe->sessions_cnt++;
> +		uc->session_id = ++session_id ?: ++session_id;
> +	}
> +}
> +
> +static void
> +uprobe_consumer_unaccount(struct uprobe *uprobe, struct uprobe_consumer *uc)
> +{
> +	if (uc->session)
> +		uprobe->sessions_cnt--;
> +}
> +
>  static void consumer_add(struct uprobe *uprobe, struct uprobe_consumer *uc)
>  {
>  	down_write(&uprobe->consumer_rwsem);
>  	uc->next = uprobe->consumers;
>  	uprobe->consumers = uc;
> +	uprobe_consumer_account(uprobe, uc);
>  	up_write(&uprobe->consumer_rwsem);
>  }
>  
> @@ -773,6 +794,7 @@ static bool consumer_del(struct uprobe *uprobe, struct uprobe_consumer *uc)
>  		if (*con == uc) {
>  			*con = uc->next;
>  			ret = true;
> +			uprobe_consumer_unaccount(uprobe, uc);
>  			break;
>  		}
>  	}
> @@ -1744,6 +1766,23 @@ static struct uprobe_task *get_utask(void)
>  	return current->utask;
>  }
>  
> +static size_t ri_size(int sessions_cnt)
> +{
> +	struct return_instance *ri __maybe_unused;
> +
> +	return sizeof(*ri) + sessions_cnt * sizeof(ri->sessions[0]);
> +}
> +
> +static struct return_instance *alloc_return_instance(int sessions_cnt)
> +{
> +	struct return_instance *ri;
> +
> +	ri = kzalloc(ri_size(sessions_cnt), GFP_KERNEL);
> +	if (ri)
> +		ri->sessions_cnt = sessions_cnt;
> +	return ri;
> +}
> +
>  static int dup_utask(struct task_struct *t, struct uprobe_task *o_utask)
>  {
>  	struct uprobe_task *n_utask;
> @@ -1756,11 +1795,11 @@ static int dup_utask(struct task_struct *t, struct uprobe_task *o_utask)
>  
>  	p = &n_utask->return_instances;
>  	for (o = o_utask->return_instances; o; o = o->next) {
> -		n = kmalloc(sizeof(struct return_instance), GFP_KERNEL);
> +		n = alloc_return_instance(o->sessions_cnt);
>  		if (!n)
>  			return -ENOMEM;
>  
> -		*n = *o;
> +		memcpy(n, o, ri_size(o->sessions_cnt));
>  		get_uprobe(n->uprobe);
>  		n->next = NULL;
>  
> @@ -1853,9 +1892,9 @@ static void cleanup_return_instances(struct uprobe_task *utask, bool chained,
>  	utask->return_instances = ri;
>  }
>  
> -static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs)
> +static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs,
> +			      struct return_instance *ri)
>  {
> -	struct return_instance *ri;
>  	struct uprobe_task *utask;
>  	unsigned long orig_ret_vaddr, trampoline_vaddr;
>  	bool chained;
> @@ -1874,9 +1913,11 @@ static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs)
>  		return;
>  	}
>  
> -	ri = kmalloc(sizeof(struct return_instance), GFP_KERNEL);
> -	if (!ri)
> -		return;
> +	if (!ri) {
> +		ri = alloc_return_instance(0);
> +		if (!ri)
> +			return;
> +	}
>  
>  	trampoline_vaddr = get_trampoline_vaddr();
>  	orig_ret_vaddr = arch_uretprobe_hijack_return_addr(trampoline_vaddr, regs);
> @@ -2065,35 +2106,85 @@ static struct uprobe *find_active_uprobe(unsigned long bp_vaddr, int *is_swbp)
>  	return uprobe;
>  }
>  
> +static struct session_consumer *
> +session_consumer_next(struct return_instance *ri, struct session_consumer *sc,
> +		      int session_id)
> +{
> +	struct session_consumer *next;
> +
> +	next = sc ? sc + 1 : &ri->sessions[0];
> +	next->id = session_id;
> +	return next;
> +}
> +
> +static struct session_consumer *
> +session_consumer_find(struct return_instance *ri, int *iter, int session_id)
> +{
> +	struct session_consumer *sc;
> +	int idx = *iter;
> +
> +	for (sc = &ri->sessions[idx]; idx < ri->sessions_cnt; idx++, sc++) {
> +		if (sc->id == session_id) {
> +			*iter = idx + 1;
> +			return sc;
> +		}
> +	}
> +	return NULL;
> +}
> +
>  static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
>  {
>  	struct uprobe_consumer *uc;
>  	int remove = UPROBE_HANDLER_REMOVE;
> +	struct session_consumer *sc = NULL;
> +	struct return_instance *ri = NULL;
>  	bool need_prep = false; /* prepare return uprobe, when needed */
>  
>  	down_read(&uprobe->register_rwsem);
> +	if (uprobe->sessions_cnt) {
> +		ri = alloc_return_instance(uprobe->sessions_cnt);
> +		if (!ri)
> +			goto out;
> +	}
> +
>  	for (uc = uprobe->consumers; uc; uc = uc->next) {
> +		__u64 *cookie = NULL;
>  		int rc = 0;
>  
> +		if (uc->session) {
> +			sc = session_consumer_next(ri, sc, uc->session_id);
> +			cookie = &sc->cookie;
> +		}
> +
>  		if (uc->handler) {
> -			rc = uc->handler(uc, regs);
> +			rc = uc->handler(uc, regs, cookie);
>  			WARN(rc & ~UPROBE_HANDLER_MASK,
>  				"bad rc=0x%x from %ps()\n", rc, uc->handler);
>  		}
>  
> -		if (uc->ret_handler)
> +		if (uc->session) {
> +			sc->rc = rc;
> +			need_prep |= !rc;
> +		} else if (uc->ret_handler) {
>  			need_prep = true;
> +		}
>  
>  		remove &= rc;
>  	}
>  
> +	/* no removal if there's at least one session consumer */
> +	remove &= !uprobe->sessions_cnt;
> +
>  	if (need_prep && !remove)
> -		prepare_uretprobe(uprobe, regs); /* put bp at return */
> +		prepare_uretprobe(uprobe, regs, ri); /* put bp at return */
> +	else
> +		kfree(ri);
>  
>  	if (remove && uprobe->consumers) {
>  		WARN_ON(!uprobe_is_active(uprobe));
>  		unapply_uprobe(uprobe, current->mm);
>  	}
> + out:
>  	up_read(&uprobe->register_rwsem);
>  }
>  
> @@ -2101,12 +2192,28 @@ static void
>  handle_uretprobe_chain(struct return_instance *ri, struct pt_regs *regs)
>  {
>  	struct uprobe *uprobe = ri->uprobe;
> +	struct session_consumer *sc;
>  	struct uprobe_consumer *uc;
> +	int session_iter = 0;
>  
>  	down_read(&uprobe->register_rwsem);
>  	for (uc = uprobe->consumers; uc; uc = uc->next) {
> +		__u64 *cookie = NULL;
> +
> +		if (uc->session) {
> +			/*
> +			 * Consumers could be added and removed, but they will not
> +			 * change position, so we can iterate sessions just once
> +			 * and keep the last found session as base for next search.
> +			 */
> +			sc = session_consumer_find(ri, &session_iter, uc->session_id);
> +			if (!sc || sc->rc)
> +				continue;
> +			cookie = &sc->cookie;
> +		}
> +
>  		if (uc->ret_handler)
> -			uc->ret_handler(uc, ri->func, regs);
> +			uc->ret_handler(uc, ri->func, regs, cookie);
>  	}
>  	up_read(&uprobe->register_rwsem);
>  }
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index cd098846e251..02d052639dfe 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -3332,7 +3332,8 @@ uprobe_multi_link_filter(struct uprobe_consumer *con, enum uprobe_filter_ctx ctx
>  }
>  
>  static int
> -uprobe_multi_link_handler(struct uprobe_consumer *con, struct pt_regs *regs)
> +uprobe_multi_link_handler(struct uprobe_consumer *con, struct pt_regs *regs,
> +			  __u64 *data)
>  {
>  	struct bpf_uprobe *uprobe;
>  
> @@ -3341,7 +3342,8 @@ uprobe_multi_link_handler(struct uprobe_consumer *con, struct pt_regs *regs)
>  }
>  
>  static int
> -uprobe_multi_link_ret_handler(struct uprobe_consumer *con, unsigned long func, struct pt_regs *regs)
> +uprobe_multi_link_ret_handler(struct uprobe_consumer *con, unsigned long func, struct pt_regs *regs,
> +			      __u64 *data)
>  {
>  	struct bpf_uprobe *uprobe;
>  
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index c98e3b3386ba..7068c279a244 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -88,9 +88,11 @@ static struct trace_uprobe *to_trace_uprobe(struct dyn_event *ev)
>  static int register_uprobe_event(struct trace_uprobe *tu);
>  static int unregister_uprobe_event(struct trace_uprobe *tu);
>  
> -static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs);
> +static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs,
> +			     __u64 *data);
>  static int uretprobe_dispatcher(struct uprobe_consumer *con,
> -				unsigned long func, struct pt_regs *regs);
> +				unsigned long func, struct pt_regs *regs,
> +				__u64 *data);
>  
>  #ifdef CONFIG_STACK_GROWSUP
>  static unsigned long adjust_stack_addr(unsigned long addr, unsigned int n)
> @@ -1504,7 +1506,8 @@ trace_uprobe_register(struct trace_event_call *event, enum trace_reg type,
>  	}
>  }
>  
> -static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs)
> +static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs,
> +			     __u64 *data)
>  {
>  	struct trace_uprobe *tu;
>  	struct uprobe_dispatch_data udd;
> @@ -1534,7 +1537,8 @@ static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs)
>  }
>  
>  static int uretprobe_dispatcher(struct uprobe_consumer *con,
> -				unsigned long func, struct pt_regs *regs)
> +				unsigned long func, struct pt_regs *regs,
> +				__u64 *data)
>  {
>  	struct trace_uprobe *tu;
>  	struct uprobe_dispatch_data udd;
> -- 
> 2.45.2
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

  parent reply	other threads:[~2024-07-02 23:55 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-01 16:41 [PATCHv2 bpf-next 0/9] uprobe, bpf: Add session support Jiri Olsa
2024-07-01 16:41 ` [PATCHv2 bpf-next 1/9] uprobe: Add support for session consumer Jiri Olsa
2024-07-02 13:04   ` Peter Zijlstra
2024-07-02 16:10     ` Jiri Olsa
2024-07-02 20:52       ` Andrii Nakryiko
2024-07-03 15:31         ` Jiri Olsa
2024-07-03 16:20           ` Jiri Olsa
2024-07-03 21:41           ` Andrii Nakryiko
2024-07-02 20:51   ` Andrii Nakryiko
2024-07-03  8:10     ` Peter Zijlstra
2024-07-03 18:31       ` Andrii Nakryiko
2024-07-03 20:36         ` Kees Cook
2024-07-05  7:10           ` Peter Zijlstra
2024-07-05 23:10             ` Kees Cook
2024-07-03 17:13     ` Jiri Olsa
2024-07-03 21:48       ` Andrii Nakryiko
2024-07-05 13:29         ` Jiri Olsa
2024-07-02 23:55   ` Masami Hiramatsu [this message]
2024-07-03  0:13     ` Andrii Nakryiko
2024-07-03 16:09       ` Jiri Olsa
2024-07-03 21:43         ` Andrii Nakryiko
2024-07-05  8:35           ` Masami Hiramatsu
2024-07-05 13:38             ` Jiri Olsa
2024-07-08  9:41               ` Peter Zijlstra
2024-07-01 16:41 ` [PATCHv2 bpf-next 2/9] bpf: Add support for uprobe multi session attach Jiri Olsa
2024-07-02 21:30   ` Andrii Nakryiko
2024-07-01 16:41 ` [PATCHv2 bpf-next 3/9] bpf: Add support for uprobe multi session context Jiri Olsa
2024-07-02 21:31   ` Andrii Nakryiko
2024-07-01 16:41 ` [PATCHv2 bpf-next 4/9] libbpf: Add support for uprobe multi session attach Jiri Olsa
2024-07-02 21:34   ` Andrii Nakryiko
2024-07-03 17:14     ` Jiri Olsa
2024-07-01 16:41 ` [PATCHv2 bpf-next 5/9] libbpf: Add uprobe session attach type names to attach_type_name Jiri Olsa
2024-07-02 21:56   ` Andrii Nakryiko
2024-07-03 17:15     ` Jiri Olsa
2024-07-01 16:41 ` [PATCHv2 bpf-next 6/9] selftests/bpf: Add uprobe session test Jiri Olsa
2024-07-02 21:57   ` Andrii Nakryiko
2024-07-01 16:41 ` [PATCHv2 bpf-next 7/9] selftests/bpf: Add uprobe session cookie test Jiri Olsa
2024-07-02 21:58   ` Andrii Nakryiko
2024-07-01 16:41 ` [PATCHv2 bpf-next 8/9] selftests/bpf: Add uprobe session recursive test Jiri Olsa
2024-07-02 22:01   ` Andrii Nakryiko
2024-07-03 17:16     ` Jiri Olsa
2024-07-01 16:41 ` [PATCHv2 bpf-next 9/9] selftests/bpf: Add uprobe session consumers test Jiri Olsa
2024-07-02 22:10   ` Andrii Nakryiko
2024-07-03 17:22     ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240703085533.820f90544c3fc42edf79468d@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kafai@fb.com \
    --cc=kpsingh@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sdf@google.com \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox