All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kris Van Hees <kris.van.hees@oracle.com>
To: eugene.loh@oracle.com
Cc: dtrace@lists.linux.dev, dtrace-devel@oss.oracle.com
Subject: Re: [PATCH] Copy fprobes entry args with BPF helper function
Date: Wed, 19 Mar 2025 14:58:55 -0400	[thread overview]
Message-ID: <Z9sT70iGDV5bM2Df@oracle.com> (raw)
In-Reply-To: <20250319063230.28171-3-eugene.loh@oracle.com>

On Wed, Mar 19, 2025 at 02:32:28AM -0400, eugene.loh@oracle.com wrote:
> From: Eugene Loh <eugene.loh@oracle.com>
> 
> With commit a6b626a89 ("Fix fprobe/kprobe selection"), fprobes were
> effectively turned on.  Unfortunately, with this fix, some tests like
> test/unittest/stack/tst.stack_fbt.sh encountered problems on UEK7
> since the BPF verifier would complain about the prototypes of some of
> the probe arguments.  E.g., when loading arg3 in fprobe_trampoline()
> from fbt::vfs_write:entry from %r8+24, the BPF verifier complains
> 
>         func 'vfs_write' arg3 type INT is not a struct
>         invalid bpf_context access off=24 size=8
> 
> We can bypass this problem by using a BPF helper function to copy the
> arguments onto the BPF stack and then load the arguments into mstate
> from there.
> 
> There is also a BPF get_func_arg() helper function, but it is not
> introduced until 5.17 -- that is, it appears after UEK7.  See Linux
> commit f92c1e1 ("bpf: Add get_func_[arg|ret|arg_cnt] helpers").

I'm OK with merging this (see r-b below) but I think that we should perhaps
see a follow-up patch soon that implements using bpf_get_func_arg() if
available, and otherwise use bpf_probe_read_kernel().  Not that it is (at
this point) necessary but it might be a good idea to adjust to newer ways
to access this data in case e.g. future kernels will hide access to the
argument data behind that bpf_get_fucn_arg() helper.

> While the already mentioned test signals the problem and the fix, we
> also add an additional test that actually checks the correctness of
> the arguments.
> 
> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>

Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>

> ---
>  libdtrace/dt_prov_fbt.c                     |  14 ++-
>  test/unittest/fbtprovider/tst.entryargs2.r  |  29 ++++++
>  test/unittest/fbtprovider/tst.entryargs2.sh | 105 ++++++++++++++++++++
>  3 files changed, 147 insertions(+), 1 deletion(-)
>  create mode 100644 test/unittest/fbtprovider/tst.entryargs2.r
>  create mode 100755 test/unittest/fbtprovider/tst.entryargs2.sh
> 
> diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c
> index 8aa53d643..50fa0d9dc 100644
> --- a/libdtrace/dt_prov_fbt.c
> +++ b/libdtrace/dt_prov_fbt.c
> @@ -285,8 +285,20 @@ static int fprobe_trampoline(dt_pcb_t *pcb, uint_t exitlbl)
>  	if (strcmp(pcb->pcb_probe->desc->prb, "entry") == 0) {
>  		int	i;
>  
> +		/*
> +		 * We want to copy entry args from %r8 to %r7 (plus offsets).
> +		 * Unfortunately, for fprobes, the BPF verifier can reject
> +		 * certain argument types.  We work around this by copying
> +		 * the arguments onto the BPF stack and loading them from there.
> +		 */
> +		emit(dlp, BPF_MOV_REG(BPF_REG_1, BPF_REG_FP));
> +		emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, DT_TRAMP_SP_SLOT(prp->argc - 1)));
> +		emit(dlp, BPF_MOV_IMM(BPF_REG_2, 8 * prp->argc));
> +		emit(dlp, BPF_MOV_REG(BPF_REG_3, BPF_REG_8));
> +		emit(dlp, BPF_CALL_HELPER(dtp->dt_bpfhelper[BPF_FUNC_probe_read_kernel]));
> +
>  		for (i = 0; i < prp->argc; i++) {
> -			emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_8, i * 8));
> +			emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_FP, DT_TRAMP_SP_SLOT(prp->argc - 1) + i * 8));
>  			emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(i), BPF_REG_0));
>  		}
>  	} else {
> diff --git a/test/unittest/fbtprovider/tst.entryargs2.r b/test/unittest/fbtprovider/tst.entryargs2.r
> new file mode 100644
> index 000000000..efc4685f9
> --- /dev/null
> +++ b/test/unittest/fbtprovider/tst.entryargs2.r
> @@ -0,0 +1,29 @@
> +mode READ           : no
> +mode WRITE          : yes
> +mode LSEEK          : yes
> +mode PREAD          : yes
> +mode PWRITE         : yes
> +mode WRITER         : yes
> +mode CAN_READ       : no
> +mode CAN_WRITE      : yes
> +mode OPENED         : yes
> +buf: =========================
> +count: 8
> +pos: 20
> +abcdefghijklmnopqrst========CDEFGHIJKLMNOPQRSTUVWXYZ0123456789
> +
> +mode READ           : yes
> +mode WRITE          : yes
> +mode LSEEK          : yes
> +mode PREAD          : yes
> +mode PWRITE         : yes
> +mode WRITER         : yes
> +mode CAN_READ       : yes
> +mode CAN_WRITE      : yes
> +mode OPENED         : yes
> +buf: =========================
> +count: 8
> +pos: 20
> +abcdefghijklmnopqrst========CDEFGHIJKLMNOPQRSTUVWXYZ0123456789
> +
> +success
> diff --git a/test/unittest/fbtprovider/tst.entryargs2.sh b/test/unittest/fbtprovider/tst.entryargs2.sh
> new file mode 100755
> index 000000000..f5b435f56
> --- /dev/null
> +++ b/test/unittest/fbtprovider/tst.entryargs2.sh
> @@ -0,0 +1,105 @@
> +#!/bin/bash
> +#
> +# Oracle Linux DTrace.
> +# Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
> +# Licensed under the Universal Permissive License v 1.0 as shown at
> +# http://oss.oracle.com/licenses/upl.
> +#
> +# Another test of entry args.
> +#
> +
> +dtrace=$1
> +CC=${CC:-/usr/bin/gcc}
> +
> +# Set up test directory.
> +
> +DIRNAME=$tmpdir/entryargs2.$$.$RANDOM
> +mkdir -p $DIRNAME
> +cd $DIRNAME
> +
> +# Make the trigger.
> +
> +cat << EOF > main.c
> +#include <stdio.h>
> +#include <fcntl.h>  // open()
> +#include <unistd.h> // lseek(), write(), close()
> +
> +int main(int c, char **v) {
> +	int fd = open("tmp.txt", c == 1 ? O_WRONLY : O_RDWR);
> +
> +	if (fd == -1)
> +		return 1;
> +
> +	/* Move the offset, then write to the file. */
> +	/* (We will overwrite some "middle section" of the file with "========".) */
> +	lseek(fd, 20, SEEK_SET);
> +	write(fd, "=========================", 8);
> +	close(fd);
> +
> +	return 0;
> +}
> +EOF
> +
> +# Build the trigger.           FIXME do consistent with Sam's changes.
> +
> +$CC $test_cppflags $test_ldflags main.c
> +if [ $? -ne 0 ]; then
> +	echo "failed to link final executable" >&2
> +	exit 1
> +fi
> +
> +# Prepare the D script.
> +
> +cat << EOF > D.d
> +/* these definitions come from kernel header include/linux/fs.h */
> +#define FMODE_READ	(1 << 0)
> +#define FMODE_WRITE	(1 << 1)
> +#define FMODE_LSEEK	(1 << 2)
> +#define FMODE_PREAD	(1 << 3)
> +#define FMODE_PWRITE	(1 << 4)
> +
> +#define FMODE_WRITER	(1 << 16)
> +#define FMODE_CAN_READ	(1 << 17)
> +#define FMODE_CAN_WRITE	(1 << 18)
> +#define FMODE_OPENED	(1 << 19)
> +
> +fbt::vfs_write:entry
> +/pid == \$target/
> +{
> +	mode = ((struct file *)arg0)->f_mode;
> +	printf("mode READ           : %s\n", mode & FMODE_READ      ? "yes" : "no");
> +	printf("mode WRITE          : %s\n", mode & FMODE_WRITE     ? "yes" : "no");
> +	printf("mode LSEEK          : %s\n", mode & FMODE_LSEEK     ? "yes" : "no");
> +	printf("mode PREAD          : %s\n", mode & FMODE_PREAD     ? "yes" : "no");
> +	printf("mode PWRITE         : %s\n", mode & FMODE_PWRITE    ? "yes" : "no");
> +	printf("mode WRITER         : %s\n", mode & FMODE_WRITER    ? "yes" : "no");
> +	printf("mode CAN_READ       : %s\n", mode & FMODE_CAN_READ  ? "yes" : "no");
> +	printf("mode CAN_WRITE      : %s\n", mode & FMODE_CAN_WRITE ? "yes" : "no");
> +	printf("mode OPENED         : %s\n", mode & FMODE_OPENED    ? "yes" : "no");
> +
> +	printf("buf: %s\n", stringof(copyinstr(arg1)));
> +	printf("count: %d\n", arg2);
> +	printf("pos: %d", *((loff_t *)arg3));
> +	exit(0);
> +}
> +EOF
> +
> +# Run the D script and trigger twice, once with O_WRONLY and then O_RDWR.
> +
> +for args in "" "dummy"; do
> +
> +	# Prepare the file to be (over)written.
> +	rm -f tmp.txt
> +	echo abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 > tmp.txt
> +
> +	# Run the D script and trigger.
> +	$dtrace $dt_flags -c "./a.out $args" -Cqs D.d
> +
> +        # Report the output file.
> +	cat tmp.txt
> +	echo
> +
> +done
> +
> +echo success
> +exit 0
> -- 
> 2.43.5
> 

  reply	other threads:[~2025-03-19 18:59 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-19  6:32 [PATCH] test: Account for pid:::entry ucaller being correct eugene.loh
2025-03-19  6:32 ` [PATCH] Fix dt_bvar_probedesc() for late USDT processes eugene.loh
2025-03-19 18:54   ` Kris Van Hees
2025-03-19  6:32 ` [PATCH] Copy fprobes entry args with BPF helper function eugene.loh
2025-03-19 18:58   ` Kris Van Hees [this message]
2025-03-19  6:32 ` [PATCH] test: Expect USDT argmap to fail on ARM on older kernels eugene.loh
2025-03-19 19:04   ` [DTrace-devel] " Kris Van Hees
2025-04-11 20:37     ` Kris Van Hees
2025-03-19  6:32 ` [PATCH] Get execargs from user space eugene.loh
2025-03-19 19:07   ` [DTrace-devel] " Kris Van Hees
2025-03-19 19:19     ` Eugene Loh
2025-03-19 18:53 ` [PATCH] test: Account for pid:::entry ucaller being correct Kris Van Hees
2025-04-14 23:33 ` [DTrace-devel] " Sam James

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z9sT70iGDV5bM2Df@oracle.com \
    --to=kris.van.hees@oracle.com \
    --cc=dtrace-devel@oss.oracle.com \
    --cc=dtrace@lists.linux.dev \
    --cc=eugene.loh@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.