From: Dave Marchevsky <davemarchevsky@fb.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Rik van Riel <riel@surriel.com>,
Ilya Leoshkevich <iii@linux.ibm.com>, Yonghong Song <yhs@fb.com>,
Kernel Team <kernel-team@fb.com>
Subject: Re: [RFC PATCH bpf-next 3/5] libbpf: usdt lib wiring of xmm reads
Date: Wed, 18 May 2022 04:20:21 -0400 [thread overview]
Message-ID: <9b7cefd5-4906-35d6-ad61-bf7d2ee06033@fb.com> (raw)
In-Reply-To: <CAEf4BzaM0SC3D66NC3djt1fsEQcJ-af0-EgPx5UV8YLDLu8ibg@mail.gmail.com>
On 5/16/22 7:26 PM, Andrii Nakryiko wrote:
> On Thu, May 12, 2022 at 12:43 AM Dave Marchevsky <davemarchevsky@fb.com> wrote:
>>
>> Handle xmm0,...,xmm15 registers when parsing USDT arguments. Currently
>> only the first 64 bits of the fetched value are returned as I haven't
>> seen the rest of the register used in practice.
>>
>> This patch also handles floats in USDT arg spec by ignoring the fact
>> that they're floats and considering them scalar. Currently we can't do
>> float math in BPF programs anyways, so might as well support passing to
>> userspace and converting there.
>>
>> We can use existing ARG_REG sscanf + logic, adding XMM-specific logic
>> when calc_pt_regs_off fails. If the reg is xmm, arg_spec's reg_off is
>> repurposed to hold reg_no, which is passed to bpf_get_reg_val. Since the
>> helper does the digging around in fxregs_state it's not necessary to
>> calculate offset in bpf code for these regs.
>>
>> NOTE: Changes here cause verification failure for existing USDT tests.
>> Specifically, BPF_USDT prog 'usdt12' fails to verify due to too many
>> insns despite not having its insn count significantly changed.
>>
>> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
>> ---
>> tools/lib/bpf/usdt.bpf.h | 36 ++++++++++++++++++++--------
>> tools/lib/bpf/usdt.c | 51 ++++++++++++++++++++++++++++++++++++----
>> 2 files changed, 73 insertions(+), 14 deletions(-)
>>
>> diff --git a/tools/lib/bpf/usdt.bpf.h b/tools/lib/bpf/usdt.bpf.h
>> index 4181fddb3687..7b5ed4cbaa2f 100644
>> --- a/tools/lib/bpf/usdt.bpf.h
>> +++ b/tools/lib/bpf/usdt.bpf.h
>> @@ -43,6 +43,7 @@ enum __bpf_usdt_arg_type {
>> BPF_USDT_ARG_CONST,
>> BPF_USDT_ARG_REG,
>> BPF_USDT_ARG_REG_DEREF,
>> + BPF_USDT_ARG_XMM_REG,
>> };
>>
>> struct __bpf_usdt_arg_spec {
>> @@ -129,7 +130,9 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
>> {
>> struct __bpf_usdt_spec *spec;
>> struct __bpf_usdt_arg_spec *arg_spec;
>> - unsigned long val;
>> + struct pt_regs *btf_regs;
>> + struct task_struct *btf_task;
>> + struct { __u64 a; __u64 unused; } val = {};
>> int err, spec_id;
>>
>> *res = 0;
>> @@ -151,7 +154,7 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
>> /* Arg is just a constant ("-4@$-9" in USDT arg spec).
>> * value is recorded in arg_spec->val_off directly.
>> */
>> - val = arg_spec->val_off;
>> + val.a = arg_spec->val_off;
>> break;
>> case BPF_USDT_ARG_REG:
>> /* Arg is in a register (e.g, "8@%rax" in USDT arg spec),
>> @@ -159,7 +162,20 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
>> * struct pt_regs. To keep things simple user-space parts
>> * record offsetof(struct pt_regs, <regname>) in arg_spec->reg_off.
>> */
>> - err = bpf_probe_read_kernel(&val, sizeof(val), (void *)ctx + arg_spec->reg_off);
>> + err = bpf_probe_read_kernel(&val.a, sizeof(val.a), (void *)ctx + arg_spec->reg_off);
>> + if (err)
>> + return err;
>> + break;
>> + case BPF_USDT_ARG_XMM_REG:
>
> nit: a bit too XMM-specific name here, we probably want to keep it a bit
Agreed.
>
>> + /* Same as above, but arg is an xmm reg, so can't look
>> + * in pt_regs, need to use special helper.
>> + * reg_off is the regno ("xmm0" -> regno 0, etc)
>> + */
>> + btf_task = bpf_get_current_task_btf();
>> + btf_regs = (struct pt_regs *)bpf_task_pt_regs(btf_task);
>
> I'd like to avoid taking dependency on bpf_get_current_task_btf() for
> rare case of XMM register, which makes it impossible to do USDT on
> older kernels. It seems like supporting reading registers from current
> (and maybe current pt_regs context) should cover a lot of practical
> uses.
>
Yep. We talked about this today. Will remove.
>> + err = bpf_get_reg_val(&val, sizeof(val),
>
> But regardless of the above, we'll need to use CO-RE to detect support
> for this new BPF helper (probably using bpf_core_enum_value_exists()?)
> to allow using USDTs on older kernels.
>
Will add.
>
>> + ((u64)arg_spec->reg_off + BPF_GETREG_X86_XMM0) << 32,
>> + btf_regs, btf_task);
>> if (err)
>> return err;
>> break;
>> @@ -171,14 +187,14 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
>> * from pt_regs, then do another user-space probe read to
>> * fetch argument value itself.
>> */
>> - err = bpf_probe_read_kernel(&val, sizeof(val), (void *)ctx + arg_spec->reg_off);
>> + err = bpf_probe_read_kernel(&val.a, sizeof(val.a), (void *)ctx + arg_spec->reg_off);
>> if (err)
>> return err;
>> - err = bpf_probe_read_user(&val, sizeof(val), (void *)val + arg_spec->val_off);
>> + err = bpf_probe_read_user(&val.a, sizeof(val.a), (void *)val.a + arg_spec->val_off);
>
> is the useful value in xmm register normally in lower 64-bits of it?
> is it possible to just request reading just the first 64 bits from
> bpf_get_reg_val() and avoid this ugly union?
For USDT usecase, I've only seen lower 64 bits used. Should be possible to just
grab those, will see if there's a clean way to integrate such an option.
>
>> if (err)
>> return err;
>> #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
>> - val >>= arg_spec->arg_bitshift;
>> + val.a >>= arg_spec->arg_bitshift;
>> #endif
>> break;
>> default:
>
> [...]
>
>> +static int calc_xmm_regno(const char *reg_name)
>> +{
>> + static struct {
>> + const char *name;
>> + __u16 regno;
>> + } xmm_reg_map[] = {
>> + { "xmm0", 0 },
>> + { "xmm1", 1 },
>> + { "xmm2", 2 },
>> + { "xmm3", 3 },
>> + { "xmm4", 4 },
>> + { "xmm5", 5 },
>> + { "xmm6", 6 },
>> + { "xmm7", 7 },
>> +#ifdef __x86_64__
>> + { "xmm8", 8 },
>> + { "xmm9", 9 },
>> + { "xmm10", 10 },
>> + { "xmm11", 11 },
>> + { "xmm12", 12 },
>> + { "xmm13", 13 },
>> + { "xmm14", 14 },
>> + { "xmm15", 15 },
>
> no-x86 arches parse this generically with sscanf(), seems like we can
> do this simple approach here as well?
>
Agreed.
>
>> +#endif
>> + };
>> + int i;
>> +
>> + for (i = 0; i < ARRAY_SIZE(xmm_reg_map); i++) {
>> + if (strcmp(reg_name, xmm_reg_map[i].name) == 0)
>> + return xmm_reg_map[i].regno;
>> + }
>> +
>> return -ENOENT;
>> }
>>
>
> [...]
next prev parent reply other threads:[~2022-05-18 8:23 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-12 7:43 [RFC PATCH bpf-next 0/5] bpf: add get_reg_val helper Dave Marchevsky
2022-05-12 7:43 ` [RFC PATCH bpf-next 1/5] x86/fpu: Move context.h to include/asm Dave Marchevsky
2022-05-12 13:56 ` David Vernet
2022-05-14 0:44 ` Alexei Starovoitov
2022-05-12 7:43 ` [RFC PATCH bpf-next 2/5] bpf: add get_reg_val helper Dave Marchevsky
2022-05-12 15:29 ` David Vernet
2022-05-18 8:07 ` Dave Marchevsky
2022-05-14 0:41 ` Alexei Starovoitov
2022-05-18 7:35 ` Dave Marchevsky
2022-05-12 7:43 ` [RFC PATCH bpf-next 3/5] libbpf: usdt lib wiring of xmm reads Dave Marchevsky
2022-05-14 0:43 ` Alexei Starovoitov
2022-05-16 23:26 ` Andrii Nakryiko
2022-05-18 8:20 ` Dave Marchevsky [this message]
2022-05-12 7:43 ` [RFC PATCH bpf-next 4/5] selftests/bpf: Add test for USDT parse of xmm reg Dave Marchevsky
2022-05-16 23:31 ` Andrii Nakryiko
2022-05-17 1:17 ` Alexei Starovoitov
2022-05-18 23:56 ` Andrii Nakryiko
2022-05-12 7:43 ` [RFC PATCH bpf-next 5/5] selftests/bpf: get_reg_val test exercising fxsave fetch Dave Marchevsky
2022-05-12 17:47 ` Dave Marchevsky
2022-05-16 23:28 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9b7cefd5-4906-35d6-ad61-bf7d2ee06033@fb.com \
--to=davemarchevsky@fb.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=iii@linux.ibm.com \
--cc=kernel-team@fb.com \
--cc=riel@surriel.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox