Re: [PATCH bpf-next 1/4] samples: bpf: fix bpf programs with kprobe/sys_connect event

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Yonghong Song <yhs@fb.com>
To: "Daniel T. Lee" <danieltimlee@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Andrii Nakryiko <andriin@fb.com>,
	netdev <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next 1/4] samples: bpf: fix bpf programs with kprobe/sys_connect event
Date: Thu, 2 Jul 2020 09:04:19 -0700	[thread overview]
Message-ID: <41ca5ad1-2b79-dbc2-5f6e-e466712fe7a9@fb.com> (raw)
In-Reply-To: <CAEKGpzhU31p=i=xbD3Fk2vJh_btrk73CgkJXMXDgM1umsEaEpg@mail.gmail.com>



On 7/2/20 4:13 AM, Daniel T. Lee wrote:
> On Thu, Jul 2, 2020 at 2:13 PM Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 7/1/20 7:16 PM, Daniel T. Lee wrote:
>>> Currently, BPF programs with kprobe/sys_connect does not work properly.
>>>
>>> Commit 34745aed515c ("samples/bpf: fix kprobe attachment issue on x64")
>>> This commit modifies the bpf_load behavior of kprobe events in the x64
>>> architecture. If the current kprobe event target starts with "sys_*",
>>> add the prefix "__x64_" to the front of the event.
>>>
>>> Appending "__x64_" prefix with kprobe/sys_* event was appropriate as a
>>> solution to most of the problems caused by the commit below.
>>>
>>>       commit d5a00528b58c ("syscalls/core, syscalls/x86: Rename struct
>>>       pt_regs-based sys_*() to __x64_sys_*()")
>>>
>>> However, there is a problem with the sys_connect kprobe event that does
>>> not work properly. For __sys_connect event, parameters can be fetched
>>> normally, but for __x64_sys_connect, parameters cannot be fetched.
>>>
>>> Because of this problem, this commit fixes the sys_connect event by
>>> specifying the __sys_connect directly and this will bypass the
>>> "__x64_" appending rule of bpf_load.
>>
>> In the kernel code, we have
>>
>> SYSCALL_DEFINE3(connect, int, fd, struct sockaddr __user *, uservaddr,
>>                   int, addrlen)
>> {
>>           return __sys_connect(fd, uservaddr, addrlen);
>> }
>>
>> Depending on compiler, there is no guarantee that __sys_connect will
>> not be inlined. I would prefer to still use the entry point
>> __x64_sys_* e.g.,
>>      SEC("kprobe/" SYSCALL(sys_write))
>>
> 
> As you mentioned, there is clearly a possibility that problems may arise
> because the symbol does not exist according to the compiler.
> 
> However, in x64, when using Kprobe for __x64_sys_connect event, the
> tests are not working properly because the parameters cannot be fetched,
> and the test under selftests/bpf is using "kprobe/_sys_connect" directly.

This is the assembly code for __x64_sys_connect.

ffffffff818d3520 <__x64_sys_connect>:
ffffffff818d3520: e8 fb df 32 00        callq   0xffffffff81c01520 
<__fentry__>
ffffffff818d3525: 48 8b 57 60           movq    96(%rdi), %rdx
ffffffff818d3529: 48 8b 77 68           movq    104(%rdi), %rsi
ffffffff818d352d: 48 8b 7f 70           movq    112(%rdi), %rdi
ffffffff818d3531: e8 1a ff ff ff        callq   0xffffffff818d3450 
<__sys_connect>
ffffffff818d3536: 48 98                 cltq
ffffffff818d3538: c3                    retq
ffffffff818d3539: 0f 1f 80 00 00 00 00  nopl    (%rax)

In bpf program, the step is:
       struct pt_regs *real_regs = PT_REGS_PARM1(pt_regs);
       param1 = PT_REGS_PARM1(real_regs);
       param2 = PT_REGS_PARM2(real_regs);
       param3 = PT_REGS_PARM3(real_regs);
The same for s390.

For other architectures, no above indirection is needed.

I guess you can abstract the above into trace_common.h?

> 
> I'm not sure how to deal with this problem. Any advice and suggestions
> will be greatly appreciated.
> 
> Thanks for your time and effort for the review.
> Daniel
> 
>>>
>>> Fixes: 34745aed515c ("samples/bpf: fix kprobe attachment issue on x64")
>>> Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
>>> ---
>>>    samples/bpf/map_perf_test_kern.c         | 2 +-
>>>    samples/bpf/test_map_in_map_kern.c       | 2 +-
>>>    samples/bpf/test_probe_write_user_kern.c | 2 +-
>>>    3 files changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/samples/bpf/map_perf_test_kern.c b/samples/bpf/map_perf_test_kern.c
>>> index 12e91ae64d4d..cebe2098bb24 100644
>>> --- a/samples/bpf/map_perf_test_kern.c
>>> +++ b/samples/bpf/map_perf_test_kern.c
>>> @@ -154,7 +154,7 @@ int stress_percpu_hmap_alloc(struct pt_regs *ctx)
>>>        return 0;
>>>    }
>>>
>>> -SEC("kprobe/sys_connect")
>>> +SEC("kprobe/__sys_connect")
>>>    int stress_lru_hmap_alloc(struct pt_regs *ctx)
>>>    {
>>>        char fmt[] = "Failed at stress_lru_hmap_alloc. ret:%dn";
>>> diff --git a/samples/bpf/test_map_in_map_kern.c b/samples/bpf/test_map_in_map_kern.c
>>> index 6cee61e8ce9b..b1562ba2f025 100644
>>> --- a/samples/bpf/test_map_in_map_kern.c
>>> +++ b/samples/bpf/test_map_in_map_kern.c
>>> @@ -102,7 +102,7 @@ static __always_inline int do_inline_hash_lookup(void *inner_map, u32 port)
>>>        return result ? *result : -ENOENT;
>>>    }
>>>
>>> -SEC("kprobe/sys_connect")
>>> +SEC("kprobe/__sys_connect")
>>>    int trace_sys_connect(struct pt_regs *ctx)
>>>    {
>>>        struct sockaddr_in6 *in6;
>>> diff --git a/samples/bpf/test_probe_write_user_kern.c b/samples/bpf/test_probe_write_user_kern.c
>>> index 6579639a83b2..9b3c3918c37d 100644
>>> --- a/samples/bpf/test_probe_write_user_kern.c
>>> +++ b/samples/bpf/test_probe_write_user_kern.c
>>> @@ -26,7 +26,7 @@ struct {
>>>     * This example sits on a syscall, and the syscall ABI is relatively stable
>>>     * of course, across platforms, and over time, the ABI may change.
>>>     */
>>> -SEC("kprobe/sys_connect")
>>> +SEC("kprobe/__sys_connect")
>>>    int bpf_prog1(struct pt_regs *ctx)
>>>    {
>>>        struct sockaddr_in new_addr, orig_addr = {};
>>>

next prev parent reply	other threads:[~2020-07-02 16:04 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-02  2:16 [PATCH bpf-next 0/4] samples: bpf: refactor BPF map test with libbpf Daniel T. Lee
2020-07-02  2:16 ` [PATCH bpf-next 1/4] samples: bpf: fix bpf programs with kprobe/sys_connect event Daniel T. Lee
2020-07-02  5:12   ` Yonghong Song
2020-07-02 11:13     ` Daniel T. Lee
2020-07-02 16:04       ` Yonghong Song [this message]
2020-07-06 10:26         ` Daniel T. Lee
2020-07-06 23:49           ` Andrii Nakryiko
2020-07-07  2:33             ` Daniel T. Lee
2020-07-07  5:15               ` Andrii Nakryiko
2020-07-07  5:46                 ` Daniel T. Lee
2020-07-02  2:16 ` [PATCH bpf-next 2/4] samples: bpf: refactor BPF map in map test with libbpf Daniel T. Lee
2020-07-02  4:25   ` Andrii Nakryiko
2020-07-02  9:58     ` Daniel T. Lee
2020-07-02  2:16 ` [PATCH bpf-next 3/4] samples: bpf: refactor BPF map performance " Daniel T. Lee
2020-07-02  4:33   ` Andrii Nakryiko
2020-07-02 11:24     ` Daniel T. Lee
2020-07-06 23:27       ` Andrii Nakryiko
2020-07-02  2:16 ` [PATCH bpf-next 4/4] selftests: bpf: remove unused bpf_map_def_legacy struct Daniel T. Lee
2020-07-02  4:39   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41ca5ad1-2b79-dbc2-5f6e-e466712fe7a9@fb.com \
    --to=yhs@fb.com \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=danieltimlee@gmail.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).