public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Jackie Liu <liu.yun@linux.dev>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Jiri Olsa <olsajiri@gmail.com>
Cc: andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
	yhs@fb.com, bpf@vger.kernel.org, liuyun01@kylinos.cn
Subject: Re: [PATCH v4] libbpf: kprobe.multi: Filter with available_filter_functions
Date: Thu, 8 Jun 2023 08:57:15 +0800	[thread overview]
Message-ID: <7fecf93e-eaf9-1ffe-4f1d-64f530828363@linux.dev> (raw)
In-Reply-To: <CAEf4BzaNpxNZ12N1JY4=EijXv14oWQMQpjF8t4zt-ZaYNp+U=Q@mail.gmail.com>



在 2023/6/8 08:00, Andrii Nakryiko 写道:
> On Wed, Jun 7, 2023 at 4:22 PM Jiri Olsa <olsajiri@gmail.com> wrote:
>>
>> On Fri, Jun 02, 2023 at 10:27:31AM -0700, Andrii Nakryiko wrote:
>>> On Thu, May 25, 2023 at 6:38 PM Jackie Liu <liu.yun@linux.dev> wrote:
>>>>
>>>> Hi Andrii.
>>>>
>>>> 在 2023/5/26 04:43, Andrii Nakryiko 写道:
>>>>> On Thu, May 25, 2023 at 3:28 AM Jackie Liu <liu.yun@linux.dev> wrote:
>>>>>>
>>>>>> From: Jackie Liu <liuyun01@kylinos.cn>
>>>>>>
>>>>>> When using regular expression matching with "kprobe multi", it scans all
>>>>>> the functions under "/proc/kallsyms" that can be matched. However, not all
>>>>>> of them can be traced by kprobe.multi. If any one of the functions fails
>>>>>> to be traced, it will result in the failure of all functions. The best
>>>>>> approach is to filter out the functions that cannot be traced to ensure
>>>>>> proper tracking of the functions.
>>>>>>
>>>>>> Use available_filter_functions check first, if failed, fallback to
>>>>>> kallsyms.
>>>>>>
>>>>>> Here is the test eBPF program [1].
>>>>>> [1] https://github.com/JackieLiu1/ketones/commit/a9e76d1ba57390e533b8b3eadde97f7a4535e867
>>>>>>
>>>>>> Suggested-by: Jiri Olsa <olsajiri@gmail.com>
>>>>>> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
>>>>>> ---
>>>>>>    tools/lib/bpf/libbpf.c | 92 +++++++++++++++++++++++++++++++++++++-----
>>>>>>    1 file changed, 83 insertions(+), 9 deletions(-)
>>>>>>
>>>>>
>>>>> Question to you and Jiri: what happens when multi-kprobe's syms has
>>>>> duplicates? Will the program be attached multiple times? If yes, then
>>>>> it sounds like a problem? Both available_filters and kallsyms can have
>>>>> duplicate function names in them, right?
>>>>
>>>> If I understand correctly, there should be no problem with repeated
>>>> function registration, because the bottom layer is done through fprobe
>>>> registration addrs, kprobe.multi itself does not do this work, but
>>>> fprobe is based on ftrace, it will register addr by makes a hash,
>>>> that is, if it is the same address, it should be filtered out.
>>>>
>>>
>>> Looking at kernel code, it seems kernel will actually return error if
>>> user specifies multiple duplicated names. Because kernel will
>>> bsearch() to the first instance, and never resolve the second
>>> duplicated instance. And then will assume that not all symbols are
>>> resolved.
>>
>> right, as I wrote in here [1] it will fail
>>
>> [1] https://lore.kernel.org/bpf/ZHB0xNEbjmwHv18d@krava/
>>
>>>
>>> So, it worries me that we'll switch from kallsyms to available_filters
>>> by default, because that introduces new failure modes.
>>
>> we did not care about duplicate with kallsyms because we used addresses,
>> and I think with duplicate addresss the kprobe_multi link will probably
>> attach (need to check) while with duplicate symbols it won't..
>>
>> perhaps we could make sure we don't pass duplicate symbols?
> 
> I think we have to stick to kallsyms and addresses. What if I actually
> want to attach to all instances of type_show? We should take into
> account available_filter_functions, but still use addresses from
> kallsyms.
> 
> I'd also advocate working on having an available_filter_functions
> version reporting not just function names, but also its associated
> address. That would actually eliminate the need for kallsyms.
> 
> I chatted with Steven Rostedt about this at the last LSF/MM/BPF
> conference, and I think we both agreed that we both a) have all the
> information in the kernel to implement this and b) it's a good idea to
> expose all that to user space. For backwards compat reasons it will
> have to be a separate file, but it's generated on the fly, so it's not
> a big deal in terms of resource usage.

Yes, I noticed that the latest version of the kernel has added 
touched_functions and enabled_functions, are they? I'm not sure.
Perhaps we can wait for such an interface to appear before directly
switching to that interface, and then submit this patch again.

-- 
Jackie Liu

> 
> 
>>
>> we do the kprobe_multi bench with symbol names read from available_filter_functions
>> and we filter out duplicates
>>
>> jirka
>>
>>>
>>> Either way, let's add a selftest that uses a duplicate function name
>>> and see what happens?
>>>
>>>> The main problem here is not the problem of repeated registration of
>>>> functions, but some functions are not allowed to hook. For example, when
>>>> I track vfs_*, vfs_set_acl_prepare_kgid and vfs_set_acl_prepare_kuid are
>>>> not allowed to hook. These exist under kallsyms, but
>>>> available_filter_functions does not, I have observed for a while,
>>>> matching through available_filter_functions can effectively prevent this
>>>> from happening.
>>>
>>> Yeah, I understand that. My point above is that a)
>>> available_filter_functions contains duplicates and b) doesn't contain
>>> addresses. So we are forced to rely on kernel string -> addr
>>> resolution, which doesn't seem to handle duplicate entries well (let's
>>> test).
>>>
>>> So it's a regression to switch to that without taking any other precautions.
>>>
>>>>
>>>>>
>>>>>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
>>>>>> index ad1ec893b41b..3dd72d69cdf7 100644
>>>>>> --- a/tools/lib/bpf/libbpf.c
>>>>>> +++ b/tools/lib/bpf/libbpf.c
>>>>>> @@ -10417,13 +10417,14 @@ static bool glob_match(const char *str, const char *pat)
>>>>>>    struct kprobe_multi_resolve {
>>>>>>           const char *pattern;
>>>>>>           unsigned long *addrs;
>>>>>> +       const char **syms;
>>>>>>           size_t cap;
>>>>>>           size_t cnt;
>>>>>>    };
>>>>>>
>>>
>>> [...]

  reply	other threads:[~2023-06-08  0:57 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-23 13:25 [PATCH] libbpf: kprobe.multi: Filter with blacklist and available_filter_functions Jackie Liu
2023-05-23 16:17 ` Jiri Olsa
2023-05-23 18:22   ` Andrii Nakryiko
2023-05-24  7:03     ` Jiri Olsa
2023-05-24  1:03   ` Jackie Liu
2023-05-24  1:19     ` Jackie Liu
2023-05-24  6:47       ` Jiri Olsa
2023-05-24  7:06         ` Jackie Liu
2023-05-24  8:41         ` [PATCH v3] libbpf: kprobe.multi: Filter with available_filter_functions Jackie Liu
2023-05-25  8:44           ` Jiri Olsa
2023-05-25 10:27             ` [PATCH v4] " Jackie Liu
2023-05-25 20:43               ` Andrii Nakryiko
2023-05-26  1:38                 ` Jackie Liu
2023-05-26  8:58                   ` Jiri Olsa
2023-06-02 17:27                   ` Andrii Nakryiko
2023-06-07  6:01                     ` Jackie Liu
2023-06-07 22:37                       ` Andrii Nakryiko
2023-06-07 23:22                     ` Jiri Olsa
2023-06-08  0:00                       ` Andrii Nakryiko
2023-06-08  0:57                         ` Jackie Liu [this message]
2023-05-26  2:10                 ` [PATCH v5] " Jackie Liu
2023-05-26  9:53                   ` Jiri Olsa
2023-05-26 12:18                     ` Jackie Liu
2023-05-24  3:44   ` [PATCH v2] " Jackie Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7fecf93e-eaf9-1ffe-4f1d-64f530828363@linux.dev \
    --to=liu.yun@linux.dev \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=liuyun01@kylinos.cn \
    --cc=martin.lau@linux.dev \
    --cc=olsajiri@gmail.com \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox