All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jackie Liu <liu.yun@linux.dev>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Jiri Olsa <olsajiri@gmail.com>
Cc: andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
	yhs@fb.com, bpf@vger.kernel.org, liuyun01@kylinos.cn
Subject: Re: [PATCH v4] libbpf: kprobe.multi: Filter with available_filter_functions
Date: Thu, 8 Jun 2023 08:57:15 +0800	[thread overview]
Message-ID: <7fecf93e-eaf9-1ffe-4f1d-64f530828363@linux.dev> (raw)
In-Reply-To: <CAEf4BzaNpxNZ12N1JY4=EijXv14oWQMQpjF8t4zt-ZaYNp+U=Q@mail.gmail.com>



在 2023/6/8 08:00, Andrii Nakryiko 写道:
> On Wed, Jun 7, 2023 at 4:22 PM Jiri Olsa <olsajiri@gmail.com> wrote:
>>
>> On Fri, Jun 02, 2023 at 10:27:31AM -0700, Andrii Nakryiko wrote:
>>> On Thu, May 25, 2023 at 6:38 PM Jackie Liu <liu.yun@linux.dev> wrote:
>>>>
>>>> Hi Andrii.
>>>>
>>>> 在 2023/5/26 04:43, Andrii Nakryiko 写道:
>>>>> On Thu, May 25, 2023 at 3:28 AM Jackie Liu <liu.yun@linux.dev> wrote:
>>>>>>
>>>>>> From: Jackie Liu <liuyun01@kylinos.cn>
>>>>>>
>>>>>> When using regular expression matching with "kprobe multi", it scans all
>>>>>> the functions under "/proc/kallsyms" that can be matched. However, not all
>>>>>> of them can be traced by kprobe.multi. If any one of the functions fails
>>>>>> to be traced, it will result in the failure of all functions. The best
>>>>>> approach is to filter out the functions that cannot be traced to ensure
>>>>>> proper tracking of the functions.
>>>>>>
>>>>>> Use available_filter_functions check first, if failed, fallback to
>>>>>> kallsyms.
>>>>>>
>>>>>> Here is the test eBPF program [1].
>>>>>> [1] https://github.com/JackieLiu1/ketones/commit/a9e76d1ba57390e533b8b3eadde97f7a4535e867
>>>>>>
>>>>>> Suggested-by: Jiri Olsa <olsajiri@gmail.com>
>>>>>> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
>>>>>> ---
>>>>>>    tools/lib/bpf/libbpf.c | 92 +++++++++++++++++++++++++++++++++++++-----
>>>>>>    1 file changed, 83 insertions(+), 9 deletions(-)
>>>>>>
>>>>>
>>>>> Question to you and Jiri: what happens when multi-kprobe's syms has
>>>>> duplicates? Will the program be attached multiple times? If yes, then
>>>>> it sounds like a problem? Both available_filters and kallsyms can have
>>>>> duplicate function names in them, right?
>>>>
>>>> If I understand correctly, there should be no problem with repeated
>>>> function registration, because the bottom layer is done through fprobe
>>>> registration addrs, kprobe.multi itself does not do this work, but
>>>> fprobe is based on ftrace, it will register addr by makes a hash,
>>>> that is, if it is the same address, it should be filtered out.
>>>>
>>>
>>> Looking at kernel code, it seems kernel will actually return error if
>>> user specifies multiple duplicated names. Because kernel will
>>> bsearch() to the first instance, and never resolve the second
>>> duplicated instance. And then will assume that not all symbols are
>>> resolved.
>>
>> right, as I wrote in here [1] it will fail
>>
>> [1] https://lore.kernel.org/bpf/ZHB0xNEbjmwHv18d@krava/
>>
>>>
>>> So, it worries me that we'll switch from kallsyms to available_filters
>>> by default, because that introduces new failure modes.
>>
>> we did not care about duplicate with kallsyms because we used addresses,
>> and I think with duplicate addresss the kprobe_multi link will probably
>> attach (need to check) while with duplicate symbols it won't..
>>
>> perhaps we could make sure we don't pass duplicate symbols?
> 
> I think we have to stick to kallsyms and addresses. What if I actually
> want to attach to all instances of type_show? We should take into
> account available_filter_functions, but still use addresses from
> kallsyms.
> 
> I'd also advocate working on having an available_filter_functions
> version reporting not just function names, but also its associated
> address. That would actually eliminate the need for kallsyms.
> 
> I chatted with Steven Rostedt about this at the last LSF/MM/BPF
> conference, and I think we both agreed that we both a) have all the
> information in the kernel to implement this and b) it's a good idea to
> expose all that to user space. For backwards compat reasons it will
> have to be a separate file, but it's generated on the fly, so it's not
> a big deal in terms of resource usage.

Yes, I noticed that the latest version of the kernel has added 
touched_functions and enabled_functions, are they? I'm not sure.
Perhaps we can wait for such an interface to appear before directly
switching to that interface, and then submit this patch again.

-- 
Jackie Liu

> 
> 
>>
>> we do the kprobe_multi bench with symbol names read from available_filter_functions
>> and we filter out duplicates
>>
>> jirka
>>
>>>
>>> Either way, let's add a selftest that uses a duplicate function name
>>> and see what happens?
>>>
>>>> The main problem here is not the problem of repeated registration of
>>>> functions, but some functions are not allowed to hook. For example, when
>>>> I track vfs_*, vfs_set_acl_prepare_kgid and vfs_set_acl_prepare_kuid are
>>>> not allowed to hook. These exist under kallsyms, but
>>>> available_filter_functions does not, I have observed for a while,
>>>> matching through available_filter_functions can effectively prevent this
>>>> from happening.
>>>
>>> Yeah, I understand that. My point above is that a)
>>> available_filter_functions contains duplicates and b) doesn't contain
>>> addresses. So we are forced to rely on kernel string -> addr
>>> resolution, which doesn't seem to handle duplicate entries well (let's
>>> test).
>>>
>>> So it's a regression to switch to that without taking any other precautions.
>>>
>>>>
>>>>>
>>>>>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
>>>>>> index ad1ec893b41b..3dd72d69cdf7 100644
>>>>>> --- a/tools/lib/bpf/libbpf.c
>>>>>> +++ b/tools/lib/bpf/libbpf.c
>>>>>> @@ -10417,13 +10417,14 @@ static bool glob_match(const char *str, const char *pat)
>>>>>>    struct kprobe_multi_resolve {
>>>>>>           const char *pattern;
>>>>>>           unsigned long *addrs;
>>>>>> +       const char **syms;
>>>>>>           size_t cap;
>>>>>>           size_t cnt;
>>>>>>    };
>>>>>>
>>>
>>> [...]

  reply	other threads:[~2023-06-08  0:57 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-23 13:25 [PATCH] libbpf: kprobe.multi: Filter with blacklist and available_filter_functions Jackie Liu
2023-05-23 16:17 ` Jiri Olsa
2023-05-23 18:22   ` Andrii Nakryiko
2023-05-24  7:03     ` Jiri Olsa
2023-05-24  1:03   ` Jackie Liu
2023-05-24  1:19     ` Jackie Liu
2023-05-24  6:47       ` Jiri Olsa
2023-05-24  7:06         ` Jackie Liu
2023-05-24  8:41         ` [PATCH v3] libbpf: kprobe.multi: Filter with available_filter_functions Jackie Liu
2023-05-25  8:44           ` Jiri Olsa
2023-05-25 10:27             ` [PATCH v4] " Jackie Liu
2023-05-25 20:43               ` Andrii Nakryiko
2023-05-26  1:38                 ` Jackie Liu
2023-05-26  8:58                   ` Jiri Olsa
2023-06-02 17:27                   ` Andrii Nakryiko
2023-06-07  6:01                     ` Jackie Liu
2023-06-07 22:37                       ` Andrii Nakryiko
2023-06-07 23:22                     ` Jiri Olsa
2023-06-08  0:00                       ` Andrii Nakryiko
2023-06-08  0:57                         ` Jackie Liu [this message]
2023-05-26  2:10                 ` [PATCH v5] " Jackie Liu
2023-05-26  9:53                   ` Jiri Olsa
2023-05-26 12:18                     ` Jackie Liu
2023-05-24  3:44   ` [PATCH v2] " Jackie Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7fecf93e-eaf9-1ffe-4f1d-64f530828363@linux.dev \
    --to=liu.yun@linux.dev \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=liuyun01@kylinos.cn \
    --cc=martin.lau@linux.dev \
    --cc=olsajiri@gmail.com \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.