BPF List
 help / color / mirror / Atom feed
From: Kui-Feng Lee <sinquersw@gmail.com>
To: Alan Maguire <alan.maguire@oracle.com>,
	acme@kernel.org, yhs@fb.com, ast@kernel.org, olsajiri@gmail.com,
	timo@incline.eu
Cc: daniel@iogearbox.net, andrii@kernel.org, songliubraving@fb.com,
	john.fastabend@gmail.com, kpsingh@chromium.org, sdf@google.com,
	haoluo@google.com, martin.lau@kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH dwarves 4/5] btf_encoder: represent "."-suffixed optimized functions (".isra.0") in BTF
Date: Thu, 26 Jan 2023 09:43:26 -0800	[thread overview]
Message-ID: <9d2a5966-7cef-0c35-8990-368fc6de930d@gmail.com> (raw)
In-Reply-To: <e719fbaf-9387-7818-c9dd-7deb545eb60e@oracle.com>


On 1/25/23 10:59, Alan Maguire wrote:
> On 25/01/2023 17:54, Kui-Feng Lee wrote:
>> On 1/24/23 05:45, Alan Maguire wrote:
>>> +/*
>>> + * static functions with suffixes are not added yet - we need to
>>> + * observe across all CUs to see if the static function has
>>> + * optimized parameters in any CU, since in such a case it should
>>> + * not be included in the final BTF.  NF_HOOK.constprop.0() is
>>> + * a case in point - it has optimized-out parameters in some CUs
>>> + * but not others.  In order to have consistency (since we do not
>>> + * know which instance the BTF-specified function signature will
>>> + * apply to), we simply skip adding functions which have optimized
>>> + * out parameters anywhere.
>>> + */
>>> +static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct function *fn)
>>> +{
>>> +    struct btf_encoder *parent = encoder->parent ? encoder->parent : encoder;
>>> +    const char *name = function__name(fn);
>>> +    struct function **nodep;
>>> +    int ret = 0;
>>> +
>>> +    pthread_mutex_lock(&parent->saved_func_lock);
>> Do you have the number of static functions with suffices?
>>
> There are a few thousand, and around 25000 static functions
> overall ("."-suffixed are all static) that will participate in
> the tree representations (see patch 5).  This equates to roughly
> half of the vmlinux BTF functions.


To evaluate the effectiveness of your patchset, I conducted an 
experiment where I ran a command:

`time env LLVM_OBJCOPY=objcopy pahole -J --btf_gen_floats 
--lang_exclude=rust -j .tmp_vmlinux.btf`.


On my machine, it took about

  - 9s w/o the patchset (3s waiting for the worker threads)

  - 13s w/ the patchset (7s waiting for the worker threads)

It was about 4s difference.

If I turned multi-threading off (w/o -j), it took

  - 28s w/o the patchset.

  - 32s w/ the patchset.

It was about 4s difference as sell.


Hence, multi-threading does not benefit us in the instance of this 
patchset. Lock contention should be taken into account heavily here. 
Approximately 4% of the time is spent when executing a Linux incremental 
build (about 96s~108s) with an insignificant modification to the source 
tree for about four seconds.


Taking into consideration the previous experience that shows a reduction 
in BTF info processing time (not including loading and IO) to 13%, I am 
uncertain if it pays off to invest my time towards reducing 4s to <1s. 
Though, cutting down 3 seconds every single time I need to rebuild the 
tree for some small changes might be worth it.


>
>> If the number of static functions with suffices is high, the contention of the lock would be an issue.
>>
>> Is it possible to keep a local pool of static functions with suffices? The pool will be combined with its parent either at the completion of a CU, before ending the thread or when merging into the main thread.
>>
> It's possible alright. I'll try to lay out the possibilities so we
> can figure out the best way forward.
>
> Option 1: global tree of static functions, created during DWARF loading
>
> Pro: Quick addition/lookup, we can flag optimizations or inconsistent prototypes as
> we encounter them.
> Con: Lock contention between encoder threads.
>
> Option 2: store static functions in a per-encoder tree, traverse them all
> prior to BTF merging to eliminate unwanted functions
>
> Pro: limits contention.
> Con: for each static function in each encoder, we need to look it up in all other
> encoder trees. In option 1 we paid that price as the function was added, here
> we pay it later on prior to merging. So processing here is
> O(number_functions * num_encoders). There may be a cleverer way to handle
> this but I can't see it right now.
>
> There may be other approaches to this of course, but these were the two I
> could come up with. What do you think?


Option 2 appears to be the more convenient and effective solution, 
whereas Option 1, I guess, will require considerable effort for a 
successful outcome.


> Alan
>
>>> +    nodep = tsearch(fn, &parent->saved_func_tree, function__compare);
>>> +    if (nodep == NULL) {
>>> +        fprintf(stderr, "error: out of memory adding local function '%s'\n",
>>> +            name);
>>> +        ret = -1;
>>> +        goto out;
>>> +    }
>>> +    /* If we find an existing entry, we want to merge observations
>>> +     * across both functions, checking that the "seen optimized-out
>>> +     * parameters" status is reflected in our tree entry.
>>> +     * If the entry is new, record encoder state required
>>> +     * to add the local function later (encoder + type_id_off)
>>> +     * such that we can add the function later.
>>> +     */
>>> +    if (*nodep != fn) {
>>> +        (*nodep)->proto.optimized_parms |= fn->proto.optimized_parms;
>>> +    } else {
>>> +        struct btf_encoder_state *state = zalloc(sizeof(*state));
>>> +
>>> +        if (state == NULL) {
>>> +            fprintf(stderr, "error: out of memory adding local function '%s'\n",
>>> +                name);
>>> +            ret = -1;
>>> +            goto out;
>>> +        }
>>> +        state->encoder = encoder;
>>> +        state->type_id_off = encoder->type_id_off;
>>> +        fn->priv = state;
>>> +        encoder->saved_func_cnt++;
>>> +        if (encoder->verbose)
>>> +            printf("added local function '%s'%s\n", name,
>>> +                   fn->proto.optimized_parms ?
>>> +                   ", optimized-out params" : "");
>>> +    }
>>> +out:
>>> +    pthread_mutex_unlock(&parent->saved_func_lock);
>>> +    return ret;
>>> +}
>>> +

  reply	other threads:[~2023-01-26 17:43 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-24 13:45 [PATCH dwarves 0/5] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
2023-01-24 13:45 ` [PATCH dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters Alan Maguire
2023-01-25 16:53   ` Jiri Olsa
2023-01-25 17:47   ` Eduard Zingerman
2023-01-25 18:28     ` Alan Maguire
2023-01-25 21:34       ` Eduard Zingerman
2023-01-25 22:52         ` Alan Maguire
2023-01-25 23:42           ` Eduard Zingerman
2023-01-26  0:20             ` Eduard Zingerman
2023-01-26 14:02               ` Alan Maguire
2023-01-26 15:02                 ` Eduard Zingerman
2023-01-24 13:45 ` [PATCH dwarves 2/5] btf_encoder: refactor function addition into dedicated btf_encoder__add_func Alan Maguire
2023-01-24 13:45 ` [PATCH dwarves 3/5] btf_encoder: child encoders should have a reference to parent encoder Alan Maguire
2023-01-24 13:45 ` [PATCH dwarves 4/5] btf_encoder: represent "."-suffixed optimized functions (".isra.0") in BTF Alan Maguire
2023-01-25 17:54   ` Kui-Feng Lee
2023-01-25 18:56     ` Arnaldo Carvalho de Melo
2023-01-26 18:37       ` Kui-Feng Lee
2023-01-25 18:59     ` Alan Maguire
2023-01-26 17:43       ` Kui-Feng Lee [this message]
2023-01-24 13:45 ` [PATCH dwarves 5/5] btf_encoder: skip BTF encoding of static functions with inconsistent prototypes Alan Maguire
2023-01-25 13:39   ` Jiri Olsa
2023-01-25 14:18     ` Alan Maguire
2023-01-25 16:53   ` Jiri Olsa
2023-01-26 14:12     ` Alan Maguire
2023-01-24 15:14 ` [PATCH dwarves 0/5] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Jiri Olsa
2023-01-24 16:11   ` Alan Maguire
2023-01-25 13:59     ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9d2a5966-7cef-0c35-8990-368fc6de930d@gmail.com \
    --to=sinquersw@gmail.com \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=kpsingh@chromium.org \
    --cc=martin.lau@kernel.org \
    --cc=olsajiri@gmail.com \
    --cc=sdf@google.com \
    --cc=songliubraving@fb.com \
    --cc=timo@incline.eu \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox