linux-modules.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>,
	Jiri Kosina <jikos@kernel.org>, Miroslav Benes <mbenes@suse.cz>,
	Joe Lawrence <joe.lawrence@redhat.com>,
	live-patching@vger.kernel.org, linux-kernel@vger.kernel.org,
	Masahiro Yamada <masahiroy@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>, Jiri Olsa <jolsa@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Luis Chamberlain <mcgrof@kernel.org>,
	linux-modules@vger.kernel.org
Subject: Re: [PATCH v2 1/8] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y
Date: Tue, 20 Sep 2022 19:26:43 +0200	[thread overview]
Message-ID: <Yyn305PlgTZixR0V@alley> (raw)
In-Reply-To: <20220909130016.727-2-thunder.leizhen@huawei.com>

On Fri 2022-09-09 21:00:09, Zhen Lei wrote:
> Currently, to search for a symbol, we need to expand the symbols in
> 'kallsyms_names' one by one, and then use the expanded string for
> comparison. This is very slow.
> 
> In fact, we can first compress the name being looked up and then use
> it for comparison when traversing 'kallsyms_names'.

This does not explain how this patch modifies the compressed data
and why it is needed.


> This increases the size of 'kallsyms_names'. About 48KiB, 2.67%, on x86
> with defconfig.
> Before: kallsyms_num_syms=131392, sizeof(kallsyms_names)=1823659
> After : kallsyms_num_syms=131392, sizeof(kallsyms_names)=1872418
> 
> However, if CONFIG_KALLSYMS_ALL is not set, the size of 'kallsyms_names'
> does not change.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  scripts/kallsyms.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
> index f18e6dfc68c5839..ab6fe7cd014efd1 100644
> --- a/scripts/kallsyms.c
> +++ b/scripts/kallsyms.c
> @@ -60,6 +60,7 @@ static unsigned int table_size, table_cnt;
>  static int all_symbols;
>  static int absolute_percpu;
>  static int base_relative;
> +static int sym_start_idx;
>  
>  static int token_profit[0x10000];
>  
> @@ -511,7 +512,7 @@ static void learn_symbol(const unsigned char *symbol, int len)
>  {
>  	int i;
>  
> -	for (i = 0; i < len - 1; i++)
> +	for (i = sym_start_idx; i < len - 1; i++)
>  		token_profit[ symbol[i] + (symbol[i + 1] << 8) ]++;

This skips the first character in the @symbol string. I do not see how
this is used in the new code, for example, in
kallsyms_on_each_match_symbol(), in the 5th patch. It seems to iterate
the compressed data from the 0th index:

	for (i = 0, off = 0; i < kallsyms_num_syms; i++)

>  }
>  
> @@ -520,7 +521,7 @@ static void forget_symbol(const unsigned char *symbol, int len)
>  {
>  	int i;
>  
> -	for (i = 0; i < len - 1; i++)
> +	for (i = sym_start_idx; i < len - 1; i++)
>  		token_profit[ symbol[i] + (symbol[i + 1] << 8) ]--;
>  }
>  
> @@ -538,7 +539,7 @@ static unsigned char *find_token(unsigned char *str, int len,
>  {
>  	int i;
>  
> -	for (i = 0; i < len - 1; i++) {
> +	for (i = sym_start_idx; i < len - 1; i++) {
>  		if (str[i] == token[0] && str[i+1] == token[1])
>  			return &str[i];
>  	}
> @@ -780,6 +781,14 @@ int main(int argc, char **argv)
>  	} else if (argc != 1)
>  		usage();
>  
> +	/*
> +	 * Skip the symbol type, do not compress it to optimize the performance
> +	 * of finding or traversing symbols in kernel, this is good for modules
> +	 * such as livepatch.

I see. The type is added as the first character here.

in static struct sym_entry *read_symbol(FILE *in)
{
[...]
	/* include the type field in the symbol name, so that it gets
	 * compressed together */
[...]
	sym->sym[0] = type;
	strcpy(sym_name(sym), name);

It sounds a bit crazy. read_symbol() makes a trick so that the type
can be compressed. This patch does another trick to avoid it.


> +	 */
> +	if (all_symbols)
> +		sym_start_idx = 1;

This looks a bit fragile. My understanding is that the new code in
kernel/kallsyms.c and kernel/module/kallsyms.c depends on this change.

The faster search is used when CONFIG_KALLSYMS_ALL is defined.
But the data are compressed this way when this script is called
with --all-symbols.

Is it guaranteed that this script will generate the needed data
when CONFIG_KALLSYMS_ALL is defined?

What about 3rd party modules?

I would personally suggest to store the symbol type into a separate
sym->type entry in struct sym_entry and never compress it.

IMHO, the size win is not worth the code complexity.

Well, people compiling the kernel for small devices might think
different. But they probably disable kallsyms completely.

Best Regards,
Petr

  reply	other threads:[~2022-09-20 17:26 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-09 13:00 [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
2022-09-09 13:00 ` [PATCH v2 1/8] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y Zhen Lei
2022-09-20 17:26   ` Petr Mladek [this message]
2022-09-21  2:42     ` Leizhen (ThunderTown)
2022-09-21  6:51       ` Leizhen (ThunderTown)
2022-09-21  7:13       ` Petr Mladek
2022-09-09 13:00 ` [PATCH v2 2/8] scripts/kallsyms: rename build_initial_tok_table() Zhen Lei
2022-09-09 13:00 ` [PATCH v2 3/8] kallsyms: Adjust the types of some local variables Zhen Lei
2022-09-09 13:00 ` [PATCH v2 4/8] kallsyms: Improve the performance of kallsyms_lookup_name() Zhen Lei
2022-09-09 13:00 ` [PATCH v2 5/8] kallsyms: Add helper kallsyms_on_each_match_symbol() Zhen Lei
2022-09-09 13:00 ` [PATCH v2 6/8] livepatch: Use kallsyms_on_each_match_symbol() to improve performance Zhen Lei
2022-09-09 13:00 ` [PATCH v2 7/8] livepatch: Improve the search performance of module_kallsyms_on_each_symbol() Zhen Lei
2022-09-20 12:08   ` Petr Mladek
2022-09-20 14:01     ` Leizhen (ThunderTown)
2022-09-21  6:56       ` Petr Mladek
2022-09-09 13:00 ` [PATCH v2 8/8] kallsyms: Add self-test facility Zhen Lei
2022-09-17  8:07   ` Kees Cook
2022-09-17 12:40     ` Leizhen (ThunderTown)
2022-09-16  1:27 ` [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Leizhen (ThunderTown)
2022-09-16  3:17   ` Leizhen (ThunderTown)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yyn305PlgTZixR0V@alley \
    --to=pmladek@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=ast@kernel.org \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=jolsa@kernel.org \
    --cc=jpoimboe@kernel.org \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=mbenes@suse.cz \
    --cc=mcgrof@kernel.org \
    --cc=thunder.leizhen@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).