From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9B8F53033F7 for ; Thu, 23 Apr 2026 14:00:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.44 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776952811; cv=none; b=NfW1Pk52GCwpMYUOkUbCx01iBv+h+IB+kjJnFdmn+3l952p/ZqdIN5cZ4OnQ2ocuyuCMjT03SaPdQ6P7HKIuJ40184IuC9UPatg5K8afUzwcFCB3hYeZFEL2zhU42XX8E7zgRlfOW5Jdmie5hH3iOJEt2wJrCRKkuRI3nB5nnIc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776952811; c=relaxed/simple; bh=e6wZXuQNZj/lX9I9qWia7hKXHCqhnnmYtXf6RYXsHNg=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=UW9uyX/GUMOwrJTJcrf1oUTW3ZfIqE2wtnRiTIk6HVUMUYKTDsxIQ0TtS1fbTwtL8mqflsTnHRz3ey9dXkMW7ymGhytPJNY2s7x2rqqLL6n+ZWsnYEAB7yM1kg3vMKZCuG3/glH2mvnbCLeHRzoTfYFo9GeRrBh5bOjQ+IERSAo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b=FMpo88C1; arc=none smtp.client-ip=209.85.221.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b="FMpo88C1" Received: by mail-wr1-f44.google.com with SMTP id ffacd0b85a97d-43fde5b81a1so5026559f8f.0 for ; Thu, 23 Apr 2026 07:00:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1776952807; x=1777557607; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=XRQ+3xe0xHI/qgQNJnJ3B7Leq+8E0R1UPebMSgZhgzM=; b=FMpo88C1KZXh+i0ReTiYxHQn+drFez/cF1qEd6tWtMJhU5VqricSWUO5RAUiDVqz3C cAk6AVO81wJz9tcHUfDsVOyvcPkQcXoRWnurtgfm5vv1SkHL/dSSVQ5BRbWMS5f0yn0m aXTNq9kCFmUJoxRPGliUc4oGanUifw4sZcNrbUPiKQmAoLIZPqZIIjvHJGZ2yy5VVpns j5dpf+7ZD/N4dqruc1KPxdh1NE3kDLYhQwkjJEFVI/hCmoiZli2N48wqSHsBhyooseTj coeBxSln9/PAGyC/hqBq2A3WkUjaFHDDElCZcezBjeW7x+5XuwOnnOK0PAOvTDcH72yp /37Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776952807; x=1777557607; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=XRQ+3xe0xHI/qgQNJnJ3B7Leq+8E0R1UPebMSgZhgzM=; b=QpBg63s7MN2iu59FUwsU/rvv1mA36uxMF9XAZxDVnnzb37QPOzd893CwjHv80UhYkk 5wig3Ki2oJt6XZJI8nb7IvAT2aTWDqJyX01YbU5xA2pXCW2OnwlLPIEqMdERtx4Tkp9i j72VFE2/dSRM4sLLM5kTFOYqu5qOXdusn+Mrv7dK4+O5bxTSJPLGKkzdhMq911ZQUHiw GANc64kO/R+6s35OjJnGIUL51Y3NXZjwaGoYJk5eVLW8sLbSKvPOXqA2mpees/sVcv8j P4Gn4+fGW/Gpq9dY/m8Pdga7Dy7RGxL0ew8Nhw/ejkAlv0+j4CfsAfTQHc50uNZbrqLQ hnLw== X-Gm-Message-State: AOJu0YyukOORhxT0Tro9njTjMp8jtIkqc/X63Ml/HvE29E3emXyVMTjI kiDnYuRWemVqcxqdAD+w3myVVwVfB54XVej1+Z2on6SmHf7p6BbO1MEzqV2Oe5rpkoM= X-Gm-Gg: AeBDieu0a7pxbxvrjxQ0cyhrH7vlfm04zTAG0DJHpsk/i7ogQVTHR4Vstt+Hx7fnblw J6nILndXA0py+oUjYWQiuUZnU2UEHnSwrJ/kVKWShT41uRP0kBCx5RjE4VleLhjg1bQRC4Jtxy2 9hSwJYYys1Oj8GsSD9Ot90PV0z84xccS0J+fPRlyQUq8SwX+IRGjFhscIalUUB99Eb7DBE3i++s xPH0YpqoMBfSgyz/EoL+FeAD1OiPCtgsvd/OZCx8D02PyBInjGgj8twYQYavZdLu5/X6vatL7xp ntjUU6Jh/0dbN6HCaeqOIFTPweLn18MAvR56lOC6Ad1MCGXuRP5i8MYMm+twgV7/8BV9NX0fM02 2NHkskbjx1z4pnnsvjXQWyDjbcApA5GkchF+5PjauOX1WCL9ZpTQk0TAa5U0xK78mg3zhInXXt7 UV2OfCDudu7ySHLntNx0Yzq79Bb9BVDiTwW16qr891AYbBUh5BqjISW+o= X-Received: by 2002:a5d:5f92:0:b0:43b:9c73:2933 with SMTP id ffacd0b85a97d-43fe3dc1795mr42546195f8f.15.1776952806300; Thu, 23 Apr 2026 07:00:06 -0700 (PDT) Received: from [10.100.51.209] (nat2.prg.suse.com. [195.250.132.146]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43fe4cb1176sm54804409f8f.3.2026.04.23.07.00.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 23 Apr 2026 07:00:05 -0700 (PDT) Message-ID: <11c8e139-f9f3-4b22-863a-4e021a3947e7@suse.com> Date: Thu, 23 Apr 2026 16:00:04 +0200 Precedence: bulk X-Mailing-List: linux-modules@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 2/2] module/kallsyms: sort function symbols and use binary search To: Stanislaw Gruszka Cc: linux-modules@vger.kernel.org, Sami Tolvanen , Luis Chamberlain , linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, live-patching@vger.kernel.org, Daniel Gomez , Aaron Tomlin , Steven Rostedt , Masami Hiramatsu , Jordan Rome , Viktor Malik References: <20260327110005.16499-1-stf_xl@wp.pl> <20260327110005.16499-2-stf_xl@wp.pl> Content-Language: en-US From: Petr Pavlu In-Reply-To: <20260327110005.16499-2-stf_xl@wp.pl> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 3/27/26 12:00 PM, Stanislaw Gruszka wrote: > Module symbol lookup via find_kallsyms_symbol() performs a linear scan > over the entire symtab when resolving an address. The number of symbols > in module symtabs has grown over the years, largely due to additional > metadata in non-standard sections, making this lookup very slow. > > Improve this by separating function symbols during module load, placing > them at the beginning of the symtab, sorting them by address, and using > binary search when resolving addresses in module text. > > This also should improve times for linear symbol name lookups, as valid > function symbols are now located at the beginning of the symtab. > > The cost of sorting is small relative to module load time. In repeated > module load tests [1], depending on .config options, this change > increases load time between 2% and 4%. With cold caches, the difference > is not measurable, as memory access latency dominates. > > The sorting theoretically could be done in compile time, but much more > complicated as we would have to simulate kernel addresses resolution > for symbols, and then correct relocation entries. That would be risky > if get out of sync. > > The improvement can be observed when listing ftrace filter functions. > > Before: > > root@nano:~# time cat /sys/kernel/tracing/available_filter_functions | wc -l > 74908 > > real 0m1.315s > user 0m0.000s > sys 0m1.312s > > After: > > root@nano:~# time cat /sys/kernel/tracing/available_filter_functions | wc -l > 74911 > > real 0m0.167s > user 0m0.004s > sys 0m0.175s > > (there are three more symbols introduced by the patch) > > For livepatch modules, the symtab layout is preserved and the existing > linear search is used. For this case, it should be possible to keep > the original ELF symtab instead of copying it 1:1, but that is outside > the scope of this patch. > > Link: https://gist.github.com/sgruszka/09f3fb1dad53a97b1aad96e1927ab117 [1] > Signed-off-by: Stanislaw Gruszka Sorry for the delay reviewing this patch. > --- > v1 -> v2: > - fix searching data symbols for CONFIG_KALLSYMS_ALL > - use kallsyms_symbol_value() in elf_sym_cmp() > > include/linux/module.h | 1 + > kernel/module/internal.h | 1 + > kernel/module/kallsyms.c | 171 +++++++++++++++++++++++++++++---------- > 3 files changed, 130 insertions(+), 43 deletions(-) > > diff --git a/include/linux/module.h b/include/linux/module.h > index ac254525014c..67c053afa882 100644 > --- a/include/linux/module.h > +++ b/include/linux/module.h > @@ -379,6 +379,7 @@ struct module_memory { > struct mod_kallsyms { > Elf_Sym *symtab; > unsigned int num_symtab; > + unsigned int num_func_syms; > char *strtab; > char *typetab; > }; > diff --git a/kernel/module/internal.h b/kernel/module/internal.h > index 618202578b42..6a4d498619b1 100644 > --- a/kernel/module/internal.h > +++ b/kernel/module/internal.h > @@ -73,6 +73,7 @@ struct load_info { > bool sig_ok; > #ifdef CONFIG_KALLSYMS > unsigned long mod_kallsyms_init_off; > + unsigned long num_func_syms; > #endif > #ifdef CONFIG_MODULE_DECOMPRESS > #ifdef CONFIG_MODULE_STATS > diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c > index f23126d804b2..d69e99e67707 100644 > --- a/kernel/module/kallsyms.c > +++ b/kernel/module/kallsyms.c > @@ -10,6 +10,7 @@ > #include > #include > #include > +#include > #include "internal.h" > > /* Lookup exported symbol in given range of kernel_symbols */ > @@ -103,6 +104,95 @@ static bool is_core_symbol(const Elf_Sym *src, const Elf_Shdr *sechdrs, > return true; > } > > +static inline bool is_func_symbol(const Elf_Sym *sym) > +{ > + return sym->st_shndx != SHN_UNDEF && sym->st_size != 0 && > + ELF_ST_TYPE(sym->st_info) == STT_FUNC; > +} > + > +static unsigned int bsearch_func_symbol(struct mod_kallsyms *kallsyms, > + unsigned long addr, > + unsigned long *bestval, > + unsigned long *nextval) > + > +{ > + unsigned int mid, low = 1, high = kallsyms->num_func_syms + 1; > + unsigned int best = 0; > + unsigned long thisval; > + > + while (low < high) { > + mid = low + (high - low) / 2; > + thisval = kallsyms_symbol_value(&kallsyms->symtab[mid]); > + > + if (thisval <= addr) { > + *bestval = thisval; > + best = mid; > + low = mid + 1; If thisval == addr, the search moves to the right and finds the last symbol with the same address. I believe it should do the opposite and return the first symbol to match the behavior of search_kallsyms_symbol(). > + } else { > + *nextval = thisval; > + high = mid; > + } > + } > + > + return best; > +} > + > +static const char *kallsyms_symbol_name(struct mod_kallsyms *kallsyms, > + unsigned int symnum) > +{ > + return kallsyms->strtab + kallsyms->symtab[symnum].st_name; > +} > + > +static unsigned int search_kallsyms_symbol(struct mod_kallsyms *kallsyms, > + unsigned long addr, > + unsigned long *bestval, > + unsigned long *nextval) > +{ > + unsigned int i, best = 0; > + > + /* > + * Scan for closest preceding symbol and next symbol. (ELF starts > + * real symbols at 1). Skip the initial function symbols range > + * if num_func_syms is non-zero, those are handled separately for > + * the core TEXT segment lookup. > + */ > + for (i = 1 + kallsyms->num_func_syms; i < kallsyms->num_symtab; i++) { > + const Elf_Sym *sym = &kallsyms->symtab[i]; > + unsigned long thisval = kallsyms_symbol_value(sym); > + > + if (sym->st_shndx == SHN_UNDEF) > + continue; > + > + /* > + * We ignore unnamed symbols: they're uninformative > + * and inserted at a whim. > + */ > + if (*kallsyms_symbol_name(kallsyms, i) == '\0' || > + is_mapping_symbol(kallsyms_symbol_name(kallsyms, i))) > + continue; > + > + if (thisval <= addr && thisval > *bestval) { > + best = i; > + *bestval = thisval; > + } > + if (thisval > addr && thisval < *nextval) > + *nextval = thisval; > + } > + > + return best; > +} > + > +static int elf_sym_cmp(const void *a, const void *b) > +{ > + unsigned long val_a = kallsyms_symbol_value((const Elf_Sym *)a); > + unsigned long val_b = kallsyms_symbol_value((const Elf_Sym *)b); > + > + if (val_a < val_b) > + return -1; > + > + return val_a > val_b; Does this comparison function and the sort() call result in stable sorting? If val_a and val_b are the same, the sorting should preserve the original order. > +} > + > /* > * We only allocate and copy the strings needed by the parts of symtab > * we keep. This is simple, but has the effect of making multiple > @@ -115,9 +205,10 @@ void layout_symtab(struct module *mod, struct load_info *info) > Elf_Shdr *symsect = info->sechdrs + info->index.sym; > Elf_Shdr *strsect = info->sechdrs + info->index.str; > const Elf_Sym *src; > - unsigned int i, nsrc, ndst, strtab_size = 0; > + unsigned int i, nsrc, ndst, nfunc, strtab_size = 0; > struct module_memory *mod_mem_data = &mod->mem[MOD_DATA]; > struct module_memory *mod_mem_init_data = &mod->mem[MOD_INIT_DATA]; > + bool is_lp_mod = is_livepatch_module(mod); > > /* Put symbol section at end of init part of module. */ > symsect->sh_flags |= SHF_ALLOC; > @@ -129,12 +220,14 @@ void layout_symtab(struct module *mod, struct load_info *info) > nsrc = symsect->sh_size / sizeof(*src); > > /* Compute total space required for the core symbols' strtab. */ > - for (ndst = i = 0; i < nsrc; i++) { > - if (i == 0 || is_livepatch_module(mod) || > + for (ndst = nfunc = i = 0; i < nsrc; i++) { > + if (i == 0 || is_lp_mod || > is_core_symbol(src + i, info->sechdrs, info->hdr->e_shnum, > info->index.pcpu)) { > strtab_size += strlen(&info->strtab[src[i].st_name]) + 1; > ndst++; > + if (!is_lp_mod && is_func_symbol(src + i)) > + nfunc++; > } > } > > @@ -156,6 +249,7 @@ void layout_symtab(struct module *mod, struct load_info *info) > mod_mem_init_data->size = ALIGN(mod_mem_init_data->size, > __alignof__(struct mod_kallsyms)); > info->mod_kallsyms_init_off = mod_mem_init_data->size; > + info->num_func_syms = nfunc; > > mod_mem_init_data->size += sizeof(struct mod_kallsyms); > info->init_typeoffs = mod_mem_init_data->size; > @@ -169,7 +263,7 @@ void layout_symtab(struct module *mod, struct load_info *info) > */ > void add_kallsyms(struct module *mod, const struct load_info *info) > { > - unsigned int i, ndst; > + unsigned int i, di, nfunc, ndst; > const Elf_Sym *src; > Elf_Sym *dst; > char *s; > @@ -178,6 +272,7 @@ void add_kallsyms(struct module *mod, const struct load_info *info) > void *data_base = mod->mem[MOD_DATA].base; > void *init_data_base = mod->mem[MOD_INIT_DATA].base; > struct mod_kallsyms *kallsyms; > + bool is_lp_mod = is_livepatch_module(mod); > > kallsyms = init_data_base + info->mod_kallsyms_init_off; This code is followed by the initialization of kallsyms: kallsyms->symtab = (void *)symsec->sh_addr; kallsyms->num_symtab = symsec->sh_size / sizeof(Elf_Sym); /* Make sure we get permanent strtab: don't use info->strtab. */ kallsyms->strtab = (void *)info->sechdrs[info->index.str].sh_addr; kallsyms->typetab = init_data_base + info->init_typeoffs; I suggest adding 'kallsyms->num_func_syms = 0;' after the initialization of kallsyms->num_symtab. > > @@ -194,19 +289,28 @@ void add_kallsyms(struct module *mod, const struct load_info *info) > mod->core_kallsyms.symtab = dst = data_base + info->symoffs; > mod->core_kallsyms.strtab = s = data_base + info->stroffs; > mod->core_kallsyms.typetab = data_base + info->core_typeoffs; > + > strtab_size = info->core_typeoffs - info->stroffs; > src = kallsyms->symtab; > - for (ndst = i = 0; i < kallsyms->num_symtab; i++) { > + ndst = info->num_func_syms + 1; > + > + for (nfunc = i = 0; i < kallsyms->num_symtab; i++) { > kallsyms->typetab[i] = elf_type(src + i, info); > - if (i == 0 || is_livepatch_module(mod) || > + if (i == 0 || is_lp_mod || > is_core_symbol(src + i, info->sechdrs, info->hdr->e_shnum, > info->index.pcpu)) { > ssize_t ret; > > - mod->core_kallsyms.typetab[ndst] = > - kallsyms->typetab[i]; > - dst[ndst] = src[i]; > - dst[ndst++].st_name = s - mod->core_kallsyms.strtab; > + if (i == 0) > + di = 0; > + else if (!is_lp_mod && is_func_symbol(src + i)) > + di = 1 + nfunc++; > + else > + di = ndst++; > + > + mod->core_kallsyms.typetab[di] = kallsyms->typetab[i]; > + dst[di] = src[i]; > + dst[di].st_name = s - mod->core_kallsyms.strtab; > ret = strscpy(s, &kallsyms->strtab[src[i].st_name], > strtab_size); > if (ret < 0) > @@ -216,9 +320,13 @@ void add_kallsyms(struct module *mod, const struct load_info *info) > } > } > > + WARN_ON_ONCE(nfunc != info->num_func_syms); > + sort(dst + 1, nfunc, sizeof(Elf_Sym), elf_sym_cmp, NULL); > + The code sorts mod->core_kallsyms.symtab but mod->core_kallsyms.typetab is not reordered accordingly. > /* Set up to point into init section. */ > rcu_assign_pointer(mod->kallsyms, kallsyms); > mod->core_kallsyms.num_symtab = ndst; > + mod->core_kallsyms.num_func_syms = nfunc; > } > > #if IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID) > @@ -241,11 +349,6 @@ void init_build_id(struct module *mod, const struct load_info *info) > } > #endif > > -static const char *kallsyms_symbol_name(struct mod_kallsyms *kallsyms, unsigned int symnum) > -{ > - return kallsyms->strtab + kallsyms->symtab[symnum].st_name; > -} > - > /* > * Given a module and address, find the corresponding symbol and return its name > * while providing its size and offset if needed. > @@ -255,7 +358,10 @@ static const char *find_kallsyms_symbol(struct module *mod, > unsigned long *size, > unsigned long *offset) > { > - unsigned int i, best = 0; > + unsigned int (*search)(struct mod_kallsyms *kallsyms, > + unsigned long addr, unsigned long *bestval, > + unsigned long *nextval); > + unsigned int best; > unsigned long nextval, bestval; > struct mod_kallsyms *kallsyms = rcu_dereference(mod->kallsyms); > struct module_memory *mod_mem = NULL; > @@ -266,6 +372,11 @@ static const char *find_kallsyms_symbol(struct module *mod, > continue; > #endif > if (within_module_mem_type(addr, mod, type)) { > + if (type == MOD_TEXT && kallsyms->num_func_syms > 0) > + search = bsearch_func_symbol; I'm not sure if it is ok to limit the search only to function symbols when the address lies in MOD_TEXT. The text can theoretically contain non-function symbols. Could this optimization be adjusted to sort all MOD_TEXT symbols (excluding anonymous and mapping symbols) and move them to the front of the symbol table? > + else > + search = search_kallsyms_symbol; > + > mod_mem = &mod->mem[type]; > break; > } > @@ -278,33 +389,7 @@ static const char *find_kallsyms_symbol(struct module *mod, > nextval = (unsigned long)mod_mem->base + mod_mem->size; > bestval = (unsigned long)mod_mem->base - 1; > > - /* > - * Scan for closest preceding symbol, and next symbol. (ELF > - * starts real symbols at 1). > - */ > - for (i = 1; i < kallsyms->num_symtab; i++) { > - const Elf_Sym *sym = &kallsyms->symtab[i]; > - unsigned long thisval = kallsyms_symbol_value(sym); > - > - if (sym->st_shndx == SHN_UNDEF) > - continue; > - > - /* > - * We ignore unnamed symbols: they're uninformative > - * and inserted at a whim. > - */ > - if (*kallsyms_symbol_name(kallsyms, i) == '\0' || > - is_mapping_symbol(kallsyms_symbol_name(kallsyms, i))) > - continue; > - > - if (thisval <= addr && thisval > bestval) { > - best = i; > - bestval = thisval; > - } > - if (thisval > addr && thisval < nextval) > - nextval = thisval; > - } > - > + best = search(kallsyms, addr, &bestval, &nextval); > if (!best) > return NULL; > -- Thanks, Petr