All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eugene Loh <eugene.loh@oracle.com>
Cc: rostedt@goodmis.org, corbet@lwn.net,
	yamada.masahiro@socionext.com, michal.lkml@markovi.net,
	jeyu@kernel.org, linux-kbuild@vger.kernel.org, maz@kernel.org,
	songliubraving@fb.com, tglx@linutronix.de,
	jacob.e.keller@intel.com,
	Kris Van Hees <kris.van.hees@oracle.com>,
	Nick Alcock <nick.alcock@oracle.com>
Subject: Re: [PATCH v4] kallsyms: add names of built-in modules
Date: Wed, 18 Dec 2019 15:55:18 -0800	[thread overview]
Message-ID: <2a535000-e71e-fab9-cf6a-e7e5fb8053d8@oracle.com> (raw)
In-Reply-To: <20191210174826.5433-1-eugene.loh@oracle.com>

Ping.


On 12/10/2019 09:48 AM, eugene.loh@oracle.com wrote:
> From: Eugene Loh <eugene.loh@oracle.com>
>
> /proc/kallsyms is very useful for tracers and other tools that need
> to map kernel symbols to addresses.
>
> It would be useful if there were a mapping between kernel symbol and
> module name that only changed when the kernel source code is changed.
> This mapping should not vanish simply because a module becomes built
> into the kernel.
>
> Therefore:
>
> - Generate a file "modules_thick.builtin" that maps from thin
>    archives that make up built-in modules to their constituent
>    object files.
>
> - Generate a linker map ".tmp_vmlinux.map", converting it into
>    ".tmp_vmlinux.ranges", mapping address ranges to object files.
>
> - Read "modules_thick.builtin" and ".tmp_vmlinux.ranges" to
>    map symbol addresses to built-in-module names.  Write those
>    module names (kallsyms_modules) and that per-symbol module
>    information (kallsyms_symbol_modules) to the *.s output file.
>
> - Use kallsyms_modules and kallsyms_symbol_modules to add
>    built-in-module information to /proc/kallsyms.
>
> Note that kernel symbols for built-in modules appear in ascending
> order by address, as usual, and thus will appear interspersed with
> symbols that are part of other built-in modules or of the kernel.
>
> Also, while it is possible for an object to appear in multiple
> built-in modules, making an unambiguous mapping of symbol to module
> impossible in such cases, this patch addresses the typical cases.
>
> Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
> Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
> ---
>   .gitignore                  |   1 +
>   Documentation/dontdiff      |   1 +
>   Makefile                    |  41 +++--
>   kernel/kallsyms.c           |  12 +-
>   scripts/Makefile            |   5 +
>   scripts/Makefile.modbuiltin |  20 ++-
>   scripts/kallsyms.c          | 298 +++++++++++++++++++++++++++++++++++-
>   scripts/link-vmlinux.sh     |  17 ++
>   scripts/modules_thick.c     | 104 +++++++++++++
>   scripts/modules_thick.h     |  27 ++++
>   scripts/namespace.pl        |   5 +
>   11 files changed, 509 insertions(+), 22 deletions(-)
>   create mode 100644 scripts/modules_thick.c
>   create mode 100644 scripts/modules_thick.h
>
> diff --git a/.gitignore b/.gitignore
> index 72ef86a5570d..0b9c88f1d388 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -46,6 +46,7 @@
>   Module.symvers
>   modules.builtin
>   modules.order
> +modules_thick.builtin
>   
>   #
>   # Top-level generic files
> diff --git a/Documentation/dontdiff b/Documentation/dontdiff
> index 72fc2e9e2b63..9d0db2ef3a51 100644
> --- a/Documentation/dontdiff
> +++ b/Documentation/dontdiff
> @@ -181,6 +181,7 @@ modules.builtin
>   modules.builtin.modinfo
>   modules.nsdeps
>   modules.order
> +modules_thick.builtin
>   modversions.h*
>   nconf
>   nconf-cfg
> diff --git a/Makefile b/Makefile
> index 73e3c2802927..430d49d3a93e 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1073,7 +1073,7 @@ cmd_link-vmlinux =                                                 \
>   	$(CONFIG_SHELL) $< $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_vmlinux) ;    \
>   	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
>   
> -vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) FORCE
> +vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) modules_thick.builtin FORCE
>   	+$(call if_changed,link-vmlinux)
>   
>   targets := vmlinux
> @@ -1284,17 +1284,6 @@ modules: $(if $(KBUILD_BUILTIN),vmlinux) modules.order modules.builtin
>   modules.order: descend
>   	$(Q)$(AWK) '!x[$$0]++' $(addsuffix /$@, $(build-dirs)) > $@
>   
> -modbuiltin-dirs := $(addprefix _modbuiltin_, $(build-dirs))
> -
> -modules.builtin: $(modbuiltin-dirs)
> -	$(Q)$(AWK) '!x[$$0]++' $(addsuffix /$@, $(build-dirs)) > $@
> -
> -PHONY += $(modbuiltin-dirs)
> -# tristate.conf is not included from this Makefile. Add it as a prerequisite
> -# here to make it self-healing in case somebody accidentally removes it.
> -$(modbuiltin-dirs): include/config/tristate.conf
> -	$(Q)$(MAKE) $(modbuiltin)=$(patsubst _modbuiltin_%,%,$@)
> -
>   # Target to prepare building external modules
>   PHONY += modules_prepare
>   modules_prepare: prepare
> @@ -1347,6 +1336,33 @@ modules modules_install:
>   
>   endif # CONFIG_MODULES
>   
> +# modules.builtin has a 'thick' form which maps from kernel modules (or rather
> +# the object file names they would have had had they not been built in) to their
> +# constituent object files: kallsyms uses this to determine which modules any
> +# given object file is part of.  (We cannot eliminate the slight redundancy
> +# here without double-expansion.)
> +
> +modbuiltin-dirs := $(addprefix _modbuiltin_, $(build-dirs))
> +
> +modbuiltin-thick-dirs := $(addprefix _modbuiltin_thick_, $(build-dirs))
> +
> +modules.builtin: $(modbuiltin-dirs)
> +	$(Q)$(AWK) '!x[$$0]++' $(addsuffix /$@, $(build-dirs)) > $@
> +
> +modules_thick.builtin: $(modbuiltin-thick-dirs)
> +	$(Q)$(AWK) '!x[$$0]++' $(addsuffix /$@, $(build-dirs)) > $@
> +
> +PHONY += $(modbuiltin-dirs) $(modbuiltin-thick-dirs)
> +# tristate.conf is not included from this Makefile. Add it as a prerequisite
> +# here to make it self-healing in case somebody accidentally removes it.
> +$(modbuiltin-dirs): include/config/tristate.conf
> +	$(Q)$(MAKE) $(modbuiltin)=$(patsubst _modbuiltin_%,%,$@) \
> +			builtin-file=modules.builtin
> +
> +$(modbuiltin-thick-dirs): include/config/tristate.conf
> +	$(Q)$(MAKE) $(modbuiltin)=$(patsubst _modbuiltin_thick_%,%,$@) \
> +			builtin-file=modules_thick.builtin
> +
>   ###
>   # Cleaning is done on three levels.
>   # make clean     Delete most generated files
> @@ -1712,6 +1728,7 @@ clean: $(clean-dirs)
>   		-o -name '*.asn1.[ch]' \
>   		-o -name '*.symtypes' -o -name 'modules.order' \
>   		-o -name modules.builtin -o -name '.tmp_*.o.*' \
> +		-o -name modules_thick.builtin \
>   		-o -name '*.c.[012]*.*' \
>   		-o -name '*.ll' \
>   		-o -name '*.gcno' \) -type f -print | xargs rm -f
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index 136ce049c4ad..ce8576503e35 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -46,6 +46,8 @@ __attribute__((weak, section(".rodata")));
>   
>   extern const u8 kallsyms_token_table[] __weak;
>   extern const u16 kallsyms_token_index[] __weak;
> +extern const char kallsyms_modules[] __weak;
> +extern const u32 kallsyms_symbol_modules[] __weak;
>   
>   extern const unsigned int kallsyms_markers[] __weak;
>   
> @@ -508,8 +510,16 @@ static int get_ksymbol_bpf(struct kallsym_iter *iter)
>   static unsigned long get_ksymbol_core(struct kallsym_iter *iter)
>   {
>   	unsigned off = iter->nameoff;
> +	u32 mod_index = 0;
>   
> -	iter->module_name[0] = '\0';
> +	if (kallsyms_symbol_modules)
> +		mod_index = kallsyms_symbol_modules[iter->pos];
> +
> +	if (mod_index == 0 || kallsyms_modules == NULL)
> +		iter->module_name[0] = '\0';
> +	else
> +		strcpy(iter->module_name, &kallsyms_modules[mod_index]);
> +	iter->exported = 0;
>   	iter->value = kallsyms_sym_address(iter->pos);
>   
>   	iter->type = kallsyms_get_symbol_type(off);
> diff --git a/scripts/Makefile b/scripts/Makefile
> index 00c47901cb06..44641cabb261 100644
> --- a/scripts/Makefile
> +++ b/scripts/Makefile
> @@ -26,6 +26,11 @@ HOSTLDLIBS_extract-cert = -lcrypto
>   
>   always		:= $(hostprogs-y) $(hostprogs-m)
>   
> +kallsyms-objs	:= kallsyms.o
> +kallsyms-objs	+= modules_thick.o
> +
> +HOSTCFLAGS_modules_thick.o := -I$(srctree)/scripts
> +HOSTCFLAGS_kallsyms.o := -I$(srctree)/scripts
>   # The following hostprogs-y programs are only build on demand
>   hostprogs-y += unifdef
>   
> diff --git a/scripts/Makefile.modbuiltin b/scripts/Makefile.modbuiltin
> index 7d4711b88656..06f31e58111e 100644
> --- a/scripts/Makefile.modbuiltin
> +++ b/scripts/Makefile.modbuiltin
> @@ -1,6 +1,6 @@
>   # SPDX-License-Identifier: GPL-2.0
>   # ==========================================================================
> -# Generating modules.builtin
> +# Generating modules.builtin and modules_thick.builtin
>   # ==========================================================================
>   
>   src := $(obj)
> @@ -30,19 +30,29 @@ __subdir-Y     := $(patsubst %/,%,$(filter %/, $(obj-Y)))
>   subdir-Y       += $(__subdir-Y)
>   subdir-ym      := $(sort $(subdir-y) $(subdir-Y) $(subdir-m))
>   subdir-ym      := $(addprefix $(obj)/,$(subdir-ym))
> -obj-Y          := $(addprefix $(obj)/,$(obj-Y))
> +pathobj-Y      := $(addprefix $(obj)/,$(obj-Y))
>   
>   modbuiltin-subdirs := $(patsubst %,%/modules.builtin, $(subdir-ym))
> -modbuiltin-mods    := $(filter %.ko, $(obj-Y:.o=.ko))
> +modbuiltin-mods    := $(filter %.ko, $(pathobj-Y:.o=.ko))
>   modbuiltin-target  := $(obj)/modules.builtin
> +modthickbuiltin-subdirs := $(patsubst %,%/modules_thick.builtin, $(subdir-ym))
> +modthickbuiltin-target  := $(obj)/modules_thick.builtin
>   
> -__modbuiltin: $(modbuiltin-target) $(subdir-ym)
> +__modbuiltin: $(obj)/$(builtin-file) $(subdir-ym)
>   	@:
>   
>   $(modbuiltin-target): $(subdir-ym) FORCE
>   	$(Q)(for m in $(modbuiltin-mods); do echo $$m; done;	\
>   	cat /dev/null $(modbuiltin-subdirs)) > $@
>   
> +$(modthickbuiltin-target): $(subdir-ym) FORCE
> +	$(Q) $(foreach mod-o, $(filter %.o,$(obj-Y)),\
> +		printf "%s:" $(addprefix $(obj)/,$(mod-o)) >> $@; \
> +		printf " %s" $(sort $(strip $(addprefix $(obj)/,$($(mod-o:.o=-objs)) \
> +			$($(mod-o:.o=-y)) $($(mod-o:.o=-Y))))) >> $@; \
> +		printf "\n" >> $@; ) \
> +	cat /dev/null $(modthickbuiltin-subdirs) >> $@;
> +
>   PHONY += FORCE
>   
>   FORCE:
> @@ -52,6 +62,6 @@ FORCE:
>   
>   PHONY += $(subdir-ym)
>   $(subdir-ym):
> -	$(Q)$(MAKE) $(modbuiltin)=$@
> +	$(Q)$(MAKE) $(modbuiltin)=$@ builtin-file=$(builtin-file)
>   
>   .PHONY: $(PHONY)
> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
> index fb55f262f42d..7368996e5d7b 100644
> --- a/scripts/kallsyms.c
> +++ b/scripts/kallsyms.c
> @@ -5,7 +5,8 @@
>    * This software may be used and distributed according to the terms
>    * of the GNU General Public License, incorporated herein by reference.
>    *
> - * Usage: nm -n vmlinux | scripts/kallsyms [--all-symbols] > symbols.S
> + * Usage: nm -n vmlinux | scripts/kallsyms [--all-symbols]
> + *                   [--absolute-percpu] [--base-relative] > symbols.S
>    *
>    *      Table compression uses all the unused char codes on the symbols and
>    *  maps these to the most used substrings (tokens). For instance, it might
> @@ -18,12 +19,15 @@
>    *
>    */
>   
> +#define _GNU_SOURCE 1
>   #include <stdbool.h>
>   #include <stdio.h>
>   #include <stdlib.h>
>   #include <string.h>
>   #include <ctype.h>
>   #include <limits.h>
> +#include <errno.h>
> +#include "modules_thick.h"
>   
>   #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0]))
>   
> @@ -35,6 +39,7 @@ struct sym_entry {
>   	unsigned int start_pos;
>   	unsigned char *sym;
>   	unsigned int percpu_absolute;
> +	unsigned int module;
>   };
>   
>   struct addr_range {
> @@ -68,10 +73,101 @@ static unsigned char best_table[256][2];
>   static unsigned char best_table_len[256];
>   
>   
> +static unsigned int strhash(const char *s)
> +{
> +	/* fnv32 hash */
> +	unsigned int hash = 2166136261U;
> +
> +	for (; *s; s++)
> +		hash = (hash ^ *s) * 0x01000193;
> +	return hash;
> +}
> +
> +#define OBJ2MOD_BITS 10
> +#define OBJ2MOD_N (1 << OBJ2MOD_BITS)
> +#define OBJ2MOD_MASK (OBJ2MOD_N - 1)
> +struct obj2mod_elem {
> +	char *obj;
> +	int mod;
> +	struct obj2mod_elem *next;
> +};
> +
> +static struct obj2mod_elem *obj2mod[OBJ2MOD_N];
> +
> +static void obj2mod_put(const char *obj, int mod)
> +{
> +	int i = strhash(obj) & OBJ2MOD_MASK;
> +	struct obj2mod_elem *elem = malloc(sizeof(struct obj2mod_elem));
> +
> +	if (!elem) {
> +		fprintf(stderr, "kallsyms: out of memory\n");
> +		exit(1);
> +	}
> +
> +	elem->obj = strdup(obj);
> +	if (!elem->obj) {
> +		fprintf(stderr, "kallsyms: out of memory\n");
> +		free(elem);
> +		exit(1);
> +	}
> +
> +	elem->mod = mod;
> +	elem->next = obj2mod[i];
> +	obj2mod[i] = elem;
> +}
> +
> +static int obj2mod_get(const char *obj)
> +{
> +	int i = strhash(obj) & OBJ2MOD_MASK;
> +	struct obj2mod_elem *elem;
> +
> +	for (elem = obj2mod[i]; elem; elem = elem->next)
> +		if (strcmp(elem->obj, obj) == 0)
> +			return elem->mod;
> +	return 0;
> +}
> +
> +static void obj2mod_free(void)
> +{
> +	int i;
> +
> +	for (i = 0; i < OBJ2MOD_N; i++) {
> +		struct obj2mod_elem *elem = obj2mod[i];
> +		struct obj2mod_elem *next;
> +
> +		while (elem) {
> +			next = elem->next;
> +			free(elem->obj);
> +			free(elem);
> +			elem = next;
> +		}
> +	}
> +}
> +
> +/*
> + * The builtin module names.  The "offset" points to the name as if
> + * all builtin module names were concatenated to a single string.
> + */
> +static unsigned int builtin_module_size;	/* number allocated */
> +static unsigned int builtin_module_len;		/* number assigned */
> +static char **builtin_modules;			/* array of module names */
> +static unsigned int *builtin_module_offsets;	/* offset */
> +
> +/*
> + * An ordered list of address ranges and how they map to built-in modules.
> + */
> +struct addrmap_entry {
> +	unsigned long long addr;
> +	unsigned long long size;
> +	unsigned int module;
> +};
> +static struct addrmap_entry *addrmap;
> +static int addrmap_num, addrmap_alloced;
> +
>   static void usage(void)
>   {
> -	fprintf(stderr, "Usage: kallsyms [--all-symbols] "
> -			"[--base-relative] < in.map > out.S\n");
> +	fprintf(stderr, "Usage: kallsyms [--all-symbols] [--absolute-percpu] "
> +			"[--base-relative] < nm_vmlinux.out > symbols.S\n");
>   	exit(1);
>   }
>   
> @@ -98,6 +194,8 @@ static bool is_ignored_symbol(const char *name, char type)
>   		"kallsyms_markers",
>   		"kallsyms_token_table",
>   		"kallsyms_token_index",
> +		"kallsyms_symbol_modules",
> +		"kallsyms_modules",
>   		/* Exclude linker generated symbols which vary between passes */
>   		"_SDA_BASE_",		/* ppc */
>   		"_SDA2_BASE_",		/* ppc */
> @@ -174,10 +272,23 @@ static void check_symbol_range(const char *sym, unsigned long long addr,
>   	}
>   }
>   
> +static int addrmap_compare(const void *keyp, const void *rangep)
> +{
> +	unsigned long long addr = *((const unsigned long long *)keyp);
> +	const struct addrmap_entry *range = rangep;
> +
> +	if (addr < range->addr)
> +		return -1;
> +	if (addr < range->addr + range->size)
> +		return 0;
> +	return 1;
> +}
> +
>   static int read_symbol(FILE *in, struct sym_entry *s)
>   {
>   	char sym[500], stype;
>   	int rc;
> +	struct addrmap_entry *range;
>   
>   	rc = fscanf(in, "%llx %c %499s\n", &s->addr, &stype, sym);
>   	if (rc != 3) {
> @@ -202,6 +313,14 @@ static int read_symbol(FILE *in, struct sym_entry *s)
>   	check_symbol_range(sym, s->addr, text_ranges, ARRAY_SIZE(text_ranges));
>   	check_symbol_range(sym, s->addr, &percpu_range, 1);
>   
> +	/* try to find a module that this address belongs to */
> +	range = bsearch(&s->addr,
> +	    addrmap, addrmap_num, sizeof(*addrmap), &addrmap_compare);
> +	if (range)
> +		s->module = builtin_module_offsets[range->module];
> +	else
> +		s->module = 0;
> +
>   	/* include the type field in the symbol name, so that it gets
>   	 * compressed together */
>   	s->len = strlen(sym) + 1;
> @@ -469,6 +588,19 @@ static void write_src(void)
>   	for (i = 0; i < 256; i++)
>   		printf("\t.short\t%d\n", best_idx[i]);
>   	printf("\n");
> +
> +	output_label("kallsyms_modules");
> +	for (i = 0; i < builtin_module_len; i++)
> +		printf("\t.asciz\t\"%s\"\n", builtin_modules[i]);
> +	printf("\n");
> +
> +	for (i = 0; i < builtin_module_len; i++)
> +		free(builtin_modules[i]);
> +
> +	output_label("kallsyms_symbol_modules");
> +	for (i = 0; i < table_cnt; i++)
> +		printf("\t.int\t%d\n", table[i].module);
> +	printf("\n");
>   }
>   
>   
> @@ -734,12 +866,169 @@ static void record_relative_base(void)
>   		}
>   }
>   
> +/*
> + * Reallocate the builtin modules list.
> + */
> +static void realloc_builtin_modules(void)
> +{
> +	builtin_module_size += 50;
> +
> +	builtin_modules = realloc(builtin_modules,
> +				  sizeof(*builtin_modules) *
> +				  builtin_module_size);
> +	builtin_module_offsets = realloc(builtin_module_offsets,
> +					 sizeof(*builtin_module_offsets) *
> +					 builtin_module_size);
> +
> +	if (!builtin_modules || !builtin_module_offsets) {
> +		fprintf(stderr, "kallsyms failure: out of memory.\n");
> +		exit(EXIT_FAILURE);
> +	}
> +}
> +
> +/*
> + * Add a single built-in module (possibly composed of many files) to the
> + * modules list.  Take the offset of the current module and return it
> + * (purely for simplicity's sake in the caller).
> + */
> +static size_t add_builtin_module(const char *module_name, char **module_paths,
> +				 size_t offset)
> +{
> +	/* map the module's object paths to the module offset */
> +	while (*module_paths) {
> +		obj2mod_put(*module_paths, builtin_module_len);
> +		module_paths++;
> +	}
> +
> +	/* add the module name */
> +	if (builtin_module_size <= builtin_module_len)
> +		realloc_builtin_modules();
> +	builtin_modules[builtin_module_len] = strdup(module_name);
> +	builtin_module_offsets[builtin_module_len] = offset;
> +	builtin_module_len++;
> +
> +	return (offset + strlen(module_name) + 1);
> +}
> +
> +/*
> + * Read the linker map.
> + */
> +static void read_linker_map(void)
> +{
> +	unsigned long long addr, size;
> +	char obj[PATH_MAX+1];
> +	FILE *f = fopen(".tmp_vmlinux.ranges", "r");
> +
> +	if (!f) {
> +		fprintf(stderr, "Cannot open '.tmp_vmlinux.ranges'.\n");
> +		exit(1);
> +	}
> +
> +	addrmap_num = 0;
> +	addrmap_alloced = 4096;
> +	addrmap = malloc(sizeof(*addrmap) * addrmap_alloced);
> +	if (!addrmap)
> +		goto oom;
> +
> +	/*
> +	 * For each address range (addr,size) and object, add to addrmap
> +	 * the range and the built-in module to which the object maps.
> +	 */
> +	while (fscanf(f, "%llx %llx %s\n", &addr, &size, obj) == 3) {
> +		int m = obj2mod_get(obj);
> +
> +		if (addr == 0 || size == 0 || m == 0)
> +			continue;
> +
> +		if (addrmap_num >= addrmap_alloced) {
> +			addrmap_alloced *= 2;
> +			addrmap = realloc(addrmap,
> +			    sizeof(*addrmap) * addrmap_alloced);
> +			if (!addrmap)
> +				goto oom;
> +		}
> +
> +		addrmap[addrmap_num].addr = addr;
> +		addrmap[addrmap_num].size = size;
> +		addrmap[addrmap_num].module = m;
> +		addrmap_num++;
> +	}
> +	fclose(f);
> +	return;
> +
> +oom:
> +	fprintf(stderr, "kallsyms: out of memory\n");
> +	exit(1);
> +}
> +
> +/*
> + * Read "modules_thick.builtin" (the list of built-in modules).  Construct:
> + *   - builtin_modules: array of built-in-module names
> + *   - builtin_module_offsets: array of offsets that will later be
> + *       used to access a concatenated list of built-in-module names
> + *   - obj2mod: a temporary, many-to-one, hash mapping
> + *       from object-file paths to built-in-module names
> + * Read ".tmp_vmlinux.ranges" (the linker map).
> + *   - addrmap[] maps address ranges to built-in module names (using obj2mod)
> + */
> +static void read_modules(void)
> +{
> +	FILE *f;
> +	char *line;
> +	size_t line_size;
> +	size_t offset = 0;
> +
> +	realloc_builtin_modules(); /* initial allocation */
> +
> +	builtin_modules[0] = strdup(""); /* a symbol that cannot be modular */
> +	builtin_module_offsets[0] = 0;
> +	builtin_module_len = 1;
> +	offset++;
> +
> +	/*
> +	 * Iterate over all modules in modules_thick.builtin and add each.
> +	 */
> +	f = fopen("modules_thick.builtin", "r");
> +	if (f == NULL) {
> +		fprintf(stderr, "Cannot open modules_thick.builtin: %s\n",
> +		    strerror(errno));
> +		exit(1);
> +	}
> +
> +	while (getline(&line, &line_size, f) > 0) {
> +		char **paths;
> +		char *module_name = NULL;
> +
> +		paths = modules_thick_parse(line, &module_name);
> +		if (paths == NULL)
> +			break;
> +		offset = add_builtin_module(module_name, paths, offset);
> +		free(paths);
> +		free(module_name);
> +	}
> +	if (ferror(f)) {
> +		fprintf(stderr, "Error reading from modules_thick file: %s\n",
> +		    strerror(errno));
> +		exit(1);
> +	}
> +
> +	fclose(f);
> +	free(line);
> +
> +	/*
> +	 * Read linker map.
> +	 */
> +	read_linker_map();
> +
> +	obj2mod_free();
> +}
> +
>   int main(int argc, char **argv)
>   {
>   	if (argc >= 2) {
>   		int i;
>   		for (i = 1; i < argc; i++) {
> -			if(strcmp(argv[i], "--all-symbols") == 0)
> +			if (strcmp(argv[i], "--all-symbols") == 0)
>   				all_symbols = 1;
>   			else if (strcmp(argv[i], "--absolute-percpu") == 0)
>   				absolute_percpu = 1;
> @@ -751,6 +1040,7 @@ int main(int argc, char **argv)
>   	} else if (argc != 1)
>   		usage();
>   
> +	read_modules();
>   	read_map(stdin);
>   	shrink_table();
>   	if (absolute_percpu)
> diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> index 436379940356..ac14d292387a 100755
> --- a/scripts/link-vmlinux.sh
> +++ b/scripts/link-vmlinux.sh
> @@ -76,6 +76,7 @@ vmlinux_link()
>   			--start-group				\
>   			${KBUILD_VMLINUX_LIBS}			\
>   			--end-group				\
> +			-Map=.tmp_vmlinux.map			\
>   			${@}"
>   
>   		${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux}	\
> @@ -88,6 +89,7 @@ vmlinux_link()
>   			-Wl,--start-group			\
>   			${KBUILD_VMLINUX_LIBS}			\
>   			-Wl,--end-group				\
> +			-Wl,-Map=.tmp_vmlinux.map		\
>   			${@}"
>   
>   		${CC} ${CFLAGS_vmlinux}				\
> @@ -140,6 +142,19 @@ kallsyms()
>   	info KSYM ${2}
>   	local kallsymopt;
>   
> +	# read the linker map to identify ranges of addresses:
> +	#   - for each *.o file, report address, size, pathname
> +	#       - most such lines will have four fields
> +	#       - but sometimes there is a line break after the first field
> +	#   - start reading at "Linker script and memory map"
> +	#   - stop reading at ".brk"
> +	${AWK} '
> +	    /\.o$/ && start==1 && NF>=3 { print $(NF-2), $(NF-1), $NF }
> +	    /^Linker script and memory map/ { start = 1 }
> +	    /^\.brk/ { exit(0) }
> +	' .tmp_vmlinux.map | sort > .tmp_vmlinux.ranges
> +
> +	# get kallsyms options
>   	if [ -n "${CONFIG_KALLSYMS_ALL}" ]; then
>   		kallsymopt="${kallsymopt} --all-symbols"
>   	fi
> @@ -152,11 +167,13 @@ kallsyms()
>   		kallsymopt="${kallsymopt} --base-relative"
>   	fi
>   
> +	# set up compilation
>   	local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL}               \
>   		      ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"
>   
>   	local afile="`basename ${2} .o`.S"
>   
> +	# construct file and compile
>   	${NM} -n ${1} | scripts/kallsyms ${kallsymopt} > ${afile}
>   	${CC} ${aflags} -c -o ${2} ${afile}
>   }
> diff --git a/scripts/modules_thick.c b/scripts/modules_thick.c
> new file mode 100644
> index 000000000000..2d33b92060f9
> --- /dev/null
> +++ b/scripts/modules_thick.c
> @@ -0,0 +1,104 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * A simple modules_thick reader.
> + *
> + * (C) 2014, 2019 Oracle, Inc.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include <errno.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +#include "modules_thick.h"
> +
> +/*
> + * Parse a line from "modules_thick.builtin".  Allocate and return a module name
> + * and a null-terminated array of object paths (file names).  The name and array
> + * should be freed by the caller; the strings the array points to are in "line".
> + *
> + * Modules can consist of multiple paths: in this case, the portion before the
> + * colon is the path to the module, while the portion after the colon is a
> + * space-separated list of object paths.  In this case, the portion before the
> + * colon is an "object file" that does not actually exist: it is merged into
> + * built-in.a without ever being written out.
> + */
> +char ** __attribute__((__nonnull__))
> +modules_thick_parse(char *line, char **module_name)
> +{
> +	size_t npaths = 1;
> +	char **paths;
> +	char *tmp;
> +	char *olist;
> +
> +	/* find object-file list after the colon, if any */
> +	olist = strchr(line, ':');
> +	if (olist != NULL) {
> +		*olist = '\0';
> +		olist++;
> +		olist += strspn(olist, " \n");
> +		if (*olist != '\0') {
> +			/* replace any trailing \n with \0 */
> +			tmp = strchr(olist, '\n');
> +			if (tmp != NULL)
> +				*tmp = '\0';
> +		} else
> +			olist = NULL;
> +	}
> +
> +	/* get pathless module_name, starting after the last '/', if any */
> +	tmp = strrchr(line, '/');
> +	*module_name = strdup(tmp ? tmp + 1 : line);
> +
> +	/* replace '-' with '_' as is done to names when built as modules */
> +	for (tmp = *module_name; *tmp != '\0'; tmp++)
> +		if (*tmp == '-')
> +			*tmp = '_';
> +
> +	/* terminate at the last '.' to remove any suffix */
> +	tmp = strrchr(*module_name, '.');
> +	if (tmp != NULL)
> +		*tmp = '\0';
> +
> +	/*
> +	 * Count the number of paths by counting the number of spaces.
> +	 * This could be an overestimate.
> +	 */
> +	if (olist) {
> +		npaths = 0;
> +		for (tmp = olist; tmp != NULL; tmp = strchr(tmp + 1, ' '))
> +			npaths++;
> +	}
> +
> +	paths = malloc((npaths + 1) * sizeof(char *));
> +	if (!paths) {
> +		fprintf(stderr, "%s: out of memory\n", __func__);
> +		exit(1);
> +	}
> +
> +	/* copy the paths in */
> +	if (olist) {
> +		size_t i = 0;
> +
> +		while ((tmp = strsep(&olist, " ")) != NULL) {
> +			if (i >= npaths) {
> +				fprintf(stderr,
> +				    "%s bug: npaths overflow on module %s\n",
> +				    __func__, *module_name);
> +				exit(1);
> +			}
> +			paths[i++] = tmp;
> +		}
> +		npaths = i;
> +	} else
> +		paths[0] = line;	/* untransformed module name */
> +
> +	paths[npaths] = NULL;
> +
> +	return paths;
> +}
> diff --git a/scripts/modules_thick.h b/scripts/modules_thick.h
> new file mode 100644
> index 000000000000..7e2c0309c731
> --- /dev/null
> +++ b/scripts/modules_thick.h
> @@ -0,0 +1,27 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * A simple modules_thick reader.
> + *
> + * (C) 2014, 2019 Oracle, Inc.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#ifndef _LINUX_MODULES_THICK_H
> +#define _LINUX_MODULES_THICK_H
> +
> +#include <stdio.h>
> +#include <stddef.h>
> +
> +/*
> + * Parse a line from "modules_thick.builtin".  Return a module name
> + * and a null-terminated array of object paths (file names).
> + */
> +
> +char ** __attribute__((__nonnull__))
> +modules_thick_parse(char *line, char **module_name);
> +
> +#endif
> diff --git a/scripts/namespace.pl b/scripts/namespace.pl
> index 1da7bca201a4..4c7615e720de 100755
> --- a/scripts/namespace.pl
> +++ b/scripts/namespace.pl
> @@ -120,6 +120,11 @@ my %nameexception = (
>       'kallsyms_addresses'=> 1,
>       'kallsyms_offsets'	=> 1,
>       'kallsyms_relative_base'=> 1,
> +    'kallsyms_token_table'=> 1,
> +    'kallsyms_token_index'=> 1,
> +    'kallsyms_markers'	=> 1,
> +    'kallsyms_modules'	=> 1,
> +    'kallsyms_symbol_modules'=> 1,
>       '__this_module'	=> 1,
>       '_etext'		=> 1,
>       '_edata'		=> 1,

  reply	other threads:[~2019-12-18 23:56 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-14 22:30 [PATCH] kallsyms: new /proc/kallmodsyms with builtin modules and symbol sizes eugene.loh
2019-11-15 16:47 ` Steven Rostedt
2019-11-15 17:26   ` Linus Torvalds
2019-11-16 17:58     ` Eugene Loh
2019-11-17  0:32       ` Linus Torvalds
2019-11-19 22:42         ` [PATCH v2] kallsyms: add names of built-in modules eugene.loh
2019-11-20  4:59           ` [PATCH v3] " eugene.loh
2019-11-22 10:00             ` Masahiro Yamada
2019-11-22 15:23               ` Nick Alcock
2019-11-22 17:04                 ` Eugene Loh
2019-12-10 17:45               ` Eugene Loh
2019-12-10 17:48                 ` [PATCH v4] " eugene.loh
2019-12-18 23:55                   ` Eugene Loh [this message]
2019-12-19  3:29                     ` Steven Rostedt
2019-12-19  4:28                       ` Masahiro Yamada
2019-12-19 10:22                         ` Masahiro Yamada
2020-01-08 18:32                         ` Eugene Loh
2020-01-20  6:37                           ` Masahiro Yamada
2020-01-24 18:08                             ` Eugene Loh
2019-12-19  9:43                       ` Jessica Yu
2019-11-20  0:11         ` [PATCH] kallsyms: new /proc/kallmodsyms with builtin modules and symbol sizes Eugene Loh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a535000-e71e-fab9-cf6a-e7e5fb8053d8@oracle.com \
    --to=eugene.loh@oracle.com \
    --cc=corbet@lwn.net \
    --cc=jacob.e.keller@intel.com \
    --cc=jeyu@kernel.org \
    --cc=kris.van.hees@oracle.com \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=michal.lkml@markovi.net \
    --cc=nick.alcock@oracle.com \
    --cc=rostedt@goodmis.org \
    --cc=songliubraving@fb.com \
    --cc=tglx@linutronix.de \
    --cc=yamada.masahiro@socionext.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.