public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC] fix kallsyms to allow discrimination of local symbols
@ 2008-07-21 21:43 James Bottomley
  2008-07-22  1:44 ` Frank Ch. Eigler
  0 siblings, 1 reply; 12+ messages in thread
From: James Bottomley @ 2008-07-21 21:43 UTC (permalink / raw)
  To: linux-kernel, systemtap

The problem is that local symbols, being hidden from the linker, might
not be unique.  Thus, they don't make good anchors for the symbol
relative addressing used by kprobes (it takes the first occurrence it
finds).  Likewise, when they appear in stack traces, it's sometimes not
obvious which local symbol it is (although context usually allows an
easy guess).

Fix all of this by prefixing local symbols with the actual C file name
they occur in separated by '|' (I had to use '|' since ':' is already in
use for module prefixes in kallsyms lookups.

I also had to rewrite mksysmap in perl because the necessary text
formatting changes in shell are painfully slow.

Comments?

James

---

diff --git a/Makefile b/Makefile
index 6192922..a416b35 100644
--- a/Makefile
+++ b/Makefile
@@ -685,7 +685,7 @@ quiet_cmd_vmlinux_version = GEN     .version
 
 # Generate System.map
 quiet_cmd_sysmap = SYSMAP
-      cmd_sysmap = $(CONFIG_SHELL) $(srctree)/scripts/mksysmap
+      cmd_sysmap = $(PERL) $(srctree)/scripts/mksysmap
 
 # Link of vmlinux
 # If CONFIG_KALLSYMS is set .version is already updated
@@ -759,7 +759,7 @@ endef
 
 # Generate .S file with all kernel symbols
 quiet_cmd_kallsyms = KSYM    $@
-      cmd_kallsyms = $(NM) -n $< | $(KALLSYMS) \
+      cmd_kallsyms = $(NM) -n -l $< | sed "s|`pwd`/||" | $(KALLSYMS) \
                      $(if $(CONFIG_KALLSYMS_ALL),--all-symbols) > $@
 
 .tmp_kallsyms1.o .tmp_kallsyms2.o .tmp_kallsyms3.o: %.o: %.S scripts FORCE
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index ad2434b..0badae2 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -63,11 +63,12 @@ static inline int is_arm_mapping_symbol(const char *str)
 
 static int read_symbol(FILE *in, struct sym_entry *s)
 {
-	char str[500];
+	char str[500], file[500];
 	char *sym, stype;
-	int rc;
+	int rc, c, line;
 
-	rc = fscanf(in, "%llx %c %499s\n", &s->addr, &stype, str);
+	file[0] = '\0';
+	rc = fscanf(in, "%llx %c %499s", &s->addr, &stype, str);
 	if (rc != 3) {
 		if (rc != EOF) {
 			/* skip line */
@@ -75,6 +76,12 @@ static int read_symbol(FILE *in, struct sym_entry *s)
 		}
 		return -1;
 	}
+	c = fgetc(in);
+	if (c != '\n') {
+		rc = fscanf(in, "%499[^:]:%d\n", file, &line);
+		if (rc != 2)
+			file[0] = '\0';
+	}
 
 	sym = str;
 	/* skip prefix char */
@@ -115,13 +122,22 @@ static int read_symbol(FILE *in, struct sym_entry *s)
 	/* include the type field in the symbol name, so that it gets
 	 * compressed together */
 	s->len = strlen(str) + 1;
+	if (islower(stype))
+		s->len += strlen(file) + 1;
 	s->sym = malloc(s->len + 1);
 	if (!s->sym) {
 		fprintf(stderr, "kallsyms failure: "
 			"unable to allocate required amount of memory\n");
 		exit(EXIT_FAILURE);
 	}
-	strcpy((char *)s->sym + 1, str);
+	if (islower(stype)) {
+		char *ss = (char *)s->sym + 1;
+		
+		strcpy(ss, file);
+		strcat(ss, "|");
+		strcat(ss, str);
+	} else
+		strcpy((char *)s->sym + 1, str);
 	s->sym[0] = stype;
 
 	return 0;
diff --git a/scripts/mksysmap b/scripts/mksysmap
index 6e133a0..496cadd 100644
--- a/scripts/mksysmap
+++ b/scripts/mksysmap
@@ -1,4 +1,4 @@
-#!/bin/sh -x
+#!/usr/bin/perl
 # Based on the vmlinux file create the System.map file
 # System.map is used by module-init tools and some debugging
 # tools to retrieve the actual addresses of symbols in the kernel.
@@ -41,5 +41,21 @@
 # so we just ignore them to let readprofile continue to work.
 # (At least sparc64 has __crc_ in the middle).
 
-$NM -n $1 | grep -v '\( [aNUw] \)\|\(__crc_\)\|\( \$[adt]\)' > $2
+chomp($cwd = `pwd`);
+open(I, "nm -n -l $ARGV[0]|") || die;
+open(O, ">$ARGV[1]") || die;
+foreach(<I>) {
+    chomp;
+    ($addr, $type, $symbol, $file_and_line) = split(/[ 	]/, $_);
+    next if ($type =~ /[aNUw]/ || $type =~ /\$[adt]/);
+    next if ($symbol=~ /__crc_/);
+    if ($type =~ /[a-z]/ && $file_and_line) {
+	($_) = split(/:/, $file_and_line);
+	(undef, $file) = split(/^$cwd\//, $_);
+	$symbol = $file."|".$symbol;
+    }
+    print O "$addr $type $symbol\n";
+}
+
+
 



^ permalink raw reply related	[flat|nested] 12+ messages in thread
* Re: [RFC] fix kallsyms to allow discrimination of local symbols
@ 2008-07-24 16:03 Frank Ch. Eigler
  0 siblings, 0 replies; 12+ messages in thread
From: Frank Ch. Eigler @ 2008-07-24 16:03 UTC (permalink / raw)
  To: Theodore Tso; +Cc: James Bottomley, linux-kernel, systemtap

Hi -


On Wed, Jul 23, 2008 at 12:25:05PM -0400, Theodore Tso wrote:
> > I also proposed a compromise where systemtap would use the
> > symbol+offset interface, but choose a single convenient symbol as base
> > for all probes in a particular elf file (/section).
> 
> I guess I don't see the value of that over just using the address
> directly.  James' point wasn't just to use the symbol+offset feature
> just for the sake of using it, but rather as a better way of
> specifying how to insert a probe into a kernel.

Right, I understand that this is the theory, but I believe the
difference between symbol+offset vs. _stext+offset
vs. absolute-address is almost entirely aesthetic rather than
functional.


> For example, it may be that by allowing the kernel to have a bit
> more semantic knowledge of where a probe is going, it could more
> easily do various safety-related checks that can't be done if all it
> is given is a raw address.

This is unlikely to be the case.  The kernel can map from addresses to
symbols internally on demand, should such extra safety checks come
into existence.  It can already check for __kprobes marked-ness,
regardless of the API.


> > As a quality-of-implementation matter, systemtap checks at translation
> > time that such probes make sense -- that "ext4_fill_super" even
> > exists.  (That is needed also to expand wildcards.)  The only way it
> > can do that is if it has dwarf or separate textual symbol table data
> > (see above).  Both of those carry addresses as well, so we might as
> > well use them.
> 
> True, though I'll note for modern kernels, with /proc/kallsyms, we
> should hopefully be able to do this (offset=0 probes) without DWARF
> headers. [...]

Yes, that's what I was referring to ("... or separate textual symbol
table").  Note that this table contains addresses too.


> BTW, one of the things which I have wondered is whether DWARF was
> really the right approach after all, given how bloated and
> space-inefficient it seems to be.  [...]

Yeah, it is probably the main source of pain in using systemtap at its
fullest.


> [...]  So there is a big difference between "please do X, Y, and Z
> first and the patch would then be acceptable" versus "for reasons A,
> B and C this patch is totally unacceptable and in fact what you are
> trying to do is pointless".

I am sorry for my part in lowering the tone of the discussion.  We
will work out how to put James' patch in, with backward-compatiblity
extensions.


- FChE

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-07-24 16:04 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-21 21:43 [RFC] fix kallsyms to allow discrimination of local symbols James Bottomley
2008-07-22  1:44 ` Frank Ch. Eigler
2008-07-22  3:53   ` James Bottomley
2008-07-22 11:51     ` Frank Ch. Eigler
2008-07-22 15:14       ` James Bottomley
2008-07-22 16:05         ` Frank Ch. Eigler
2008-07-23  1:48           ` Theodore Tso
2008-07-23  4:16             ` Frank Ch. Eigler
2008-07-23 16:25               ` Theodore Tso
2008-07-23 16:40                 ` James Bottomley
2008-07-23 17:47                   ` Theodore Tso
  -- strict thread matches above, loose matches on Subject: below --
2008-07-24 16:03 Frank Ch. Eigler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox