[PATCH] kallsyms data size reduction / lookup speedup

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] kallsyms data size reduction / lookup speedup
@ 2004-08-25  4:04 pmarques
  2004-08-25 17:39 ` Matt Mackall
  2004-08-25 20:51 ` Sam Ravnborg
  0 siblings, 2 replies; 18+ messages in thread
From: pmarques @ 2004-08-25  4:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: bcasavan

[-- Attachment #1: Type: text/plain, Size: 2824 bytes --]


This patch is an improvement over my first kallsyms speedup patch posted about 2
weeks ago.

It changes scripts/kallsyms as to produce a different format for kallsyms_names
and extra data to speedup lookups. The compression algorithm is quite simple:
it uses all the char codes not actually used in symbols to build a lookup table
that translates these codes into small strings. For instance, in my test runs
the code 0xFE was being translated into "acpi_" giving a 4 byte save on every
translation.

The advantage of this algorithm is that to translate a symbol we only require
information that is stored on that symbol position, and never need to go back
on the compressed stream to get information from other symbols.

To give an idea about the benefits of this algorithm here are some benchmark
results on a P4 2.8GHz with a symbol table with 10000 entries:

kallsyms_lookup average time:
  vanilla           1346.0 us
  speedup             14.4 us
  with this patch      0.5 us

total data produced by scripts/kallsyms:
  uncompressed         169 Kb
  vanilla              134 Kb
  with this patch       91 Kb

(speedup was my latest patch, that only changed the way kallsyms_lookup worked
and not the data format)

I removed a cond_resched() from the proc/kallsyms handling code path, because
using stem compression, if the current position went backwards, the hole stream
would be uncompressed up to the current position. It seemed that by removing
this loop it would be safe to remove the conditional reschedule altogether.

There is just one catch with this patch: the time it takes to compile the kernel
goes up just a bit (about 0.8s on a P4 2.8GHz with defconfig). If this delay is
not acceptable I can change the compression algorithm so that it can use the
previous table (calculating a new table is what consumes most of the time, and
not doing the actual compression) and check to see if it obtains a similar
compression ratio. If it does, then this is a sign that the symbol patterns
haven't changed that much and this table is still good to use. This would not
only cut the time down to half on any compilation (because of the 2 pass symbol
build method), but in frequent cases where a developer is compiling a single
file and linking everything over and over again, the table optimization process
would never run.

I'm CC'ing Brent Casavant on this email, because last june he sent a patch
trying a different approach that used a 32 entry symbol cache, because there
was a problem with the time "top" took to read "proc/<pid>/wchan". I was
hopping he would be willing to test this patch and comment on the results.

As always, comments, suggestions, flames will be greatly appreciated :)



Signed-off-by: Paulo Marques <pmarques@grupopie.com>


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: kallsyms_patch --]
[-- Type: text/x-diff; name="kallsyms_patch", Size: 17637 bytes --]

diff -uprN -X dontdiff linux-2.6.9-rc1/kernel/kallsyms.c linux-2.6.9-rc1-kall/kernel/kallsyms.c
--- linux-2.6.9-rc1/kernel/kallsyms.c	2004-08-24 23:16:39.000000000 +0100
+++ linux-2.6.9-rc1-kall/kernel/kallsyms.c	2004-08-25 03:59:27.000000000 +0100
@@ -5,6 +5,12 @@
  * module loader:
  *   Copyright 2002 Rusty Russell <rusty@rustcorp.com.au> IBM Corporation
  * Stem compression by Andi Kleen.
+ *
+ * ChangeLog:
+ *
+ * (25/Aug/2004) Paulo Marques
+ *      Changed the compression method from stem compression to "table lookup"
+ *      compression
  */
 #include <linux/kallsyms.h>
 #include <linux/module.h>
@@ -17,7 +23,12 @@
 /* These will be re-linked against their real values during the second link stage */
 extern unsigned long kallsyms_addresses[] __attribute__((weak));
 extern unsigned long kallsyms_num_syms __attribute__((weak));
-extern char kallsyms_names[] __attribute__((weak));
+extern u8 kallsyms_names[] __attribute__((weak));
+
+extern u8 kallsyms_token_table[] __attribute__((weak));
+extern u16 kallsyms_token_index[] __attribute__((weak));
+
+extern unsigned long kallsyms_markers[] __attribute__((weak));
 
 /* Defined by the linker script. */
 extern char _stext[], _etext[], _sinittext[], _einittext[];
@@ -37,21 +48,56 @@ static inline int is_kernel_text(unsigne
 	return 0;
 }
 
+static unsigned int kallsyms_expand_symbol(unsigned int off, char *result)
+{
+	int len, tlen;
+	u8 *tptr, *data;
+
+	data = &kallsyms_names[off];
+	
+	len=*data++;
+	off += len + 1;
+	while(len) {
+		tptr=&kallsyms_token_table[kallsyms_token_index[*data]];
+		data++;
+		len--;
+
+		tlen=*tptr++;
+		while(tlen) {
+			*result++=*tptr++;
+			tlen--;
+		}
+	}
+
+	*result = 0;
+
+	return off;
+}
+
+static unsigned int get_symbol_offset(unsigned long pos)
+{
+	u8 *name;
+	int i;
+
+	name = &kallsyms_names[ kallsyms_markers[pos>>8] ];
+	for(i = 0; i < (pos&0xFF); i++)
+		name = name + (*name) + 1;
+
+	return name - kallsyms_names;
+}
+
 /* Lookup the address for this symbol. Returns 0 if not found. */
 unsigned long kallsyms_lookup_name(const char *name)
 {
 	char namebuf[KSYM_NAME_LEN+1];
 	unsigned long i;
-	char *knames;
+	unsigned int off;
 
-	for (i = 0, knames = kallsyms_names; i < kallsyms_num_syms; i++) {
-		unsigned prefix = *knames++;
+	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
+		off = kallsyms_expand_symbol(off, namebuf);
 
-		strlcpy(namebuf + prefix, knames, KSYM_NAME_LEN - prefix);
 		if (strcmp(namebuf, name) == 0)
 			return kallsyms_addresses[i];
-
-		knames += strlen(knames) + 1;
 	}
 	return module_kallsyms_lookup_name(name);
 }
@@ -62,7 +108,7 @@ const char *kallsyms_lookup(unsigned lon
 			    unsigned long *offset,
 			    char **modname, char *namebuf)
 {
-	unsigned long i, best = 0;
+	unsigned long i, low, high, mid;
 
 	/* This kernel should never had been booted. */
 	BUG_ON(!kallsyms_addresses);
@@ -71,40 +117,41 @@ const char *kallsyms_lookup(unsigned lon
 	namebuf[0] = 0;
 
 	if (is_kernel_text(addr) || is_kernel_inittext(addr)) {
-		unsigned long symbol_end;
-		char *name = kallsyms_names;
+		unsigned long symbol_end=0;
 
-		/* They're sorted, we could be clever here, but who cares? */
-		for (i = 0; i < kallsyms_num_syms; i++) {
-			if (kallsyms_addresses[i] > kallsyms_addresses[best] &&
-			    kallsyms_addresses[i] <= addr)
-				best = i;
+		/* do a binary search on the sorted kallsyms_addresses array */
+		low = 0;
+		high = kallsyms_num_syms;
+		while (high-low > 1) {
+			mid = (low + high) / 2;
+			if (kallsyms_addresses[mid] <= addr) low = mid;
+			else high = mid;
 		}
+		while (low && kallsyms_addresses[low-1] == kallsyms_addresses[low]) 
+			--low;
 
 		/* Grab name */
-		for (i = 0; i <= best; i++) { 
-			unsigned prefix = *name++;
-			strncpy(namebuf + prefix, name, KSYM_NAME_LEN - prefix);
-			name += strlen(name) + 1;
-		}
-
-		/* At worst, symbol ends at end of section. */
-		if (is_kernel_inittext(addr))
-			symbol_end = (unsigned long)_einittext;
-		else
-			symbol_end = (unsigned long)_etext;
+		kallsyms_expand_symbol(get_symbol_offset(low), namebuf);
 
 		/* Search for next non-aliased symbol */
-		for (i = best+1; i < kallsyms_num_syms; i++) {
-			if (kallsyms_addresses[i] > kallsyms_addresses[best]) {
+		for (i = low + 1; i < kallsyms_num_syms; i++) {
+			if (kallsyms_addresses[i] > kallsyms_addresses[low]) {
 				symbol_end = kallsyms_addresses[i];
 				break;
 			}
 		}
 
-		*symbolsize = symbol_end - kallsyms_addresses[best];
+		if (!symbol_end) {
+			/* At worst, symbol ends at end of section. */
+			if (is_kernel_inittext(addr))
+				symbol_end = (unsigned long)_einittext;
+			else
+				symbol_end = (unsigned long)_etext;
+		}
+
+		*symbolsize = symbol_end - kallsyms_addresses[low];
 		*modname = NULL;
-		*offset = addr - kallsyms_addresses[best];
+		*offset = addr - kallsyms_addresses[low];
 		return namebuf;
 	}
 
@@ -168,15 +215,10 @@ static int get_ksymbol_mod(struct kallsy
 /* Returns space to next name. */
 static unsigned long get_ksymbol_core(struct kallsym_iter *iter)
 {
-	unsigned stemlen, off = iter->nameoff;
+	unsigned off = iter->nameoff;
+
+	off = kallsyms_expand_symbol(off, iter->name);
 
-	/* First char of each symbol name indicates prefix length
-	   shared with previous name (stem compression). */
-	stemlen = kallsyms_names[off++];
-
-	strlcpy(iter->name+stemlen, kallsyms_names + off,
-		KSYM_NAME_LEN+1-stemlen);
-	off += strlen(kallsyms_names + off) + 1;
 	iter->owner = NULL;
 	iter->value = kallsyms_addresses[iter->pos];
 	if (is_kernel_text(iter->value) || is_kernel_inittext(iter->value))
@@ -188,11 +230,11 @@ static unsigned long get_ksymbol_core(st
 	return off - iter->nameoff;
 }
 
-static void reset_iter(struct kallsym_iter *iter)
+static void reset_iter(struct kallsym_iter *iter, loff_t new_pos)
 {
 	iter->name[0] = '\0';
-	iter->nameoff = 0;
-	iter->pos = 0;
+	iter->nameoff = get_symbol_offset(new_pos);
+	iter->pos = new_pos;
 }
 
 /* Returns false if pos at or past end of file. */
@@ -204,16 +246,13 @@ static int update_iter(struct kallsym_it
 		return get_ksymbol_mod(iter);
 	}
 	
-	/* If we're past the desired position, reset to start. */
-	if (pos < iter->pos)
-		reset_iter(iter);
-
-	/* We need to iterate through the previous symbols: can be slow */
-	for (; iter->pos != pos; iter->pos++) {
-		iter->nameoff += get_ksymbol_core(iter);
-		cond_resched();
-	}
-	get_ksymbol_core(iter);
+	/* If we're not on the desired position, reset to new position. */
+	if (pos != iter->pos)
+		reset_iter(iter, pos);
+
+	iter->nameoff += get_ksymbol_core(iter);
+	iter->pos++;
+
 	return 1;
 }
 
@@ -274,7 +313,7 @@ static int kallsyms_open(struct inode *i
 	iter = kmalloc(sizeof(*iter), GFP_KERNEL);
 	if (!iter)
 		return -ENOMEM;
-	reset_iter(iter);
+	reset_iter(iter, 0);
 
 	ret = seq_open(file, &kallsyms_op);
 	if (ret == 0)
diff -uprN -X dontdiff linux-2.6.9-rc1/scripts/kallsyms.c linux-2.6.9-rc1-kall/scripts/kallsyms.c
--- linux-2.6.9-rc1/scripts/kallsyms.c	2004-08-24 23:16:39.000000000 +0100
+++ linux-2.6.9-rc1-kall/scripts/kallsyms.c	2004-08-25 03:30:30.000000000 +0100
@@ -6,6 +6,13 @@
  * of the GNU General Public License, incorporated herein by reference.
  *
  * Usage: nm -n vmlinux | scripts/kallsyms [--all-symbols] > symbols.S
+ *
+ * ChangeLog:
+ *
+ * (25/Aug/2004) Paulo Marques
+ *      Changed the compression method from stem compression to "table lookup"
+ *      compression
+ *
  */
 
 #include <stdio.h>
@@ -13,10 +20,30 @@
 #include <string.h>
 #include <ctype.h>
 
+/* compression tunning settings */
+#define MAX_TOK_SIZE		11
+#define KSYM_NAME_LEN		127
+
+/* we use only a subset of the complete symbol table to gather the token count,
+   to speed up compression, at the expense of a little compression ratio
+*/
+#define WORKING_SET		1024
+#define GOOD_BAD_THRESHOLD	10
+
+#define HASH_BITS		18
+#define HASH_TABLE_SIZE		(1 << HASH_BITS)
+#define HASH_MASK		(HASH_TABLE_SIZE - 1)
+#define HASH_BASE_OFFSET	2166136261U
+#define HASH_FOLD(a)		((a)&(HASH_MASK))
+
+
 struct sym_entry {
 	unsigned long long addr;
 	char type;
-	char *sym;
+	char sample;
+	char valid;
+	unsigned char len;
+	unsigned char *sym;
 };
 
 
@@ -25,6 +52,24 @@ static int size, cnt;
 static unsigned long long _stext, _etext, _sinittext, _einittext;
 static int all_symbols = 0;
 
+
+struct token { 
+	unsigned char data[MAX_TOK_SIZE];
+	unsigned char len;
+	int profit;
+	struct token *next;
+	struct token *right;
+	struct token *left;
+	struct token *smaller;
+	};
+
+struct token bad_head, good_head;
+struct token *hash_table[HASH_TABLE_SIZE];
+
+unsigned char best_table[256][MAX_TOK_SIZE+1];
+unsigned char best_table_len[256];
+
+
 static void
 usage(void)
 {
@@ -60,6 +105,7 @@ read_symbol(FILE *in, struct sym_entry *
 		return -1;
 
 	s->sym = strdup(str);
+	s->len = strlen(str);
 	return 0;
 }
 
@@ -108,11 +154,42 @@ read_map(FILE *in)
 	}
 }
 
+static void output_label(char *label)
+{
+	printf(".globl %s\n",label);
+	printf("\tALGN\n");
+	printf("%s:\n",label);
+}
+
+static int expand_symbol(unsigned char *data, int len, char *result)
+{
+	int c, rlen, total=0;
+
+	while (len) {
+		c = *data;
+		if (best_table[c][0]==c && best_table_len[c]==1) {
+			*result++ = c;
+			total++;
+		} else { 
+			rlen = expand_symbol(best_table[c], best_table_len[c], result);
+			total += rlen;
+			result += rlen;
+		}
+		data++;
+		len--;
+	}
+	*result=0;
+
+	return total;
+}
+
 static void
 write_src(void)
 {
-	int i, valid = 0;
-	char *prev;
+	int i, k, off, valid;
+	unsigned int best_idx[256];
+	unsigned int *markers;
+	char buf[KSYM_NAME_LEN+1];
 
 	printf("#include <asm/types.h>\n");
 	printf("#if BITS_PER_LONG == 64\n");
@@ -125,43 +202,349 @@ write_src(void)
 
 	printf(".data\n");
 
-	printf(".globl kallsyms_addresses\n");
-	printf("\tALGN\n");
-	printf("kallsyms_addresses:\n");
+	output_label("kallsyms_addresses");
+	valid = 0;
 	for (i = 0; i < cnt; i++) {
-		if (!symbol_valid(&table[i]))
-			continue;
-
-		printf("\tPTR\t%#llx\n", table[i].addr);
-		valid++;
+		if (table[i].valid) {
+			printf("\tPTR\t%#llx\n", table[i].addr);
+			valid++;
+		}
 	}
 	printf("\n");
 
-	printf(".globl kallsyms_num_syms\n");
-	printf("\tALGN\n");
-	printf("kallsyms_num_syms:\n");
+	output_label("kallsyms_num_syms");
 	printf("\tPTR\t%d\n", valid);
 	printf("\n");
 
-	printf(".globl kallsyms_names\n");
-	printf("\tALGN\n");
-	printf("kallsyms_names:\n");
-	prev = ""; 
+	markers = (unsigned int *) malloc(sizeof(unsigned int)*((valid + 255) / 256));
+	
+	output_label("kallsyms_names");
+	valid = 0;
+	off = 0;
 	for (i = 0; i < cnt; i++) {
-		int k;
 
-		if (!symbol_valid(&table[i]))
+		if (!table[i].valid)
 			continue;
 
-		for (k = 0; table[i].sym[k] && table[i].sym[k] == prev[k]; ++k)
-			; 
+		if ((valid & 0xFF) == 0)
+			markers[valid >> 8] = off;
+
+		printf("\t.byte 0x%02x", table[i].len);
+		for (k = 0; k < table[i].len; k++)
+			printf(", 0x%02x", table[i].sym[k]);
+		printf("\n");
 
-		printf("\t.byte 0x%02x\n\t.asciz\t\"%s\"\n", k, table[i].sym + k);
-		prev = table[i].sym;
+		off += table[i].len + 1;
+		valid++;
 	}
 	printf("\n");
+
+	output_label("kallsyms_markers");
+	for (i = 0; i < ((valid + 255) >> 8); i++)
+		printf("\tPTR\t%d\n", markers[i]);
+	printf("\n");
+
+	output_label("kallsyms_token_table");
+	off = 0;
+	for (i = 0; i < 256; i++) {
+		best_idx[i] = off;
+		expand_symbol(best_table[i],best_table_len[i],buf);
+		k = strlen(buf);
+		printf("\t.byte 0x%02x\n\t.ascii\t\"%s\"\n", k, buf);
+		off += k + 1;
+	}
+	printf("\n");
+
+	output_label("kallsyms_token_index");
+	for (i = 0; i < 256; i++)
+		printf("\t.word\t%d\n", best_idx[i]);
+	printf("\n");
+}
+
+
+/* table lookup compression functions */
+
+static inline unsigned int rehash_token(unsigned int hash, unsigned char data)
+{
+	return ((hash * 16777619) ^ data);
+}
+
+static unsigned int hash_token(unsigned char *data, int len)
+{
+	unsigned int hash=HASH_BASE_OFFSET;
+	int i;
+
+	for (i = 0; i < len; i++)
+		hash = rehash_token(hash, data[i]);
+
+	return HASH_FOLD(hash);
+}
+
+static struct token *find_token_hash(unsigned char *data, int len, unsigned int hash)
+{
+	struct token *ptr;
+
+	ptr = hash_table[hash];
+
+	while (ptr) {
+		if ((ptr->len == len) && (memcmp(ptr->data, data, len) == 0))
+			return ptr;
+		ptr=ptr->next;
+	}
+
+	return NULL;
 }
 
+static inline void insert_token_in_group(struct token *head, struct token *ptr)
+{
+	ptr->right = head->right;
+	ptr->right->left = ptr;
+	head->right = ptr;
+	ptr->left = head;
+}
+
+static inline void remove_token_from_group(struct token *ptr)
+{
+	ptr->left->right = ptr->right;
+	ptr->right->left = ptr->left;
+}
+
+static void learn_token(unsigned char *data, int len)
+{
+	struct token *ptr,*last_ptr;
+	int i, newprofit;
+	unsigned int hash = HASH_BASE_OFFSET;
+	unsigned int hashes[MAX_TOK_SIZE + 1];
+
+	if (len > MAX_TOK_SIZE) 
+		len = MAX_TOK_SIZE;
+
+	hash = rehash_token(hash, data[0]);
+	for (i = 2; i <= len; i++) {
+		hash = rehash_token(hash, data[i-1]);
+		hashes[i] = HASH_FOLD(hash);
+	}
+
+	last_ptr = NULL;
+	ptr = NULL;
+
+	for (i = len; i >= 2; i--) {
+		hash = hashes[i];
+
+		if (!ptr) ptr = find_token_hash(data, i, hash);
+
+		if (!ptr) {
+			ptr = (struct token *) malloc(sizeof(*ptr));
+			memcpy(ptr->data, data, i);
+			ptr->len = i;
+			ptr->profit = -GOOD_BAD_THRESHOLD;
+			ptr->next = hash_table[hash];
+			hash_table[hash] = ptr;
+
+			insert_token_in_group(&bad_head, ptr);
+
+			ptr->smaller = NULL;
+		} else {
+			newprofit = ptr->profit + (ptr->len - 1);
+			if((ptr->profit < 0) && (newprofit >= 0)) {
+				remove_token_from_group(ptr);
+				insert_token_in_group(&good_head,ptr);
+			}
+		ptr->profit = newprofit;
+		}
+
+		if (last_ptr) last_ptr->smaller = ptr;
+		last_ptr = ptr;
+
+		ptr = ptr->smaller;
+	}
+}
+
+static void forget_token(unsigned char *data, int len)
+{
+	struct token *ptr;
+	int i, newprofit;
+	unsigned int hash=0;
+
+	if (len > MAX_TOK_SIZE) len = MAX_TOK_SIZE;
+
+	hash = hash_token(data, len);
+	ptr = find_token_hash(data, len, hash);
+
+	for (i = len; i >= 2; i--) {
+
+		newprofit = ptr->profit - (ptr->len - 1);
+		if ((ptr->profit >= 0) && (newprofit < 0)) {
+			remove_token_from_group(ptr);
+			insert_token_in_group(&bad_head, ptr);
+		}
+		ptr->profit=newprofit;
+
+		ptr=ptr->smaller;
+	}
+}
+
+static void learn_symbol(unsigned char *symbol, int len)
+{
+	int i;
+
+	for (i = 0; i < len - 1; i++)
+		learn_token(symbol + i, len - i);
+}
+
+static void forget_symbol(unsigned char *symbol, int len)
+{
+	int i;
+
+	for (i = 0; i < len - 1; i++)
+		forget_token(symbol + i, len - i);
+}
+
+static void build_initial_tok_table(void)
+{
+	int i, use_it, valid;
+
+	valid = 0;
+	for (i = 0; i < cnt; i++) {
+		table[i].valid = symbol_valid(&table[i]);
+		if (table[i].valid) valid++;
+	}
+
+	use_it = 0;
+	for (i = 0; i < cnt; i++) {
+		table[i].sample = 0;
+		if (table[i].valid) {
+			use_it += WORKING_SET;
+			if (use_it >= valid) {
+				table[i].sample = 1;
+				use_it -= valid;
+			}
+		}
+		if (table[i].sample)
+			learn_symbol(table[i].sym, table[i].len);
+	}
+}
+
+static void compress_symbols(unsigned char *str, int tlen, int idx)
+{
+	int i, len, learn, size;
+	unsigned char *p;
+
+	for (i = 0; i < cnt; i++) {
+		if (!table[i].valid) continue;
+		
+		len = table[i].len;
+		learn = 0;
+		p = table[i].sym;
+
+		do {
+			p = (unsigned char *) strstr((char *) p, (char *) str);
+			if (!p) break;
+
+			if (!learn) {
+				if (table[i].sample) 
+					forget_symbol(table[i].sym, len);
+				learn = 1;
+			}
+
+			*p = idx;
+			size = (len - (p - table[i].sym)) - tlen + 1;
+			memmove(p + 1, p + tlen, size);
+			p++;
+			len -= tlen - 1;
+			
+		} while (size >= tlen);
+
+		if(learn) {
+			table[i].len = len;
+			if(table[i].sample) learn_symbol(table[i].sym, len);
+		}
+	}
+}
+
+static struct token *find_best_token(void)
+{
+	struct token *ptr,*best,*head;
+	int bestprofit;
+
+	bestprofit=-10000;
+ 
+	/* failsafe: if the "good" list is empty search from the "bad" list */
+	if(good_head.right == &good_head) head = &bad_head;
+	else head = &good_head;
+
+	ptr = head->right;
+	best = NULL;
+	while (ptr != head) { 
+		if (ptr->profit > bestprofit) {
+			bestprofit = ptr->profit;
+			best = ptr;
+		}
+		ptr = ptr->right;
+	}
+
+	return best;
+}
+
+static void optimize_result(void)
+{
+	struct token *best;
+	int i;
+
+	/* using the '\0' symbol last allows compress_symbols to use standard
+	   fast string functions
+	*/
+	for (i = 255; i >= 0; i--) {
+		if (!best_table_len[i]) {
+			best = find_best_token();
+
+			best_table_len[i] = best->len;
+			memcpy(best_table[i], best->data, best_table_len[i]);
+			/* zero terminate the token so that we can use strstr
+			   in compress_symbols */
+			best_table[i][best_table_len[i]]='\0';
+
+			compress_symbols(best_table[i], best_table_len[i], i);
+		}
+	}
+}
+
+static void insert_real_symbols_in_table(void)
+{
+	int i, j, c;
+
+	memset(best_table, 0, sizeof(best_table));
+	memset(best_table_len, 0, sizeof(best_table_len));
+
+	for (i = 0; i < cnt; i++) {
+		if (table[i].valid) {
+			for (j = 0; j < table[i].len; j++) {
+				c = table[i].sym[j];
+				best_table[c][0]=c;
+				best_table_len[c]=1;
+			}
+		}
+	}
+}
+
+static void optimize_token_table(void)
+{
+	memset(hash_table, 0, sizeof(hash_table));
+
+	good_head.left = &good_head;
+	good_head.right = &good_head;
+
+	bad_head.left = &bad_head;
+	bad_head.right = &bad_head;
+
+	build_initial_tok_table();
+
+	insert_real_symbols_in_table();
+
+	optimize_result();
+}
+
+
 int
 main(int argc, char **argv)
 {
@@ -171,6 +554,7 @@ main(int argc, char **argv)
 		usage();
 
 	read_map(stdin);
+	optimize_token_table();
 	write_src();
 
 	return 0;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25  4:04 [PATCH] kallsyms data size reduction / lookup speedup pmarques
@ 2004-08-25 17:39 ` Matt Mackall
  2004-08-25 18:46   ` Paulo Marques
  2004-08-25 20:51 ` Sam Ravnborg
  1 sibling, 1 reply; 18+ messages in thread
From: Matt Mackall @ 2004-08-25 17:39 UTC (permalink / raw)
  To: pmarques; +Cc: linux-kernel, bcasavan

On Wed, Aug 25, 2004 at 05:04:46AM +0100, pmarques@grupopie.com wrote:
> 
> As always, comments, suggestions, flames will be greatly appreciated :)

Please post patches inline so they're easier to comment on.
Attachments are a nuisance.

Am I correct that this is completely replacing stem compression with
your substring dictionary approach?

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25 17:39 ` Matt Mackall
@ 2004-08-25 18:46   ` Paulo Marques
  2004-08-25 18:58     ` Matt Mackall
  0 siblings, 1 reply; 18+ messages in thread
From: Paulo Marques @ 2004-08-25 18:46 UTC (permalink / raw)
  To: Matt Mackall; +Cc: linux-kernel, bcasavan

Matt Mackall wrote:
> On Wed, Aug 25, 2004 at 05:04:46AM +0100, pmarques@grupopie.com wrote:
> 
>>As always, comments, suggestions, flames will be greatly appreciated :)
> 
> 
> Please post patches inline so they're easier to comment on.
> Attachments are a nuisance.

Sorry about that. I've had problems in the past with my email client 
word wrapping patches, so to be sure the patch goes untouched I sent it 
this way.

Since I've changed email client since then, next time I'll try inlining 
again.

> Am I correct that this is completely replacing stem compression with
> your substring dictionary approach?

Yes, you are correct.

Right now I'm working on making the proc interface more eficient by 
removing all the seq_file stuff, that was needed because of the O(n) 
lookup time we had previously.

Not using seq_file with stem decompression would make a simple "cat 
/proc/kallsyms" to be O(n^2).

Bets regards,

-- 
Paulo Marques - www.grupopie.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25 18:46   ` Paulo Marques
@ 2004-08-25 18:58     ` Matt Mackall
  2004-08-25 19:09       ` Paulo Marques
  0 siblings, 1 reply; 18+ messages in thread
From: Matt Mackall @ 2004-08-25 18:58 UTC (permalink / raw)
  To: Paulo Marques; +Cc: linux-kernel, bcasavan

On Wed, Aug 25, 2004 at 07:46:53PM +0100, Paulo Marques wrote:
> Matt Mackall wrote:
> >On Wed, Aug 25, 2004 at 05:04:46AM +0100, pmarques@grupopie.com wrote:
> >
> >>As always, comments, suggestions, flames will be greatly appreciated :)
> >
> >
> >Please post patches inline so they're easier to comment on.
> >Attachments are a nuisance.
> 
> Sorry about that. I've had problems in the past with my email client 
> word wrapping patches, so to be sure the patch goes untouched I sent it 
> this way.
> 
> Since I've changed email client since then, next time I'll try inlining 
> again.
> 
> >Am I correct that this is completely replacing stem compression with
> >your substring dictionary approach?
> 
> Yes, you are correct.
> 
> Right now I'm working on making the proc interface more eficient by 
> removing all the seq_file stuff, that was needed because of the O(n) 
> lookup time we had previously.

FYI, killing the seq_file stuff will likely prove unpopular. So you'll
want to do that in a separate patch. If it doesn't affect the way
you're handling compression, please repost your compression patch. I
have a few comments, but otherwise I think we should move forward with it.
 
--
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25 18:58     ` Matt Mackall
@ 2004-08-25 19:09       ` Paulo Marques
  2004-08-25 19:29         ` viro
  2004-08-25 21:12         ` Matt Mackall
  0 siblings, 2 replies; 18+ messages in thread
From: Paulo Marques @ 2004-08-25 19:09 UTC (permalink / raw)
  To: Matt Mackall; +Cc: linux-kernel, bcasavan

Matt Mackall wrote:
> ...
> 
> FYI, killing the seq_file stuff will likely prove unpopular. So you'll
> want to do that in a separate patch. If it doesn't affect the way
> you're handling compression, please repost your compression patch. I
> have a few comments, but otherwise I think we should move forward with it.

I'm still not sure that the seq_file is the culprit, but doing
a 10000 symbol decompression in a user space application takes
about 340us, whereas doing a "time cat /proc/kallsyms > /dev/null"
gives approx. 0.2s! (this is all on a Pentium4 2.8GHz)

*If* the seq_file is the culprit, then I don't think removing
it (or improving it) will be unpopular.

Anyway, it doesn't affect the compression at all, so here is an
inline version of the patch for your reading/patching pleasure :)

-- 
Paulo Marques - www.grupopie.com


Signed-off-by: Paulo Marques <pmarques@grupopie.com>

  kernel/kallsyms.c  |  143 +++++++++++------
  scripts/kallsyms.c |  432 ++++++++++++++++++++++++++++++++++++++++++++++++++---
  2 files changed, 499 insertions(+), 76 deletions(-)

diff -uprN -X dontdiff linux-2.6.9-rc1/kernel/kallsyms.c linux-2.6.9-rc1-kall/kernel/kallsyms.c
--- linux-2.6.9-rc1/kernel/kallsyms.c	2004-08-24 23:16:39.000000000 +0100
+++ linux-2.6.9-rc1-kall/kernel/kallsyms.c	2004-08-25 03:59:27.000000000 +0100
@@ -5,6 +5,12 @@
   * module loader:
   *   Copyright 2002 Rusty Russell <rusty@rustcorp.com.au> IBM Corporation
   * Stem compression by Andi Kleen.
+ *
+ * ChangeLog:
+ *
+ * (25/Aug/2004) Paulo Marques
+ *      Changed the compression method from stem compression to "table lookup"
+ *      compression
   */
  #include <linux/kallsyms.h>
  #include <linux/module.h>
@@ -17,7 +23,12 @@
  /* These will be re-linked against their real values during the second link stage */
  extern unsigned long kallsyms_addresses[] __attribute__((weak));
  extern unsigned long kallsyms_num_syms __attribute__((weak));
-extern char kallsyms_names[] __attribute__((weak));
+extern u8 kallsyms_names[] __attribute__((weak));
+
+extern u8 kallsyms_token_table[] __attribute__((weak));
+extern u16 kallsyms_token_index[] __attribute__((weak));
+
+extern unsigned long kallsyms_markers[] __attribute__((weak));

  /* Defined by the linker script. */
  extern char _stext[], _etext[], _sinittext[], _einittext[];
@@ -37,21 +48,56 @@ static inline int is_kernel_text(unsigne
  	return 0;
  }

+static unsigned int kallsyms_expand_symbol(unsigned int off, char *result)
+{
+	int len, tlen;
+	u8 *tptr, *data;
+
+	data = &kallsyms_names[off];
+	
+	len=*data++;
+	off += len + 1;
+	while(len) {
+		tptr=&kallsyms_token_table[kallsyms_token_index[*data]];
+		data++;
+		len--;
+
+		tlen=*tptr++;
+		while(tlen) {
+			*result++=*tptr++;
+			tlen--;
+		}
+	}
+
+	*result = 0;
+
+	return off;
+}
+
+static unsigned int get_symbol_offset(unsigned long pos)
+{
+	u8 *name;
+	int i;
+
+	name = &kallsyms_names[ kallsyms_markers[pos>>8] ];
+	for(i = 0; i < (pos&0xFF); i++)
+		name = name + (*name) + 1;
+
+	return name - kallsyms_names;
+}
+
  /* Lookup the address for this symbol. Returns 0 if not found. */
  unsigned long kallsyms_lookup_name(const char *name)
  {
  	char namebuf[KSYM_NAME_LEN+1];
  	unsigned long i;
-	char *knames;
+	unsigned int off;

-	for (i = 0, knames = kallsyms_names; i < kallsyms_num_syms; i++) {
-		unsigned prefix = *knames++;
+	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
+		off = kallsyms_expand_symbol(off, namebuf);

-		strlcpy(namebuf + prefix, knames, KSYM_NAME_LEN - prefix);
  		if (strcmp(namebuf, name) == 0)
  			return kallsyms_addresses[i];
-
-		knames += strlen(knames) + 1;
  	}
  	return module_kallsyms_lookup_name(name);
  }
@@ -62,7 +108,7 @@ const char *kallsyms_lookup(unsigned lon
  			    unsigned long *offset,
  			    char **modname, char *namebuf)
  {
-	unsigned long i, best = 0;
+	unsigned long i, low, high, mid;

  	/* This kernel should never had been booted. */
  	BUG_ON(!kallsyms_addresses);
@@ -71,40 +117,41 @@ const char *kallsyms_lookup(unsigned lon
  	namebuf[0] = 0;

  	if (is_kernel_text(addr) || is_kernel_inittext(addr)) {
-		unsigned long symbol_end;
-		char *name = kallsyms_names;
+		unsigned long symbol_end=0;

-		/* They're sorted, we could be clever here, but who cares? */
-		for (i = 0; i < kallsyms_num_syms; i++) {
-			if (kallsyms_addresses[i] > kallsyms_addresses[best] &&
-			    kallsyms_addresses[i] <= addr)
-				best = i;
+		/* do a binary search on the sorted kallsyms_addresses array */
+		low = 0;
+		high = kallsyms_num_syms;
+		while (high-low > 1) {
+			mid = (low + high) / 2;
+			if (kallsyms_addresses[mid] <= addr) low = mid;
+			else high = mid;
  		}
+		while (low && kallsyms_addresses[low-1] == kallsyms_addresses[low])
+			--low;

  		/* Grab name */
-		for (i = 0; i <= best; i++) {
-			unsigned prefix = *name++;
-			strncpy(namebuf + prefix, name, KSYM_NAME_LEN - prefix);
-			name += strlen(name) + 1;
-		}
-
-		/* At worst, symbol ends at end of section. */
-		if (is_kernel_inittext(addr))
-			symbol_end = (unsigned long)_einittext;
-		else
-			symbol_end = (unsigned long)_etext;
+		kallsyms_expand_symbol(get_symbol_offset(low), namebuf);

  		/* Search for next non-aliased symbol */
-		for (i = best+1; i < kallsyms_num_syms; i++) {
-			if (kallsyms_addresses[i] > kallsyms_addresses[best]) {
+		for (i = low + 1; i < kallsyms_num_syms; i++) {
+			if (kallsyms_addresses[i] > kallsyms_addresses[low]) {
  				symbol_end = kallsyms_addresses[i];
  				break;
  			}
  		}

-		*symbolsize = symbol_end - kallsyms_addresses[best];
+		if (!symbol_end) {
+			/* At worst, symbol ends at end of section. */
+			if (is_kernel_inittext(addr))
+				symbol_end = (unsigned long)_einittext;
+			else
+				symbol_end = (unsigned long)_etext;
+		}
+
+		*symbolsize = symbol_end - kallsyms_addresses[low];
  		*modname = NULL;
-		*offset = addr - kallsyms_addresses[best];
+		*offset = addr - kallsyms_addresses[low];
  		return namebuf;
  	}

@@ -168,15 +215,10 @@ static int get_ksymbol_mod(struct kallsy
  /* Returns space to next name. */
  static unsigned long get_ksymbol_core(struct kallsym_iter *iter)
  {
-	unsigned stemlen, off = iter->nameoff;
+	unsigned off = iter->nameoff;
+
+	off = kallsyms_expand_symbol(off, iter->name);

-	/* First char of each symbol name indicates prefix length
-	   shared with previous name (stem compression). */
-	stemlen = kallsyms_names[off++];
-
-	strlcpy(iter->name+stemlen, kallsyms_names + off,
-		KSYM_NAME_LEN+1-stemlen);
-	off += strlen(kallsyms_names + off) + 1;
  	iter->owner = NULL;
  	iter->value = kallsyms_addresses[iter->pos];
  	if (is_kernel_text(iter->value) || is_kernel_inittext(iter->value))
@@ -188,11 +230,11 @@ static unsigned long get_ksymbol_core(st
  	return off - iter->nameoff;
  }

-static void reset_iter(struct kallsym_iter *iter)
+static void reset_iter(struct kallsym_iter *iter, loff_t new_pos)
  {
  	iter->name[0] = '\0';
-	iter->nameoff = 0;
-	iter->pos = 0;
+	iter->nameoff = get_symbol_offset(new_pos);
+	iter->pos = new_pos;
  }

  /* Returns false if pos at or past end of file. */
@@ -204,16 +246,13 @@ static int update_iter(struct kallsym_it
  		return get_ksymbol_mod(iter);
  	}
  	
-	/* If we're past the desired position, reset to start. */
-	if (pos < iter->pos)
-		reset_iter(iter);
-
-	/* We need to iterate through the previous symbols: can be slow */
-	for (; iter->pos != pos; iter->pos++) {
-		iter->nameoff += get_ksymbol_core(iter);
-		cond_resched();
-	}
-	get_ksymbol_core(iter);
+	/* If we're not on the desired position, reset to new position. */
+	if (pos != iter->pos)
+		reset_iter(iter, pos);
+
+	iter->nameoff += get_ksymbol_core(iter);
+	iter->pos++;
+
  	return 1;
  }

@@ -274,7 +313,7 @@ static int kallsyms_open(struct inode *i
  	iter = kmalloc(sizeof(*iter), GFP_KERNEL);
  	if (!iter)
  		return -ENOMEM;
-	reset_iter(iter);
+	reset_iter(iter, 0);

  	ret = seq_open(file, &kallsyms_op);
  	if (ret == 0)
diff -uprN -X dontdiff linux-2.6.9-rc1/scripts/kallsyms.c linux-2.6.9-rc1-kall/scripts/kallsyms.c
--- linux-2.6.9-rc1/scripts/kallsyms.c	2004-08-24 23:16:39.000000000 +0100
+++ linux-2.6.9-rc1-kall/scripts/kallsyms.c	2004-08-25 03:30:30.000000000 +0100
@@ -6,6 +6,13 @@
   * of the GNU General Public License, incorporated herein by reference.
   *
   * Usage: nm -n vmlinux | scripts/kallsyms [--all-symbols] > symbols.S
+ *
+ * ChangeLog:
+ *
+ * (25/Aug/2004) Paulo Marques
+ *      Changed the compression method from stem compression to "table lookup"
+ *      compression
+ *
   */

  #include <stdio.h>
@@ -13,10 +20,30 @@
  #include <string.h>
  #include <ctype.h>

+/* compression tunning settings */
+#define MAX_TOK_SIZE		11
+#define KSYM_NAME_LEN		127
+
+/* we use only a subset of the complete symbol table to gather the token count,
+   to speed up compression, at the expense of a little compression ratio
+*/
+#define WORKING_SET		1024
+#define GOOD_BAD_THRESHOLD	10
+
+#define HASH_BITS		18
+#define HASH_TABLE_SIZE		(1 << HASH_BITS)
+#define HASH_MASK		(HASH_TABLE_SIZE - 1)
+#define HASH_BASE_OFFSET	2166136261U
+#define HASH_FOLD(a)		((a)&(HASH_MASK))
+
+
  struct sym_entry {
  	unsigned long long addr;
  	char type;
-	char *sym;
+	char sample;
+	char valid;
+	unsigned char len;
+	unsigned char *sym;
  };


@@ -25,6 +52,24 @@ static int size, cnt;
  static unsigned long long _stext, _etext, _sinittext, _einittext;
  static int all_symbols = 0;

+
+struct token {
+	unsigned char data[MAX_TOK_SIZE];
+	unsigned char len;
+	int profit;
+	struct token *next;
+	struct token *right;
+	struct token *left;
+	struct token *smaller;
+	};
+
+struct token bad_head, good_head;
+struct token *hash_table[HASH_TABLE_SIZE];
+
+unsigned char best_table[256][MAX_TOK_SIZE+1];
+unsigned char best_table_len[256];
+
+
  static void
  usage(void)
  {
@@ -60,6 +105,7 @@ read_symbol(FILE *in, struct sym_entry *
  		return -1;

  	s->sym = strdup(str);
+	s->len = strlen(str);
  	return 0;
  }

@@ -108,11 +154,42 @@ read_map(FILE *in)
  	}
  }

+static void output_label(char *label)
+{
+	printf(".globl %s\n",label);
+	printf("\tALGN\n");
+	printf("%s:\n",label);
+}
+
+static int expand_symbol(unsigned char *data, int len, char *result)
+{
+	int c, rlen, total=0;
+
+	while (len) {
+		c = *data;
+		if (best_table[c][0]==c && best_table_len[c]==1) {
+			*result++ = c;
+			total++;
+		} else {
+			rlen = expand_symbol(best_table[c], best_table_len[c], result);
+			total += rlen;
+			result += rlen;
+		}
+		data++;
+		len--;
+	}
+	*result=0;
+
+	return total;
+}
+
  static void
  write_src(void)
  {
-	int i, valid = 0;
-	char *prev;
+	int i, k, off, valid;
+	unsigned int best_idx[256];
+	unsigned int *markers;
+	char buf[KSYM_NAME_LEN+1];

  	printf("#include <asm/types.h>\n");
  	printf("#if BITS_PER_LONG == 64\n");
@@ -125,43 +202,349 @@ write_src(void)

  	printf(".data\n");

-	printf(".globl kallsyms_addresses\n");
-	printf("\tALGN\n");
-	printf("kallsyms_addresses:\n");
+	output_label("kallsyms_addresses");
+	valid = 0;
  	for (i = 0; i < cnt; i++) {
-		if (!symbol_valid(&table[i]))
-			continue;
-
-		printf("\tPTR\t%#llx\n", table[i].addr);
-		valid++;
+		if (table[i].valid) {
+			printf("\tPTR\t%#llx\n", table[i].addr);
+			valid++;
+		}
  	}
  	printf("\n");

-	printf(".globl kallsyms_num_syms\n");
-	printf("\tALGN\n");
-	printf("kallsyms_num_syms:\n");
+	output_label("kallsyms_num_syms");
  	printf("\tPTR\t%d\n", valid);
  	printf("\n");

-	printf(".globl kallsyms_names\n");
-	printf("\tALGN\n");
-	printf("kallsyms_names:\n");
-	prev = "";
+	markers = (unsigned int *) malloc(sizeof(unsigned int)*((valid + 255) / 256));
+	
+	output_label("kallsyms_names");
+	valid = 0;
+	off = 0;
  	for (i = 0; i < cnt; i++) {
-		int k;

-		if (!symbol_valid(&table[i]))
+		if (!table[i].valid)
  			continue;

-		for (k = 0; table[i].sym[k] && table[i].sym[k] == prev[k]; ++k)
-			;
+		if ((valid & 0xFF) == 0)
+			markers[valid >> 8] = off;
+
+		printf("\t.byte 0x%02x", table[i].len);
+		for (k = 0; k < table[i].len; k++)
+			printf(", 0x%02x", table[i].sym[k]);
+		printf("\n");

-		printf("\t.byte 0x%02x\n\t.asciz\t\"%s\"\n", k, table[i].sym + k);
-		prev = table[i].sym;
+		off += table[i].len + 1;
+		valid++;
  	}
  	printf("\n");
+
+	output_label("kallsyms_markers");
+	for (i = 0; i < ((valid + 255) >> 8); i++)
+		printf("\tPTR\t%d\n", markers[i]);
+	printf("\n");
+
+	output_label("kallsyms_token_table");
+	off = 0;
+	for (i = 0; i < 256; i++) {
+		best_idx[i] = off;
+		expand_symbol(best_table[i],best_table_len[i],buf);
+		k = strlen(buf);
+		printf("\t.byte 0x%02x\n\t.ascii\t\"%s\"\n", k, buf);
+		off += k + 1;
+	}
+	printf("\n");
+
+	output_label("kallsyms_token_index");
+	for (i = 0; i < 256; i++)
+		printf("\t.word\t%d\n", best_idx[i]);
+	printf("\n");
+}
+
+
+/* table lookup compression functions */
+
+static inline unsigned int rehash_token(unsigned int hash, unsigned char data)
+{
+	return ((hash * 16777619) ^ data);
+}
+
+static unsigned int hash_token(unsigned char *data, int len)
+{
+	unsigned int hash=HASH_BASE_OFFSET;
+	int i;
+
+	for (i = 0; i < len; i++)
+		hash = rehash_token(hash, data[i]);
+
+	return HASH_FOLD(hash);
+}
+
+static struct token *find_token_hash(unsigned char *data, int len, unsigned int hash)
+{
+	struct token *ptr;
+
+	ptr = hash_table[hash];
+
+	while (ptr) {
+		if ((ptr->len == len) && (memcmp(ptr->data, data, len) == 0))
+			return ptr;
+		ptr=ptr->next;
+	}
+
+	return NULL;
  }

+static inline void insert_token_in_group(struct token *head, struct token *ptr)
+{
+	ptr->right = head->right;
+	ptr->right->left = ptr;
+	head->right = ptr;
+	ptr->left = head;
+}
+
+static inline void remove_token_from_group(struct token *ptr)
+{
+	ptr->left->right = ptr->right;
+	ptr->right->left = ptr->left;
+}
+
+static void learn_token(unsigned char *data, int len)
+{
+	struct token *ptr,*last_ptr;
+	int i, newprofit;
+	unsigned int hash = HASH_BASE_OFFSET;
+	unsigned int hashes[MAX_TOK_SIZE + 1];
+
+	if (len > MAX_TOK_SIZE)
+		len = MAX_TOK_SIZE;
+
+	hash = rehash_token(hash, data[0]);
+	for (i = 2; i <= len; i++) {
+		hash = rehash_token(hash, data[i-1]);
+		hashes[i] = HASH_FOLD(hash);
+	}
+
+	last_ptr = NULL;
+	ptr = NULL;
+
+	for (i = len; i >= 2; i--) {
+		hash = hashes[i];
+
+		if (!ptr) ptr = find_token_hash(data, i, hash);
+
+		if (!ptr) {
+			ptr = (struct token *) malloc(sizeof(*ptr));
+			memcpy(ptr->data, data, i);
+			ptr->len = i;
+			ptr->profit = -GOOD_BAD_THRESHOLD;
+			ptr->next = hash_table[hash];
+			hash_table[hash] = ptr;
+
+			insert_token_in_group(&bad_head, ptr);
+
+			ptr->smaller = NULL;
+		} else {
+			newprofit = ptr->profit + (ptr->len - 1);
+			if((ptr->profit < 0) && (newprofit >= 0)) {
+				remove_token_from_group(ptr);
+				insert_token_in_group(&good_head,ptr);
+			}
+		ptr->profit = newprofit;
+		}
+
+		if (last_ptr) last_ptr->smaller = ptr;
+		last_ptr = ptr;
+
+		ptr = ptr->smaller;
+	}
+}
+
+static void forget_token(unsigned char *data, int len)
+{
+	struct token *ptr;
+	int i, newprofit;
+	unsigned int hash=0;
+
+	if (len > MAX_TOK_SIZE) len = MAX_TOK_SIZE;
+
+	hash = hash_token(data, len);
+	ptr = find_token_hash(data, len, hash);
+
+	for (i = len; i >= 2; i--) {
+
+		newprofit = ptr->profit - (ptr->len - 1);
+		if ((ptr->profit >= 0) && (newprofit < 0)) {
+			remove_token_from_group(ptr);
+			insert_token_in_group(&bad_head, ptr);
+		}
+		ptr->profit=newprofit;
+
+		ptr=ptr->smaller;
+	}
+}
+
+static void learn_symbol(unsigned char *symbol, int len)
+{
+	int i;
+
+	for (i = 0; i < len - 1; i++)
+		learn_token(symbol + i, len - i);
+}
+
+static void forget_symbol(unsigned char *symbol, int len)
+{
+	int i;
+
+	for (i = 0; i < len - 1; i++)
+		forget_token(symbol + i, len - i);
+}
+
+static void build_initial_tok_table(void)
+{
+	int i, use_it, valid;
+
+	valid = 0;
+	for (i = 0; i < cnt; i++) {
+		table[i].valid = symbol_valid(&table[i]);
+		if (table[i].valid) valid++;
+	}
+
+	use_it = 0;
+	for (i = 0; i < cnt; i++) {
+		table[i].sample = 0;
+		if (table[i].valid) {
+			use_it += WORKING_SET;
+			if (use_it >= valid) {
+				table[i].sample = 1;
+				use_it -= valid;
+			}
+		}
+		if (table[i].sample)
+			learn_symbol(table[i].sym, table[i].len);
+	}
+}
+
+static void compress_symbols(unsigned char *str, int tlen, int idx)
+{
+	int i, len, learn, size;
+	unsigned char *p;
+
+	for (i = 0; i < cnt; i++) {
+		if (!table[i].valid) continue;
+		
+		len = table[i].len;
+		learn = 0;
+		p = table[i].sym;
+
+		do {
+			p = (unsigned char *) strstr((char *) p, (char *) str);
+			if (!p) break;
+
+			if (!learn) {
+				if (table[i].sample)
+					forget_symbol(table[i].sym, len);
+				learn = 1;
+			}
+
+			*p = idx;
+			size = (len - (p - table[i].sym)) - tlen + 1;
+			memmove(p + 1, p + tlen, size);
+			p++;
+			len -= tlen - 1;
+			
+		} while (size >= tlen);
+
+		if(learn) {
+			table[i].len = len;
+			if(table[i].sample) learn_symbol(table[i].sym, len);
+		}
+	}
+}
+
+static struct token *find_best_token(void)
+{
+	struct token *ptr,*best,*head;
+	int bestprofit;
+
+	bestprofit=-10000;
+
+	/* failsafe: if the "good" list is empty search from the "bad" list */
+	if(good_head.right == &good_head) head = &bad_head;
+	else head = &good_head;
+
+	ptr = head->right;
+	best = NULL;
+	while (ptr != head) {
+		if (ptr->profit > bestprofit) {
+			bestprofit = ptr->profit;
+			best = ptr;
+		}
+		ptr = ptr->right;
+	}
+
+	return best;
+}
+
+static void optimize_result(void)
+{
+	struct token *best;
+	int i;
+
+	/* using the '\0' symbol last allows compress_symbols to use standard
+	   fast string functions
+	*/
+	for (i = 255; i >= 0; i--) {
+		if (!best_table_len[i]) {
+			best = find_best_token();
+
+			best_table_len[i] = best->len;
+			memcpy(best_table[i], best->data, best_table_len[i]);
+			/* zero terminate the token so that we can use strstr
+			   in compress_symbols */
+			best_table[i][best_table_len[i]]='\0';
+
+			compress_symbols(best_table[i], best_table_len[i], i);
+		}
+	}
+}
+
+static void insert_real_symbols_in_table(void)
+{
+	int i, j, c;
+
+	memset(best_table, 0, sizeof(best_table));
+	memset(best_table_len, 0, sizeof(best_table_len));
+
+	for (i = 0; i < cnt; i++) {
+		if (table[i].valid) {
+			for (j = 0; j < table[i].len; j++) {
+				c = table[i].sym[j];
+				best_table[c][0]=c;
+				best_table_len[c]=1;
+			}
+		}
+	}
+}
+
+static void optimize_token_table(void)
+{
+	memset(hash_table, 0, sizeof(hash_table));
+
+	good_head.left = &good_head;
+	good_head.right = &good_head;
+
+	bad_head.left = &bad_head;
+	bad_head.right = &bad_head;
+
+	build_initial_tok_table();
+
+	insert_real_symbols_in_table();
+
+	optimize_result();
+}
+
+
  int
  main(int argc, char **argv)
  {
@@ -171,6 +554,7 @@ main(int argc, char **argv)
  		usage();

  	read_map(stdin);
+	optimize_token_table();
  	write_src();

  	return 0;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25 19:09       ` Paulo Marques
@ 2004-08-25 19:29         ` viro
  2004-08-25 23:40           ` Paulo Marques
  2004-08-25 21:12         ` Matt Mackall
  1 sibling, 1 reply; 18+ messages in thread
From: viro @ 2004-08-25 19:29 UTC (permalink / raw)
  To: Paulo Marques; +Cc: Matt Mackall, linux-kernel, bcasavan

On Wed, Aug 25, 2004 at 08:09:33PM +0100, Paulo Marques wrote:
> Matt Mackall wrote:
> >...
> >
> >FYI, killing the seq_file stuff will likely prove unpopular. So you'll
> >want to do that in a separate patch. If it doesn't affect the way
> >you're handling compression, please repost your compression patch. I
> >have a few comments, but otherwise I think we should move forward with it.
> 
> I'm still not sure that the seq_file is the culprit, but doing
> a 10000 symbol decompression in a user space application takes
> about 340us, whereas doing a "time cat /proc/kallsyms > /dev/null"
> gives approx. 0.2s! (this is all on a Pentium4 2.8GHz)
> 
> *If* the seq_file is the culprit, then I don't think removing
> it (or improving it) will be unpopular.

If it really spends that much in seq_file, I bet anything that it got
*very* dumb iterator.  Which should be fixable...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25  4:04 [PATCH] kallsyms data size reduction / lookup speedup pmarques
  2004-08-25 17:39 ` Matt Mackall
@ 2004-08-25 20:51 ` Sam Ravnborg
  2004-08-25 21:25   ` Randy.Dunlap
  2004-08-26  0:27   ` Paulo Marques
  1 sibling, 2 replies; 18+ messages in thread
From: Sam Ravnborg @ 2004-08-25 20:51 UTC (permalink / raw)
  To: pmarques; +Cc: linux-kernel, bcasavan

On Wed, Aug 25, 2004 at 05:04:46AM +0100, pmarques@grupopie.com wrote:
> 
> This patch is an improvement over my first kallsyms speedup patch posted about 2
> weeks ago.

My origianl comment still hold.
Decoupling the compression and decompression part is not good.
Better keep them close to each other.

Why not put all symbols in an __init section, compress them during kernel boot
and then the original section get discarded.

After a quick browse of the code.
- Use spaces around '=' etc.

	Sam


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25 19:09       ` Paulo Marques
  2004-08-25 19:29         ` viro
@ 2004-08-25 21:12         ` Matt Mackall
  2004-08-25 23:50           ` Paulo Marques
  1 sibling, 1 reply; 18+ messages in thread
From: Matt Mackall @ 2004-08-25 21:12 UTC (permalink / raw)
  To: Paulo Marques; +Cc: linux-kernel, bcasavan

On Wed, Aug 25, 2004 at 08:09:33PM +0100, Paulo Marques wrote:
> 
> diff -uprN -X dontdiff linux-2.6.9-rc1/kernel/kallsyms.c 
> linux-2.6.9-rc1-kall/kernel/kallsyms.c
> --- linux-2.6.9-rc1/kernel/kallsyms.c	2004-08-24 23:16:39.000000000 +0100
> +++ linux-2.6.9-rc1-kall/kernel/kallsyms.c	2004-08-25 
> 03:59:27.000000000 +0100
> @@ -5,6 +5,12 @@
>   * module loader:
>   *   Copyright 2002 Rusty Russell <rusty@rustcorp.com.au> IBM Corporation
>   * Stem compression by Andi Kleen.

Might as well kill this dangling reference to stem compression.

> 
> +static unsigned int kallsyms_expand_symbol(unsigned int off, char *result)
> +{
> +	int len, tlen;
> +	u8 *tptr, *data;
> +
> +	data = &kallsyms_names[off];
> +	
> +	len=*data++;
> +	off += len + 1;
> +	while(len) {
> +		tptr=&kallsyms_token_table[kallsyms_token_index[*data]];
> +		data++;
> +		len--;
> +
> +		tlen=*tptr++;
> +		while(tlen) {
> +			*result++=*tptr++;

Whitespace, please. Vertical and horizontal. Also, this style with
pointer manipulation tends to be frowned on - it's compact but
needlessly obscure. Using a couple index variables and for loops would
make this quite a bit more readable.

> +static unsigned int get_symbol_offset(unsigned long pos)
> +{
> +	u8 *name;
> +	int i;
> +
> +	name = &kallsyms_names[ kallsyms_markers[pos>>8] ];
> +	for(i = 0; i < (pos&0xFF); i++)
> +		name = name + (*name) + 1;

That's a bit magic looking. A brief description of the data structure in this
vicinity would be appreciated. 

> --- linux-2.6.9-rc1/scripts/kallsyms.c	2004-08-24 23:16:39.000000000 +0100
> +++ linux-2.6.9-rc1-kall/scripts/kallsyms.c	2004-08-25 
> 03:30:30.000000000 +0100
> @@ -6,6 +6,13 @@
>   * of the GNU General Public License, incorporated herein by reference.
>   *
>   * Usage: nm -n vmlinux | scripts/kallsyms [--all-symbols] > symbols.S
> + *
> + * ChangeLog:
> + *
> + * (25/Aug/2004) Paulo Marques
> + *      Changed the compression method from stem compression to "table 
> lookup"
> + *      compression
> + *
>   */

Throw an address in there so we know where to direct complaints. Also
a description of "table compression" belongs here.

Generally looks decent, could use a bit more in the way of comments.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25 20:51 ` Sam Ravnborg
@ 2004-08-25 21:25   ` Randy.Dunlap
  2004-08-26  0:27   ` Paulo Marques
  1 sibling, 0 replies; 18+ messages in thread
From: Randy.Dunlap @ 2004-08-25 21:25 UTC (permalink / raw)
  To: Sam Ravnborg; +Cc: pmarques, linux-kernel, bcasavan

On Wed, 25 Aug 2004 22:51:13 +0200 Sam Ravnborg wrote:

| On Wed, Aug 25, 2004 at 05:04:46AM +0100, pmarques@grupopie.com wrote:
| > 
| > This patch is an improvement over my first kallsyms speedup patch posted about 2
| > weeks ago.
| 
| My origianl comment still hold.
| Decoupling the compression and decompression part is not good.
| Better keep them close to each other.
| 
| Why not put all symbols in an __init section, compress them during kernel boot
| and then the original section get discarded.
| 
| After a quick browse of the code.
| - Use spaces around '=' etc.

I just want to emphasize Sam's last comment.  Find and use that
'spacebar' thingy a lot more.  Make the code human-readable, not
just machine-readable.

--
~Randy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25 19:29         ` viro
@ 2004-08-25 23:40           ` Paulo Marques
  2004-08-25 23:43             ` viro
  0 siblings, 1 reply; 18+ messages in thread
From: Paulo Marques @ 2004-08-25 23:40 UTC (permalink / raw)
  To: viro; +Cc: Matt Mackall, linux-kernel, bcasavan

viro@parcelfarce.linux.theplanet.co.uk wrote:
> On Wed, Aug 25, 2004 at 08:09:33PM +0100, Paulo Marques wrote:
> 
>>Matt Mackall wrote:
>>
>>>...
>>>
>>>FYI, killing the seq_file stuff will likely prove unpopular. So you'll
>>>want to do that in a separate patch. If it doesn't affect the way
>>>you're handling compression, please repost your compression patch. I
>>>have a few comments, but otherwise I think we should move forward with it.
>>
>>I'm still not sure that the seq_file is the culprit, but doing
>>a 10000 symbol decompression in a user space application takes
>>about 340us, whereas doing a "time cat /proc/kallsyms > /dev/null"
>>gives approx. 0.2s! (this is all on a Pentium4 2.8GHz)
>>
>>*If* the seq_file is the culprit, then I don't think removing
>>it (or improving it) will be unpopular.
> 
> 
> If it really spends that much in seq_file, I bet anything that it got
> *very* dumb iterator.  Which should be fixable...

That is why I kept a big *If* in that sentence. I'm quite new to all
this, and I'm still reading a lot of source code.

If the culprit is in fact seq_file, and seq_file can be improved in a
way that works for everyone (not only kallsyms), then I also agree
that is is the way to go. But hunting this down might prove that the
problem is somewhere else. It is just too soon to draw conclusions.

-- 
Paulo Marques - www.grupopie.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25 23:40           ` Paulo Marques
@ 2004-08-25 23:43             ` viro
  2004-08-26  9:59               ` Andrew Morton
  0 siblings, 1 reply; 18+ messages in thread
From: viro @ 2004-08-25 23:43 UTC (permalink / raw)
  To: Paulo Marques; +Cc: Matt Mackall, linux-kernel, bcasavan

On Thu, Aug 26, 2004 at 12:40:30AM +0100, Paulo Marques wrote:
> That is why I kept a big *If* in that sentence. I'm quite new to all
> this, and I'm still reading a lot of source code.
> 
> If the culprit is in fact seq_file, and seq_file can be improved in a
> way that works for everyone (not only kallsyms), then I also agree
> that is is the way to go. But hunting this down might prove that the
> problem is somewhere else. It is just too soon to draw conclusions.

readprofile(1) ought to narrow it down with that kind of timing difference...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25 21:12         ` Matt Mackall
@ 2004-08-25 23:50           ` Paulo Marques
  0 siblings, 0 replies; 18+ messages in thread
From: Paulo Marques @ 2004-08-25 23:50 UTC (permalink / raw)
  To: Matt Mackall; +Cc: linux-kernel, bcasavan

Matt Mackall wrote:
> On Wed, Aug 25, 2004 at 08:09:33PM +0100, Paulo Marques wrote:
> 
>>diff -uprN -X dontdiff linux-2.6.9-rc1/kernel/kallsyms.c 
>>linux-2.6.9-rc1-kall/kernel/kallsyms.c
>>--- linux-2.6.9-rc1/kernel/kallsyms.c	2004-08-24 23:16:39.000000000 +0100
>>+++ linux-2.6.9-rc1-kall/kernel/kallsyms.c	2004-08-25 
>>03:59:27.000000000 +0100
>>@@ -5,6 +5,12 @@
>>  * module loader:
>>  *   Copyright 2002 Rusty Russell <rusty@rustcorp.com.au> IBM Corporation
>>  * Stem compression by Andi Kleen.
> 
> 
> Might as well kill this dangling reference to stem compression.

Will do.

>>+static unsigned int kallsyms_expand_symbol(unsigned int off, char *result)
>>+{
>>+	int len, tlen;
>>+	u8 *tptr, *data;
>>+
>>+	data = &kallsyms_names[off];
>>+	
>>+	len=*data++;
>>+	off += len + 1;
>>+	while(len) {
>>+		tptr=&kallsyms_token_table[kallsyms_token_index[*data]];
>>+		data++;
>>+		len--;
>>+
>>+		tlen=*tptr++;
>>+		while(tlen) {
>>+			*result++=*tptr++;
> 
> 
> Whitespace, please. Vertical and horizontal. Also, this style with
> pointer manipulation tends to be frowned on - it's compact but
> needlessly obscure. Using a couple index variables and for loops would
> make this quite a bit more readable.

Yikes! I didn't release this code. It escaped leaving a trail of blood
in its wake :)  (Klingon programmer)

Consider this fixed, and some comments thrown in, while I'm at it.

>>+static unsigned int get_symbol_offset(unsigned long pos)
>>+{
>>+	u8 *name;
>>+	int i;
>>+
>>+	name = &kallsyms_names[ kallsyms_markers[pos>>8] ];
>>+	for(i = 0; i < (pos&0xFF); i++)
>>+		name = name + (*name) + 1;
> 
> 
> That's a bit magic looking. A brief description of the data structure in this
> vicinity would be appreciated. 

Ok.

>>--- linux-2.6.9-rc1/scripts/kallsyms.c	2004-08-24 23:16:39.000000000 +0100
>>+++ linux-2.6.9-rc1-kall/scripts/kallsyms.c	2004-08-25 
>>03:30:30.000000000 +0100
>>@@ -6,6 +6,13 @@
>>  * of the GNU General Public License, incorporated herein by reference.
>>  *
>>  * Usage: nm -n vmlinux | scripts/kallsyms [--all-symbols] > symbols.S
>>+ *
>>+ * ChangeLog:
>>+ *
>>+ * (25/Aug/2004) Paulo Marques
>>+ *      Changed the compression method from stem compression to "table 
>>lookup"
>>+ *      compression
>>+ *
>>  */
> 
> 
> Throw an address in there so we know where to direct complaints. Also
> a description of "table compression" belongs here.

Ok.

> Generally looks decent, could use a bit more in the way of comments.

The next patch will come with all this fixed. I was actually afraid of
putting in too many comments, that I ended up putting hardly any
comments.

Thanks for all the sugestions,

-- 
Paulo Marques - www.grupopie.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25 20:51 ` Sam Ravnborg
  2004-08-25 21:25   ` Randy.Dunlap
@ 2004-08-26  0:27   ` Paulo Marques
  2004-08-26 11:01     ` Paulo Marques
  1 sibling, 1 reply; 18+ messages in thread
From: Paulo Marques @ 2004-08-26  0:27 UTC (permalink / raw)
  To: Sam Ravnborg; +Cc: linux-kernel, bcasavan

Sam Ravnborg wrote:
> On Wed, Aug 25, 2004 at 05:04:46AM +0100, pmarques@grupopie.com wrote:
> 
>>This patch is an improvement over my first kallsyms speedup patch posted about 2
>>weeks ago.
> 
> 
> My origianl comment still hold.
> Decoupling the compression and decompression part is not good.
> Better keep them close to each other.

In this case the decompression step is so simple that I don't think this 
is really a problem.
At least the source code is close to each other.

> Why not put all symbols in an __init section, compress them during kernel boot
> and then the original section get discarded.

In that case we should definitely change the algorithm as the 
compression step is quite hard. It was designed to run only at compile 
time, so code size / speed trade-offs would all tend to increase the 
code size, and even as it is it would slow down the boot sequence.

Anyway, even if the code and data is discarded as they would be in an 
__init section, the compression code would still be in the kernel image. 
This can reduce the areas of application of this code (boot from a 
floppy, embedded systems, etc.)

It _seems_ silly to have code in there that works over the same data 
every time it boots and produces the exact same result. I must say 
however that I realize that I'm a newcomer and I might not be seeing the 
hole picture. So, if it is the general consensus that this is the way to 
go, then fine by me :)

> After a quick browse of the code.
> - Use spaces around '=' etc.

I'm really sorry about this. I've tried to not let these escape, but 
some of them (many of them, in fact) did. I promise I'll be more 
thorough next time.

-- 
Paulo Marques - www.grupopie.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-25 23:43             ` viro
@ 2004-08-26  9:59               ` Andrew Morton
  2004-08-26 10:26                 ` Paulo Marques
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2004-08-26  9:59 UTC (permalink / raw)
  To: viro; +Cc: pmarques, mpm, linux-kernel, bcasavan

viro@parcelfarce.linux.theplanet.co.uk wrote:
>
> On Thu, Aug 26, 2004 at 12:40:30AM +0100, Paulo Marques wrote:
>  > That is why I kept a big *If* in that sentence. I'm quite new to all
>  > this, and I'm still reading a lot of source code.
>  > 
>  > If the culprit is in fact seq_file, and seq_file can be improved in a
>  > way that works for everyone (not only kallsyms), then I also agree
>  > that is is the way to go. But hunting this down might prove that the
>  > problem is somewhere else. It is just too soon to draw conclusions.
> 
>  readprofile(1) ought to narrow it down with that kind of timing difference...



c014696c do_anonymous_page                             4   0.0133
c0133550 kallsyms_expand_symbol                       17   0.1545
c026b78f __copy_user_intel                            31   0.2039
c026a5ec vsnprintf                                    39   0.0283
c026a318 number                                       40   0.0552
c011d102 write_profile                                79   0.7182
c0131f39 is_exported                                3805  17.0628
c0104028 default_idle                               4254  86.8163
00000000 total                                      8325   0.0025

It's all in the O(n) is_exported().  Rusty's fault ;)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-26  9:59               ` Andrew Morton
@ 2004-08-26 10:26                 ` Paulo Marques
  2004-08-30 18:21                   ` Sam Ravnborg
  0 siblings, 1 reply; 18+ messages in thread
From: Paulo Marques @ 2004-08-26 10:26 UTC (permalink / raw)
  To: Andrew Morton; +Cc: viro, mpm, linux-kernel, bcasavan

Andrew Morton wrote:
> viro@parcelfarce.linux.theplanet.co.uk wrote:
> 
>>On Thu, Aug 26, 2004 at 12:40:30AM +0100, Paulo Marques wrote:
>> > That is why I kept a big *If* in that sentence. I'm quite new to all
>> > this, and I'm still reading a lot of source code.
>> > 
>> > If the culprit is in fact seq_file, and seq_file can be improved in a
>> > way that works for everyone (not only kallsyms), then I also agree
>> > that is is the way to go. But hunting this down might prove that the
>> > problem is somewhere else. It is just too soon to draw conclusions.
>>
>> readprofile(1) ought to narrow it down with that kind of timing difference...
> 
> 
> 
> 
> c014696c do_anonymous_page                             4   0.0133
> c0133550 kallsyms_expand_symbol                       17   0.1545
> c026b78f __copy_user_intel                            31   0.2039
> c026a5ec vsnprintf                                    39   0.0283
> c026a318 number                                       40   0.0552
> c011d102 write_profile                                79   0.7182
> c0131f39 is_exported                                3805  17.0628
> c0104028 default_idle                               4254  86.8163
> 00000000 total                                      8325   0.0025
> 
> It's all in the O(n) is_exported().  Rusty's fault ;)

The is_exported function is only called for kernel symbols, and not 
module symbols AFAICT.

If there is a way to know at compile time the exported symbols, then 
scripts/kallsyms might generate a bitmap along with all the other data 
it generates so that checking is_exported would become O(1).

For a tipical scenario of less than 16k symbols, the bitmap would take 
2kB. If we don't want to spend this 2kB we can do something a little 
more ugly:

The format of the compressed stream is basicaly:

[1 byte <name length>][N-bytes name]

repeated for every symbol. Since KSYM_NAME_LEN is defined to be 127, we 
still have an extra bit in the length to keep the "is_exported" 
information there. The format would then be:

[1 byte <7:is_exported bit>:<6..0:name length>][N-bytes name]

Of course, this limits future changes to KSYM_NAME_LEN :(

If the information is hard to get at compile time, then we could do a 
single loop the first time the information is needed, and then use that 
for subsequent calls.

But with this solved we could have sub-milisecond "cat /proc/kallsyms" :)

-- 
Paulo Marques - www.grupopie.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-26  0:27   ` Paulo Marques
@ 2004-08-26 11:01     ` Paulo Marques
  0 siblings, 0 replies; 18+ messages in thread
From: Paulo Marques @ 2004-08-26 11:01 UTC (permalink / raw)
  To: Paulo Marques; +Cc: Sam Ravnborg, linux-kernel, bcasavan

Paulo Marques wrote:
> ....
> however that I realize that I'm a newcomer and I might not be seeing the 
> hole picture. So, if it is the general consensus that this is the way to 
   ^^^^
*whole* picture, of course. Not a picture of a hole ;)

-- 
Paulo Marques - www.grupopie.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-26 10:26                 ` Paulo Marques
@ 2004-08-30 18:21                   ` Sam Ravnborg
  2004-08-30 18:38                     ` Paulo Marques
  0 siblings, 1 reply; 18+ messages in thread
From: Sam Ravnborg @ 2004-08-30 18:21 UTC (permalink / raw)
  To: Paulo Marques; +Cc: Andrew Morton, viro, mpm, linux-kernel, bcasavan

On Thu, Aug 26, 2004 at 11:26:33AM +0100, Paulo Marques wrote:
> If there is a way to know at compile time the exported symbols, then 
> scripts/kallsyms might generate a bitmap along with all the other data 
> it generates so that checking is_exported would become O(1).

If there exist a symbol named __ksymtab_{symbol} then you know it is exported.

	Sam

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] kallsyms data size reduction / lookup speedup
  2004-08-30 18:21                   ` Sam Ravnborg
@ 2004-08-30 18:38                     ` Paulo Marques
  0 siblings, 0 replies; 18+ messages in thread
From: Paulo Marques @ 2004-08-30 18:38 UTC (permalink / raw)
  To: Sam Ravnborg; +Cc: Andrew Morton, viro, mpm, linux-kernel, bcasavan

Sam Ravnborg wrote:
> On Thu, Aug 26, 2004 at 11:26:33AM +0100, Paulo Marques wrote:
> 
>>If there is a way to know at compile time the exported symbols, then 
>>scripts/kallsyms might generate a bitmap along with all the other data 
>>it generates so that checking is_exported would become O(1).
> 
> 
> If there exist a symbol named __ksymtab_{symbol} then you know it is exported.

Thanks for the sugestion.

I already looked at EXPORT_SYMBOL in module.h and reached the same 
conclusion, and wrote a patch that did just that.

The problem with that patch was that symbols exported with 
EXPORT_SYMBOL_GPL would show up as exported whereas in the previous 
version they didn't :(

I asked Rusty Russell in private if the old behaviour was the desired 
behaviour or if it would be better to uppercase the symbols that are 
uppercased in System.map (i.e. exported vs public), and he replied that 
the old behaviour was done on purpose and should be kept (at least that 
is what I understood, not being a native speaker and all).

So I'm writting a new version right now that checks that the 
__ksymtab_{symbol} is on the __ksymtab section and not on the 
__ksymtab_gpl section to keep the old behaviour.

Anyway, the patch really speeds up a "time cat /proc/kallsyms" from 
0.25s to 0.00s (I really need to do my own benchmark. "time" doesn't 
have enough resolution :)

I'll probably post it tonight.

-- 
Paulo Marques - www.grupopie.com

To err is human, but to really foul things up requires a computer.
Farmers' Almanac, 1978

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2004-08-30 18:51 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-25  4:04 [PATCH] kallsyms data size reduction / lookup speedup pmarques
2004-08-25 17:39 ` Matt Mackall
2004-08-25 18:46   ` Paulo Marques
2004-08-25 18:58     ` Matt Mackall
2004-08-25 19:09       ` Paulo Marques
2004-08-25 19:29         ` viro
2004-08-25 23:40           ` Paulo Marques
2004-08-25 23:43             ` viro
2004-08-26  9:59               ` Andrew Morton
2004-08-26 10:26                 ` Paulo Marques
2004-08-30 18:21                   ` Sam Ravnborg
2004-08-30 18:38                     ` Paulo Marques
2004-08-25 21:12         ` Matt Mackall
2004-08-25 23:50           ` Paulo Marques
2004-08-25 20:51 ` Sam Ravnborg
2004-08-25 21:25   ` Randy.Dunlap
2004-08-26  0:27   ` Paulo Marques
2004-08-26 11:01     ` Paulo Marques

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox