[PATCH 0/2] Fix 'already_tokenized()' performance problem

linux-sparse.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] Fix 'already_tokenized()' performance problem
@ 2011-04-19 18:12 Linus Torvalds
  2011-04-19 18:17 ` [PATCH 1/2] Add new streams to a hash-list based on their names Linus Torvalds
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Linus Torvalds @ 2011-04-19 18:12 UTC (permalink / raw)
  To: Christopher Li, linux-sparse


This series of two trivial patches makes 'already_tokenized()' go away 
from profiles of a kernel build with C=2, shaving about 3-4% off the time 
for me.

This came up in a discussion on another tech board, and people were 
complaining that I hadn't optimized the pre-processor sufficiently.

(Depending on compiler, usually the profile will show 'strcmp()' taking 
all the time - it depends on whether the strcmp gets inlined or not)

Linus Torvalds (2):
  Add new streams to a hash-list based on their names
  Teach 'already_tokenized()' to use the stream name hash table

 pre-process.c |    8 +++++---
 token.h       |    3 ++-
 tokenize.c    |   25 ++++++++++++++++++++++++-
 3 files changed, 31 insertions(+), 5 deletions(-)

-- 
1.7.5.rc1.1.gc3f61.dirty


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] Add new streams to a hash-list based on their names
  2011-04-19 18:12 [PATCH 0/2] Fix 'already_tokenized()' performance problem Linus Torvalds
@ 2011-04-19 18:17 ` Linus Torvalds
  2011-04-19 22:01   ` Christopher Li
  2011-04-19 18:19 ` PATCH 2/2] Teach 'already_tokenized()' to use the stream name hash table Linus Torvalds
  2011-04-19 21:52 ` [PATCH 0/2] Fix 'already_tokenized()' performance problem Christopher Li
  2 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2011-04-19 18:17 UTC (permalink / raw)
  To: Christopher Li, linux-sparse


This trivially creates a small (currently 64-entry) hash table for
stream numbers, so that you can look up a stream by its name.

Nothing currently uses it, but the next commit will teach
"already_tokenized()" to look up the stream by name, allowing us to
avoid the "walk over each stream we've seen" loop.  The silly string
compare in that loop can take a lot of CPU time.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

The hash itself is soem random thing made up to look something like the 
Linux kernel path component hash. It seems "good enough".

NOTE! The linked list of "struct stream" is based on indexes, since that's 
the only way to look up a stream - they move around when we re-allocate 
them using realloc(), and you can't actually use a regular linked list of 
"struct stream *". 

 token.h    |    3 ++-
 tokenize.c |   25 ++++++++++++++++++++++++-
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/token.h b/token.h
index a7ec77e05b15..cd2923318446 100644
--- a/token.h
+++ b/token.h
@@ -40,7 +40,7 @@ struct stream {
 
 	/* Use these to check for "already parsed" */
 	enum constantfile constant;
-	int dirty;
+	int dirty, next_stream;
 	struct ident *protect;
 	struct token *ifndef;
 	struct token *top_if;
@@ -49,6 +49,7 @@ struct stream {
 extern int input_stream_nr;
 extern struct stream *input_streams;
 extern unsigned int tabstop;
+extern int *hash_stream(const char *name);
 
 struct ident {
 	struct ident *next;	/* Hash chain of identifiers */
diff --git a/tokenize.c b/tokenize.c
index 272974b3b844..d4f05e563770 100644
--- a/tokenize.c
+++ b/tokenize.c
@@ -14,6 +14,7 @@
 #include <string.h>
 #include <ctype.h>
 #include <unistd.h>
+#include <stdint.h>
 
 #include "lib.h"
 #include "allocate.h"
@@ -179,9 +180,28 @@ const char *show_token(const struct token *token)
 	}
 }
 
+#define HASHED_INPUT_BITS (6)
+#define HASHED_INPUT (1 << HASHED_INPUT_BITS)
+#define HASH_PRIME 0x9e370001UL
+
+static int input_stream_hashes[HASHED_INPUT] = { [0 ... HASHED_INPUT-1] = -1 };
+
+int *hash_stream(const char *name)
+{
+	uint32_t hash = 0;
+	unsigned char c;
+
+	while ((c = *name++) != 0)
+		hash = (hash + (c << 4) + (c >> 4)) * 11;
+
+	hash *= HASH_PRIME;
+	hash >>= 32 - HASHED_INPUT_BITS;
+	return input_stream_hashes + hash;
+}
+
 int init_stream(const char *name, int fd, const char **next_path)
 {
-	int stream = input_stream_nr;
+	int stream = input_stream_nr, *hash;
 	struct stream *current;
 
 	if (stream >= input_streams_allocated) {
@@ -199,6 +219,9 @@ int init_stream(const char *name, int fd, const char **next_path)
 	current->path = NULL;
 	current->constant = CONSTANT_FILE_MAYBE;
 	input_stream_nr = stream+1;
+	hash = hash_stream(name);
+	current->next_stream = *hash;
+	*hash = stream;
 	return stream;
 }
 
-- 
1.7.5.rc1.1.gc3f61.dirty


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* PATCH 2/2] Teach 'already_tokenized()' to use the stream name hash table
  2011-04-19 18:12 [PATCH 0/2] Fix 'already_tokenized()' performance problem Linus Torvalds
  2011-04-19 18:17 ` [PATCH 1/2] Add new streams to a hash-list based on their names Linus Torvalds
@ 2011-04-19 18:19 ` Linus Torvalds
  2011-04-19 21:52 ` [PATCH 0/2] Fix 'already_tokenized()' performance problem Christopher Li
  2 siblings, 0 replies; 7+ messages in thread
From: Linus Torvalds @ 2011-04-19 18:19 UTC (permalink / raw)
  To: Christopher Li, linux-sparse


This replaces the "loop over all streams" with a simple hash lookup.  It
makes the cost of checking for already tokenized streams basically go
away (it could be up to 5% of CPU time, almost entirely due to the
"strcmp()" of the name).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

That "up to 5%" is probably debatable, and will depend on just how many 
includes you have etc etc. And on the CPU and library issues. But 3-4% is 
definitely the case for my kernel C=2 build on my current machine.

So it's real, and worth it.

 pre-process.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/pre-process.c b/pre-process.c
index 603cc00c999f..6d12f173146d 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -655,10 +655,12 @@ static const char *token_name_sequence(struct token *token, int endop, struct to
 
 static int already_tokenized(const char *path)
 {
-	int i;
-	struct stream *s = input_streams;
+	int stream, next;
+
+	for (stream = *hash_stream(path); stream >= 0 ; stream = next) {
+		struct stream *s = input_streams + stream;
 
-	for (i = input_stream_nr; --i >= 0; s++) {
+		next = s->next_stream;
 		if (s->constant != CONSTANT_FILE_YES)
 			continue;
 		if (strcmp(path, s->name))
-- 
1.7.5.rc1.1.gc3f61.dirty


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] Fix 'already_tokenized()' performance problem
  2011-04-19 18:12 [PATCH 0/2] Fix 'already_tokenized()' performance problem Linus Torvalds
  2011-04-19 18:17 ` [PATCH 1/2] Add new streams to a hash-list based on their names Linus Torvalds
  2011-04-19 18:19 ` PATCH 2/2] Teach 'already_tokenized()' to use the stream name hash table Linus Torvalds
@ 2011-04-19 21:52 ` Christopher Li
  2011-04-19 22:09   ` Linus Torvalds
  2 siblings, 1 reply; 7+ messages in thread
From: Christopher Li @ 2011-04-19 21:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-sparse

Very nice patch. Simple and clean.

Applied and pushed.


On Tue, Apr 19, 2011 at 11:12 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> This series of two trivial patches makes 'already_tokenized()' go away
> from profiles of a kernel build with C=2, shaving about 3-4% off the time
> for me.

I am suppressed that you can get 3-4% from just comparing the stream names.
Last time I try to optimized that nextchar() and nextchar_slow()
because Al complains
the tab character force the slow path, I haven't seen some thing like
1% difference.
So I leave it alone.

Thanks for the patch.

Chris

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] Add new streams to a hash-list based on their names
  2011-04-19 18:17 ` [PATCH 1/2] Add new streams to a hash-list based on their names Linus Torvalds
@ 2011-04-19 22:01   ` Christopher Li
  2011-04-19 22:13     ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Christopher Li @ 2011-04-19 22:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-sparse

On Tue, Apr 19, 2011 at 11:17 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> +       while ((c = *name++) != 0)
> +               hash = (hash + (c << 4) + (c >> 4)) * 11;
> +
> +       hash *= HASH_PRIME;
> +       hash >>= 32 - HASHED_INPUT_BITS;

Just curious about this hash function, obviously it is not the same as the one
used for hashing idents. Does it have any history or you just code it up?

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] Fix 'already_tokenized()' performance problem
  2011-04-19 21:52 ` [PATCH 0/2] Fix 'already_tokenized()' performance problem Christopher Li
@ 2011-04-19 22:09   ` Linus Torvalds
  0 siblings, 0 replies; 7+ messages in thread
From: Linus Torvalds @ 2011-04-19 22:09 UTC (permalink / raw)
  To: Christopher Li; +Cc: linux-sparse

On Tue, Apr 19, 2011 at 2:52 PM, Christopher Li <sparse@chrisli.org> wrote:
>
> I am suppressed that you can get 3-4% from just comparing the stream names.

I was too. But the kernel sources have gotten more complicated, and
"input_stream_number" can get very high. Having 200 streams is common,
and 400 streams is not unusual for some more complex kernel files that
include a lot of files.

So what happened was that a single '#include" at the end of such a
situation would compare the pathname against hundreds of longish
strings. So it ends up O(n**2) in number of #includes - each strcmp is
cheap, but there's a _lot_ of them.

I don't remember these kinds of numbers from years ago when I did more
profiling, but I suspect the kernel tree has gotten way more
include-happy too.

                     Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] Add new streams to a hash-list based on their names
  2011-04-19 22:01   ` Christopher Li
@ 2011-04-19 22:13     ` Linus Torvalds
  0 siblings, 0 replies; 7+ messages in thread
From: Linus Torvalds @ 2011-04-19 22:13 UTC (permalink / raw)
  To: Christopher Li; +Cc: linux-sparse

On Tue, Apr 19, 2011 at 3:01 PM, Christopher Li <sparse@chrisli.org> wrote:
> On Tue, Apr 19, 2011 at 11:17 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> +       while ((c = *name++) != 0)
>> +               hash = (hash + (c << 4) + (c >> 4)) * 11;
>> +
>> +       hash *= HASH_PRIME;
>> +       hash >>= 32 - HASHED_INPUT_BITS;
>
> Just curious about this hash function, obviously it is not the same as the one
> used for hashing idents. Does it have any history or you just code it up?

It resembles the kernel pathname hash function, but without the fancy
"do it in 64 bits on 64-bit architectures".

And the exact shifting etc at the end is different (the kernel sizes
the hash table based on memory size etc).

But it might be a good idea to share the hash function with the idents
- there is absolutely nothing magical about the hashing I picked. The
kernel path component hash function has been "sufficiently good", but
it's nothing magical or special.

                         Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-04-19 22:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-19 18:12 [PATCH 0/2] Fix 'already_tokenized()' performance problem Linus Torvalds
2011-04-19 18:17 ` [PATCH 1/2] Add new streams to a hash-list based on their names Linus Torvalds
2011-04-19 22:01   ` Christopher Li
2011-04-19 22:13     ` Linus Torvalds
2011-04-19 18:19 ` PATCH 2/2] Teach 'already_tokenized()' to use the stream name hash table Linus Torvalds
2011-04-19 21:52 ` [PATCH 0/2] Fix 'already_tokenized()' performance problem Christopher Li
2011-04-19 22:09   ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).