public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] fix PID hash sizing
@ 2004-08-22  4:44 Nick Piggin
  2004-08-22  4:46 ` [PATCH 2/2] use hlist for pid hash Nick Piggin
  2004-08-22 12:32 ` [PATCH 1/2] fix PID hash sizing William Lee Irwin III
  0 siblings, 2 replies; 10+ messages in thread
From: Nick Piggin @ 2004-08-22  4:44 UTC (permalink / raw)
  To: Andrew Morton, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 89 bytes --]

I see PID hash sizing problems on an Opteron.
I thought this got fixed a while ago? Hm.


[-- Attachment #2: pid-hash-alloc.patch --]
[-- Type: text/x-patch, Size: 2180 bytes --]



Export nr_kernel_pages, nr_all_pages. Use nr_kernel_pages when sizing
the PID hash. This fixes a sizing problem I'm seeing with the x86-64 kernel
on an Opteron.


---

 linux-2.6-npiggin/include/linux/bootmem.h |    3 +++
 linux-2.6-npiggin/kernel/pid.c            |    2 +-
 linux-2.6-npiggin/mm/page_alloc.c         |    4 ++--
 3 files changed, 6 insertions(+), 3 deletions(-)

diff -puN kernel/pid.c~pid-hash-alloc kernel/pid.c
--- linux-2.6/kernel/pid.c~pid-hash-alloc	2004-08-22 14:28:46.000000000 +1000
+++ linux-2.6-npiggin/kernel/pid.c	2004-08-22 14:34:13.000000000 +1000
@@ -276,7 +276,7 @@ int kgdb_pid_init_done; /* so we don't c
 void __init pidhash_init(void)
 {
 	int i, j, pidhash_size;
-	unsigned long megabytes = max_pfn >> (20 - PAGE_SHIFT);
+	unsigned long megabytes = nr_kernel_pages >> (20 - PAGE_SHIFT);
 
 	pidhash_shift = max(4, fls(megabytes * 4));
 	pidhash_shift = min(12, pidhash_shift);
diff -puN include/linux/bootmem.h~pid-hash-alloc include/linux/bootmem.h
--- linux-2.6/include/linux/bootmem.h~pid-hash-alloc	2004-08-22 14:28:46.000000000 +1000
+++ linux-2.6-npiggin/include/linux/bootmem.h	2004-08-22 14:28:46.000000000 +1000
@@ -67,6 +67,9 @@ extern void * __init __alloc_bootmem_nod
 	__alloc_bootmem_node((pgdat), (x), PAGE_SIZE, 0)
 #endif /* !CONFIG_HAVE_ARCH_BOOTMEM_NODE */
 
+extern unsigned long __initdata nr_kernel_pages;
+extern unsigned long __initdata nr_all_pages;
+
 extern void *__init alloc_large_system_hash(const char *tablename,
 					    unsigned long bucketsize,
 					    unsigned long numentries,
diff -puN mm/page_alloc.c~pid-hash-alloc mm/page_alloc.c
--- linux-2.6/mm/page_alloc.c~pid-hash-alloc	2004-08-22 14:29:02.000000000 +1000
+++ linux-2.6-npiggin/mm/page_alloc.c	2004-08-22 13:41:11.000000000 +1000
@@ -55,8 +55,8 @@ EXPORT_SYMBOL(zone_table);
 static char *zone_names[MAX_NR_ZONES] = { "DMA", "Normal", "HighMem" };
 int min_free_kbytes = 1024;
 
-static unsigned long __initdata nr_kernel_pages;
-static unsigned long __initdata nr_all_pages;
+unsigned long __initdata nr_kernel_pages;
+unsigned long __initdata nr_all_pages;
 
 /*
  * Temporary debugging check for pages not lying within a given zone.

_

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/2] use hlist for pid hash
  2004-08-22  4:44 [PATCH 1/2] fix PID hash sizing Nick Piggin
@ 2004-08-22  4:46 ` Nick Piggin
  2004-08-22  5:00   ` David S. Miller
                     ` (2 more replies)
  2004-08-22 12:32 ` [PATCH 1/2] fix PID hash sizing William Lee Irwin III
  1 sibling, 3 replies; 10+ messages in thread
From: Nick Piggin @ 2004-08-22  4:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 96 bytes --]

Any reason why this shouldn't be done? Anyone know of a decent test that
stresses the pid hash?

[-- Attachment #2: pid-use-hlist.patch --]
[-- Type: text/x-patch, Size: 3475 bytes --]



Use hlists for the PID hashes. This halves the memory footprint of these
hashes. No benchmarks, but I think this is a worthy improvement because
the hashes are something that would be likely to have significant portions
loaded into the cache of every CPU on some workloads.

This comes at the "expense" of
	1. reintroducing the memory  prefetch into the hash traversal loop;
	2. adding new pids to the head of the list instead of the tail. I
	   suspect that if this was a big problem then the hash isn't sized
	   well or could benefit from moving hot entries to the head.

Also, account for all the pid hashes when reporting hash memory usage.


---

 linux-2.6-npiggin/include/linux/pid.h |    2 +-
 linux-2.6-npiggin/kernel/pid.c        |   19 ++++++++++---------
 2 files changed, 11 insertions(+), 10 deletions(-)

diff -puN kernel/pid.c~pid-use-hlist kernel/pid.c
--- linux-2.6/kernel/pid.c~pid-use-hlist	2004-08-22 14:42:56.000000000 +1000
+++ linux-2.6-npiggin/kernel/pid.c	2004-08-22 14:42:56.000000000 +1000
@@ -27,7 +27,7 @@
 #include <linux/hash.h>
 
 #define pid_hashfn(nr) hash_long((unsigned long)nr, pidhash_shift)
-static struct list_head *pid_hash[PIDTYPE_MAX];
+static struct hlist_head *pid_hash[PIDTYPE_MAX];
 static int pidhash_shift;
 
 int pid_max = PID_MAX_DEFAULT;
@@ -150,11 +150,11 @@ failure:
 
 fastcall struct pid *find_pid(enum pid_type type, int nr)
 {
-	struct list_head *elem, *bucket = &pid_hash[type][pid_hashfn(nr)];
+	struct hlist_node *elem;
 	struct pid *pid;
 
-	__list_for_each(elem, bucket) {
-		pid = list_entry(elem, struct pid, hash_chain);
+	hlist_for_each_entry(pid, elem,
+			&pid_hash[type][pid_hashfn(nr)], hash_chain) {
 		if (pid->nr == nr)
 			return pid;
 	}
@@ -181,7 +181,8 @@ int fastcall attach_pid(task_t *task, en
 		INIT_LIST_HEAD(&pid->task_list);
 		pid->task = task;
 		get_task_struct(task);
-		list_add(&pid->hash_chain, &pid_hash[type][pid_hashfn(nr)]);
+		hlist_add_head(&pid->hash_chain,
+				&pid_hash[type][pid_hashfn(nr)]);
 	}
 	list_add_tail(&task->pids[type].pid_chain, &pid->task_list);
 	task->pids[type].pidptr = pid;
@@ -200,7 +201,7 @@ static inline int __detach_pid(task_t *t
 		return 0;
 
 	nr = pid->nr;
-	list_del(&pid->hash_chain);
+	hlist_del(&pid->hash_chain);
 	put_task_struct(pid->task);
 
 	return nr;
@@ -282,9 +283,9 @@ void __init pidhash_init(void)
 	pidhash_shift = min(12, pidhash_shift);
 	pidhash_size = 1 << pidhash_shift;
 
-	printk("PID hash table entries: %d (order %d: %Zd bytes)\n",
+	printk("PID hash table entries: %d (order: %d, %Zd bytes)\n",
 		pidhash_size, pidhash_shift,
-		pidhash_size * sizeof(struct list_head));
+		PIDTYPE_MAX * pidhash_size * sizeof(struct hlist_head));
 
 	for (i = 0; i < PIDTYPE_MAX; i++) {
 		pid_hash[i] = alloc_bootmem(pidhash_size *
@@ -292,7 +293,7 @@ void __init pidhash_init(void)
 		if (!pid_hash[i])
 			panic("Could not alloc pidhash!\n");
 		for (j = 0; j < pidhash_size; j++)
-			INIT_LIST_HEAD(&pid_hash[i][j]);
+			INIT_HLIST_HEAD(&pid_hash[i][j]);
 	}
 #ifdef CONFIG_KGDB
 	kgdb_pid_init_done++;
diff -puN include/linux/pid.h~pid-use-hlist include/linux/pid.h
--- linux-2.6/include/linux/pid.h~pid-use-hlist	2004-08-22 14:42:56.000000000 +1000
+++ linux-2.6-npiggin/include/linux/pid.h	2004-08-22 14:42:56.000000000 +1000
@@ -16,7 +16,7 @@ struct pid
 	atomic_t count;
 	struct task_struct *task;
 	struct list_head task_list;
-	struct list_head hash_chain;
+	struct hlist_node hash_chain;
 };
 
 struct pid_link

_

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] use hlist for pid hash
  2004-08-22  4:46 ` [PATCH 2/2] use hlist for pid hash Nick Piggin
@ 2004-08-22  5:00   ` David S. Miller
  2004-08-22  5:31     ` Nick Piggin
  2004-08-22  5:25   ` Ryan Cumming
  2004-09-06 11:35   ` [PATCH] Allocate correct amount of memory " Anton Blanchard
  2 siblings, 1 reply; 10+ messages in thread
From: David S. Miller @ 2004-08-22  5:00 UTC (permalink / raw)
  To: Nick Piggin; +Cc: akpm, linux-kernel

On Sun, 22 Aug 2004 14:46:38 +1000
Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> Any reason why this shouldn't be done? Anyone know of a decent test that
> stresses the pid hash?

I can't think of any way in which this could decrease
performance.  I highly recommend this patch :-)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] use hlist for pid hash
  2004-08-22  4:46 ` [PATCH 2/2] use hlist for pid hash Nick Piggin
  2004-08-22  5:00   ` David S. Miller
@ 2004-08-22  5:25   ` Ryan Cumming
  2004-08-22  5:26     ` Nick Piggin
  2004-09-06 11:35   ` [PATCH] Allocate correct amount of memory " Anton Blanchard
  2 siblings, 1 reply; 10+ messages in thread
From: Ryan Cumming @ 2004-08-22  5:25 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux kernel mailing list

[-- Attachment #1: Type: text/plain, Size: 513 bytes --]

On Saturday 21 August 2004 21:46, you wrote:
> This comes at the "expense" of
> 1. reintroducing the memory  prefetch into the hash traversal loop;
> 2. adding new pids to the head of the list instead of the tail. I
>    suspect that if this was a big problem then the hash isn't sized
>    well or could benefit from moving hot entries to the head.

It looks like the current code is already adding PIDs to the head: list_add() 
adds to the head of a list and list_add_tail() adds to the tail.

-Ryan

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] use hlist for pid hash
  2004-08-22  5:25   ` Ryan Cumming
@ 2004-08-22  5:26     ` Nick Piggin
  0 siblings, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2004-08-22  5:26 UTC (permalink / raw)
  To: Ryan Cumming; +Cc: linux kernel mailing list

Ryan Cumming wrote:
> On Saturday 21 August 2004 21:46, you wrote:
> 
>>This comes at the "expense" of
>>1. reintroducing the memory  prefetch into the hash traversal loop;
>>2. adding new pids to the head of the list instead of the tail. I
>>   suspect that if this was a big problem then the hash isn't sized
>>   well or could benefit from moving hot entries to the head.
> 
> 
> It looks like the current code is already adding PIDs to the head: list_add() 
> adds to the head of a list and list_add_tail() adds to the tail.
> 

Hmm, so it is. I must have been looking at the task_list list when I
was thinking that. Well, all the better.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] use hlist for pid hash
  2004-08-22  5:00   ` David S. Miller
@ 2004-08-22  5:31     ` Nick Piggin
  0 siblings, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2004-08-22  5:31 UTC (permalink / raw)
  To: David S. Miller; +Cc: akpm, linux-kernel, Ingo Molnar, William Lee Irwin III

[-- Attachment #1: Type: text/plain, Size: 986 bytes --]

David S. Miller wrote:
> On Sun, 22 Aug 2004 14:46:38 +1000
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
> 
>>Any reason why this shouldn't be done? Anyone know of a decent test that
>>stresses the pid hash?
> 
> 
> I can't think of any way in which this could decrease
> performance.  I highly recommend this patch :-)
> 

Oh good :)

The only thing that could hurt is that the hash list traversal in
find_pid now does prefetch.

This would be trivial to remove by introducing another list.h entity
similar to __list_for_each for hlists, however I would have thought
that if anything, the prefetch might help a tiny bit on the odd
workloads where the number of tasks is much greater than the number
of hash entries. Ingo? WLI?

I expect you could get a good bit of overlap on the prefetch if the
next hash pointer isn't in the same cacheline as ->nr... although,
on a 64-bit architecture with a 32byte cacheline size, this is
guaranteed to be the case. Why not put them together?

[-- Attachment #2: pid-search-cacheline.patch --]
[-- Type: text/x-patch, Size: 647 bytes --]




---

 linux-2.6-npiggin/include/linux/pid.h |    3 ++-
 1 files changed, 2 insertions(+), 1 deletion(-)

diff -puN include/linux/pid.h~pid-search-cacheline include/linux/pid.h
--- linux-2.6/include/linux/pid.h~pid-search-cacheline	2004-08-22 15:14:33.000000000 +1000
+++ linux-2.6-npiggin/include/linux/pid.h	2004-08-22 15:16:33.000000000 +1000
@@ -12,11 +12,12 @@ enum pid_type
 
 struct pid
 {
+	/* Try to keep hash_chain in the same cacheline as nr for find_pid */
+	struct hlist_node hash_chain;
 	int nr;
 	atomic_t count;
 	struct task_struct *task;
 	struct list_head task_list;
-	struct hlist_node hash_chain;
 };
 
 struct pid_link

_

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] fix PID hash sizing
  2004-08-22  4:44 [PATCH 1/2] fix PID hash sizing Nick Piggin
  2004-08-22  4:46 ` [PATCH 2/2] use hlist for pid hash Nick Piggin
@ 2004-08-22 12:32 ` William Lee Irwin III
  2004-08-23  0:16   ` Nick Piggin
  1 sibling, 1 reply; 10+ messages in thread
From: William Lee Irwin III @ 2004-08-22 12:32 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, linux-kernel

On Sun, Aug 22, 2004 at 02:44:46PM +1000, Nick Piggin wrote:
> I see PID hash sizing problems on an Opteron.
> I thought this got fixed a while ago? Hm.
> Export nr_kernel_pages, nr_all_pages. Use nr_kernel_pages when sizing
> the PID hash. This fixes a sizing problem I'm seeing with the x86-64 kernel
> on an Opteron.

Please describe the the pid hash sizing problem.


-- wli

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] fix PID hash sizing
  2004-08-22 12:32 ` [PATCH 1/2] fix PID hash sizing William Lee Irwin III
@ 2004-08-23  0:16   ` Nick Piggin
  0 siblings, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2004-08-23  0:16 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Andrew Morton, linux-kernel

William Lee Irwin III wrote:

>On Sun, Aug 22, 2004 at 02:44:46PM +1000, Nick Piggin wrote:
>
>>I see PID hash sizing problems on an Opteron.
>>I thought this got fixed a while ago? Hm.
>>Export nr_kernel_pages, nr_all_pages. Use nr_kernel_pages when sizing
>>the PID hash. This fixes a sizing problem I'm seeing with the x86-64 kernel
>>on an Opteron.
>>
>
>Please describe the the pid hash sizing problem.
>
>
>

max_pfn isn't set up at that time yet, I'm guessing. I didn't look
into it further than that.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] Allocate correct amount of memory for pid hash
  2004-08-22  4:46 ` [PATCH 2/2] use hlist for pid hash Nick Piggin
  2004-08-22  5:00   ` David S. Miller
  2004-08-22  5:25   ` Ryan Cumming
@ 2004-09-06 11:35   ` Anton Blanchard
  2004-09-06 11:41     ` Nick Piggin
  2 siblings, 1 reply; 10+ messages in thread
From: Anton Blanchard @ 2004-09-06 11:35 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, linux-kernel


Hi Nick,

> Use hlists for the PID hashes. This halves the memory footprint of these
> hashes. No benchmarks, but I think this is a worthy improvement because
> the hashes are something that would be likely to have significant portions
> loaded into the cache of every CPU on some workloads.
> 
> This comes at the "expense" of
> 	1. reintroducing the memory  prefetch into the hash traversal loop;
> 	2. adding new pids to the head of the list instead of the tail. I
> 	   suspect that if this was a big problem then the hash isn't sized
> 	   well or could benefit from moving hot entries to the head.
> 
> Also, account for all the pid hashes when reporting hash memory usage.

It looks like we are now allocating twice as much memory as required.
How does this look?

Signed-off-by: Anton Blanchard <anton@samba.org>

diff -puN kernel/pid.c~fix_freemem_reporting kernel/pid.c
--- foobar2/kernel/pid.c~fix_freemem_reporting	2004-09-06 21:17:34.185012321 +1000
+++ foobar2-anton/kernel/pid.c	2004-09-06 21:25:29.494818586 +1000
@@ -278,7 +278,7 @@ void __init pidhash_init(void)
 
 	for (i = 0; i < PIDTYPE_MAX; i++) {
 		pid_hash[i] = alloc_bootmem(pidhash_size *
-					sizeof(struct list_head));
+					sizeof(struct hlist_head));
 		if (!pid_hash[i])
 			panic("Could not alloc pidhash!\n");
 		for (j = 0; j < pidhash_size; j++)
_

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Allocate correct amount of memory for pid hash
  2004-09-06 11:35   ` [PATCH] Allocate correct amount of memory " Anton Blanchard
@ 2004-09-06 11:41     ` Nick Piggin
  0 siblings, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2004-09-06 11:41 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: Andrew Morton, linux-kernel

Anton Blanchard wrote:

>>Also, account for all the pid hashes when reporting hash memory usage.
> 
> 
> It looks like we are now allocating twice as much memory as required.
> How does this look?
> 

Fine. Good catch thanks Anton.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-09-06 11:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-22  4:44 [PATCH 1/2] fix PID hash sizing Nick Piggin
2004-08-22  4:46 ` [PATCH 2/2] use hlist for pid hash Nick Piggin
2004-08-22  5:00   ` David S. Miller
2004-08-22  5:31     ` Nick Piggin
2004-08-22  5:25   ` Ryan Cumming
2004-08-22  5:26     ` Nick Piggin
2004-09-06 11:35   ` [PATCH] Allocate correct amount of memory " Anton Blanchard
2004-09-06 11:41     ` Nick Piggin
2004-08-22 12:32 ` [PATCH 1/2] fix PID hash sizing William Lee Irwin III
2004-08-23  0:16   ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox