* [PATCH 1/2] fix PID hash sizing
@ 2004-08-22 4:44 Nick Piggin
2004-08-22 4:46 ` [PATCH 2/2] use hlist for pid hash Nick Piggin
2004-08-22 12:32 ` [PATCH 1/2] fix PID hash sizing William Lee Irwin III
0 siblings, 2 replies; 10+ messages in thread
From: Nick Piggin @ 2004-08-22 4:44 UTC (permalink / raw)
To: Andrew Morton, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 89 bytes --]
I see PID hash sizing problems on an Opteron.
I thought this got fixed a while ago? Hm.
[-- Attachment #2: pid-hash-alloc.patch --]
[-- Type: text/x-patch, Size: 2180 bytes --]
Export nr_kernel_pages, nr_all_pages. Use nr_kernel_pages when sizing
the PID hash. This fixes a sizing problem I'm seeing with the x86-64 kernel
on an Opteron.
---
linux-2.6-npiggin/include/linux/bootmem.h | 3 +++
linux-2.6-npiggin/kernel/pid.c | 2 +-
linux-2.6-npiggin/mm/page_alloc.c | 4 ++--
3 files changed, 6 insertions(+), 3 deletions(-)
diff -puN kernel/pid.c~pid-hash-alloc kernel/pid.c
--- linux-2.6/kernel/pid.c~pid-hash-alloc 2004-08-22 14:28:46.000000000 +1000
+++ linux-2.6-npiggin/kernel/pid.c 2004-08-22 14:34:13.000000000 +1000
@@ -276,7 +276,7 @@ int kgdb_pid_init_done; /* so we don't c
void __init pidhash_init(void)
{
int i, j, pidhash_size;
- unsigned long megabytes = max_pfn >> (20 - PAGE_SHIFT);
+ unsigned long megabytes = nr_kernel_pages >> (20 - PAGE_SHIFT);
pidhash_shift = max(4, fls(megabytes * 4));
pidhash_shift = min(12, pidhash_shift);
diff -puN include/linux/bootmem.h~pid-hash-alloc include/linux/bootmem.h
--- linux-2.6/include/linux/bootmem.h~pid-hash-alloc 2004-08-22 14:28:46.000000000 +1000
+++ linux-2.6-npiggin/include/linux/bootmem.h 2004-08-22 14:28:46.000000000 +1000
@@ -67,6 +67,9 @@ extern void * __init __alloc_bootmem_nod
__alloc_bootmem_node((pgdat), (x), PAGE_SIZE, 0)
#endif /* !CONFIG_HAVE_ARCH_BOOTMEM_NODE */
+extern unsigned long __initdata nr_kernel_pages;
+extern unsigned long __initdata nr_all_pages;
+
extern void *__init alloc_large_system_hash(const char *tablename,
unsigned long bucketsize,
unsigned long numentries,
diff -puN mm/page_alloc.c~pid-hash-alloc mm/page_alloc.c
--- linux-2.6/mm/page_alloc.c~pid-hash-alloc 2004-08-22 14:29:02.000000000 +1000
+++ linux-2.6-npiggin/mm/page_alloc.c 2004-08-22 13:41:11.000000000 +1000
@@ -55,8 +55,8 @@ EXPORT_SYMBOL(zone_table);
static char *zone_names[MAX_NR_ZONES] = { "DMA", "Normal", "HighMem" };
int min_free_kbytes = 1024;
-static unsigned long __initdata nr_kernel_pages;
-static unsigned long __initdata nr_all_pages;
+unsigned long __initdata nr_kernel_pages;
+unsigned long __initdata nr_all_pages;
/*
* Temporary debugging check for pages not lying within a given zone.
_
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2] use hlist for pid hash
2004-08-22 4:44 [PATCH 1/2] fix PID hash sizing Nick Piggin
@ 2004-08-22 4:46 ` Nick Piggin
2004-08-22 5:00 ` David S. Miller
` (2 more replies)
2004-08-22 12:32 ` [PATCH 1/2] fix PID hash sizing William Lee Irwin III
1 sibling, 3 replies; 10+ messages in thread
From: Nick Piggin @ 2004-08-22 4:46 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 96 bytes --]
Any reason why this shouldn't be done? Anyone know of a decent test that
stresses the pid hash?
[-- Attachment #2: pid-use-hlist.patch --]
[-- Type: text/x-patch, Size: 3475 bytes --]
Use hlists for the PID hashes. This halves the memory footprint of these
hashes. No benchmarks, but I think this is a worthy improvement because
the hashes are something that would be likely to have significant portions
loaded into the cache of every CPU on some workloads.
This comes at the "expense" of
1. reintroducing the memory prefetch into the hash traversal loop;
2. adding new pids to the head of the list instead of the tail. I
suspect that if this was a big problem then the hash isn't sized
well or could benefit from moving hot entries to the head.
Also, account for all the pid hashes when reporting hash memory usage.
---
linux-2.6-npiggin/include/linux/pid.h | 2 +-
linux-2.6-npiggin/kernel/pid.c | 19 ++++++++++---------
2 files changed, 11 insertions(+), 10 deletions(-)
diff -puN kernel/pid.c~pid-use-hlist kernel/pid.c
--- linux-2.6/kernel/pid.c~pid-use-hlist 2004-08-22 14:42:56.000000000 +1000
+++ linux-2.6-npiggin/kernel/pid.c 2004-08-22 14:42:56.000000000 +1000
@@ -27,7 +27,7 @@
#include <linux/hash.h>
#define pid_hashfn(nr) hash_long((unsigned long)nr, pidhash_shift)
-static struct list_head *pid_hash[PIDTYPE_MAX];
+static struct hlist_head *pid_hash[PIDTYPE_MAX];
static int pidhash_shift;
int pid_max = PID_MAX_DEFAULT;
@@ -150,11 +150,11 @@ failure:
fastcall struct pid *find_pid(enum pid_type type, int nr)
{
- struct list_head *elem, *bucket = &pid_hash[type][pid_hashfn(nr)];
+ struct hlist_node *elem;
struct pid *pid;
- __list_for_each(elem, bucket) {
- pid = list_entry(elem, struct pid, hash_chain);
+ hlist_for_each_entry(pid, elem,
+ &pid_hash[type][pid_hashfn(nr)], hash_chain) {
if (pid->nr == nr)
return pid;
}
@@ -181,7 +181,8 @@ int fastcall attach_pid(task_t *task, en
INIT_LIST_HEAD(&pid->task_list);
pid->task = task;
get_task_struct(task);
- list_add(&pid->hash_chain, &pid_hash[type][pid_hashfn(nr)]);
+ hlist_add_head(&pid->hash_chain,
+ &pid_hash[type][pid_hashfn(nr)]);
}
list_add_tail(&task->pids[type].pid_chain, &pid->task_list);
task->pids[type].pidptr = pid;
@@ -200,7 +201,7 @@ static inline int __detach_pid(task_t *t
return 0;
nr = pid->nr;
- list_del(&pid->hash_chain);
+ hlist_del(&pid->hash_chain);
put_task_struct(pid->task);
return nr;
@@ -282,9 +283,9 @@ void __init pidhash_init(void)
pidhash_shift = min(12, pidhash_shift);
pidhash_size = 1 << pidhash_shift;
- printk("PID hash table entries: %d (order %d: %Zd bytes)\n",
+ printk("PID hash table entries: %d (order: %d, %Zd bytes)\n",
pidhash_size, pidhash_shift,
- pidhash_size * sizeof(struct list_head));
+ PIDTYPE_MAX * pidhash_size * sizeof(struct hlist_head));
for (i = 0; i < PIDTYPE_MAX; i++) {
pid_hash[i] = alloc_bootmem(pidhash_size *
@@ -292,7 +293,7 @@ void __init pidhash_init(void)
if (!pid_hash[i])
panic("Could not alloc pidhash!\n");
for (j = 0; j < pidhash_size; j++)
- INIT_LIST_HEAD(&pid_hash[i][j]);
+ INIT_HLIST_HEAD(&pid_hash[i][j]);
}
#ifdef CONFIG_KGDB
kgdb_pid_init_done++;
diff -puN include/linux/pid.h~pid-use-hlist include/linux/pid.h
--- linux-2.6/include/linux/pid.h~pid-use-hlist 2004-08-22 14:42:56.000000000 +1000
+++ linux-2.6-npiggin/include/linux/pid.h 2004-08-22 14:42:56.000000000 +1000
@@ -16,7 +16,7 @@ struct pid
atomic_t count;
struct task_struct *task;
struct list_head task_list;
- struct list_head hash_chain;
+ struct hlist_node hash_chain;
};
struct pid_link
_
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] use hlist for pid hash
2004-08-22 4:46 ` [PATCH 2/2] use hlist for pid hash Nick Piggin
@ 2004-08-22 5:00 ` David S. Miller
2004-08-22 5:31 ` Nick Piggin
2004-08-22 5:25 ` Ryan Cumming
2004-09-06 11:35 ` [PATCH] Allocate correct amount of memory " Anton Blanchard
2 siblings, 1 reply; 10+ messages in thread
From: David S. Miller @ 2004-08-22 5:00 UTC (permalink / raw)
To: Nick Piggin; +Cc: akpm, linux-kernel
On Sun, 22 Aug 2004 14:46:38 +1000
Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> Any reason why this shouldn't be done? Anyone know of a decent test that
> stresses the pid hash?
I can't think of any way in which this could decrease
performance. I highly recommend this patch :-)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] use hlist for pid hash
2004-08-22 4:46 ` [PATCH 2/2] use hlist for pid hash Nick Piggin
2004-08-22 5:00 ` David S. Miller
@ 2004-08-22 5:25 ` Ryan Cumming
2004-08-22 5:26 ` Nick Piggin
2004-09-06 11:35 ` [PATCH] Allocate correct amount of memory " Anton Blanchard
2 siblings, 1 reply; 10+ messages in thread
From: Ryan Cumming @ 2004-08-22 5:25 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux kernel mailing list
[-- Attachment #1: Type: text/plain, Size: 513 bytes --]
On Saturday 21 August 2004 21:46, you wrote:
> This comes at the "expense" of
> 1. reintroducing the memory prefetch into the hash traversal loop;
> 2. adding new pids to the head of the list instead of the tail. I
> suspect that if this was a big problem then the hash isn't sized
> well or could benefit from moving hot entries to the head.
It looks like the current code is already adding PIDs to the head: list_add()
adds to the head of a list and list_add_tail() adds to the tail.
-Ryan
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] use hlist for pid hash
2004-08-22 5:25 ` Ryan Cumming
@ 2004-08-22 5:26 ` Nick Piggin
0 siblings, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2004-08-22 5:26 UTC (permalink / raw)
To: Ryan Cumming; +Cc: linux kernel mailing list
Ryan Cumming wrote:
> On Saturday 21 August 2004 21:46, you wrote:
>
>>This comes at the "expense" of
>>1. reintroducing the memory prefetch into the hash traversal loop;
>>2. adding new pids to the head of the list instead of the tail. I
>> suspect that if this was a big problem then the hash isn't sized
>> well or could benefit from moving hot entries to the head.
>
>
> It looks like the current code is already adding PIDs to the head: list_add()
> adds to the head of a list and list_add_tail() adds to the tail.
>
Hmm, so it is. I must have been looking at the task_list list when I
was thinking that. Well, all the better.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] use hlist for pid hash
2004-08-22 5:00 ` David S. Miller
@ 2004-08-22 5:31 ` Nick Piggin
0 siblings, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2004-08-22 5:31 UTC (permalink / raw)
To: David S. Miller; +Cc: akpm, linux-kernel, Ingo Molnar, William Lee Irwin III
[-- Attachment #1: Type: text/plain, Size: 986 bytes --]
David S. Miller wrote:
> On Sun, 22 Aug 2004 14:46:38 +1000
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
>
>>Any reason why this shouldn't be done? Anyone know of a decent test that
>>stresses the pid hash?
>
>
> I can't think of any way in which this could decrease
> performance. I highly recommend this patch :-)
>
Oh good :)
The only thing that could hurt is that the hash list traversal in
find_pid now does prefetch.
This would be trivial to remove by introducing another list.h entity
similar to __list_for_each for hlists, however I would have thought
that if anything, the prefetch might help a tiny bit on the odd
workloads where the number of tasks is much greater than the number
of hash entries. Ingo? WLI?
I expect you could get a good bit of overlap on the prefetch if the
next hash pointer isn't in the same cacheline as ->nr... although,
on a 64-bit architecture with a 32byte cacheline size, this is
guaranteed to be the case. Why not put them together?
[-- Attachment #2: pid-search-cacheline.patch --]
[-- Type: text/x-patch, Size: 647 bytes --]
---
linux-2.6-npiggin/include/linux/pid.h | 3 ++-
1 files changed, 2 insertions(+), 1 deletion(-)
diff -puN include/linux/pid.h~pid-search-cacheline include/linux/pid.h
--- linux-2.6/include/linux/pid.h~pid-search-cacheline 2004-08-22 15:14:33.000000000 +1000
+++ linux-2.6-npiggin/include/linux/pid.h 2004-08-22 15:16:33.000000000 +1000
@@ -12,11 +12,12 @@ enum pid_type
struct pid
{
+ /* Try to keep hash_chain in the same cacheline as nr for find_pid */
+ struct hlist_node hash_chain;
int nr;
atomic_t count;
struct task_struct *task;
struct list_head task_list;
- struct hlist_node hash_chain;
};
struct pid_link
_
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] fix PID hash sizing
2004-08-22 4:44 [PATCH 1/2] fix PID hash sizing Nick Piggin
2004-08-22 4:46 ` [PATCH 2/2] use hlist for pid hash Nick Piggin
@ 2004-08-22 12:32 ` William Lee Irwin III
2004-08-23 0:16 ` Nick Piggin
1 sibling, 1 reply; 10+ messages in thread
From: William Lee Irwin III @ 2004-08-22 12:32 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, linux-kernel
On Sun, Aug 22, 2004 at 02:44:46PM +1000, Nick Piggin wrote:
> I see PID hash sizing problems on an Opteron.
> I thought this got fixed a while ago? Hm.
> Export nr_kernel_pages, nr_all_pages. Use nr_kernel_pages when sizing
> the PID hash. This fixes a sizing problem I'm seeing with the x86-64 kernel
> on an Opteron.
Please describe the the pid hash sizing problem.
-- wli
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] fix PID hash sizing
2004-08-22 12:32 ` [PATCH 1/2] fix PID hash sizing William Lee Irwin III
@ 2004-08-23 0:16 ` Nick Piggin
0 siblings, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2004-08-23 0:16 UTC (permalink / raw)
To: William Lee Irwin III; +Cc: Andrew Morton, linux-kernel
William Lee Irwin III wrote:
>On Sun, Aug 22, 2004 at 02:44:46PM +1000, Nick Piggin wrote:
>
>>I see PID hash sizing problems on an Opteron.
>>I thought this got fixed a while ago? Hm.
>>Export nr_kernel_pages, nr_all_pages. Use nr_kernel_pages when sizing
>>the PID hash. This fixes a sizing problem I'm seeing with the x86-64 kernel
>>on an Opteron.
>>
>
>Please describe the the pid hash sizing problem.
>
>
>
max_pfn isn't set up at that time yet, I'm guessing. I didn't look
into it further than that.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] Allocate correct amount of memory for pid hash
2004-08-22 4:46 ` [PATCH 2/2] use hlist for pid hash Nick Piggin
2004-08-22 5:00 ` David S. Miller
2004-08-22 5:25 ` Ryan Cumming
@ 2004-09-06 11:35 ` Anton Blanchard
2004-09-06 11:41 ` Nick Piggin
2 siblings, 1 reply; 10+ messages in thread
From: Anton Blanchard @ 2004-09-06 11:35 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, linux-kernel
Hi Nick,
> Use hlists for the PID hashes. This halves the memory footprint of these
> hashes. No benchmarks, but I think this is a worthy improvement because
> the hashes are something that would be likely to have significant portions
> loaded into the cache of every CPU on some workloads.
>
> This comes at the "expense" of
> 1. reintroducing the memory prefetch into the hash traversal loop;
> 2. adding new pids to the head of the list instead of the tail. I
> suspect that if this was a big problem then the hash isn't sized
> well or could benefit from moving hot entries to the head.
>
> Also, account for all the pid hashes when reporting hash memory usage.
It looks like we are now allocating twice as much memory as required.
How does this look?
Signed-off-by: Anton Blanchard <anton@samba.org>
diff -puN kernel/pid.c~fix_freemem_reporting kernel/pid.c
--- foobar2/kernel/pid.c~fix_freemem_reporting 2004-09-06 21:17:34.185012321 +1000
+++ foobar2-anton/kernel/pid.c 2004-09-06 21:25:29.494818586 +1000
@@ -278,7 +278,7 @@ void __init pidhash_init(void)
for (i = 0; i < PIDTYPE_MAX; i++) {
pid_hash[i] = alloc_bootmem(pidhash_size *
- sizeof(struct list_head));
+ sizeof(struct hlist_head));
if (!pid_hash[i])
panic("Could not alloc pidhash!\n");
for (j = 0; j < pidhash_size; j++)
_
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Allocate correct amount of memory for pid hash
2004-09-06 11:35 ` [PATCH] Allocate correct amount of memory " Anton Blanchard
@ 2004-09-06 11:41 ` Nick Piggin
0 siblings, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2004-09-06 11:41 UTC (permalink / raw)
To: Anton Blanchard; +Cc: Andrew Morton, linux-kernel
Anton Blanchard wrote:
>>Also, account for all the pid hashes when reporting hash memory usage.
>
>
> It looks like we are now allocating twice as much memory as required.
> How does this look?
>
Fine. Good catch thanks Anton.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2004-09-06 11:41 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-22 4:44 [PATCH 1/2] fix PID hash sizing Nick Piggin
2004-08-22 4:46 ` [PATCH 2/2] use hlist for pid hash Nick Piggin
2004-08-22 5:00 ` David S. Miller
2004-08-22 5:31 ` Nick Piggin
2004-08-22 5:25 ` Ryan Cumming
2004-08-22 5:26 ` Nick Piggin
2004-09-06 11:35 ` [PATCH] Allocate correct amount of memory " Anton Blanchard
2004-09-06 11:41 ` Nick Piggin
2004-08-22 12:32 ` [PATCH 1/2] fix PID hash sizing William Lee Irwin III
2004-08-23 0:16 ` Nick Piggin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox