* [PATCH] Making i/dhash_entries cmdline work as it use to.
@ 2004-07-12 17:56 Jose R. Santos
2004-07-13 2:37 ` Jose R. Santos
2004-07-13 10:29 ` David Howells
0 siblings, 2 replies; 9+ messages in thread
From: Jose R. Santos @ 2004-07-12 17:56 UTC (permalink / raw)
To: dhowells, akpm, linux-kernel, slprat
Hi David,
I was looking at your patch for >MAX_ORDER hash tables but it seems that
the patch limits the number of entries to what it thinks are good values
and the i/dhash_entries cmdline options can not exceed this. Any reason
for this? This seems to limit the usability of the patch on systems were
larger allocations that the ones the kernel calculates are desired.
Also, any particular reason why MAX_SYS_HASH_TABLE_ORDER was set to 14?
I am already seeing the need to go higher on my 64GB setup and was
wondering if this could be bumped up to 19.
I'm sending a patch that get the cmdline options working as the did before
where the could override the kernel calculations and increases
MAX_SYS_HASH_TABLE_ORDER to 19. Only tested on PPC64 at the moment.
Andrew - Could you consider this for inclusion into mainline?
Thanks,
-JRS
# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
# ChangeSet 1.1676 -> 1.1677
# mm/page_alloc.c 1.200 -> 1.201
# include/linux/mmzone.h 1.58 -> 1.59
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 04/07/12 jsantos@rx8.austin.ibm.com 1.1677
# - Make ihash_entries and dhash_entries cmdline options behave like they use to.
# - Change MAX_SYS_HASH_TABLE_ORDER to allow very large hashes on very LARGE memory systems.
# --------------------------------------------
#
diff -Nru a/include/linux/mmzone.h b/include/linux/mmzone.h
--- a/include/linux/mmzone.h Mon Jul 12 12:53:29 2004
+++ b/include/linux/mmzone.h Mon Jul 12 12:53:29 2004
@@ -26,10 +26,10 @@
* permitted by MAX_ORDER, so we allocate with the bootmem allocator, and are
* limited to this size
*/
-#if MAX_ORDER > 14
+#if MAX_ORDER > 19
#define MAX_SYS_HASH_TABLE_ORDER MAX_ORDER
#else
-#define MAX_SYS_HASH_TABLE_ORDER 14
+#define MAX_SYS_HASH_TABLE_ORDER 19
#endif
struct free_area {
diff -Nru a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c Mon Jul 12 12:53:29 2004
+++ b/mm/page_alloc.c Mon Jul 12 12:53:29 2004
@@ -2000,31 +2000,30 @@
unsigned int *_hash_shift,
unsigned int *_hash_mask)
{
- unsigned long mem, max, log2qty, size;
+ unsigned long max, log2qty, size;
void *table;
- /* round applicable memory size up to nearest megabyte */
- mem = consider_highmem ? nr_all_pages : nr_kernel_pages;
- mem += (1UL << (20 - PAGE_SHIFT)) - 1;
- mem >>= 20 - PAGE_SHIFT;
- mem <<= 20 - PAGE_SHIFT;
-
- /* limit to 1 bucket per 2^scale bytes of low memory (rounded up to
- * nearest power of 2 in size) */
- if (scale > PAGE_SHIFT)
- mem >>= (scale - PAGE_SHIFT);
- else
- mem <<= (PAGE_SHIFT - scale);
-
- mem = 1UL << (long_log2(mem) + 1);
+ /* allow the kernel cmdline to have a say */
+ if (!numentries) {
+ /* round applicable memory size up to nearest megabyte */
+ numentries = consider_highmem ? nr_all_pages : nr_kernel_pages;
+ numentries += (1UL << (20 - PAGE_SHIFT)) - 1;
+ numentries >>= 20 - PAGE_SHIFT;
+ numentries <<= 20 - PAGE_SHIFT;
+ /* limit to 1 bucket per 2^scale bytes of low memory */
+ if (scale > PAGE_SHIFT)
+ numentries >>= (scale - PAGE_SHIFT);
+ else
+ numentries <<= (PAGE_SHIFT - scale);
+ }
+ /* rounded up to nearest power of 2 in size) */
+ numentries = 1UL << (long_log2(numentries) + 1);
+
/* limit allocation size */
max = (1UL << (PAGE_SHIFT + MAX_SYS_HASH_TABLE_ORDER)) / bucketsize;
- if (max > mem)
- max = mem;
- /* allow the kernel cmdline to have a say */
- if (!numentries || numentries > max)
+ if (numentries > max)
numentries = max;
log2qty = long_log2(numentries);
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Making i/dhash_entries cmdline work as it use to.
2004-07-12 17:56 [PATCH] Making i/dhash_entries cmdline work as it use to Jose R. Santos
@ 2004-07-13 2:37 ` Jose R. Santos
2004-07-13 2:56 ` Jose R. Santos
2004-07-13 6:25 ` Dave Hansen
2004-07-13 10:29 ` David Howells
1 sibling, 2 replies; 9+ messages in thread
From: Jose R. Santos @ 2004-07-13 2:37 UTC (permalink / raw)
To: dhowells, akpm, linux-kernel, slpratt
* Jose R. Santos <jrsantos@austin.ibm.com> [2004-07-12 12:56:05 -0500]:
> Also, any particular reason why MAX_SYS_HASH_TABLE_ORDER was set to 14?
> I am already seeing the need to go higher on my 64GB setup and was
> wondering if this could be bumped up to 19.
Actualy, it doesnt look like we need MAX_SYS_HASH_TABLE_ORDER at all so
I'm resending the patch which now limits the max size of a hash table to
1/16 total memory pages. This would keep people from doing dangerous
things when using the hash_entries.
-JRS
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Making i/dhash_entries cmdline work as it use to.
2004-07-13 2:37 ` Jose R. Santos
@ 2004-07-13 2:56 ` Jose R. Santos
2004-07-13 12:33 ` David Howells
2004-07-13 6:25 ` Dave Hansen
1 sibling, 1 reply; 9+ messages in thread
From: Jose R. Santos @ 2004-07-13 2:56 UTC (permalink / raw)
To: dhowells, akpm, linux-kernel, slpratt
[-- Attachment #1: Type: text/plain, Size: 370 bytes --]
* Jose R. Santos <jrsantos@austin.ibm.com> [2004-07-12 21:37:21 -0500]:
> Actualy, it doesnt look like we need MAX_SYS_HASH_TABLE_ORDER at all so
> I'm resending the patch which now limits the max size of a hash table to
> 1/16 total memory pages. This would keep people from doing dangerous
> things when using the hash_entries.
Forgot to attache the patch. :)
-JRS
[-- Attachment #2: hash_entries_cmdline_first2.patch --]
[-- Type: text/plain, Size: 3177 bytes --]
# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
# ChangeSet 1.1678 -> 1.1679
# mm/page_alloc.c 1.202 -> 1.203
# include/linux/mmzone.h 1.60 -> 1.61
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 04/07/12 jsantos@rx8.austin.ibm.com 1.1679
# - Make ihash_entries and dhash_entries cmdline option behave like it use to.
# - Remove MAX_SYS_HASH_TABLE_ORDER. Limit the max size to 1/16 the total number of pages.
# --------------------------------------------
#
diff -Nru a/include/linux/mmzone.h b/include/linux/mmzone.h
--- a/include/linux/mmzone.h Mon Jul 12 21:17:10 2004
+++ b/include/linux/mmzone.h Mon Jul 12 21:17:10 2004
@@ -20,18 +20,6 @@
#define MAX_ORDER CONFIG_FORCE_MAX_ZONEORDER
#endif
-/*
- * system hash table size limits
- * - on large memory machines, we may want to allocate a bigger hash than that
- * permitted by MAX_ORDER, so we allocate with the bootmem allocator, and are
- * limited to this size
- */
-#if MAX_ORDER > 14
-#define MAX_SYS_HASH_TABLE_ORDER MAX_ORDER
-#else
-#define MAX_SYS_HASH_TABLE_ORDER 14
-#endif
-
struct free_area {
struct list_head free_list;
unsigned long *map;
diff -Nru a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c Mon Jul 12 21:17:10 2004
+++ b/mm/page_alloc.c Mon Jul 12 21:17:10 2004
@@ -2000,31 +2000,30 @@
unsigned int *_hash_shift,
unsigned int *_hash_mask)
{
- unsigned long mem, max, log2qty, size;
+ unsigned long max, log2qty, size;
void *table;
- /* round applicable memory size up to nearest megabyte */
- mem = consider_highmem ? nr_all_pages : nr_kernel_pages;
- mem += (1UL << (20 - PAGE_SHIFT)) - 1;
- mem >>= 20 - PAGE_SHIFT;
- mem <<= 20 - PAGE_SHIFT;
-
- /* limit to 1 bucket per 2^scale bytes of low memory (rounded up to
- * nearest power of 2 in size) */
- if (scale > PAGE_SHIFT)
- mem >>= (scale - PAGE_SHIFT);
- else
- mem <<= (PAGE_SHIFT - scale);
-
- mem = 1UL << (long_log2(mem) + 1);
+ /* allow the kernel cmdline to have a say */
+ if (!numentries) {
+ /* round applicable memory size up to nearest megabyte */
+ numentries = consider_highmem ? nr_all_pages : nr_kernel_pages;
+ numentries += (1UL << (20 - PAGE_SHIFT)) - 1;
+ numentries >>= 20 - PAGE_SHIFT;
+ numentries <<= 20 - PAGE_SHIFT;
- /* limit allocation size */
- max = (1UL << (PAGE_SHIFT + MAX_SYS_HASH_TABLE_ORDER)) / bucketsize;
- if (max > mem)
- max = mem;
+ /* limit to 1 bucket per 2^scale bytes of low memory */
+ if (scale > PAGE_SHIFT)
+ numentries >>= (scale - PAGE_SHIFT);
+ else
+ numentries <<= (PAGE_SHIFT - scale);
+ }
+ /* rounded up to nearest power of 2 in size */
+ numentries = 1UL << (long_log2(numentries) + 1);
+
+ /* limit allocation size to 1/16 total memory */
+ max = ((nr_all_pages << PAGE_SHIFT)/16) / bucketsize;
- /* allow the kernel cmdline to have a say */
- if (!numentries || numentries > max)
+ if (numentries > max)
numentries = max;
log2qty = long_log2(numentries);
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Making i/dhash_entries cmdline work as it use to.
2004-07-13 2:37 ` Jose R. Santos
2004-07-13 2:56 ` Jose R. Santos
@ 2004-07-13 6:25 ` Dave Hansen
2004-07-13 12:59 ` Jose R. Santos
1 sibling, 1 reply; 9+ messages in thread
From: Dave Hansen @ 2004-07-13 6:25 UTC (permalink / raw)
To: Jose R. Santos
Cc: dhowells, Andrew Morton, Linux Kernel Mailing List, slpratt
On Mon, 2004-07-12 at 19:37, Jose R. Santos wrote:
> * Jose R. Santos <jrsantos@austin.ibm.com> [2004-07-12 12:56:05 -0500]:
> > Also, any particular reason why MAX_SYS_HASH_TABLE_ORDER was set to 14?
> > I am already seeing the need to go higher on my 64GB setup and was
> > wondering if this could be bumped up to 19.
>
> Actualy, it doesnt look like we need MAX_SYS_HASH_TABLE_ORDER at all so
> I'm resending the patch which now limits the max size of a hash table to
> 1/16 total memory pages. This would keep people from doing dangerous
> things when using the hash_entries.
Do you worry that you're just expanding the hash table as long as it
gives benefit on your one, single benchmark? There have to be plenty of
other workloads that could benefit from an extra 1/16th more memory.
Not every workload is as dcache heavy as SFS.
-- Dave
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Making i/dhash_entries cmdline work as it use to.
2004-07-12 17:56 [PATCH] Making i/dhash_entries cmdline work as it use to Jose R. Santos
2004-07-13 2:37 ` Jose R. Santos
@ 2004-07-13 10:29 ` David Howells
2004-07-13 13:17 ` Jose R. Santos
1 sibling, 1 reply; 9+ messages in thread
From: David Howells @ 2004-07-13 10:29 UTC (permalink / raw)
To: Jose R. Santos; +Cc: akpm, linux-kernel, slprat
Jose R. Santos <jrsantos@austin.ibm.com> wrote:
> Also, any particular reason why MAX_SYS_HASH_TABLE_ORDER was set to 14?
> I am already seeing the need to go higher on my 64GB setup and was
> wondering if this could be bumped up to 19.
Yes. IBM did some testing and found that was about optimal. No significant
gain was found with anything greater.
> I'm sending a patch that get the cmdline options working as the did before
> where the could override the kernel calculations and increases
> MAX_SYS_HASH_TABLE_ORDER to 19. Only tested on PPC64 at the moment.
You need to be careful increasing the maximum order - you have to remember
that this affects several tables (well, at least two at the moment), and so
the effect is multiplied.
It may be reasonable to let the kernel cmdline override the maximum number of
buckets calculated on the scaling factor provided to the function (effectively
number of buckets per unit memory), but consider that the number of objects
that can be allocated and linked into the table is in effect governed by such
a factor.
David
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Making i/dhash_entries cmdline work as it use to.
2004-07-13 2:56 ` Jose R. Santos
@ 2004-07-13 12:33 ` David Howells
2004-07-13 12:54 ` Jose R. Santos
0 siblings, 1 reply; 9+ messages in thread
From: David Howells @ 2004-07-13 12:33 UTC (permalink / raw)
To: Jose R. Santos; +Cc: akpm, linux-kernel, slpratt
> Actualy, it doesnt look like we need MAX_SYS_HASH_TABLE_ORDER at all so
> I'm resending the patch which now limits the max size of a hash table to
> 1/16 total memory pages. This would keep people from doing dangerous
> things when using the hash_entries.
That's an enormous limit. Consider the fact that you will have a multiplicity
of such hash tables, each potentially eating 1/16th of your total memory
(remember, the bootmem allocator's only real limit is how big a chunk of
memory it can allocate in one go).
Do you have numbers to show that committing an eighth of your memory (8GB if
you have 64GB - two hash tables at 4GB apiece) to hash tables is almost
certainly not worth it.
For instance, the dcache:
(*) Assuming 64GB of memory, of which 1/16th allocated to the dcache
hash table and 1/16th allocated to the inode hash table.
(*) dcache hash buckets are 16 bytes apiece on PPC64.
(*) 4GB hash table gives you 256 Meg hash buckets.
(gdb) p ((4ULL<<30)/16) / 1024 / 1024
$1 = 256
(*) The dentry struct on PPC64 is approx 256 bytes in size.
(*) The remaining 56GB could provide 224 Meg dentries if allocated to
nothing else.
(gdb) p (56ULL<<30) / 256 / 1024 / 1024
$2 = 224
(*) So you'd end up with more buckets in the hash table than you could
possibly allocate dentry structures.
David
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Making i/dhash_entries cmdline work as it use to.
2004-07-13 12:33 ` David Howells
@ 2004-07-13 12:54 ` Jose R. Santos
0 siblings, 0 replies; 9+ messages in thread
From: Jose R. Santos @ 2004-07-13 12:54 UTC (permalink / raw)
To: David Howells; +Cc: Jose R. Santos, akpm, linux-kernel, slpratt
On 07/13/04 07:33:40, David Howells wrote:
> That's an enormous limit. Consider the fact that you will have a multiplicity
> of such hash tables, each potentially eating 1/16th of your total memory
> (remember, the bootmem allocator's only real limit is how big a chunk of
> memory it can allocate in one go).
>
> Do you have numbers to show that committing an eighth of your memory (8GB if
> you have 64GB - two hash tables at 4GB apiece) to hash tables is almost
> certainly not worth it.
I do not use all that memory but this is just the absolute limit that you can
allocate. The point is not to limit it to ORDER 14 because thats the most
gains seen on one 64GB setup. The idea is to give people some room to play
when they use the cmdline if for some reason they need to go that hi.
-JRS
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Making i/dhash_entries cmdline work as it use to.
2004-07-13 6:25 ` Dave Hansen
@ 2004-07-13 12:59 ` Jose R. Santos
0 siblings, 0 replies; 9+ messages in thread
From: Jose R. Santos @ 2004-07-13 12:59 UTC (permalink / raw)
To: Dave Hansen
Cc: Jose R. Santos, dhowells, Andrew Morton,
Linux Kernel Mailing List, slpratt
On 07/13/04 01:25:10, Dave Hansen wrote:
> Do you worry that you're just expanding the hash table as long as it
> gives benefit on your one, single benchmark? There have to be plenty of
> other workloads that could benefit from an extra 1/16th more memory.
> Not every workload is as dcache heavy as SFS.
This does not change the normal behavior of the kernel. This only
has effect when you use the cmdline options to increase it. I only
use 1/256 of my memory for hashes but there might be someone else
crazier than me that wants more. Of course the 1/16 number could be
change to something a lot smaller but i should give people some room
to play with.
-JRS
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Making i/dhash_entries cmdline work as it use to.
2004-07-13 10:29 ` David Howells
@ 2004-07-13 13:17 ` Jose R. Santos
0 siblings, 0 replies; 9+ messages in thread
From: Jose R. Santos @ 2004-07-13 13:17 UTC (permalink / raw)
To: David Howells; +Cc: Jose R. Santos, akpm, linux-kernel, slprat
On 07/13/04 05:29:13, David Howells wrote:
>
> Jose R. Santos <jrsantos@austin.ibm.com> wrote:
> > Also, any particular reason why MAX_SYS_HASH_TABLE_ORDER was set to 14?
> > I am already seeing the need to go higher on my 64GB setup and was
> > wondering if this could be bumped up to 19.
>
> Yes. IBM did some testing and found that was about optimal. No significant
> gain was found with anything greater.
On a single setup. What about people that want to use Linux on a 128way with
over a terabyte of memory. Certainly ORDER 14 might be to small for them.
> > I'm sending a patch that get the cmdline options working as the did before
> > where the could override the kernel calculations and increases
> > MAX_SYS_HASH_TABLE_ORDER to 19. Only tested on PPC64 at the moment.
>
> You need to be careful increasing the maximum order - you have to remember
> that this affects several tables (well, at least two at the moment), and so
> the effect is multiplied.
Only if I use the cmdline option. With no command line arguments this allocate
the kernels sain calculated defaults.
> It may be reasonable to let the kernel cmdline override the maximum number of
> buckets calculated on the scaling factor provided to the function (effectively
> number of buckets per unit memory), but consider that the number of objects
> that can be allocated and linked into the table is in effect governed by such
> a factor.
It seems easier to specify a the number of buckets you want than specifying a
scaling factor, which some people may have problems figuring out.
-JRS
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2004-07-13 13:17 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-12 17:56 [PATCH] Making i/dhash_entries cmdline work as it use to Jose R. Santos
2004-07-13 2:37 ` Jose R. Santos
2004-07-13 2:56 ` Jose R. Santos
2004-07-13 12:33 ` David Howells
2004-07-13 12:54 ` Jose R. Santos
2004-07-13 6:25 ` Dave Hansen
2004-07-13 12:59 ` Jose R. Santos
2004-07-13 10:29 ` David Howells
2004-07-13 13:17 ` Jose R. Santos
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.