From: wli@holomorphy.com
To: "Richard B. Johnson" <root@chaos.analogic.com>
Cc: Andrea Arcangeli <andrea@suse.de>,
wli@parcelfarce.linux.theplanet.co.uk,
linux-kernel@vger.kernel.org, riel@surriel.com,
hch@infradead.org, phillips@bonn-fries.net
Subject: Re: 2.4.19pre2aa1
Date: Wed, 13 Mar 2002 22:10:22 +0000 [thread overview]
Message-ID: <20020313221022.H14628@holomorphy.com> (raw)
In-Reply-To: <20020313021838.G14628@holomorphy.com> <Pine.LNX.3.95.1020313134921.28928A-100000@chaos.analogic.com>
In-Reply-To: <Pine.LNX.3.95.1020313134921.28928A-100000@chaos.analogic.com>; from root@chaos.analogic.com on Wed, Mar 13, 2002 at 02:06:48PM -0500
On Wed, Mar 13, 2002 at 02:06:48PM -0500, Richard B. Johnson wrote:
> [SNIPPED..]
Nice move. No one will have the foggiest idea without hunting for
my prior message whether your comments on what I said are accurate.
On Wed, Mar 13, 2002 at 02:06:48PM -0500, Richard B. Johnson wrote:
> You might want to look at www.eece.unm.edu/faculty/heileman/hash/hash.html
> rather than assuming everyone is uninformed. Source-code is provided
> for several hashing functions as well as source-code for tests. This
> is a relatively old reference although it addresses both the chaos
> and fractal methods discussed here, plus chaotic probe strategies
> in address hashing.
You've presented a paper that attempts to establish a connection
between chaos theory and hashing by open addressing. I'm not convinced
chaos theory is all that, but some of their measurement techniques
appear useful, and I'll probably try using them, despite the fact Linux
appears to favor hashing by separate chaining.
This URL has not convinced me you are informed, and I don't really want
to be convinced whether you're informed or not. Your contributions to
these threads have been far less than enlightening or helpful thus far.
On Wed, Mar 13, 2002 at 02:06:48PM -0500, Richard B. Johnson wrote:
> A fast random number generator is essential to produce meaningful
> tests within a reasonable period of time. It is also used within one
> of the hashing functions to 'guess' at the direction of displacement
> when an insertion slot is not immediately located, as well as the
> displacement mechanism for several chaotic methods discussed.
I don't buy this, and the reason why is that hashing is obviously
sensitive to the input distribution. To defeat any hash function,
you need only produce a distribution concentrated on a set hashing
to the same number. If your distribution is literally uniform, then
you'll never do better (by one measure) than least residue mod hash
table size. You need to produce a realistic set of test keys. The key
distributions you actually want to hash will be neither of the above.
(Though one should verify it doesn't do poorly on uniform input, it's
little more than a sanity check.)
On Wed, Mar 13, 2002 at 02:06:48PM -0500, Richard B. Johnson wrote:
> Using your own hash-function as a template for tests of the same
> hash-function, as you propose, is unlikely to provide meaningful
> results.
I'm not proposing that. I have no idea why you think I am.
If I may summarize, if you're going to use random number generation
to simulate test inputs, verify the distribution of the test inputs
you generate actually resembles the thing you're trying to simulate.
It's far more productive to get samples of kernel virtual addresses
of the objects whose addresses you're hashing, or inode numbers from
real filesystems, or filenames, or whatever, than to just feed it a
stream of numbers spewed by some random number generator no one's
heard of, and that's not going to change in the least. For a Monte
Carlo simulation of its usage in the kernel, you're going to need to
actually figure out how to simulate those things' distributions.
Hmm, I wouldn't be suprised to get a bunch of "YHBT" messages after
this but the chaos theory paper does have some useful stuff for
analyzing hash table performance. Distributions' entropies and Lyapunov
exponents of hash functions should be easy enough to compute.
Bill
next prev parent reply other threads:[~2002-03-13 22:10 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-03-12 4:19 2.4.19pre2aa1 wli
2002-03-12 5:31 ` 2.4.19pre2aa1 wli
2002-03-12 6:36 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-12 6:06 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-12 10:46 ` 2.4.19pre2aa1 Rik van Riel
2002-03-12 11:47 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-12 11:48 ` 2.4.19pre2aa1 wli
2002-03-12 12:21 ` 2.4.19pre2aa1 Daniel Phillips
2002-03-12 14:25 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-12 14:32 ` 2.4.19pre2aa1 Daniel Phillips
2002-03-15 17:20 ` 2.4.19pre2aa1 Horst von Brand
2002-03-15 16:43 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-12 11:29 ` 2.4.19pre2aa1 wli
2002-03-12 12:56 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-12 13:20 ` 2.4.19pre2aa1 Rik van Riel
2002-03-12 13:33 ` 2.4.19pre2aa1 Richard B. Johnson
2002-03-12 14:17 ` 2.4.19pre2aa1 wli
2002-03-12 14:30 ` 2.4.19pre2aa1 Richard B. Johnson
2002-03-13 2:18 ` 2.4.19pre2aa1 wli
2002-03-13 19:06 ` 2.4.19pre2aa1 Richard B. Johnson
2002-03-13 22:10 ` wli [this message]
2002-03-14 12:18 ` 2.4.19pre2aa1 Richard B. Johnson
2002-03-14 12:47 ` 2.4.19pre2aa1 Daniel Phillips
2002-03-14 13:59 ` 2.4.19pre2aa1 Rik van Riel
2002-03-14 14:02 ` 2.4.19pre2aa1 Daniel Phillips
2002-03-12 14:14 ` 2.4.19pre2aa1 wli
2002-03-12 15:04 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-12 23:31 ` 2.4.19pre2aa1 wli
2002-03-13 0:09 ` 2.4.19pre2aa1 Andrew Morton
2002-03-13 1:06 ` 2.4.19pre2aa1 wli
2002-03-13 1:24 ` 2.4.19pre2aa1 Andrew Morton
2002-03-13 7:37 ` 2.4.19pre2aa1 Daniel Phillips
2002-03-13 7:30 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-13 7:55 ` 2.4.19pre2aa1 Andrew Morton
2002-03-13 8:06 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-13 10:57 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-13 13:51 ` 2.4.19pre2aa1 Rik van Riel
2002-03-13 14:03 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-13 19:19 ` 2.4.19pre2aa1 Andrew Morton
2002-03-13 8:12 ` 2.4.19pre2aa1 Daniel Phillips
-- strict thread matches above, loose matches on Subject: below --
2002-03-08 2:40 2.4.19pre2aa1 rwhron
2002-03-07 8:21 2.4.19pre2aa1 Andrea Arcangeli
2002-03-07 10:49 ` 2.4.19pre2aa1 William Lee Irwin III
2002-03-07 11:27 ` 2.4.19pre2aa1 Daniel Phillips
2002-03-07 11:47 ` 2.4.19pre2aa1 William Lee Irwin III
2002-03-07 11:46 ` 2.4.19pre2aa1 Daniel Phillips
2002-03-07 17:03 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-07 20:18 ` 2.4.19pre2aa1 William Lee Irwin III
2002-03-07 20:38 ` 2.4.19pre2aa1 Richard B. Johnson
2002-03-08 0:22 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-08 0:26 ` 2.4.19pre2aa1 Rik van Riel
2002-03-08 0:11 ` 2.4.19pre2aa1 Andrea Arcangeli
2002-03-07 11:34 ` 2.4.19pre2aa1 Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020313221022.H14628@holomorphy.com \
--to=wli@holomorphy.com \
--cc=andrea@suse.de \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=phillips@bonn-fries.net \
--cc=riel@surriel.com \
--cc=root@chaos.analogic.com \
--cc=wli@parcelfarce.linux.theplanet.co.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox