* Dcache hash distrubition patches
@ 2003-03-10 17:45 Martin J. Bligh
2003-03-10 17:52 ` Andi Kleen
0 siblings, 1 reply; 8+ messages in thread
From: Martin J. Bligh @ 2003-03-10 17:45 UTC (permalink / raw)
To: linux-kernel; +Cc: Andi Kleen
Thanks to Andi for the distribution count patches. I did some quick
numbers with this, just by running "find / | xargs ls -l > /dev/null"
to populate the cache, then dumped the numbers out.
larry:~# grep dentry /proc/slabinfo
dentry_cache 880531 884880 160 36870 36870 1 : 248 124
count length
444569 0
381295 1
163442 2
46937 3
10042 4
1747 5
253 6
30 7
5 8
Conclusion: the hash distribution (for this simple test) looks fine
to me.
M.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dcache hash distrubition patches
2003-03-10 17:45 Dcache hash distrubition patches Martin J. Bligh
@ 2003-03-10 17:52 ` Andi Kleen
2003-03-11 7:41 ` Martin J. Bligh
0 siblings, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2003-03-10 17:52 UTC (permalink / raw)
To: Martin J. Bligh; +Cc: linux-kernel, Andi Kleen
> Conclusion: the hash distribution (for this simple test) looks fine
> to me.
Yes, because of the overkill size of the hash table. With a 100K + entry
table you can make near every hash function look good ;)
Try to reduce it to a smaller number of buckets and see what happens.
-Andi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dcache hash distrubition patches
2003-03-10 17:52 ` Andi Kleen
@ 2003-03-11 7:41 ` Martin J. Bligh
2003-03-11 7:47 ` William Lee Irwin III
2003-03-11 15:23 ` Andi Kleen
0 siblings, 2 replies; 8+ messages in thread
From: Martin J. Bligh @ 2003-03-11 7:41 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
>> Conclusion: the hash distribution (for this simple test) looks fine
>> to me.
>
> Yes, because of the overkill size of the hash table. With a 100K + entry
> table you can make near every hash function look good ;)
>
> Try to reduce it to a smaller number of buckets and see what happens.
OK, after I've stopped being an idiot, and misreading the code, I have
some numbers. They still look pretty good to me. I shrunk us from
1,048,576 buckets to 65536, and loaded 1,150,000 entries in there.
5 3
9 4
44 5
113 6
243 7
519 8
1059 9
1613 10
2458 11
3506 12
4515 13
5349 14
6071 15
6328 16
6369 17
5862 18
5228 19
4305 20
3546 21
2613 22
1981 23
1382 24
928 25
602 26
368 27
230 28
115 29
75 30
45 31
16 32
14 33
3 34
4 35
2 36
It's not perfect, but not bad either. Some mathematician can go calculate
just how imperfect it is over random distribution, but it looks OK to me ;-)
M
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dcache hash distrubition patches
2003-03-11 7:41 ` Martin J. Bligh
@ 2003-03-11 7:47 ` William Lee Irwin III
2003-03-11 15:23 ` Andi Kleen
1 sibling, 0 replies; 8+ messages in thread
From: William Lee Irwin III @ 2003-03-11 7:47 UTC (permalink / raw)
To: Martin J. Bligh; +Cc: Andi Kleen, linux-kernel
On Mon, Mar 10, 2003 at 11:41:07PM -0800, Martin J. Bligh wrote:
> It's not perfect, but not bad either. Some mathematician can go calculate
> just how imperfect it is over random distribution, but it looks OK to me ;-)
Okay, the standards are a bit more stringent for hash functions. You
basically want the total number of collisions to be minimized.
That said, standard statistical things are useful for first-pass
discrimination.
This looks basically okay, but if it's really just even with respect
to cache pressure the need for a different data structure in the
eventual future is clear. I'd need a baseline (pre-hlist) dcache
snapshot to compare against to get a better idea.
-- wli
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dcache hash distrubition patches
2003-03-11 7:41 ` Martin J. Bligh
2003-03-11 7:47 ` William Lee Irwin III
@ 2003-03-11 15:23 ` Andi Kleen
2003-03-11 15:31 ` Martin J. Bligh
1 sibling, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2003-03-11 15:23 UTC (permalink / raw)
To: Martin J. Bligh; +Cc: Andi Kleen, linux-kernel
On Tue, Mar 11, 2003 at 08:41:07AM +0100, Martin J. Bligh wrote:
> some numbers. They still look pretty good to me. I shrunk us from
> 1,048,576 buckets to 65536, and loaded 1,150,000 entries in there.
Interesting would be to find the sweet spot with the smallest hash table
size that still performs well. Not sure if find / is a good workload
for that though.
Also same for inode hash (but I don't have statistics for that right now)
-Andi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dcache hash distrubition patches
2003-03-11 15:23 ` Andi Kleen
@ 2003-03-11 15:31 ` Martin J. Bligh
2003-03-11 16:27 ` Andi Kleen
0 siblings, 1 reply; 8+ messages in thread
From: Martin J. Bligh @ 2003-03-11 15:31 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
>> some numbers. They still look pretty good to me. I shrunk us from
>> 1,048,576 buckets to 65536, and loaded 1,150,000 entries in there.
>
> Interesting would be to find the sweet spot with the smallest hash table
> size that still performs well. Not sure if find / is a good workload
> for that though.
Difficult, as it depends how many files are in the working set of the
machine, really. Right now it eats 4Mb of lowmem ... 1M entries seems
to be about 150Mb of slab for dentries, which is probably as much as
anyone wants ... but it's nice to keep those hash chains short ;-)
I can try 1Mb or something I suppose ... what's the purpose here,
to keep the cachelines of the bucket heads warm? Not sure it's worth
the tradeoff, as we have to touch another line for each element we
walk?
I take it you're happy enough with the current hash function distribution?
> Also same for inode hash (but I don't have statistics for that right now)
I could hack something up ... but 1 machine ain't going to cut it. I
suspect I'd have a much smaller inode hash, as I tend to have masses
of kernel trees, mostly hardlinked to each other.
M.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dcache hash distrubition patches
2003-03-11 15:31 ` Martin J. Bligh
@ 2003-03-11 16:27 ` Andi Kleen
2003-03-11 17:29 ` Anton Blanchard
0 siblings, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2003-03-11 16:27 UTC (permalink / raw)
To: Martin J. Bligh; +Cc: Andi Kleen, linux-kernel
On Tue, Mar 11, 2003 at 04:31:23PM +0100, Martin J. Bligh wrote:
> I can try 1Mb or something I suppose ... what's the purpose here,
> to keep the cachelines of the bucket heads warm? Not sure it's worth
> the tradeoff, as we have to touch another line for each element we
> walk?
Use less cache for the hash table overall.
Use less lowmem.
Ideally there would be no tradeoff if you can still get reasonable
hash chain length with smaller tables (= overall win)
>
> I take it you're happy enough with the current hash function distribution?
At least I don't know how to improve it.
> > Also same for inode hash (but I don't have statistics for that right now)
>
> I could hack something up ... but 1 machine ain't going to cut it. I
> suspect I'd have a much smaller inode hash, as I tend to have masses
> of kernel trees, mostly hardlinked to each other.
I doubt inode cache is very critical, except perhaps in NFS server loads
(but even the nfs server has an own frontend cache)
Normally the dcache should bear most of the load.
-Andi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dcache hash distrubition patches
2003-03-11 16:27 ` Andi Kleen
@ 2003-03-11 17:29 ` Anton Blanchard
0 siblings, 0 replies; 8+ messages in thread
From: Anton Blanchard @ 2003-03-11 17:29 UTC (permalink / raw)
To: Andi Kleen; +Cc: Martin J. Bligh, linux-kernel
> I doubt inode cache is very critical, except perhaps in NFS server loads
> (but even the nfs server has an own frontend cache)
> Normally the dcache should bear most of the load.
I did some analysis of the hashes ages ago. From memory the regular
grouping of inodes caused this unbalanced distribution:
http://samba.org/~anton/linux/hashes/1/icache.png
Anton
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2003-03-11 17:21 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-10 17:45 Dcache hash distrubition patches Martin J. Bligh
2003-03-10 17:52 ` Andi Kleen
2003-03-11 7:41 ` Martin J. Bligh
2003-03-11 7:47 ` William Lee Irwin III
2003-03-11 15:23 ` Andi Kleen
2003-03-11 15:31 ` Martin J. Bligh
2003-03-11 16:27 ` Andi Kleen
2003-03-11 17:29 ` Anton Blanchard
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox