From mboxrd@z Thu Jan 1 00:00:00 1970 From: Josef Bacik Subject: Re: Name hashing function causing a perf regression Date: Fri, 12 Sep 2014 17:25:50 -0400 Message-ID: <541364DE.5030104@fb.com> References: <540F5562.1050505@fb.com> <8761gs64dx.fsf@tassilo.jf.intel.com> <54134EFA.2030101@fb.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Cc: Andi Kleen , linux-fsdevel , Al Viro , Christoph Hellwig , Chris Mason To: Linus Torvalds Return-path: Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:38993 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752106AbaILV0a (ORCPT ); Fri, 12 Sep 2014 17:26:30 -0400 In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 09/12/2014 04:39 PM, Linus Torvalds wrote: > On Fri, Sep 12, 2014 at 12:52 PM, Josef Bacik wrote: >> >> So that looks much better, not perfect but hlist_for_each through 5 entries >> isn't going to kill us, I'll build a kernel with this and get back shortly >> with real numbers. > > So one thing to look out for is that the fold_hash() is only effective > for 64-bit architectures, and I'm pretty sure that we end up having a > very similar issue on 32-bit architectures that have > > - two 32-bit words with the fairly simplisting mixing (basically > "first word * 9 + second word") > > This is probably ok'ish, in that you end up still having a 32-bit > number with *most* of the bits being relevant, but we should think > about this too. > > - the real problem is likely the "d_hash()" function that narrows the > 32-bit hash down to d_hash_mask bits, and drops a lot of bits as it > does so (ie d_hash_mask is usually on the order of 12 bits, so it > really only uses 24 bits of the 32-bit hash value, and 12 bits is > particularly bad anyway, since it's a multiple of three which is what > the '9' in the first simplistic mixing used too. > > So it *might* be better to fix this all at d_hash() instead, and make > that use "hash_u32(hash, d_hash_bits)" instead. And maybe keep the > fold_hash() stage fairly stupid. > > Anyway, what I'm trying to say is that your numbers are certainly > *much* better than the hash bucket list used to be, but we should > probably tweak it a bit more. I'm not sure you really need to go to a > real hash function like murmur, but it would also be a good idea to at > least look at making some initial value be randomized so that people > also can't quite as easily perhaps try to hit worst-case situations > where they control the hash and can make really long buckets (the fact > that we mix in the parent pointer and many filesystems have some > directory size limit might make that harder anyway, but..) > Ok I have a good direction to take this in, so far the best results have been to change fold_hash() to hash_64(hash, 32) and change d_hash to do this static int *d_hash(int *table, unsigned long parent, unsigned int hash) { hash += (unsigned long) parent / L1_CACHE_BYTES; hash += hash_32(hash, d_hash_shift); return table + (hash & d_hash_mask); } which has gotten us 890k buckets. I'll futz with it some more on Monday and send out a patch with whatever gives me the best numbers. Thanks for the help, hashing is not my forte, Josef