From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35E92C433F5 for ; Mon, 11 Oct 2021 20:20:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0FDD660F21 for ; Mon, 11 Oct 2021 20:20:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234933AbhJKUWk (ORCPT ); Mon, 11 Oct 2021 16:22:40 -0400 Received: from outgoing-auth-1.mit.edu ([18.9.28.11]:50860 "EHLO outgoing.mit.edu" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S234867AbhJKUWk (ORCPT ); Mon, 11 Oct 2021 16:22:40 -0400 Received: from cwcc.thunk.org (pool-72-74-133-215.bstnma.fios.verizon.net [72.74.133.215]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 19BKKZLi029752 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 11 Oct 2021 16:20:36 -0400 Received: by cwcc.thunk.org (Postfix, from userid 15806) id 7057815C00CA; Mon, 11 Oct 2021 16:20:35 -0400 (EDT) Date: Mon, 11 Oct 2021 16:20:35 -0400 From: "Theodore Ts'o" To: Avi Deitcher Cc: linux-ext4@vger.kernel.org Subject: Re: algorithm for half-md4 used in htree directories Message-ID: References: <3A493D20-568A-4D63-A575-5DEEBFAAF41A@dilger.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Mon, Oct 11, 2021 at 08:30:36AM -0700, Avi Deitcher wrote: > Does someone know how this is constructed and used? > > On Mon, Oct 4, 2021 at 12:57 AM Avi Deitcher wrote: > > > > Hi Andreas, > > > > I had looked in __ext4fs_dirhash(). Yes, it does reference the seed - > > and create a default if none is there at the filesystem level - but it > > doesn't appear to use it, in that function. hinfo is populated in the > > function - hash, minor-hash, seed - but it never uses the seed to > > manipulate the hash. The seed is used to initialize the buf array, so long as the seed is not all zero's. If it is all zeros, then the default seed is used instead (right above this bit of code: if (hinfo->seed) { for (i = 0; i < 4; i++) { if (hinfo->seed[i]) { memcpy(buf, hinfo->seed, sizeof(buf)); break; } } } The legacy hash doesn't use the seed, yes. But for the other hash types (hash_version), they mix the filename (in different ways depending on the hash type. For example, for half md4: case DX_HASH_HALF_MD4: p = name; while (len > 0) { (*str2hashbuf)(p, len, in, 8); half_md4_transform(buf, in); ^^^ len -= 32; p += 32; } minor_hash = buf[2]; hash = buf[1]; break; When the hash seed is different, that means the initial state of the buf array will different, and this influences the resulting hash. Cheers, - Ted