public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Caleb D.S. Brzezinski" <calebdsb@protonmail.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: hirofumi@mail.parknet.co.jp, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 2/3] fat: add the msdos_format_name() filename cache
Date: Sun, 29 Aug 2021 17:11:56 +0000	[thread overview]
Message-ID: <87o89gw4yy.fsf@protonmail.com> (raw)
In-Reply-To: <YSujmt9vman41ecj@zeniv-ca.linux.org.uk>

Hi Al,

"Al Viro" <viro@zeniv.linux.org.uk> writes:

> On Sun, Aug 29, 2021 at 02:25:29PM +0000, Caleb D.S. Brzezinski wrote:
>> Implement the main msdos_format_name() filename cache. If used as a
>> module, all memory allocated for the cache is freed when the module is
>> de-registered.
>>
>> Signed-off-by: Caleb D.S. Brzezinski <calebdsb@protonmail.com>
>> ---
>>  fs/fat/namei_msdos.c | 35 +++++++++++++++++++++++++++++++++++
>>  1 file changed, 35 insertions(+)
>>
>> diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c
>> index 7561674b1..f9d4f63c3 100644
>> --- a/fs/fat/namei_msdos.c
>> +++ b/fs/fat/namei_msdos.c
>> @@ -124,6 +124,16 @@ static int msdos_format_name(const unsigned char *name, int len,
>>  	unsigned char *walk;
>>  	unsigned char c;
>>  	int space;
>> +	u64 hash;
>> +	struct msdos_name_node *node;
>> +
>> +	/* check if the name is already in the cache */
>> +
>> +	hash = msdos_fname_hash(name);
>> +	if (find_fname_in_cache(res, hash))
>> +		return 0;

> Huh?  How could that possibly work, seeing that
> 	* your hash function only looks at the first 8 characters

My understanding was that the maximum length of the name considered when
passed to msdos_format_name() was eight characters; see:

		while (walk - res < 8)

and

		for (walk = res; len && walk - res < 8; walk++) {

If that's an incorrect understanding, then yes, it definitely wouldn't
work. A larger, more computationally intensive hash function would be
required, which would most likely cancel out the improved lookup from
the cache.

> 	* your find_fname_in_cache() assumes that hash collisions
> are impossible, which is... unlikely, considering the nature of
> that hash function

If the names are 8 character limited, then logically any name with the
exact same set of characters would "collide" into the same formatted
name. Again, if I misunderstood the constraints on the filenames, then
yes, this is unnecessary.

> Out of curiosity, how have you tested that thing?

I've used it on my own FAT32 drives for profiling, run it through
kmemleak, ksan, some stress tests, etc. for a few weeks. Like I said, I
benchmarked it and it shaved about 0.2ms of time off my most common use
case.

Thanks.
Caleb B.

-- 
"Come now, and let us reason together," Says the LORD
    -- Isaiah 1:18a, NASB


  parent reply	other threads:[~2021-08-29 17:12 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-29 14:25 [PATCH 0/3] fat: add a cache for msdos_format_name() Caleb D.S. Brzezinski
2021-08-29 14:25 ` [PATCH 1/3] fat: define functions and data structures for a formatted name cache Caleb D.S. Brzezinski
2021-08-29 21:05   ` kernel test robot
2021-08-29 21:05   ` [RFC PATCH] fat: msdos_ncache can be static kernel test robot
2021-08-29 14:25 ` [PATCH 2/3] fat: add the msdos_format_name() filename cache Caleb D.S. Brzezinski
2021-08-29 15:11   ` Al Viro
2021-08-29 15:26     ` Al Viro
2021-08-29 17:19       ` Caleb D.S. Brzezinski
2021-08-29 17:11     ` Caleb D.S. Brzezinski [this message]
2021-08-29 21:23       ` Al Viro
2021-08-29 14:25 ` [PATCH 3/3] fat: add hash machinery to relevant filesystem operations Caleb D.S. Brzezinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o89gw4yy.fsf@protonmail.com \
    --to=calebdsb@protonmail.com \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox