All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Caleb D.S. Brzezinski" <calebdsb@protonmail.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: hirofumi@mail.parknet.co.jp, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 2/3] fat: add the msdos_format_name() filename cache
Date: Sun, 29 Aug 2021 17:11:56 +0000	[thread overview]
Message-ID: <87o89gw4yy.fsf@protonmail.com> (raw)
In-Reply-To: <YSujmt9vman41ecj@zeniv-ca.linux.org.uk>

Hi Al,

"Al Viro" <viro@zeniv.linux.org.uk> writes:

> On Sun, Aug 29, 2021 at 02:25:29PM +0000, Caleb D.S. Brzezinski wrote:
>> Implement the main msdos_format_name() filename cache. If used as a
>> module, all memory allocated for the cache is freed when the module is
>> de-registered.
>>
>> Signed-off-by: Caleb D.S. Brzezinski <calebdsb@protonmail.com>
>> ---
>>  fs/fat/namei_msdos.c | 35 +++++++++++++++++++++++++++++++++++
>>  1 file changed, 35 insertions(+)
>>
>> diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c
>> index 7561674b1..f9d4f63c3 100644
>> --- a/fs/fat/namei_msdos.c
>> +++ b/fs/fat/namei_msdos.c
>> @@ -124,6 +124,16 @@ static int msdos_format_name(const unsigned char *name, int len,
>>  	unsigned char *walk;
>>  	unsigned char c;
>>  	int space;
>> +	u64 hash;
>> +	struct msdos_name_node *node;
>> +
>> +	/* check if the name is already in the cache */
>> +
>> +	hash = msdos_fname_hash(name);
>> +	if (find_fname_in_cache(res, hash))
>> +		return 0;

> Huh?  How could that possibly work, seeing that
> 	* your hash function only looks at the first 8 characters

My understanding was that the maximum length of the name considered when
passed to msdos_format_name() was eight characters; see:

		while (walk - res < 8)

and

		for (walk = res; len && walk - res < 8; walk++) {

If that's an incorrect understanding, then yes, it definitely wouldn't
work. A larger, more computationally intensive hash function would be
required, which would most likely cancel out the improved lookup from
the cache.

> 	* your find_fname_in_cache() assumes that hash collisions
> are impossible, which is... unlikely, considering the nature of
> that hash function

If the names are 8 character limited, then logically any name with the
exact same set of characters would "collide" into the same formatted
name. Again, if I misunderstood the constraints on the filenames, then
yes, this is unnecessary.

> Out of curiosity, how have you tested that thing?

I've used it on my own FAT32 drives for profiling, run it through
kmemleak, ksan, some stress tests, etc. for a few weeks. Like I said, I
benchmarked it and it shaved about 0.2ms of time off my most common use
case.

Thanks.
Caleb B.

-- 
"Come now, and let us reason together," Says the LORD
    -- Isaiah 1:18a, NASB


  parent reply	other threads:[~2021-08-29 17:12 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-29 14:25 [PATCH 0/3] fat: add a cache for msdos_format_name() Caleb D.S. Brzezinski
2021-08-29 14:25 ` [PATCH 1/3] fat: define functions and data structures for a formatted name cache Caleb D.S. Brzezinski
2021-08-29 21:05   ` kernel test robot
2021-08-29 21:05     ` kernel test robot
2021-08-29 21:05   ` [RFC PATCH] fat: msdos_ncache can be static kernel test robot
2021-08-29 21:05     ` kernel test robot
2021-08-29 14:25 ` [PATCH 2/3] fat: add the msdos_format_name() filename cache Caleb D.S. Brzezinski
2021-08-29 15:11   ` Al Viro
2021-08-29 15:26     ` Al Viro
2021-08-29 17:19       ` Caleb D.S. Brzezinski
2021-08-29 17:11     ` Caleb D.S. Brzezinski [this message]
2021-08-29 21:23       ` Al Viro
2021-08-29 14:25 ` [PATCH 3/3] fat: add hash machinery to relevant filesystem operations Caleb D.S. Brzezinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o89gw4yy.fsf@protonmail.com \
    --to=calebdsb@protonmail.com \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.