public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jason A. Donenfeld" <Jason@zx2c4.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel test robot <oliver.sang@intel.com>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	Masahiro Yamada <masahiroy@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org,
	Theodore Ts'o <tytso@mit.edu>,
	Christian Brauner <brauner@kernel.org>
Subject: Re: [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail
Date: Fri, 30 Dec 2022 03:08:14 +0100	[thread overview]
Message-ID: <Y65IDkjj9kYcoLiW@zx2c4.com> (raw)
In-Reply-To: <CAHk-=wieOBgQ-7aoihBzywKqxiO7o7hc6gd_csn69ChcxR1FuQ@mail.gmail.com>

On Thu, Dec 29, 2022 at 10:55:05AM -0800, Linus Torvalds wrote:
> Also, I'm surprised this hasn't been an issue earlier - 'char' has
> always been unsigned on arm (among other architectures), so if this
> test started failing now on x86-64 due to -funsigned-char, it has
> presumably been failing on arm the whole time.

That's the curious part, indeed...

> Oh, I think I see one potential problem in ext4:
> 
> ext4_xattr_hash_entry() is hot garbage. Lookie here:
> 
>         while (name_len--) {
>                 hash = (hash << NAME_HASH_SHIFT) ^
>                        (hash >> (8*sizeof(hash) - NAME_HASH_SHIFT)) ^
>                        *name++;
>         }
> 
> so that hash will now depend on the sign of that 'char *name' pointer.
> 
> If that hash ever has any long-term meaning (ie saved on disk or
> exposed some other way), that would be problematic.

Note that ext4 has lots of sign-specific code for hashing. Only some of
it can now be removed, since compatibility with old file systems must be
preserved. But what I mean is the code that begins in super.c:

                i = le32_to_cpu(es->s_flags);
                if (i & EXT2_FLAGS_UNSIGNED_HASH)
                        sbi->s_hash_unsigned = 3;
                else if ((i & EXT2_FLAGS_SIGNED_HASH) == 0) {
#ifdef __CHAR_UNSIGNED__
                        if (!sb_rdonly(sb))
                                es->s_flags |=
                                        cpu_to_le32(EXT2_FLAGS_UNSIGNED_HASH);
                        sbi->s_hash_unsigned = 3;
#else
                        if (!sb_rdonly(sb))
                                es->s_flags |=
                                        cpu_to_le32(EXT2_FLAGS_SIGNED_HASH);
#endif
                }

The second part of that #else can now go away. And then maybe the whole
expression can be simplified.

These actually wind up being used in namei.c:

                hinfo->hash_version += EXT4_SB(dir->i_sb)->s_hash_unsigned;

, which then sets the hash version that's selected in hash.c:

        switch (hinfo->hash_version) {
        case DX_HASH_LEGACY_UNSIGNED:
                hash = dx_hack_hash_unsigned(name, len);
                break;
        case DX_HASH_LEGACY:
                hash = dx_hack_hash_signed(name, len);
                break;
        case DX_HASH_HALF_MD4_UNSIGNED:
                str2hashbuf = str2hashbuf_unsigned;
                fallthrough;
        case DX_HASH_HALF_MD4:
                p = name;
		[...]

And so on. dx_hack_hash_unsigned() and dx_hack_hash_signed() are the
same functions, except one uses `unsigned char` and the other uses
`signed char`. It's unfortunate these exist, but now it's part of the
on-disk format, so they have to stick around (along with other warts
like "halfmd4").

But at least for new file systems, things should be unified. Anyway, it
looks like for *these* hashes, the ext4 developers did consider the
signedness issue.

Sounds like maybe it was left out of ext4_xattr_hash_entry(), which does
indeed look like it's part of the on-disk representation:

static int
ext4_xattr_inode_verify_hashes(struct inode *ea_inode,
                               struct ext4_xattr_entry *entry, void *buffer,
                               size_t size)
{
        u32 hash;

        /* Verify stored hash matches calculated hash. */
        hash = ext4_xattr_inode_hash(EXT4_SB(ea_inode->i_sb), buffer, size);
        if (hash != ext4_xattr_inode_get_hash(ea_inode))
                return -EFSCORRUPTED;

        if (entry) {
                __le32 e_hash, tmp_data;

                /* Verify entry hash. */
                tmp_data = cpu_to_le32(hash);
                e_hash = ext4_xattr_hash_entry(entry->e_name, entry->e_name_len,
                                               &tmp_data, 1);
                if (e_hash != entry->e_hash)
                        return -EFSCORRUPTED;
        }
        return 0;
}

So if ext4_xattr_hash_entry() is indeed broken with respect to
heisensignness, then this stuff has always been broken, and it's
probably good that this is being unearthed...

Jason

  reply	other threads:[~2022-12-30  2:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-29  8:49 [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail kernel test robot
2022-12-29 18:55 ` Linus Torvalds
2022-12-30  2:08   ` Jason A. Donenfeld [this message]
2022-12-30 15:36   ` Christian Brauner
2022-12-30 10:14 ` [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail #forregzbot Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y65IDkjj9kYcoLiW@zx2c4.com \
    --to=jason@zx2c4.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=brauner@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=keescook@chromium.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=masahiroy@kernel.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox