All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jason A. Donenfeld" <Jason@zx2c4.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel test robot <oliver.sang@intel.com>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	Masahiro Yamada <masahiroy@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org,
	Theodore Ts'o <tytso@mit.edu>,
	Christian Brauner <brauner@kernel.org>
Subject: Re: [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail
Date: Fri, 30 Dec 2022 03:08:14 +0100	[thread overview]
Message-ID: <Y65IDkjj9kYcoLiW@zx2c4.com> (raw)
In-Reply-To: <CAHk-=wieOBgQ-7aoihBzywKqxiO7o7hc6gd_csn69ChcxR1FuQ@mail.gmail.com>

On Thu, Dec 29, 2022 at 10:55:05AM -0800, Linus Torvalds wrote:
> Also, I'm surprised this hasn't been an issue earlier - 'char' has
> always been unsigned on arm (among other architectures), so if this
> test started failing now on x86-64 due to -funsigned-char, it has
> presumably been failing on arm the whole time.

That's the curious part, indeed...

> Oh, I think I see one potential problem in ext4:
> 
> ext4_xattr_hash_entry() is hot garbage. Lookie here:
> 
>         while (name_len--) {
>                 hash = (hash << NAME_HASH_SHIFT) ^
>                        (hash >> (8*sizeof(hash) - NAME_HASH_SHIFT)) ^
>                        *name++;
>         }
> 
> so that hash will now depend on the sign of that 'char *name' pointer.
> 
> If that hash ever has any long-term meaning (ie saved on disk or
> exposed some other way), that would be problematic.

Note that ext4 has lots of sign-specific code for hashing. Only some of
it can now be removed, since compatibility with old file systems must be
preserved. But what I mean is the code that begins in super.c:

                i = le32_to_cpu(es->s_flags);
                if (i & EXT2_FLAGS_UNSIGNED_HASH)
                        sbi->s_hash_unsigned = 3;
                else if ((i & EXT2_FLAGS_SIGNED_HASH) == 0) {
#ifdef __CHAR_UNSIGNED__
                        if (!sb_rdonly(sb))
                                es->s_flags |=
                                        cpu_to_le32(EXT2_FLAGS_UNSIGNED_HASH);
                        sbi->s_hash_unsigned = 3;
#else
                        if (!sb_rdonly(sb))
                                es->s_flags |=
                                        cpu_to_le32(EXT2_FLAGS_SIGNED_HASH);
#endif
                }

The second part of that #else can now go away. And then maybe the whole
expression can be simplified.

These actually wind up being used in namei.c:

                hinfo->hash_version += EXT4_SB(dir->i_sb)->s_hash_unsigned;

, which then sets the hash version that's selected in hash.c:

        switch (hinfo->hash_version) {
        case DX_HASH_LEGACY_UNSIGNED:
                hash = dx_hack_hash_unsigned(name, len);
                break;
        case DX_HASH_LEGACY:
                hash = dx_hack_hash_signed(name, len);
                break;
        case DX_HASH_HALF_MD4_UNSIGNED:
                str2hashbuf = str2hashbuf_unsigned;
                fallthrough;
        case DX_HASH_HALF_MD4:
                p = name;
		[...]

And so on. dx_hack_hash_unsigned() and dx_hack_hash_signed() are the
same functions, except one uses `unsigned char` and the other uses
`signed char`. It's unfortunate these exist, but now it's part of the
on-disk format, so they have to stick around (along with other warts
like "halfmd4").

But at least for new file systems, things should be unified. Anyway, it
looks like for *these* hashes, the ext4 developers did consider the
signedness issue.

Sounds like maybe it was left out of ext4_xattr_hash_entry(), which does
indeed look like it's part of the on-disk representation:

static int
ext4_xattr_inode_verify_hashes(struct inode *ea_inode,
                               struct ext4_xattr_entry *entry, void *buffer,
                               size_t size)
{
        u32 hash;

        /* Verify stored hash matches calculated hash. */
        hash = ext4_xattr_inode_hash(EXT4_SB(ea_inode->i_sb), buffer, size);
        if (hash != ext4_xattr_inode_get_hash(ea_inode))
                return -EFSCORRUPTED;

        if (entry) {
                __le32 e_hash, tmp_data;

                /* Verify entry hash. */
                tmp_data = cpu_to_le32(hash);
                e_hash = ext4_xattr_hash_entry(entry->e_name, entry->e_name_len,
                                               &tmp_data, 1);
                if (e_hash != entry->e_hash)
                        return -EFSCORRUPTED;
        }
        return 0;
}

So if ext4_xattr_hash_entry() is indeed broken with respect to
heisensignness, then this stuff has always been broken, and it's
probably good that this is being unearthed...

Jason

  reply	other threads:[~2022-12-30  2:08 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-29  8:49 [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail kernel test robot
2022-12-29 18:55 ` Linus Torvalds
2022-12-30  2:08   ` Jason A. Donenfeld [this message]
2022-12-30 15:36   ` Christian Brauner
2022-12-30 10:14 ` [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail #forregzbot Thorsten Leemhuis
2023-01-19 14:29   ` Linux kernel regression tracking (#update)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y65IDkjj9kYcoLiW@zx2c4.com \
    --to=jason@zx2c4.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=brauner@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=keescook@chromium.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=masahiroy@kernel.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.