Linux kbuild/kconfig development
 help / color / mirror / Atom feed
* Re: [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail
       [not found] <202212291509.704a11c9-oliver.sang@intel.com>
@ 2022-12-29 18:55 ` Linus Torvalds
  2022-12-30  2:08   ` Jason A. Donenfeld
  2022-12-30 15:36   ` Christian Brauner
  2022-12-30 10:14 ` [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail #forregzbot Thorsten Leemhuis
  1 sibling, 2 replies; 4+ messages in thread
From: Linus Torvalds @ 2022-12-29 18:55 UTC (permalink / raw)
  To: kernel test robot
  Cc: Jason A. Donenfeld, oe-lkp, lkp, Masahiro Yamada, Kees Cook,
	Andrew Morton, Andy Shevchenko, Greg Kroah-Hartman, linux-kernel,
	linux-kbuild, Theodore Ts'o, Christian Brauner

On Thu, Dec 29, 2022 at 12:49 AM kernel test robot
<oliver.sang@intel.com> wrote:
>
> generic/454       _check_generic_filesystem: filesystem on /dev/sda4 is inconsistent

The commentary on that test is:

  Create xattrs with multiple keys that all appear the same
  (in unicode, anyway) but point to different values.  In theory all
  Linux filesystems should allow this (filenames are a sequence of
  arbitrary bytes) even if the user implications are horrifying.

and looking at the script it seems to indeed just do setfattr and
getfattr with some unusual data (ie high bit set).

Adding Ted, since this is apparently all on ext4. I guess it could be
the vfs layer too, but it really doesn't tend to look very much at the
xattr data, so.. Adding Christian Brauner anyway, since he's been
working in this area for other reasons.

Ted, Christian - I cut down the report mercilessly. It's not really
all that interesting, apart from the basic information of "xfstest
generic/454 started failing consistently on ext4 at commit
3bc753c06dd0 ('kbuild: treat char as always unsigned')".

If you think you need more, see

    https://lore.kernel.org/all/202212291509.704a11c9-oliver.sang@intel.com/

Also, I'm surprised this hasn't been an issue earlier - 'char' has
always been unsigned on arm (among other architectures), so if this
test started failing now on x86-64 due to -funsigned-char, it has
presumably been failing on arm the whole time.

I assume it's something that compares a 'char *name' by value, but the
ones I looked at (eg xattr_find_entry() used strlen()/memcmp() which
should be all good).

Oh, I think I see one potential problem in ext4:

ext4_xattr_hash_entry() is hot garbage. Lookie here:

        while (name_len--) {
                hash = (hash << NAME_HASH_SHIFT) ^
                       (hash >> (8*sizeof(hash) - NAME_HASH_SHIFT)) ^
                       *name++;
        }

so that hash will now depend on the sign of that 'char *name' pointer.

If that hash ever has any long-term meaning (ie saved on disk or
exposed some other way), that would be problematic.

Of course, if it's just an ephemeral thing only used on that one
machine, then it isn't a problem - having different hashes on
different machines (and different boots) is a non-issue. But
considering that it then does

        return cpu_to_le32(hash);

at the end does seem to imply to me that it all really *tries* to be
architecture-neutral, and has just failed horrendously.

Because that does look like code that might get confused by the sign of a char.

There might be other cases, I only looked very quickly for these kinds
of problems.

          Linus

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail
  2022-12-29 18:55 ` [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail Linus Torvalds
@ 2022-12-30  2:08   ` Jason A. Donenfeld
  2022-12-30 15:36   ` Christian Brauner
  1 sibling, 0 replies; 4+ messages in thread
From: Jason A. Donenfeld @ 2022-12-30  2:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: kernel test robot, oe-lkp, lkp, Masahiro Yamada, Kees Cook,
	Andrew Morton, Andy Shevchenko, Greg Kroah-Hartman, linux-kernel,
	linux-kbuild, Theodore Ts'o, Christian Brauner

On Thu, Dec 29, 2022 at 10:55:05AM -0800, Linus Torvalds wrote:
> Also, I'm surprised this hasn't been an issue earlier - 'char' has
> always been unsigned on arm (among other architectures), so if this
> test started failing now on x86-64 due to -funsigned-char, it has
> presumably been failing on arm the whole time.

That's the curious part, indeed...

> Oh, I think I see one potential problem in ext4:
> 
> ext4_xattr_hash_entry() is hot garbage. Lookie here:
> 
>         while (name_len--) {
>                 hash = (hash << NAME_HASH_SHIFT) ^
>                        (hash >> (8*sizeof(hash) - NAME_HASH_SHIFT)) ^
>                        *name++;
>         }
> 
> so that hash will now depend on the sign of that 'char *name' pointer.
> 
> If that hash ever has any long-term meaning (ie saved on disk or
> exposed some other way), that would be problematic.

Note that ext4 has lots of sign-specific code for hashing. Only some of
it can now be removed, since compatibility with old file systems must be
preserved. But what I mean is the code that begins in super.c:

                i = le32_to_cpu(es->s_flags);
                if (i & EXT2_FLAGS_UNSIGNED_HASH)
                        sbi->s_hash_unsigned = 3;
                else if ((i & EXT2_FLAGS_SIGNED_HASH) == 0) {
#ifdef __CHAR_UNSIGNED__
                        if (!sb_rdonly(sb))
                                es->s_flags |=
                                        cpu_to_le32(EXT2_FLAGS_UNSIGNED_HASH);
                        sbi->s_hash_unsigned = 3;
#else
                        if (!sb_rdonly(sb))
                                es->s_flags |=
                                        cpu_to_le32(EXT2_FLAGS_SIGNED_HASH);
#endif
                }

The second part of that #else can now go away. And then maybe the whole
expression can be simplified.

These actually wind up being used in namei.c:

                hinfo->hash_version += EXT4_SB(dir->i_sb)->s_hash_unsigned;

, which then sets the hash version that's selected in hash.c:

        switch (hinfo->hash_version) {
        case DX_HASH_LEGACY_UNSIGNED:
                hash = dx_hack_hash_unsigned(name, len);
                break;
        case DX_HASH_LEGACY:
                hash = dx_hack_hash_signed(name, len);
                break;
        case DX_HASH_HALF_MD4_UNSIGNED:
                str2hashbuf = str2hashbuf_unsigned;
                fallthrough;
        case DX_HASH_HALF_MD4:
                p = name;
		[...]

And so on. dx_hack_hash_unsigned() and dx_hack_hash_signed() are the
same functions, except one uses `unsigned char` and the other uses
`signed char`. It's unfortunate these exist, but now it's part of the
on-disk format, so they have to stick around (along with other warts
like "halfmd4").

But at least for new file systems, things should be unified. Anyway, it
looks like for *these* hashes, the ext4 developers did consider the
signedness issue.

Sounds like maybe it was left out of ext4_xattr_hash_entry(), which does
indeed look like it's part of the on-disk representation:

static int
ext4_xattr_inode_verify_hashes(struct inode *ea_inode,
                               struct ext4_xattr_entry *entry, void *buffer,
                               size_t size)
{
        u32 hash;

        /* Verify stored hash matches calculated hash. */
        hash = ext4_xattr_inode_hash(EXT4_SB(ea_inode->i_sb), buffer, size);
        if (hash != ext4_xattr_inode_get_hash(ea_inode))
                return -EFSCORRUPTED;

        if (entry) {
                __le32 e_hash, tmp_data;

                /* Verify entry hash. */
                tmp_data = cpu_to_le32(hash);
                e_hash = ext4_xattr_hash_entry(entry->e_name, entry->e_name_len,
                                               &tmp_data, 1);
                if (e_hash != entry->e_hash)
                        return -EFSCORRUPTED;
        }
        return 0;
}

So if ext4_xattr_hash_entry() is indeed broken with respect to
heisensignness, then this stuff has always been broken, and it's
probably good that this is being unearthed...

Jason

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail #forregzbot
       [not found] <202212291509.704a11c9-oliver.sang@intel.com>
  2022-12-29 18:55 ` [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail Linus Torvalds
@ 2022-12-30 10:14 ` Thorsten Leemhuis
  1 sibling, 0 replies; 4+ messages in thread
From: Thorsten Leemhuis @ 2022-12-30 10:14 UTC (permalink / raw)
  To: regressions@lists.linux.dev; +Cc: linux-kernel, linux-kbuild

[Note: this mail contains only information for Linux kernel regression
tracking. Mails like these contain '#forregzbot' in the subject to make
then easy to spot and filter out. The author also tried to remove most
or all individuals from the list of recipients to spare them the hassle.]

On 29.12.22 09:49, kernel test robot wrote:
> 
> please be noted we felt quite strange why this change could cause
>   xfstests.generic.454.fail
> but even by rebuild and rerun, the issue seems persistent while keeping clean
> on parent
> 
> daf4218bf8dd9750 3bc753c06dd02a3517c9b498e38
> ---------------- ---------------------------
>        fail:runs  %reproduction    fail:runs
>            |             |             |
>            :10         100%          10:10    xfstests.generic.454.fail
> 
> so we just report out FYI to seek any insights about this.
> 
> 
> Greeting,
> 
> FYI, we noticed xfstests.generic.454.fail due to commit (built with gcc-11):
> 
> commit: 3bc753c06dd02a3517c9b498e3846ebfc94ac3ee ("kbuild: treat char as always unsigned")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> [test failed on linux-next/master c76083fac3bae1a87ae3d005b5cb1cbc761e31d5]
> 
> in testcase: xfstests
> version: xfstests-x86_64-fb6575e-1_20221226
> with following parameters:
> 
> 	disk: 4HDD
> 	fs: ext4
> 	test: generic-group-22
> 

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced 3bc753c06dd02a
#regzbot title ext4/acls: xfstests.generic.454 suddenly fails for the
kernel test robot
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail
  2022-12-29 18:55 ` [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail Linus Torvalds
  2022-12-30  2:08   ` Jason A. Donenfeld
@ 2022-12-30 15:36   ` Christian Brauner
  1 sibling, 0 replies; 4+ messages in thread
From: Christian Brauner @ 2022-12-30 15:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: kernel test robot, Jason A. Donenfeld, oe-lkp, lkp,
	Masahiro Yamada, Kees Cook, Andrew Morton, Andy Shevchenko,
	Greg Kroah-Hartman, linux-kernel, linux-kbuild, Theodore Ts'o

On Thu, Dec 29, 2022 at 10:55:05AM -0800, Linus Torvalds wrote:
> On Thu, Dec 29, 2022 at 12:49 AM kernel test robot
> <oliver.sang@intel.com> wrote:
> >
> > generic/454       _check_generic_filesystem: filesystem on /dev/sda4 is inconsistent
> 
> The commentary on that test is:
> 
>   Create xattrs with multiple keys that all appear the same
>   (in unicode, anyway) but point to different values.  In theory all
>   Linux filesystems should allow this (filenames are a sequence of
>   arbitrary bytes) even if the user implications are horrifying.
> 
> and looking at the script it seems to indeed just do setfattr and
> getfattr with some unusual data (ie high bit set).
> 
> Adding Ted, since this is apparently all on ext4. I guess it could be
> the vfs layer too, but it really doesn't tend to look very much at the
> xattr data, so.. Adding Christian Brauner anyway, since he's been
> working in this area for other reasons.

The test uses the user.* xattr namespace which should be unaffected by
the xattr changes we did the last few cycles.

> 
> Ted, Christian - I cut down the report mercilessly. It's not really
> all that interesting, apart from the basic information of "xfstest
> generic/454 started failing consistently on ext4 at commit
> 3bc753c06dd0 ('kbuild: treat char as always unsigned')".
> 
> If you think you need more, see
> 
>     https://lore.kernel.org/all/202212291509.704a11c9-oliver.sang@intel.com/
> 
> Also, I'm surprised this hasn't been an issue earlier - 'char' has
> always been unsigned on arm (among other architectures), so if this
> test started failing now on x86-64 due to -funsigned-char, it has
> presumably been failing on arm the whole time.
> 
> I assume it's something that compares a 'char *name' by value, but the
> ones I looked at (eg xattr_find_entry() used strlen()/memcmp() which
> should be all good).
> 
> Oh, I think I see one potential problem in ext4:
> 
> ext4_xattr_hash_entry() is hot garbage. Lookie here:
> 
>         while (name_len--) {
>                 hash = (hash << NAME_HASH_SHIFT) ^
>                        (hash >> (8*sizeof(hash) - NAME_HASH_SHIFT)) ^
>                        *name++;
>         }
> 
> so that hash will now depend on the sign of that 'char *name' pointer.
> 
> If that hash ever has any long-term meaning (ie saved on disk or
> exposed some other way), that would be problematic.

If the xattrs aren't storable in the inode then they are stored in a
separate block. The consist of a header and after that is an array of
struct ext4_xattr_entry entries.

Each of the entries store a hash value e_hash which is a hash of the
xattr name and xattr value.

The aforementioned header contains another h_hash field which seems to
be a hash of all e_hash fields of all xattrs.

For in-inode/ea-inode xattrs a hash for xattr name and value is stored
on disk as well but there's no header. The i_atime field contains a
checksum of only the value that is stored in the inode.

IOW, the bug might also depend on how the xattrs are stored. For
example, what xattrs might be stored in the inode depends on the inode
size that is chosen when the filesystem is created. But if I'm not
mistaken, it also depends on the block size.
So if the block size chosen for x86 differs from arm that might have an
impact on how the xattrs are stored and thus make the bug more or less
likely to appear?...

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-12-30 15:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <202212291509.704a11c9-oliver.sang@intel.com>
2022-12-29 18:55 ` [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail Linus Torvalds
2022-12-30  2:08   ` Jason A. Donenfeld
2022-12-30 15:36   ` Christian Brauner
2022-12-30 10:14 ` [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail #forregzbot Thorsten Leemhuis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox