linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Recent Linux kernel commit breaks Gnulib test suite.
@ 2025-06-05  4:03 Collin Funk
  2025-06-05  7:10 ` Paul Eggert
  2025-06-05 13:46 ` Stephen Smalley
  0 siblings, 2 replies; 4+ messages in thread
From: Collin Funk @ 2025-06-05  4:03 UTC (permalink / raw)
  To: bug-gnulib
  Cc: linux-fsdevel, linux-kernel, selinux, Christian Brauner,
	Stephen Smalley

Hi,

Using the following testdir:

    $ git clone https://git.savannah.gnu.org/git/gnulib.git && cd gnulib
    $ ./gnulib-tool --create-testdir --dir testdir1 --single-configure `./gnulib-tool --list | grep acl`

I see the following result:

    $ cd testdir1 && ./configure && make check
    [...]
    FAIL: test-copy-acl.sh
    [...]
    FAIL: test-file-has-acl.sh

This occurs with these two kernels:

    $ uname -r
    6.14.9-300.fc42.x86_64
    $ uname -r
    6.14.8-300.fc42.x86_64

But with this kernel:

    $ uname -r
    6.14.6-300.fc42.x86_64

The result is:

    $ cd testdir1 && ./configure && make check
    [...]
    PASS: test-copy-acl.sh
    [...]
    PASS: test-file-has-acl.sh

Here is the test-suite.log from 6.14.9-300.fc42.x86_64:

    FAIL: test-copy-acl.sh
    ======================
    
    /home/collin/.local/src/gnulib/testdir1/gltests/test-copy-acl: preserving permissions for 'tmpfile2': Numerical result out of range
    FAIL test-copy-acl.sh (exit status: 1)
    
    FAIL: test-file-has-acl.sh
    ==========================
    
    file_has_acl("tmpfile0") returned no, expected yes
    FAIL test-file-has-acl.sh (exit status: 1)

To investigate further, I created the testdir again after applying the
following diff:

    diff --git a/tests/test-copy-acl.sh b/tests/test-copy-acl.sh
    index 061755f124..f9457e884f 100755
    --- a/tests/test-copy-acl.sh
    +++ b/tests/test-copy-acl.sh
    @@ -209,7 +209,7 @@ cd "$builddir" ||
       {
         echo "Simple contents" > "$2"
         chmod 600 "$2"
    -    ${CHECKER} "$builddir"/test-copy-acl${EXEEXT} "$1" "$2" || exit 1
    +    ${CHECKER} strace "$builddir"/test-copy-acl${EXEEXT} "$1" "$2" || exit 1
         ${CHECKER} "$builddir"/test-sameacls${EXEEXT} "$1" "$2" || exit 1
         func_test_same_acls                           "$1" "$2" || exit 1
       }

Then running the test from inside testdir1/gltests:

    $ ./test-copy-acl.sh
    [...]
    access("/etc/selinux/config", F_OK)     = 0
    openat(AT_FDCWD, "tmpfile0", O_RDONLY)  = 3
    fstat(3, {st_mode=S_IFREG|0610, st_size=16, ...}) = 0
    openat(AT_FDCWD, "tmpfile2", O_WRONLY)  = 4
    fchmod(4, 0610)                         = 0
    flistxattr(3, NULL, 0)                  = 17
    flistxattr(3, 0x7ffda3f6c900, 17)       = -1 ERANGE (Numerical result out of range)
    write(2, "/home/collin/.local/src/gnulib/t"..., 63/home/collin/.local/src/gnulib/testdir1/gltests/test-copy-acl: ) = 63
    write(2, "preserving permissions for 'tmpf"..., 37preserving permissions for 'tmpfile2') = 37
    write(2, ": Numerical result out of range", 31: Numerical result out of range) = 31
    write(2, "\n", 1
    )                       = 1
    exit_group(1)                           = ?
    +++ exited with 1 +++

So, we get the buffer size from 'flistxattr(3, NULL, 0)' and then call
it again after allocating it 'flistxattr(3, 0x7ffda3f6c900, 17)'. This
shouldn't fail with ERANGE then.

To confirm, I replaced 'strace' with 'gdb --args'. Here is the result:

    (gdb) b qcopy_acl 
    Breakpoint 1 at 0x400a10: file qcopy-acl.c, line 84.
    (gdb) run
    Starting program: /home/collin/.local/src/gnulib/testdir1/gltests/test-copy-acl tmpfile0 tmpfile2
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib64/libthread_db.so.1".
    
    Breakpoint 1, qcopy_acl (src_name=src_name@entry=0x7fffffffd7c3 "tmpfile0", source_desc=source_desc@entry=3, 
        dst_name=dst_name@entry=0x7fffffffd7cc "tmpfile2", dest_desc=dest_desc@entry=4, mode=mode@entry=392) at qcopy-acl.c:84
    84	  ret = chmod_or_fchmod (dst_name, dest_desc, mode);
    (gdb) n
    90	  if (ret == 0)
    (gdb) n
    92	      ret = source_desc <= 0 || dest_desc <= 0
    (gdb) s
    attr_copy_fd (src_path=src_path@entry=0x7fffffffd7c3 "tmpfile0", src_fd=src_fd@entry=3, dst_path=dst_path@entry=0x7fffffffd7cc "tmpfile2", 
        dst_fd=dst_fd@entry=4, check=check@entry=0x4009b0 <is_attr_permissions>, ctx=ctx@entry=0x0) at libattr/attr_copy_fd.c:73
    73		if (check == NULL)
    (gdb) n
    76		size = flistxattr (src_fd, NULL, 0);
    (gdb) n
    77		if (size < 0) {
    (gdb) print size
    $1 = 17
    (gdb) n
    86		names = (char *) my_alloc (size+1);
    (gdb) n
    92		size = flistxattr (src_fd, names, size);
    (gdb) print errno
    $2 = 0
    (gdb) n
    93		if (size < 0) {
    (gdb) print size
    $3 = -1
    (gdb) print errno
    $4 = 34

After confirming with the Fedora Kernel tags [1], I am fairly confident
that it was caused by this commit [2].

But I am not familiar enough with ACLs, SELinux, or the Kernel to know
the fix.

Adding the lists where this was discussed and some of the signers to CC,
since they will know better than me.

Collin

[1] https://gitlab.com/cki-project/kernel-ark
[2] https://github.com/torvalds/linux/commit/8b0ba61df5a1c44e2b3cf683831a4fc5e24ea99d

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Recent Linux kernel commit breaks Gnulib test suite.
  2025-06-05  4:03 Recent Linux kernel commit breaks Gnulib test suite Collin Funk
@ 2025-06-05  7:10 ` Paul Eggert
  2025-06-05 13:46 ` Stephen Smalley
  1 sibling, 0 replies; 4+ messages in thread
From: Paul Eggert @ 2025-06-05  7:10 UTC (permalink / raw)
  To: Collin Funk
  Cc: linux-fsdevel, linux-kernel, selinux, Christian Brauner,
	Stephen Smalley, bug-gnulib

I saw that bug last month, and filed a Fedora bug report here:

https://bugzilla.redhat.com/show_bug.cgi?id=2369561

The kernel bug breaks libxattr's attr_copy_fd and attr_copy_file 
functions, among other things.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Recent Linux kernel commit breaks Gnulib test suite.
  2025-06-05  4:03 Recent Linux kernel commit breaks Gnulib test suite Collin Funk
  2025-06-05  7:10 ` Paul Eggert
@ 2025-06-05 13:46 ` Stephen Smalley
  2025-06-05 16:53   ` Stephen Smalley
  1 sibling, 1 reply; 4+ messages in thread
From: Stephen Smalley @ 2025-06-05 13:46 UTC (permalink / raw)
  To: Collin Funk
  Cc: bug-gnulib, linux-fsdevel, linux-kernel, selinux,
	Christian Brauner, Paul Eggert, Paul Moore

On Thu, Jun 5, 2025 at 12:03 AM Collin Funk <collin.funk1@gmail.com> wrote:
>
> Hi,
>
> Using the following testdir:
>
>     $ git clone https://git.savannah.gnu.org/git/gnulib.git && cd gnulib
>     $ ./gnulib-tool --create-testdir --dir testdir1 --single-configure `./gnulib-tool --list | grep acl`
>
> I see the following result:
>
>     $ cd testdir1 && ./configure && make check
>     [...]
>     FAIL: test-copy-acl.sh
>     [...]
>     FAIL: test-file-has-acl.sh
>
> This occurs with these two kernels:
>
>     $ uname -r
>     6.14.9-300.fc42.x86_64
>     $ uname -r
>     6.14.8-300.fc42.x86_64
>
> But with this kernel:
>
>     $ uname -r
>     6.14.6-300.fc42.x86_64
>
> The result is:
>
>     $ cd testdir1 && ./configure && make check
>     [...]
>     PASS: test-copy-acl.sh
>     [...]
>     PASS: test-file-has-acl.sh
>
> Here is the test-suite.log from 6.14.9-300.fc42.x86_64:
>
>     FAIL: test-copy-acl.sh
>     ======================
>
>     /home/collin/.local/src/gnulib/testdir1/gltests/test-copy-acl: preserving permissions for 'tmpfile2': Numerical result out of range
>     FAIL test-copy-acl.sh (exit status: 1)
>
>     FAIL: test-file-has-acl.sh
>     ==========================
>
>     file_has_acl("tmpfile0") returned no, expected yes
>     FAIL test-file-has-acl.sh (exit status: 1)
>
> To investigate further, I created the testdir again after applying the
> following diff:
>
>     diff --git a/tests/test-copy-acl.sh b/tests/test-copy-acl.sh
>     index 061755f124..f9457e884f 100755
>     --- a/tests/test-copy-acl.sh
>     +++ b/tests/test-copy-acl.sh
>     @@ -209,7 +209,7 @@ cd "$builddir" ||
>        {
>          echo "Simple contents" > "$2"
>          chmod 600 "$2"
>     -    ${CHECKER} "$builddir"/test-copy-acl${EXEEXT} "$1" "$2" || exit 1
>     +    ${CHECKER} strace "$builddir"/test-copy-acl${EXEEXT} "$1" "$2" || exit 1
>          ${CHECKER} "$builddir"/test-sameacls${EXEEXT} "$1" "$2" || exit 1
>          func_test_same_acls                           "$1" "$2" || exit 1
>        }
>
> Then running the test from inside testdir1/gltests:
>
>     $ ./test-copy-acl.sh
>     [...]
>     access("/etc/selinux/config", F_OK)     = 0
>     openat(AT_FDCWD, "tmpfile0", O_RDONLY)  = 3
>     fstat(3, {st_mode=S_IFREG|0610, st_size=16, ...}) = 0
>     openat(AT_FDCWD, "tmpfile2", O_WRONLY)  = 4
>     fchmod(4, 0610)                         = 0
>     flistxattr(3, NULL, 0)                  = 17
>     flistxattr(3, 0x7ffda3f6c900, 17)       = -1 ERANGE (Numerical result out of range)
>     write(2, "/home/collin/.local/src/gnulib/t"..., 63/home/collin/.local/src/gnulib/testdir1/gltests/test-copy-acl: ) = 63
>     write(2, "preserving permissions for 'tmpf"..., 37preserving permissions for 'tmpfile2') = 37
>     write(2, ": Numerical result out of range", 31: Numerical result out of range) = 31
>     write(2, "\n", 1
>     )                       = 1
>     exit_group(1)                           = ?
>     +++ exited with 1 +++
>
> So, we get the buffer size from 'flistxattr(3, NULL, 0)' and then call
> it again after allocating it 'flistxattr(3, 0x7ffda3f6c900, 17)'. This
> shouldn't fail with ERANGE then.
>
> To confirm, I replaced 'strace' with 'gdb --args'. Here is the result:
>
>     (gdb) b qcopy_acl
>     Breakpoint 1 at 0x400a10: file qcopy-acl.c, line 84.
>     (gdb) run
>     Starting program: /home/collin/.local/src/gnulib/testdir1/gltests/test-copy-acl tmpfile0 tmpfile2
>     [Thread debugging using libthread_db enabled]
>     Using host libthread_db library "/lib64/libthread_db.so.1".
>
>     Breakpoint 1, qcopy_acl (src_name=src_name@entry=0x7fffffffd7c3 "tmpfile0", source_desc=source_desc@entry=3,
>         dst_name=dst_name@entry=0x7fffffffd7cc "tmpfile2", dest_desc=dest_desc@entry=4, mode=mode@entry=392) at qcopy-acl.c:84
>     84    ret = chmod_or_fchmod (dst_name, dest_desc, mode);
>     (gdb) n
>     90    if (ret == 0)
>     (gdb) n
>     92        ret = source_desc <= 0 || dest_desc <= 0
>     (gdb) s
>     attr_copy_fd (src_path=src_path@entry=0x7fffffffd7c3 "tmpfile0", src_fd=src_fd@entry=3, dst_path=dst_path@entry=0x7fffffffd7cc "tmpfile2",
>         dst_fd=dst_fd@entry=4, check=check@entry=0x4009b0 <is_attr_permissions>, ctx=ctx@entry=0x0) at libattr/attr_copy_fd.c:73
>     73          if (check == NULL)
>     (gdb) n
>     76          size = flistxattr (src_fd, NULL, 0);
>     (gdb) n
>     77          if (size < 0) {
>     (gdb) print size
>     $1 = 17
>     (gdb) n
>     86          names = (char *) my_alloc (size+1);
>     (gdb) n
>     92          size = flistxattr (src_fd, names, size);
>     (gdb) print errno
>     $2 = 0
>     (gdb) n
>     93          if (size < 0) {
>     (gdb) print size
>     $3 = -1
>     (gdb) print errno
>     $4 = 34
>
> After confirming with the Fedora Kernel tags [1], I am fairly confident
> that it was caused by this commit [2].
>
> But I am not familiar enough with ACLs, SELinux, or the Kernel to know
> the fix.
>
> Adding the lists where this was discussed and some of the signers to CC,
> since they will know better than me.

Thank you for the bug report. Looks like the security xattr handling
is somehow replacing the overall length with just the length of the
security.selinux xattr rather than adding it to the length of the acl
xattr. Will check to see if this is already fixed on vfs.fixes; if
not, will look into a fix although it wasn't immediately obvious to me
why this is happening. There is also another patch related to this
pending that is supposed to go through the LSM tree which might fix
it.

>
> Collin
>
> [1] https://gitlab.com/cki-project/kernel-ark
> [2] https://github.com/torvalds/linux/commit/8b0ba61df5a1c44e2b3cf683831a4fc5e24ea99d

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Recent Linux kernel commit breaks Gnulib test suite.
  2025-06-05 13:46 ` Stephen Smalley
@ 2025-06-05 16:53   ` Stephen Smalley
  0 siblings, 0 replies; 4+ messages in thread
From: Stephen Smalley @ 2025-06-05 16:53 UTC (permalink / raw)
  To: Collin Funk
  Cc: bug-gnulib, linux-fsdevel, linux-kernel, selinux,
	Christian Brauner, Paul Eggert, Paul Moore

On Thu, Jun 5, 2025 at 9:46 AM Stephen Smalley
<stephen.smalley.work@gmail.com> wrote:
>
> On Thu, Jun 5, 2025 at 12:03 AM Collin Funk <collin.funk1@gmail.com> wrote:
> >
> > Hi,
> >
> > Using the following testdir:
> >
> >     $ git clone https://git.savannah.gnu.org/git/gnulib.git && cd gnulib
> >     $ ./gnulib-tool --create-testdir --dir testdir1 --single-configure `./gnulib-tool --list | grep acl`
> >
> > I see the following result:
> >
> >     $ cd testdir1 && ./configure && make check
> >     [...]
> >     FAIL: test-copy-acl.sh
> >     [...]
> >     FAIL: test-file-has-acl.sh
> >
> > This occurs with these two kernels:
> >
> >     $ uname -r
> >     6.14.9-300.fc42.x86_64
> >     $ uname -r
> >     6.14.8-300.fc42.x86_64
> >
> > But with this kernel:
> >
> >     $ uname -r
> >     6.14.6-300.fc42.x86_64
> >
> > The result is:
> >
> >     $ cd testdir1 && ./configure && make check
> >     [...]
> >     PASS: test-copy-acl.sh
> >     [...]
> >     PASS: test-file-has-acl.sh
> >
> > Here is the test-suite.log from 6.14.9-300.fc42.x86_64:
> >
> >     FAIL: test-copy-acl.sh
> >     ======================
> >
> >     /home/collin/.local/src/gnulib/testdir1/gltests/test-copy-acl: preserving permissions for 'tmpfile2': Numerical result out of range
> >     FAIL test-copy-acl.sh (exit status: 1)
> >
> >     FAIL: test-file-has-acl.sh
> >     ==========================
> >
> >     file_has_acl("tmpfile0") returned no, expected yes
> >     FAIL test-file-has-acl.sh (exit status: 1)
> >
> > To investigate further, I created the testdir again after applying the
> > following diff:
> >
> >     diff --git a/tests/test-copy-acl.sh b/tests/test-copy-acl.sh
> >     index 061755f124..f9457e884f 100755
> >     --- a/tests/test-copy-acl.sh
> >     +++ b/tests/test-copy-acl.sh
> >     @@ -209,7 +209,7 @@ cd "$builddir" ||
> >        {
> >          echo "Simple contents" > "$2"
> >          chmod 600 "$2"
> >     -    ${CHECKER} "$builddir"/test-copy-acl${EXEEXT} "$1" "$2" || exit 1
> >     +    ${CHECKER} strace "$builddir"/test-copy-acl${EXEEXT} "$1" "$2" || exit 1
> >          ${CHECKER} "$builddir"/test-sameacls${EXEEXT} "$1" "$2" || exit 1
> >          func_test_same_acls                           "$1" "$2" || exit 1
> >        }
> >
> > Then running the test from inside testdir1/gltests:
> >
> >     $ ./test-copy-acl.sh
> >     [...]
> >     access("/etc/selinux/config", F_OK)     = 0
> >     openat(AT_FDCWD, "tmpfile0", O_RDONLY)  = 3
> >     fstat(3, {st_mode=S_IFREG|0610, st_size=16, ...}) = 0
> >     openat(AT_FDCWD, "tmpfile2", O_WRONLY)  = 4
> >     fchmod(4, 0610)                         = 0
> >     flistxattr(3, NULL, 0)                  = 17
> >     flistxattr(3, 0x7ffda3f6c900, 17)       = -1 ERANGE (Numerical result out of range)
> >     write(2, "/home/collin/.local/src/gnulib/t"..., 63/home/collin/.local/src/gnulib/testdir1/gltests/test-copy-acl: ) = 63
> >     write(2, "preserving permissions for 'tmpf"..., 37preserving permissions for 'tmpfile2') = 37
> >     write(2, ": Numerical result out of range", 31: Numerical result out of range) = 31
> >     write(2, "\n", 1
> >     )                       = 1
> >     exit_group(1)                           = ?
> >     +++ exited with 1 +++
> >
> > So, we get the buffer size from 'flistxattr(3, NULL, 0)' and then call
> > it again after allocating it 'flistxattr(3, 0x7ffda3f6c900, 17)'. This
> > shouldn't fail with ERANGE then.
> >
> > To confirm, I replaced 'strace' with 'gdb --args'. Here is the result:
> >
> >     (gdb) b qcopy_acl
> >     Breakpoint 1 at 0x400a10: file qcopy-acl.c, line 84.
> >     (gdb) run
> >     Starting program: /home/collin/.local/src/gnulib/testdir1/gltests/test-copy-acl tmpfile0 tmpfile2
> >     [Thread debugging using libthread_db enabled]
> >     Using host libthread_db library "/lib64/libthread_db.so.1".
> >
> >     Breakpoint 1, qcopy_acl (src_name=src_name@entry=0x7fffffffd7c3 "tmpfile0", source_desc=source_desc@entry=3,
> >         dst_name=dst_name@entry=0x7fffffffd7cc "tmpfile2", dest_desc=dest_desc@entry=4, mode=mode@entry=392) at qcopy-acl.c:84
> >     84    ret = chmod_or_fchmod (dst_name, dest_desc, mode);
> >     (gdb) n
> >     90    if (ret == 0)
> >     (gdb) n
> >     92        ret = source_desc <= 0 || dest_desc <= 0
> >     (gdb) s
> >     attr_copy_fd (src_path=src_path@entry=0x7fffffffd7c3 "tmpfile0", src_fd=src_fd@entry=3, dst_path=dst_path@entry=0x7fffffffd7cc "tmpfile2",
> >         dst_fd=dst_fd@entry=4, check=check@entry=0x4009b0 <is_attr_permissions>, ctx=ctx@entry=0x0) at libattr/attr_copy_fd.c:73
> >     73          if (check == NULL)
> >     (gdb) n
> >     76          size = flistxattr (src_fd, NULL, 0);
> >     (gdb) n
> >     77          if (size < 0) {
> >     (gdb) print size
> >     $1 = 17
> >     (gdb) n
> >     86          names = (char *) my_alloc (size+1);
> >     (gdb) n
> >     92          size = flistxattr (src_fd, names, size);
> >     (gdb) print errno
> >     $2 = 0
> >     (gdb) n
> >     93          if (size < 0) {
> >     (gdb) print size
> >     $3 = -1
> >     (gdb) print errno
> >     $4 = 34
> >
> > After confirming with the Fedora Kernel tags [1], I am fairly confident
> > that it was caused by this commit [2].
> >
> > But I am not familiar enough with ACLs, SELinux, or the Kernel to know
> > the fix.
> >
> > Adding the lists where this was discussed and some of the signers to CC,
> > since they will know better than me.
>
> Thank you for the bug report. Looks like the security xattr handling
> is somehow replacing the overall length with just the length of the
> security.selinux xattr rather than adding it to the length of the acl
> xattr. Will check to see if this is already fixed on vfs.fixes; if
> not, will look into a fix although it wasn't immediately obvious to me
> why this is happening. There is also another patch related to this
> pending that is supposed to go through the LSM tree which might fix
> it.

Sorry, mea culpa; should be fixed by
https://lore.kernel.org/selinux/20250605164852.2016-1-stephen.smalley.work@gmail.com/

>
> >
> > Collin
> >
> > [1] https://gitlab.com/cki-project/kernel-ark
> > [2] https://github.com/torvalds/linux/commit/8b0ba61df5a1c44e2b3cf683831a4fc5e24ea99d

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-06-05 16:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-05  4:03 Recent Linux kernel commit breaks Gnulib test suite Collin Funk
2025-06-05  7:10 ` Paul Eggert
2025-06-05 13:46 ` Stephen Smalley
2025-06-05 16:53   ` Stephen Smalley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).