Linux Kernel Selftest development
 help / color / mirror / Atom feed
* selftests: net/af_unix test_unix_oob [FAILED]
@ 2023-08-07 19:44 Mirsad Todorovac
  2023-08-07 20:46 ` Kuniyuki Iwashima
  0 siblings, 1 reply; 6+ messages in thread
From: Mirsad Todorovac @ 2023-08-07 19:44 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, Kuniyuki Iwashima, Florian Westphal,
	Mirsad Goran Todorovac, Alexander Mikhalitsyn, linux-kernel,
	linux-kselftest

[-- Attachment #1: Type: text/plain, Size: 1373 bytes --]

Hi all,

In the kernel 6.5-rc5 build on Ubuntu 22.04 LTS (jammy jellyfish) on a Ryzen 7950 assembled box,
vanilla torvalds tree kernel, the test test_unix_oob unexpectedly fails:

# selftests: net/af_unix: test_unix_oob
# Test 2 failed, sigurg 23 len 63 OOB %

It is this code:

         /* Test 2:
          * Verify that the first OOB is over written by
          * the 2nd one and the first OOB is returned as
          * part of the read, and sigurg is received.
          */
         wait_for_data(pfd, POLLIN | POLLPRI);
         len = 0;
         while (len < 70)
                 len = recv(pfd, buf, 1024, MSG_PEEK);
         len = read_data(pfd, buf, 1024);
         read_oob(pfd, &oob);
         if (!signal_recvd || len != 127 || oob != '#') {
                 fprintf(stderr, "Test 2 failed, sigurg %d len %d OOB %c\n",
                 signal_recvd, len, oob);
                 die(1);
         }

In 6.5-rc4, this test was OK, so it might mean we have a regression?

marvin@defiant:~/linux/kernel/linux_torvalds$ grep test_unix_oob ../kselftest-6.5-rc4-1.log
/net/af_unix/test_unix_oob
# selftests: net/af_unix: test_unix_oob
ok 2 selftests: net/af_unix: test_unix_oob
marvin@defiant:~/linux/kernel/linux_torvalds$

Hope this helps.

NOTE: the kernel is vanilla torvalds tree, only "dirty" because the selftests were modified.

Kind regards,
Mirsad Todorovac

[-- Attachment #2: config-6.5.0-rc5-debug-dirty.xz --]
[-- Type: application/x-xz, Size: 57752 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: selftests: net/af_unix test_unix_oob [FAILED]
  2023-08-07 19:44 selftests: net/af_unix test_unix_oob [FAILED] Mirsad Todorovac
@ 2023-08-07 20:46 ` Kuniyuki Iwashima
  2023-08-07 23:09   ` Mirsad Todorovac
  0 siblings, 1 reply; 6+ messages in thread
From: Kuniyuki Iwashima @ 2023-08-07 20:46 UTC (permalink / raw)
  To: mirsad.todorovac
  Cc: alexander, davem, edumazet, fw, kuba, kuniyu, linux-kernel,
	linux-kselftest, netdev, pabeni, shuah

From: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Date: Mon, 7 Aug 2023 21:44:41 +0200
> Hi all,
> 
> In the kernel 6.5-rc5 build on Ubuntu 22.04 LTS (jammy jellyfish) on a Ryzen 7950 assembled box,
> vanilla torvalds tree kernel, the test test_unix_oob unexpectedly fails:
> 
> # selftests: net/af_unix: test_unix_oob
> # Test 2 failed, sigurg 23 len 63 OOB %
> 
> It is this code:
> 
>          /* Test 2:
>           * Verify that the first OOB is over written by
>           * the 2nd one and the first OOB is returned as
>           * part of the read, and sigurg is received.
>           */
>          wait_for_data(pfd, POLLIN | POLLPRI);
>          len = 0;
>          while (len < 70)
>                  len = recv(pfd, buf, 1024, MSG_PEEK);
>          len = read_data(pfd, buf, 1024);
>          read_oob(pfd, &oob);
>          if (!signal_recvd || len != 127 || oob != '#') {
>                  fprintf(stderr, "Test 2 failed, sigurg %d len %d OOB %c\n",
>                  signal_recvd, len, oob);
>                  die(1);
>          }
> 
> In 6.5-rc4, this test was OK, so it might mean we have a regression?

Thanks for reporting.

I confirmed the test doesn't fail on net-next at least, but it's based
on v6.5-rc4.

  ---8<---
  [root@localhost ~]# ./test_unix_oob 
  [root@localhost ~]# echo $?
  0
  [root@localhost ~]# uname -r
  6.5.0-rc4-01192-g66244337512f
  ---8<---

I'll check 6.5-rc5 later.


> 
> marvin@defiant:~/linux/kernel/linux_torvalds$ grep test_unix_oob ../kselftest-6.5-rc4-1.log
> /net/af_unix/test_unix_oob
> # selftests: net/af_unix: test_unix_oob
> ok 2 selftests: net/af_unix: test_unix_oob
> marvin@defiant:~/linux/kernel/linux_torvalds$
> 
> Hope this helps.
> 
> NOTE: the kernel is vanilla torvalds tree, only "dirty" because the selftests were modified.
> 
> Kind regards,
> Mirsad Todorovac

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: selftests: net/af_unix test_unix_oob [FAILED]
  2023-08-07 20:46 ` Kuniyuki Iwashima
@ 2023-08-07 23:09   ` Mirsad Todorovac
  2023-08-08  8:53     ` Mirsad Todorovac
  0 siblings, 1 reply; 6+ messages in thread
From: Mirsad Todorovac @ 2023-08-07 23:09 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: alexander, davem, edumazet, fw, kuba, linux-kernel,
	linux-kselftest, netdev, pabeni, shuah

On 8/7/23 22:46, Kuniyuki Iwashima wrote:
> From: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
> Date: Mon, 7 Aug 2023 21:44:41 +0200
>> Hi all,
>>
>> In the kernel 6.5-rc5 build on Ubuntu 22.04 LTS (jammy jellyfish) on a Ryzen 7950 assembled box,
>> vanilla torvalds tree kernel, the test test_unix_oob unexpectedly fails:
>>
>> # selftests: net/af_unix: test_unix_oob
>> # Test 2 failed, sigurg 23 len 63 OOB %
>>
>> It is this code:
>>
>>           /* Test 2:
>>            * Verify that the first OOB is over written by
>>            * the 2nd one and the first OOB is returned as
>>            * part of the read, and sigurg is received.
>>            */
>>           wait_for_data(pfd, POLLIN | POLLPRI);
>>           len = 0;
>>           while (len < 70)
>>                   len = recv(pfd, buf, 1024, MSG_PEEK);
>>           len = read_data(pfd, buf, 1024);
>>           read_oob(pfd, &oob);
>>           if (!signal_recvd || len != 127 || oob != '#') {
>>                   fprintf(stderr, "Test 2 failed, sigurg %d len %d OOB %c\n",
>>                   signal_recvd, len, oob);
>>                   die(1);
>>           }
>>
>> In 6.5-rc4, this test was OK, so it might mean we have a regression?
> 
> Thanks for reporting.
> 
> I confirmed the test doesn't fail on net-next at least, but it's based
> on v6.5-rc4.
> 
>    ---8<---
>    [root@localhost ~]# ./test_unix_oob
>    [root@localhost ~]# echo $?
>    0
>    [root@localhost ~]# uname -r
>    6.5.0-rc4-01192-g66244337512f
>    ---8<---
> 
> I'll check 6.5-rc5 later.

Hi, Kuniyuki,

It seems that there is a new development. I could reproduce the error with the failed test 2
as early as 6.0-rc1. However, the gotcha is that the error appears to be sporadically manifested
(possibly a race)?

I am currently attempting a bisect.

Kind regards,
Mirsad

>> marvin@defiant:~/linux/kernel/linux_torvalds$ grep test_unix_oob ../kselftest-6.5-rc4-1.log
>> /net/af_unix/test_unix_oob
>> # selftests: net/af_unix: test_unix_oob
>> ok 2 selftests: net/af_unix: test_unix_oob
>> marvin@defiant:~/linux/kernel/linux_torvalds$
>>
>> Hope this helps.
>>
>> NOTE: the kernel is vanilla torvalds tree, only "dirty" because the selftests were modified.
>>
>> Kind regards,
>> Mirsad Todorovac

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: selftests: net/af_unix test_unix_oob [FAILED]
  2023-08-07 23:09   ` Mirsad Todorovac
@ 2023-08-08  8:53     ` Mirsad Todorovac
  2023-08-14  8:54       ` Mirsad Todorovac
  0 siblings, 1 reply; 6+ messages in thread
From: Mirsad Todorovac @ 2023-08-08  8:53 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: alexander, davem, edumazet, fw, kuba, linux-kernel,
	linux-kselftest, netdev, pabeni, shuah

On 8/8/23 01:09, Mirsad Todorovac wrote:
> On 8/7/23 22:46, Kuniyuki Iwashima wrote:
>> From: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
>> Date: Mon, 7 Aug 2023 21:44:41 +0200
>>> Hi all,
>>>
>>> In the kernel 6.5-rc5 build on Ubuntu 22.04 LTS (jammy jellyfish) on a Ryzen 7950 assembled box,
>>> vanilla torvalds tree kernel, the test test_unix_oob unexpectedly fails:
>>>
>>> # selftests: net/af_unix: test_unix_oob
>>> # Test 2 failed, sigurg 23 len 63 OOB %
>>>
>>> It is this code:
>>>
>>>           /* Test 2:
>>>            * Verify that the first OOB is over written by
>>>            * the 2nd one and the first OOB is returned as
>>>            * part of the read, and sigurg is received.
>>>            */
>>>           wait_for_data(pfd, POLLIN | POLLPRI);
>>>           len = 0;
>>>           while (len < 70)
>>>                   len = recv(pfd, buf, 1024, MSG_PEEK);
>>>           len = read_data(pfd, buf, 1024);
>>>           read_oob(pfd, &oob);
>>>           if (!signal_recvd || len != 127 || oob != '#') {
>>>                   fprintf(stderr, "Test 2 failed, sigurg %d len %d OOB %c\n",
>>>                   signal_recvd, len, oob);
>>>                   die(1);
>>>           }
>>>
>>> In 6.5-rc4, this test was OK, so it might mean we have a regression?
>>
>> Thanks for reporting.
>>
>> I confirmed the test doesn't fail on net-next at least, but it's based
>> on v6.5-rc4.
>>
>>    ---8<---
>>    [root@localhost ~]# ./test_unix_oob
>>    [root@localhost ~]# echo $?
>>    0
>>    [root@localhost ~]# uname -r
>>    6.5.0-rc4-01192-g66244337512f
>>    ---8<---
>>
>> I'll check 6.5-rc5 later.
> 
> Hi, Kuniyuki,
> 
> It seems that there is a new development. I could reproduce the error with the failed test 2
> as early as 6.0-rc1. However, the gotcha is that the error appears to be sporadically manifested
> (possibly a race)?
> 
> I am currently attempting a bisect.

Bisect had shown that the condition existed already at 5.11 torvalds tree.

It has to do with the configs chosen (I used the configs from seltests/*/config merged), but it
is also present in the Ubuntu production build:

marvin@defiant:~$ cd linux/kernel/linux_torvalds
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
Test 2 failed, sigurg 23 len 63 OOB %
marvin@defiant:~/linux/kernel/linux_torvalds$ uname -rms
Linux 6.4.8-060408-generic x86_64
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
Test 1 failed sigurg 0 len 63
marvin@defiant:~/linux/kernel/linux_torvalds$

It happens on rare occasions, so it seems to be a hard-to-spot race.

Normal test running test_unix_oob once never noticed that, save by accident, which brought the problem to attention ...

However, the problem seems to be config-driven rather than kernel-version-driven.

marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..100000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 1 Inline failed, sigurg 0 len 63
Test 1 Inline failed, sigurg 0 len 63
Test 1 Inline failed, sigurg 0 len 63
Test 2 Inline failed, len 63 atmark 1
Test 3 Inline failed, sigurg 23 len 63 data x
Test 3 Inline failed, sigurg 23 len 63 data x
Test 3 Inline failed, sigurg 23 len 63 data x
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 failed, sigurg 23 len 63 OOB %
marvin@defiant:~/linux/kernel/linux_torvalds$ uname -rms
Linux 6.5.0-060500rc4-generic x86_64
marvin@defiant:~/linux/kernel/linux_torvalds$

At moments, I was able to reproduce with certain configs, but now something odd happens.

I will keep investigating.

Kind regards,
Mirsad

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: selftests: net/af_unix test_unix_oob [FAILED]
  2023-08-08  8:53     ` Mirsad Todorovac
@ 2023-08-14  8:54       ` Mirsad Todorovac
  2023-08-20 10:34         ` selftests: net/af_unix test_unix_oob [FAILED][NEW] Mirsad Todorovac
  0 siblings, 1 reply; 6+ messages in thread
From: Mirsad Todorovac @ 2023-08-14  8:54 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: alexander, davem, edumazet, fw, kuba, linux-kernel,
	linux-kselftest, netdev, pabeni, shuah

On 8/8/23 10:53, Mirsad Todorovac wrote:
> On 8/8/23 01:09, Mirsad Todorovac wrote:
>> On 8/7/23 22:46, Kuniyuki Iwashima wrote:
>>> From: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
>>> Date: Mon, 7 Aug 2023 21:44:41 +0200
>>>> Hi all,
>>>>
>>>> In the kernel 6.5-rc5 build on Ubuntu 22.04 LTS (jammy jellyfish) on a Ryzen 7950 assembled box,
>>>> vanilla torvalds tree kernel, the test test_unix_oob unexpectedly fails:
>>>>
>>>> # selftests: net/af_unix: test_unix_oob
>>>> # Test 2 failed, sigurg 23 len 63 OOB %
>>>>
>>>> It is this code:
>>>>
>>>>           /* Test 2:
>>>>            * Verify that the first OOB is over written by
>>>>            * the 2nd one and the first OOB is returned as
>>>>            * part of the read, and sigurg is received.
>>>>            */
>>>>           wait_for_data(pfd, POLLIN | POLLPRI);
>>>>           len = 0;
>>>>           while (len < 70)
>>>>                   len = recv(pfd, buf, 1024, MSG_PEEK);
>>>>           len = read_data(pfd, buf, 1024);
>>>>           read_oob(pfd, &oob);
>>>>           if (!signal_recvd || len != 127 || oob != '#') {
>>>>                   fprintf(stderr, "Test 2 failed, sigurg %d len %d OOB %c\n",
>>>>                   signal_recvd, len, oob);
>>>>                   die(1);
>>>>           }
>>>>
>>>> In 6.5-rc4, this test was OK, so it might mean we have a regression?
>>>
>>> Thanks for reporting.
>>>
>>> I confirmed the test doesn't fail on net-next at least, but it's based
>>> on v6.5-rc4.
>>>
>>>    ---8<---
>>>    [root@localhost ~]# ./test_unix_oob
>>>    [root@localhost ~]# echo $?
>>>    0
>>>    [root@localhost ~]# uname -r
>>>    6.5.0-rc4-01192-g66244337512f
>>>    ---8<---
>>>
>>> I'll check 6.5-rc5 later.
>>
>> Hi, Kuniyuki,
>>
>> It seems that there is a new development. I could reproduce the error with the failed test 2
>> as early as 6.0-rc1. However, the gotcha is that the error appears to be sporadically manifested
>> (possibly a race)?
>>
>> I am currently attempting a bisect.
> 
> Bisect had shown that the condition existed already at 5.11 torvalds tree.
> 
> It has to do with the configs chosen (I used the configs from seltests/*/config merged), but it
> is also present in the Ubuntu production build:
> 
> marvin@defiant:~$ cd linux/kernel/linux_torvalds
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> Test 2 failed, sigurg 23 len 63 OOB %
> marvin@defiant:~/linux/kernel/linux_torvalds$ uname -rms
> Linux 6.4.8-060408-generic x86_64
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> Test 1 failed sigurg 0 len 63
> marvin@defiant:~/linux/kernel/linux_torvalds$
> 
> It happens on rare occasions, so it seems to be a hard-to-spot race.
> 
> Normal test running test_unix_oob once never noticed that, save by accident, which brought the problem to attention ...
> 
> However, the problem seems to be config-driven rather than kernel-version-driven.
> 
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..100000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> Test 3.1 Inline failed, len 1 oob % atmark 0
> Test 1 Inline failed, sigurg 0 len 63
> Test 1 Inline failed, sigurg 0 len 63
> Test 1 Inline failed, sigurg 0 len 63
> Test 2 Inline failed, len 63 atmark 1
> Test 3 Inline failed, sigurg 23 len 63 data x
> Test 3 Inline failed, sigurg 23 len 63 data x
> Test 3 Inline failed, sigurg 23 len 63 data x
> Test 3 Inline failed, sigurg 23 len 63 data x
> Test 2 Inline failed, len 63 atmark 1
> Test 3.1 Inline failed, len 1 oob % atmark 0
> Test 2 failed, sigurg 23 len 63 OOB %
> marvin@defiant:~/linux/kernel/linux_torvalds$ uname -rms
> Linux 6.5.0-060500rc4-generic x86_64
> marvin@defiant:~/linux/kernel/linux_torvalds$
> 
> At moments, I was able to reproduce with certain configs, but now something odd happens.
> 
> I will keep investigating.

Please not that the bug persisted in 6.5-rc6:

marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..100000}; do !!; done
for a in {0..100000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
Test 2 failed, sigurg 23 len 63 OOB %
Test 2 Inline failed, len 63 atmark 1
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 failed, sigurg 23 len 63 OOB %
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3 Inline failed, sigurg 23 len 63 data x
Test 1 Inline failed, sigurg 0 len 63
Test 1 Inline failed, sigurg 0 len 63
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 1 Inline failed, sigurg 0 len 63
Test 2 failed, sigurg 23 len 63 OOB %
Test 1 Inline failed, sigurg 0 len 63
Test 2 failed, sigurg 23 len 63 OOB %
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
marvin@defiant:~/linux/kernel/linux_torvalds$

The bug can be triggered as a non-privileged user, but is not clear whether it is exploitable to elevate privileges.

Best regards,
Mirsad Todorovac

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: selftests: net/af_unix test_unix_oob [FAILED][NEW]
  2023-08-14  8:54       ` Mirsad Todorovac
@ 2023-08-20 10:34         ` Mirsad Todorovac
  0 siblings, 0 replies; 6+ messages in thread
From: Mirsad Todorovac @ 2023-08-20 10:34 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: alexander, davem, edumazet, fw, kuba, linux-kernel,
	linux-kselftest, netdev, pabeni, shuah

[-- Attachment #1: Type: text/plain, Size: 13877 bytes --]

On 8/14/23 10:54, Mirsad Todorovac wrote:
> On 8/8/23 10:53, Mirsad Todorovac wrote:
>> On 8/8/23 01:09, Mirsad Todorovac wrote:
>>> On 8/7/23 22:46, Kuniyuki Iwashima wrote:
>>>> From: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
>>>> Date: Mon, 7 Aug 2023 21:44:41 +0200
>>>>> Hi all,
>>>>>
>>>>> In the kernel 6.5-rc5 build on Ubuntu 22.04 LTS (jammy jellyfish) on a Ryzen 7950 assembled box,
>>>>> vanilla torvalds tree kernel, the test test_unix_oob unexpectedly fails:
>>>>>
>>>>> # selftests: net/af_unix: test_unix_oob
>>>>> # Test 2 failed, sigurg 23 len 63 OOB %
>>>>>
>>>>> It is this code:
>>>>>
>>>>>           /* Test 2:
>>>>>            * Verify that the first OOB is over written by
>>>>>            * the 2nd one and the first OOB is returned as
>>>>>            * part of the read, and sigurg is received.
>>>>>            */
>>>>>           wait_for_data(pfd, POLLIN | POLLPRI);
>>>>>           len = 0;
>>>>>           while (len < 70)
>>>>>                   len = recv(pfd, buf, 1024, MSG_PEEK);
>>>>>           len = read_data(pfd, buf, 1024);
>>>>>           read_oob(pfd, &oob);
>>>>>           if (!signal_recvd || len != 127 || oob != '#') {
>>>>>                   fprintf(stderr, "Test 2 failed, sigurg %d len %d OOB %c\n",
>>>>>                   signal_recvd, len, oob);
>>>>>                   die(1);
>>>>>           }
>>>>>
>>>>> In 6.5-rc4, this test was OK, so it might mean we have a regression?
>>>>
>>>> Thanks for reporting.
>>>>
>>>> I confirmed the test doesn't fail on net-next at least, but it's based
>>>> on v6.5-rc4.
>>>>
>>>>    ---8<---
>>>>    [root@localhost ~]# ./test_unix_oob
>>>>    [root@localhost ~]# echo $?
>>>>    0
>>>>    [root@localhost ~]# uname -r
>>>>    6.5.0-rc4-01192-g66244337512f
>>>>    ---8<---
>>>>
>>>> I'll check 6.5-rc5 later.
>>>
>>> Hi, Kuniyuki,
>>>
>>> It seems that there is a new development. I could reproduce the error with the failed test 2
>>> as early as 6.0-rc1. However, the gotcha is that the error appears to be sporadically manifested
>>> (possibly a race)?
>>>
>>> I am currently attempting a bisect.
>>
>> Bisect had shown that the condition existed already at 5.11 torvalds tree.
>>
>> It has to do with the configs chosen (I used the configs from seltests/*/config merged), but it
>> is also present in the Ubuntu production build:
>>
>> marvin@defiant:~$ cd linux/kernel/linux_torvalds
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> Test 2 failed, sigurg 23 len 63 OOB %
>> marvin@defiant:~/linux/kernel/linux_torvalds$ uname -rms
>> Linux 6.4.8-060408-generic x86_64
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> Test 1 failed sigurg 0 len 63
>> marvin@defiant:~/linux/kernel/linux_torvalds$
>>
>> It happens on rare occasions, so it seems to be a hard-to-spot race.
>>
>> Normal test running test_unix_oob once never noticed that, save by accident, which brought the problem to attention ...
>>
>> However, the problem seems to be config-driven rather than kernel-version-driven.
>>
>> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..100000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
>> Test 3.1 Inline failed, len 1 oob % atmark 0
>> Test 1 Inline failed, sigurg 0 len 63
>> Test 1 Inline failed, sigurg 0 len 63
>> Test 1 Inline failed, sigurg 0 len 63
>> Test 2 Inline failed, len 63 atmark 1
>> Test 3 Inline failed, sigurg 23 len 63 data x
>> Test 3 Inline failed, sigurg 23 len 63 data x
>> Test 3 Inline failed, sigurg 23 len 63 data x
>> Test 3 Inline failed, sigurg 23 len 63 data x
>> Test 2 Inline failed, len 63 atmark 1
>> Test 3.1 Inline failed, len 1 oob % atmark 0
>> Test 2 failed, sigurg 23 len 63 OOB %
>> marvin@defiant:~/linux/kernel/linux_torvalds$ uname -rms
>> Linux 6.5.0-060500rc4-generic x86_64
>> marvin@defiant:~/linux/kernel/linux_torvalds$
>>
>> At moments, I was able to reproduce with certain configs, but now something odd happens.
>>
>> I will keep investigating.
> 
> Please not that the bug persisted in 6.5-rc6:
> 
> marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..100000}; do !!; done
> for a in {0..100000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> Test 2 failed, sigurg 23 len 63 OOB %
> Test 2 Inline failed, len 63 atmark 1
> Test 3 Inline failed, sigurg 23 len 63 data x
> Test 2 failed, sigurg 23 len 63 OOB %
> Test 3.1 Inline failed, len 1 oob % atmark 0
> Test 3 Inline failed, sigurg 23 len 63 data x
> Test 1 Inline failed, sigurg 0 len 63
> Test 1 Inline failed, sigurg 0 len 63
> Test 3.1 Inline failed, len 1 oob % atmark 0
> Test 1 Inline failed, sigurg 0 len 63
> Test 2 failed, sigurg 23 len 63 OOB %
> Test 1 Inline failed, sigurg 0 len 63
> Test 2 failed, sigurg 23 len 63 OOB %
> Test 3.1 Inline failed, len 1 oob % atmark 0
> Test 3.1 Inline failed, len 1 oob % atmark 0
> marvin@defiant:~/linux/kernel/linux_torvalds$
> 
> The bug can be triggered as a non-privileged user, but is not clear whether it is exploitable to elevate privileges.

Hi again,

I have tried the selftests/net/af_unix/test_oob_unix and:

marvin@defiant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
Test 2 failed, sigurg 23 len 63 OOB %
Test 2 failed, sigurg 23 len 63 OOB %
Test 2 Inline failed, len 63 atmark 1
Test 3 Inline failed, sigurg 23 len 63 data x
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 Inline failed, len 63 atmark 1
Test 1 Inline failed, sigurg 0 len 63
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 Inline failed, len 63 atmark 1
Test 2 Inline failed, len 63 atmark 1
Test 1 Inline failed, sigurg 0 len 63
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3 Inline failed, sigurg 23 len 63 data x
Test 3 Inline failed, sigurg 23 len 63 data x
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 1 Inline failed, sigurg 0 len 63
Test 2 failed, sigurg 23 len 63 OOB %
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 Inline failed, len 63 atmark 1
Test 2 Inline failed, len 63 atmark 1
Test 1 Inline failed, sigurg 0 len 63
Test 2 Inline failed, len 63 atmark 1
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 failed, sigurg 23 len 63 OOB %
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 Inline failed, len 63 atmark 1
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 1 Inline failed, sigurg 0 len 63
Test 2 Inline failed, len 63 atmark 1
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 failed, sigurg 23 len 63 OOB %
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 1 Inline failed, sigurg 0 len 63
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 1 Inline failed, sigurg 0 len 63
Test 3 Inline failed, sigurg 23 len 63 data x
Test 1 Inline failed, sigurg 0 len 63
Test 1 Inline failed, sigurg 0 len 63
Test 2 Inline failed, len 63 atmark 1
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 Inline failed, len 63 atmark 1
Test 3 Inline failed, sigurg 23 len 63 data x
Test 1 Inline failed, sigurg 0 len 63
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 failed, sigurg 23 len 63 OOB %
Test 2 Inline failed, len 63 atmark 1
Test 2 failed, sigurg 23 len 63 OOB %
Test 1 Inline failed, sigurg 0 len 63
Test 1 Inline failed, sigurg 0 len 63
Test 3 Inline failed, sigurg 23 len 63 data x
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 Inline failed, len 63 atmark 1
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3 Inline failed, sigurg 23 len 63 data x
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 1 Inline failed, sigurg 0 len 63
Test 2 failed, sigurg 23 len 63 OOB %
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 Inline failed, len 63 atmark 1
Test 3 Inline failed, sigurg 23 len 63 data x
Test 3 Inline failed, sigurg 23 len 63 data x
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 Inline failed, len 63 atmark 1
Test 2 Inline failed, len 63 atmark 1
Test 2 Inline failed, len 63 atmark 1
Test 2 Inline failed, len 63 atmark 1
Test 2 failed, sigurg 23 len 63 OOB %
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 Inline failed, len 63 atmark 1
Test 2 failed, sigurg 23 len 63 OOB %
Test 1 Inline failed, sigurg 0 len 63
Test 1 Inline failed, sigurg 0 len 63
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 1 Inline failed, sigurg 0 len 63
Test 2 failed, sigurg 23 len 63 OOB %
Test 2 Inline failed, len 63 atmark 1
Test 2 failed, sigurg 23 len 63 OOB %
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3 Inline failed, sigurg 23 len 63 data x
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 Inline failed, len 63 atmark 1
Test 2 Inline failed, len 63 atmark 1
Test 2 failed, sigurg 23 len 63 OOB %
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 Inline failed, len 63 atmark 1
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 failed, sigurg 23 len 63 OOB %
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3 Inline failed, sigurg 23 len 63 data x
Test 1 Inline failed, sigurg 0 len 63
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 1 Inline failed, sigurg 0 len 63
Test 2 Inline failed, len 63 atmark 1
Test 3 Inline failed, sigurg 23 len 63 data x
Test 3 Inline failed, sigurg 23 len 63 data x
Test 1 Inline failed, sigurg 0 len 63
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 Inline failed, len 63 atmark 1
Test 2 Inline failed, len 63 atmark 1
Test 2 Inline failed, len 63 atmark 1
Test 2 failed, sigurg 23 len 63 OOB %
Test 2 Inline failed, len 63 atmark 1
Test 1 Inline failed, sigurg 0 len 63
Test 2 Inline failed, len 63 atmark 1
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 Inline failed, len 63 atmark 1
Test 2 failed, sigurg 23 len 63 OOB %
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 Inline failed, len 63 atmark 1
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 2 failed, sigurg 23 len 63 OOB %
Test 2 failed, sigurg 23 len 63 OOB %
marvin@defiant:~/linux/kernel/linux_torvalds$

The kernel is 6.5.0-rc6-net-cfg-kcsan-00038-g16931859a650 vanilla torvalds tree on Ubuntu 22.04.

Best regards,
Mirsad Todorovac

[-- Attachment #2: config-6.5.0-rc6-net-cfg-kcsan-00038-g16931859a650.xz --]
[-- Type: application/x-xz, Size: 57688 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-08-20 10:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-07 19:44 selftests: net/af_unix test_unix_oob [FAILED] Mirsad Todorovac
2023-08-07 20:46 ` Kuniyuki Iwashima
2023-08-07 23:09   ` Mirsad Todorovac
2023-08-08  8:53     ` Mirsad Todorovac
2023-08-14  8:54       ` Mirsad Todorovac
2023-08-20 10:34         ` selftests: net/af_unix test_unix_oob [FAILED][NEW] Mirsad Todorovac

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox