Problems with csum_partial with misaligned buffers on sh4 platform

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Problems with csum_partial with misaligned buffers on sh4 platform
@ 2024-02-10 15:12 Guenter Roeck
  2024-02-10 20:12 ` John Paul Adrian Glaubitz
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Guenter Roeck @ 2024-02-10 15:12 UTC (permalink / raw)
  To: Yoshinori Sato
  Cc: Rich Felker, John Paul Adrian Glaubitz, linux-sh, linux-kernel

Hi,

when running checksum unit tests on sh4 qemu emulations, I get the following
errors.

    KTAP version 1
    # Subtest: checksum
    # module: checksum_kunit
    1..5
    # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
    Expected ( u64)result == ( u64)expec, but
        ( u64)result == 53378 (0xd082)
        ( u64)expec == 33488 (0x82d0)
    not ok 1 test_csum_fixed_random_inputs
    # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
    Expected ( u64)result == ( u64)expec, but
        ( u64)result == 65281 (0xff01)
        ( u64)expec == 65280 (0xff00)
    not ok 2 test_csum_all_carry_inputs
    # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
    Expected ( u64)result == ( u64)expec, but
        ( u64)result == 65535 (0xffff)
        ( u64)expec == 65534 (0xfffe)
    not ok 3 test_csum_no_carry_inputs
    ok 4 test_ip_fast_csum
    ok 5 test_csum_ipv6_magic
# checksum: pass:2 fail:3 skip:0 total:5

The above is with from a little endian system. On a big endian system,
the test result is as follows.

    KTAP version 1
    # Subtest: checksum
    # module: checksum_kunit
    1..5
    # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
    Expected ( u64)result == ( u64)expec, but
        ( u64)result == 33488 (0x82d0)
        ( u64)expec == 53378 (0xd082)
    not ok 1 test_csum_fixed_random_inputs
    # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
    Expected ( u64)result == ( u64)expec, but
        ( u64)result == 65281 (0xff01)
        ( u64)expec == 255 (0xff)
    not ok 2 test_csum_all_carry_inputs
    # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
    Expected ( u64)result == ( u64)expec, but
        ( u64)result == 1020 (0x3fc)
        ( u64)expec == 0 (0x0)
    not ok 3 test_csum_no_carry_inputs
    # test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
    Expected ( u64)expected == ( u64)csum_result, but
        ( u64)expected == 55939 (0xda83)
        ( u64)csum_result == 33754 (0x83da)
    not ok 4 test_ip_fast_csum
    # test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
    Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
        ( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
        ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
    not ok 5 test_csum_ipv6_magic
# checksum: pass:0 fail:5 skip:0 total:5

Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
systems due to a bug in the test code, unrelated to this problem.

Analysis shows that the errors are seen only if the buffer is misaligned.
Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
Handle calling csum_partial with misaligned data") which seemed to be
related. Reverting that commit fixes the problem.
This suggests that something may be wrong with that commit. Alternatively,
of course, it may be possible that something is wrong with the qemu
emulation, but that seems unlikely.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with csum_partial with misaligned buffers on sh4 platform
  2024-02-10 15:12 Problems with csum_partial with misaligned buffers on sh4 platform Guenter Roeck
@ 2024-02-10 20:12 ` John Paul Adrian Glaubitz
  2024-02-10 21:54   ` Guenter Roeck
  2024-02-11 14:35 ` Yoshinori Sato
  2024-03-11 17:04 ` Guenter Roeck
  2 siblings, 1 reply; 11+ messages in thread
From: John Paul Adrian Glaubitz @ 2024-02-10 20:12 UTC (permalink / raw)
  To: Guenter Roeck, Yoshinori Sato; +Cc: Rich Felker, linux-sh, linux-kernel

Hi Guenter,

On Sat, 2024-02-10 at 07:12 -0800, Guenter Roeck wrote:
> when running checksum unit tests on sh4 qemu emulations, I get the following
> errors.
> 
>     KTAP version 1
>     # Subtest: checksum
>     # module: checksum_kunit
>     1..5
>     # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 53378 (0xd082)
>         ( u64)expec == 33488 (0x82d0)
>     not ok 1 test_csum_fixed_random_inputs
>     # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 65281 (0xff01)
>         ( u64)expec == 65280 (0xff00)
>     not ok 2 test_csum_all_carry_inputs
>     # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 65535 (0xffff)
>         ( u64)expec == 65534 (0xfffe)
>     not ok 3 test_csum_no_carry_inputs
>     ok 4 test_ip_fast_csum
>     ok 5 test_csum_ipv6_magic
> # checksum: pass:2 fail:3 skip:0 total:5
> 
> The above is with from a little endian system. On a big endian system,
> the test result is as follows.
> 
>     KTAP version 1
>     # Subtest: checksum
>     # module: checksum_kunit
>     1..5
>     # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 33488 (0x82d0)
>         ( u64)expec == 53378 (0xd082)
>     not ok 1 test_csum_fixed_random_inputs
>     # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 65281 (0xff01)
>         ( u64)expec == 255 (0xff)
>     not ok 2 test_csum_all_carry_inputs
>     # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 1020 (0x3fc)
>         ( u64)expec == 0 (0x0)
>     not ok 3 test_csum_no_carry_inputs
>     # test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
>     Expected ( u64)expected == ( u64)csum_result, but
>         ( u64)expected == 55939 (0xda83)
>         ( u64)csum_result == 33754 (0x83da)
>     not ok 4 test_ip_fast_csum
>     # test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
>     Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
>         ( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
>         ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
>     not ok 5 test_csum_ipv6_magic
> # checksum: pass:0 fail:5 skip:0 total:5
> 
> Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
> systems due to a bug in the test code, unrelated to this problem.
> 
> Analysis shows that the errors are seen only if the buffer is misaligned.
> Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
> Handle calling csum_partial with misaligned data") which seemed to be
> related. Reverting that commit fixes the problem.
> This suggests that something may be wrong with that commit. Alternatively,
> of course, it may be possible that something is wrong with the qemu
> emulation, but that seems unlikely.

I have not run these tests before. Can you tell me how these are run,
so I can verify these reproduce on real hardware?

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with csum_partial with misaligned buffers on sh4 platform
  2024-02-10 20:12 ` John Paul Adrian Glaubitz
@ 2024-02-10 21:54   ` Guenter Roeck
  2024-02-11  3:41     ` D. Jeff Dionne
  2024-02-11  9:53     ` Geert Uytterhoeven
  0 siblings, 2 replies; 11+ messages in thread
From: Guenter Roeck @ 2024-02-10 21:54 UTC (permalink / raw)
  To: John Paul Adrian Glaubitz, Yoshinori Sato
  Cc: Rich Felker, linux-sh, linux-kernel

Hi Adrian,

On 2/10/24 12:12, John Paul Adrian Glaubitz wrote:
> Hi Guenter,
> 
> On Sat, 2024-02-10 at 07:12 -0800, Guenter Roeck wrote:
>> when running checksum unit tests on sh4 qemu emulations, I get the following
>> errors.
>>
>>      KTAP version 1
>>      # Subtest: checksum
>>      # module: checksum_kunit
>>      1..5
>>      # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>>      Expected ( u64)result == ( u64)expec, but
>>          ( u64)result == 53378 (0xd082)
>>          ( u64)expec == 33488 (0x82d0)
>>      not ok 1 test_csum_fixed_random_inputs
>>      # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>>      Expected ( u64)result == ( u64)expec, but
>>          ( u64)result == 65281 (0xff01)
>>          ( u64)expec == 65280 (0xff00)
>>      not ok 2 test_csum_all_carry_inputs
>>      # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
>>      Expected ( u64)result == ( u64)expec, but
>>          ( u64)result == 65535 (0xffff)
>>          ( u64)expec == 65534 (0xfffe)
>>      not ok 3 test_csum_no_carry_inputs
>>      ok 4 test_ip_fast_csum
>>      ok 5 test_csum_ipv6_magic
>> # checksum: pass:2 fail:3 skip:0 total:5
>>
>> The above is with from a little endian system. On a big endian system,
>> the test result is as follows.
>>
>>      KTAP version 1
>>      # Subtest: checksum
>>      # module: checksum_kunit
>>      1..5
>>      # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>>      Expected ( u64)result == ( u64)expec, but
>>          ( u64)result == 33488 (0x82d0)
>>          ( u64)expec == 53378 (0xd082)
>>      not ok 1 test_csum_fixed_random_inputs
>>      # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>>      Expected ( u64)result == ( u64)expec, but
>>          ( u64)result == 65281 (0xff01)
>>          ( u64)expec == 255 (0xff)
>>      not ok 2 test_csum_all_carry_inputs
>>      # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
>>      Expected ( u64)result == ( u64)expec, but
>>          ( u64)result == 1020 (0x3fc)
>>          ( u64)expec == 0 (0x0)
>>      not ok 3 test_csum_no_carry_inputs
>>      # test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
>>      Expected ( u64)expected == ( u64)csum_result, but
>>          ( u64)expected == 55939 (0xda83)
>>          ( u64)csum_result == 33754 (0x83da)
>>      not ok 4 test_ip_fast_csum
>>      # test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
>>      Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
>>          ( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
>>          ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
>>      not ok 5 test_csum_ipv6_magic
>> # checksum: pass:0 fail:5 skip:0 total:5
>>
>> Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
>> systems due to a bug in the test code, unrelated to this problem.
>>
>> Analysis shows that the errors are seen only if the buffer is misaligned.
>> Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
>> Handle calling csum_partial with misaligned data") which seemed to be
>> related. Reverting that commit fixes the problem.
>> This suggests that something may be wrong with that commit. Alternatively,
>> of course, it may be possible that something is wrong with the qemu
>> emulation, but that seems unlikely.
> 
> I have not run these tests before. Can you tell me how these are run,
> so I can verify these reproduce on real hardware?
> 

Enabling CONFIG_KUNIT and CONFIG_CHECKSUM_KUNIT on top of a working
configuration should do the trick. Both can be built as module,
so presumably one can build and load them separately. I have not tried
that, though - I always build them into the kernel and boot the resulting
image.

Hope this helps,
Guenter


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with csum_partial with misaligned buffers on sh4 platform
  2024-02-10 21:54   ` Guenter Roeck
@ 2024-02-11  3:41     ` D. Jeff Dionne
  2024-02-11 16:12       ` Rich Felker
  2024-02-11  9:53     ` Geert Uytterhoeven
  1 sibling, 1 reply; 11+ messages in thread
From: D. Jeff Dionne @ 2024-02-11  3:41 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: John Paul Adrian Glaubitz, Yoshinori Sato, Rich Felker, linux-sh,
	linux-kernel

I remember there being problems with alignment on SH targets in the network stack.  IIRC, wireguard triggered it in actual use, seems to me it had to do with skb alignment.

Rich Felker may remember more, but I don’t think we implemented a (complete) solution.

Cheers,
J.

> On 11 Feb 2024, at 07:03, Guenter Roeck <linux@roeck-us.net> wrote:
> 
> Hi Adrian,
> 
>> On 2/10/24 12:12, John Paul Adrian Glaubitz wrote:
>> Hi Guenter,
>>> On Sat, 2024-02-10 at 07:12 -0800, Guenter Roeck wrote:
>>> when running checksum unit tests on sh4 qemu emulations, I get the following
>>> errors.
>>> 
>>>     KTAP version 1
>>>     # Subtest: checksum
>>>     # module: checksum_kunit
>>>     1..5
>>>     # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>>>     Expected ( u64)result == ( u64)expec, but
>>>         ( u64)result == 53378 (0xd082)
>>>         ( u64)expec == 33488 (0x82d0)
>>>     not ok 1 test_csum_fixed_random_inputs
>>>     # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>>>     Expected ( u64)result == ( u64)expec, but
>>>         ( u64)result == 65281 (0xff01)
>>>         ( u64)expec == 65280 (0xff00)
>>>     not ok 2 test_csum_all_carry_inputs
>>>     # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
>>>     Expected ( u64)result == ( u64)expec, but
>>>         ( u64)result == 65535 (0xffff)
>>>         ( u64)expec == 65534 (0xfffe)
>>>     not ok 3 test_csum_no_carry_inputs
>>>     ok 4 test_ip_fast_csum
>>>     ok 5 test_csum_ipv6_magic
>>> # checksum: pass:2 fail:3 skip:0 total:5
>>> 
>>> The above is with from a little endian system. On a big endian system,
>>> the test result is as follows.
>>> 
>>>     KTAP version 1
>>>     # Subtest: checksum
>>>     # module: checksum_kunit
>>>     1..5
>>>     # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>>>     Expected ( u64)result == ( u64)expec, but
>>>         ( u64)result == 33488 (0x82d0)
>>>         ( u64)expec == 53378 (0xd082)
>>>     not ok 1 test_csum_fixed_random_inputs
>>>     # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>>>     Expected ( u64)result == ( u64)expec, but
>>>         ( u64)result == 65281 (0xff01)
>>>         ( u64)expec == 255 (0xff)
>>>     not ok 2 test_csum_all_carry_inputs
>>>     # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
>>>     Expected ( u64)result == ( u64)expec, but
>>>         ( u64)result == 1020 (0x3fc)
>>>         ( u64)expec == 0 (0x0)
>>>     not ok 3 test_csum_no_carry_inputs
>>>     # test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
>>>     Expected ( u64)expected == ( u64)csum_result, but
>>>         ( u64)expected == 55939 (0xda83)
>>>         ( u64)csum_result == 33754 (0x83da)
>>>     not ok 4 test_ip_fast_csum
>>>     # test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
>>>     Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
>>>         ( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
>>>         ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
>>>     not ok 5 test_csum_ipv6_magic
>>> # checksum: pass:0 fail:5 skip:0 total:5
>>> 
>>> Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
>>> systems due to a bug in the test code, unrelated to this problem.
>>> 
>>> Analysis shows that the errors are seen only if the buffer is misaligned.
>>> Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
>>> Handle calling csum_partial with misaligned data") which seemed to be
>>> related. Reverting that commit fixes the problem.
>>> This suggests that something may be wrong with that commit. Alternatively,
>>> of course, it may be possible that something is wrong with the qemu
>>> emulation, but that seems unlikely.
>> I have not run these tests before. Can you tell me how these are run,
>> so I can verify these reproduce on real hardware?
> 
> Enabling CONFIG_KUNIT and CONFIG_CHECKSUM_KUNIT on top of a working
> configuration should do the trick. Both can be built as module,
> so presumably one can build and load them separately. I have not tried
> that, though - I always build them into the kernel and boot the resulting
> image.
> 
> Hope this helps,
> Guenter
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with csum_partial with misaligned buffers on sh4 platform
  2024-02-11  3:41     ` D. Jeff Dionne
@ 2024-02-11 16:12       ` Rich Felker
  0 siblings, 0 replies; 11+ messages in thread
From: Rich Felker @ 2024-02-11 16:12 UTC (permalink / raw)
  To: D. Jeff Dionne
  Cc: Guenter Roeck, John Paul Adrian Glaubitz, Yoshinori Sato,
	linux-sh, linux-kernel

On Sun, Feb 11, 2024 at 12:41:16PM +0900, D. Jeff Dionne wrote:
> I remember there being problems with alignment on SH targets in the
> network stack. IIRC, wireguard triggered it in actual use, seems to
> me it had to do with skb alignment.
> 
> Rich Felker may remember more, but I don’t think we implemented a
> (complete) solution.

What I recall was that some of the tunneling encapsulation code
ultimately did its zerocopy by arranging for either the inner or outer
headers to be misaligned (due to the historical badness of ethernet),
and thereby blowing up on archs without misaligned access support
(ours read/wrote bogus data, probably ignoring the low address bits or
something, on misaligned addresses). We never solved it; the code that
later worked was doing the encapsulatio in userspace without the
kernel's misaligned zerocopy stuff.

The right solution would be to make the affected accesses happen
through custom int16/int32 types with attribute packed applied to
them, so that on archs with misaligned access, the code would not
change at all, but on archs without it, the codegen would do
everything byte-by-byte and reassemble. But this would probably be an
invasive change that would make the maintainers of the network stack
unhappy...

Rich

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with csum_partial with misaligned buffers on sh4 platform
  2024-02-10 21:54   ` Guenter Roeck
  2024-02-11  3:41     ` D. Jeff Dionne
@ 2024-02-11  9:53     ` Geert Uytterhoeven
  1 sibling, 0 replies; 11+ messages in thread
From: Geert Uytterhoeven @ 2024-02-11  9:53 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: John Paul Adrian Glaubitz, Yoshinori Sato, Rich Felker, linux-sh,
	linux-kernel

Hi Günter,

On Sat, Feb 10, 2024 at 10:59 PM Guenter Roeck <linux@roeck-us.net> wrote:
> On 2/10/24 12:12, John Paul Adrian Glaubitz wrote:
> > I have not run these tests before. Can you tell me how these are run,
> > so I can verify these reproduce on real hardware?
>
> Enabling CONFIG_KUNIT and CONFIG_CHECKSUM_KUNIT on top of a working
> configuration should do the trick. Both can be built as module,
> so presumably one can build and load them separately. I have not tried
> that, though - I always build them into the kernel and boot the resulting
> image.

Yes, you can build and load them as modules separately; that's what
I do on m68k (and yes, the checksum test fails on m68k, as it is
big endian).

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with csum_partial with misaligned buffers on sh4 platform
  2024-02-10 15:12 Problems with csum_partial with misaligned buffers on sh4 platform Guenter Roeck
  2024-02-10 20:12 ` John Paul Adrian Glaubitz
@ 2024-02-11 14:35 ` Yoshinori Sato
  2024-03-11 17:04 ` Guenter Roeck
  2 siblings, 0 replies; 11+ messages in thread
From: Yoshinori Sato @ 2024-02-11 14:35 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rich Felker, John Paul Adrian Glaubitz, linux-sh, linux-kernel

On Sun, 11 Feb 2024 00:12:39 +0900,
Guenter Roeck wrote:
> 
> Hi,
> 
> when running checksum unit tests on sh4 qemu emulations, I get the following
> errors.
> 
>     KTAP version 1
>     # Subtest: checksum
>     # module: checksum_kunit
>     1..5
>     # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 53378 (0xd082)
>         ( u64)expec == 33488 (0x82d0)
>     not ok 1 test_csum_fixed_random_inputs
>     # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 65281 (0xff01)
>         ( u64)expec == 65280 (0xff00)
>     not ok 2 test_csum_all_carry_inputs
>     # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 65535 (0xffff)
>         ( u64)expec == 65534 (0xfffe)
>     not ok 3 test_csum_no_carry_inputs
>     ok 4 test_ip_fast_csum
>     ok 5 test_csum_ipv6_magic
> # checksum: pass:2 fail:3 skip:0 total:5
> 
> The above is with from a little endian system. On a big endian system,
> the test result is as follows.
> 
>     KTAP version 1
>     # Subtest: checksum
>     # module: checksum_kunit
>     1..5
>     # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 33488 (0x82d0)
>         ( u64)expec == 53378 (0xd082)
>     not ok 1 test_csum_fixed_random_inputs
>     # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 65281 (0xff01)
>         ( u64)expec == 255 (0xff)
>     not ok 2 test_csum_all_carry_inputs
>     # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 1020 (0x3fc)
>         ( u64)expec == 0 (0x0)
>     not ok 3 test_csum_no_carry_inputs
>     # test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
>     Expected ( u64)expected == ( u64)csum_result, but
>         ( u64)expected == 55939 (0xda83)
>         ( u64)csum_result == 33754 (0x83da)
>     not ok 4 test_ip_fast_csum
>     # test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
>     Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
>         ( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
>         ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
>     not ok 5 test_csum_ipv6_magic
> # checksum: pass:0 fail:5 skip:0 total:5
> 
> Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
> systems due to a bug in the test code, unrelated to this problem.
> 
> Analysis shows that the errors are seen only if the buffer is misaligned.
> Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
> Handle calling csum_partial with misaligned data") which seemed to be
> related. Reverting that commit fixes the problem.
> This suggests that something may be wrong with that commit. Alternatively,
> of course, it may be possible that something is wrong with the qemu
> emulation, but that seems unlikely.

I checked that part of the code, and it only uses basic instructions.
If there is a problem with these instructions, other problems should occur,
but I have never seen such a phenomenon.
So I think the culprit is in that commit, not qemu.

I think it's better to use GENERIC_CSUM since the previous code is also
not very efficient.

> Thanks,
> Guenter

-- 
Yosinori Sato

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with csum_partial with misaligned buffers on sh4 platform
  2024-02-10 15:12 Problems with csum_partial with misaligned buffers on sh4 platform Guenter Roeck
  2024-02-10 20:12 ` John Paul Adrian Glaubitz
  2024-02-11 14:35 ` Yoshinori Sato
@ 2024-03-11 17:04 ` Guenter Roeck
  2024-03-18 15:04   ` Linux regression tracking (Thorsten Leemhuis)
  2 siblings, 1 reply; 11+ messages in thread
From: Guenter Roeck @ 2024-03-11 17:04 UTC (permalink / raw)
  To: Yoshinori Sato
  Cc: Rich Felker, John Paul Adrian Glaubitz, linux-sh, linux-kernel,
	regressions

On Sat, Feb 10, 2024 at 07:12:39AM -0800, Guenter Roeck wrote:
> Hi,
> 
> when running checksum unit tests on sh4 qemu emulations, I get the following
> errors.
> 

Adding to regression tracker.

#regzbot ^introduced cadc4e1a2b4d2
#regzbot title Problems with csum_partial with misaligned buffers on sh4 platform
#regzbot ignore-activity

>     KTAP version 1
>     # Subtest: checksum
>     # module: checksum_kunit
>     1..5
>     # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 53378 (0xd082)
>         ( u64)expec == 33488 (0x82d0)
>     not ok 1 test_csum_fixed_random_inputs
>     # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 65281 (0xff01)
>         ( u64)expec == 65280 (0xff00)
>     not ok 2 test_csum_all_carry_inputs
>     # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 65535 (0xffff)
>         ( u64)expec == 65534 (0xfffe)
>     not ok 3 test_csum_no_carry_inputs
>     ok 4 test_ip_fast_csum
>     ok 5 test_csum_ipv6_magic
> # checksum: pass:2 fail:3 skip:0 total:5
> 
> The above is with from a little endian system. On a big endian system,
> the test result is as follows.
> 
>     KTAP version 1
>     # Subtest: checksum
>     # module: checksum_kunit
>     1..5
>     # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 33488 (0x82d0)
>         ( u64)expec == 53378 (0xd082)
>     not ok 1 test_csum_fixed_random_inputs
>     # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 65281 (0xff01)
>         ( u64)expec == 255 (0xff)
>     not ok 2 test_csum_all_carry_inputs
>     # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
>     Expected ( u64)result == ( u64)expec, but
>         ( u64)result == 1020 (0x3fc)
>         ( u64)expec == 0 (0x0)
>     not ok 3 test_csum_no_carry_inputs
>     # test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
>     Expected ( u64)expected == ( u64)csum_result, but
>         ( u64)expected == 55939 (0xda83)
>         ( u64)csum_result == 33754 (0x83da)
>     not ok 4 test_ip_fast_csum
>     # test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
>     Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
>         ( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
>         ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
>     not ok 5 test_csum_ipv6_magic
> # checksum: pass:0 fail:5 skip:0 total:5
> 
> Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
> systems due to a bug in the test code, unrelated to this problem.
> 
> Analysis shows that the errors are seen only if the buffer is misaligned.
> Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
> Handle calling csum_partial with misaligned data") which seemed to be
> related. Reverting that commit fixes the problem.
> This suggests that something may be wrong with that commit. Alternatively,
> of course, it may be possible that something is wrong with the qemu
> emulation, but that seems unlikely.
> 
> Thanks,
> Guenter

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with csum_partial with misaligned buffers on sh4 platform
  2024-03-11 17:04 ` Guenter Roeck
@ 2024-03-18 15:04   ` Linux regression tracking (Thorsten Leemhuis)
  2024-03-18 15:32     ` Guenter Roeck
  0 siblings, 1 reply; 11+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-03-18 15:04 UTC (permalink / raw)
  To: Guenter Roeck, Yoshinori Sato
  Cc: Rich Felker, John Paul Adrian Glaubitz, linux-sh, linux-kernel,
	regressions

On 11.03.24 18:04, Guenter Roeck wrote:
> On Sat, Feb 10, 2024 at 07:12:39AM -0800, Guenter Roeck wrote:
>>
>> when running checksum unit tests on sh4 qemu emulations, I get the following
>> errors.
> 
> Adding to regression tracker.
> 
> #regzbot ^introduced cadc4e1a2b4d2

Hmmm, thx for that, but well, I'm a bit taken back and forth here. That
commit afaics is from v3.0-rc1 and Linus iirc at least once said
something along the lines of "a regression only reported after a long
time at some point becomes just a bug". I'd say that applies there,
which is why I'm wondering if tracking this really is worth it.

Ciao, Thorsten


>>     KTAP version 1
>>     # Subtest: checksum
>>     # module: checksum_kunit
>>     1..5
>>     # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>>     Expected ( u64)result == ( u64)expec, but
>>         ( u64)result == 53378 (0xd082)
>>         ( u64)expec == 33488 (0x82d0)
>>     not ok 1 test_csum_fixed_random_inputs
>>     # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>>     Expected ( u64)result == ( u64)expec, but
>>         ( u64)result == 65281 (0xff01)
>>         ( u64)expec == 65280 (0xff00)
>>     not ok 2 test_csum_all_carry_inputs
>>     # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
>>     Expected ( u64)result == ( u64)expec, but
>>         ( u64)result == 65535 (0xffff)
>>         ( u64)expec == 65534 (0xfffe)
>>     not ok 3 test_csum_no_carry_inputs
>>     ok 4 test_ip_fast_csum
>>     ok 5 test_csum_ipv6_magic
>> # checksum: pass:2 fail:3 skip:0 total:5
>>
>> The above is with from a little endian system. On a big endian system,
>> the test result is as follows.
>>
>>     KTAP version 1
>>     # Subtest: checksum
>>     # module: checksum_kunit
>>     1..5
>>     # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>>     Expected ( u64)result == ( u64)expec, but
>>         ( u64)result == 33488 (0x82d0)
>>         ( u64)expec == 53378 (0xd082)
>>     not ok 1 test_csum_fixed_random_inputs
>>     # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>>     Expected ( u64)result == ( u64)expec, but
>>         ( u64)result == 65281 (0xff01)
>>         ( u64)expec == 255 (0xff)
>>     not ok 2 test_csum_all_carry_inputs
>>     # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
>>     Expected ( u64)result == ( u64)expec, but
>>         ( u64)result == 1020 (0x3fc)
>>         ( u64)expec == 0 (0x0)
>>     not ok 3 test_csum_no_carry_inputs
>>     # test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
>>     Expected ( u64)expected == ( u64)csum_result, but
>>         ( u64)expected == 55939 (0xda83)
>>         ( u64)csum_result == 33754 (0x83da)
>>     not ok 4 test_ip_fast_csum
>>     # test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
>>     Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
>>         ( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
>>         ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
>>     not ok 5 test_csum_ipv6_magic
>> # checksum: pass:0 fail:5 skip:0 total:5
>>
>> Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
>> systems due to a bug in the test code, unrelated to this problem.
>>
>> Analysis shows that the errors are seen only if the buffer is misaligned.
>> Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
>> Handle calling csum_partial with misaligned data") which seemed to be
>> related. Reverting that commit fixes the problem.
>> This suggests that something may be wrong with that commit. Alternatively,
>> of course, it may be possible that something is wrong with the qemu
>> emulation, but that seems unlikely.
>>
>> Thanks,
>> Guenter

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with csum_partial with misaligned buffers on sh4 platform
  2024-03-18 15:04   ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-03-18 15:32     ` Guenter Roeck
  2024-03-18 15:58       ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 11+ messages in thread
From: Guenter Roeck @ 2024-03-18 15:32 UTC (permalink / raw)
  To: Linux regressions mailing list, Yoshinori Sato
  Cc: Rich Felker, John Paul Adrian Glaubitz, linux-sh, linux-kernel

On 3/18/24 08:04, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 11.03.24 18:04, Guenter Roeck wrote:
>> On Sat, Feb 10, 2024 at 07:12:39AM -0800, Guenter Roeck wrote:
>>>
>>> when running checksum unit tests on sh4 qemu emulations, I get the following
>>> errors.
>>
>> Adding to regression tracker.
>>
>> #regzbot ^introduced cadc4e1a2b4d2
> 
> Hmmm, thx for that, but well, I'm a bit taken back and forth here. That
> commit afaics is from v3.0-rc1 and Linus iirc at least once said
> something along the lines of "a regression only reported after a long
> time at some point becomes just a bug". I'd say that applies there,
> which is why I'm wondering if tracking this really is worth it.
> 

Not my call to make. I'll keep in mind to not add "bugs" to the regression
tracker in the future. Feel free to drop.

For my understanding, what is "a long time" ?

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with csum_partial with misaligned buffers on sh4 platform
  2024-03-18 15:32     ` Guenter Roeck
@ 2024-03-18 15:58       ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 0 replies; 11+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-03-18 15:58 UTC (permalink / raw)
  To: Guenter Roeck, Linux regressions mailing list, Yoshinori Sato
  Cc: Rich Felker, John Paul Adrian Glaubitz, linux-sh, linux-kernel

On 18.03.24 16:32, Guenter Roeck wrote:
> On 3/18/24 08:04, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 11.03.24 18:04, Guenter Roeck wrote:
>>> On Sat, Feb 10, 2024 at 07:12:39AM -0800, Guenter Roeck wrote:
>>>>
>>>> when running checksum unit tests on sh4 qemu emulations, I get the
>>>> following
>>>> errors.
>>>
>>> Adding to regression tracker.
>>>
>>> #regzbot ^introduced cadc4e1a2b4d2
>>
>> Hmmm, thx for that, but well, I'm a bit taken back and forth here. That
>> commit afaics is from v3.0-rc1 and Linus iirc at least once said
>> something along the lines of "a regression only reported after a long
>> time at some point becomes just a bug". I'd say that applies there,
>> which is why I'm wondering if tracking this really is worth it.
> 
> Not my call to make. I'll keep in mind to not add "bugs" to the regression
> tracker in the future.

From my side there is no need for you to keep that in mind, as "somewhat
added this regression to the tracking" might be something that will
occasionally make a developer finally fix the problem -- which is why I
waited a few days with today's reply. :-D

> Feel free to drop.

Let me do that:

#regzbot inconclusive: really old regression

> For my understanding, what is "a long time" ?

That is a good question and I guess the answer like so often in kernel
land depends on the regression in question. :-/ Also note that that
"iirc" really was meant like it, as I might misremember. I just checked
and found two related quotes, but the situations are somewhat different:

https://lore.kernel.org/all/CAHk-=wis_qQy4oDNynNKi5b7Qhosmxtoj1jxo5wmB6SRUwQUBQ@mail.gmail.com/
"""
And yes, I do consider "regression in an earlier release" to be a
regression that needs fixing.

There's obviously a time limit: if that "regression in an earlier
release" was a year or more ago, and just took forever for people to
notice, and it had semantic changes that now mean that fixing the
regression could cause a _new_ regression, then that can cause me to
go "Oh, now the new semantics are what we have to live with".
"""

And also:
https://lore.kernel.org/all/CAHk-=wiVi7mSrsMP=fLXQrXK_UimybW=ziLOwSzFTtoXUacWVQ@mail.gmail.com/
"""
And obviously, if users take years to even notice that something
broke, or if we have sane ways to work around the breakage that
doesn't make for too much trouble for users (ie "ok, there are a
handful of users, and they can use a kernel command line to work
around it" kind of things) we've also been a bit less strict.
"""

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-03-18 15:58 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-10 15:12 Problems with csum_partial with misaligned buffers on sh4 platform Guenter Roeck
2024-02-10 20:12 ` John Paul Adrian Glaubitz
2024-02-10 21:54   ` Guenter Roeck
2024-02-11  3:41     ` D. Jeff Dionne
2024-02-11 16:12       ` Rich Felker
2024-02-11  9:53     ` Geert Uytterhoeven
2024-02-11 14:35 ` Yoshinori Sato
2024-03-11 17:04 ` Guenter Roeck
2024-03-18 15:04   ` Linux regression tracking (Thorsten Leemhuis)
2024-03-18 15:32     ` Guenter Roeck
2024-03-18 15:58       ` Linux regression tracking (Thorsten Leemhuis)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox