* Re: [PATCH v11] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests [not found] <20240229-fix_sparse_errors_checksum_tests-v11-1-f608d9ec7574@rivosinc.com> @ 2024-03-01 7:17 ` Christophe Leroy 2024-03-01 17:09 ` Charlie Jenkins [not found] ` <62b69aaf-7633-4bd8-aefe-5ba47147dba7@roeck-us.net> 1 sibling, 1 reply; 7+ messages in thread From: Christophe Leroy @ 2024-03-01 7:17 UTC (permalink / raw) To: Charlie Jenkins, Guenter Roeck, David Laight, Palmer Dabbelt, Andrew Morton, Helge Deller, James E.J. Bottomley, Parisc List, Arnd Bergmann, Geert Uytterhoeven, Russell King Cc: linux-kernel@vger.kernel.org, Palmer Dabbelt, Linux ARM, netdev@vger.kernel.org +CC netdev ARM Russell Le 29/02/2024 à 23:46, Charlie Jenkins a écrit : > The test cases for ip_fast_csum and csum_ipv6_magic were not properly > aligning the IP header, which were causing failures on architectures > that do not support misaligned accesses like some ARM platforms. To > solve this, align the data along (14 + NET_IP_ALIGN) bytes which is the > standard alignment of an IP header and must be supported by the > architecture. In your description, please provide more details on platforms that have a problem, what the problem is exactly (Failed calculation, slowliness, kernel Oops, panic, ....) on each platform. And please copy maintainers and lists of platforms your are specifically addressing with this change. And as this is network related, netdev list should have been copied as well. I still think that your patch is not the good approach, it looks like you are ignoring all the discussion. Below is a quote of what Geert said and I fully agree with that: IMHO the tests should validate the expected functionality. If a test fails, either functionality is missing or behaves wrong, or the test is wrong. What is the point of writing tests for a core functionality like network checksumming that do not match the expected functionality? So we all agree that there is something to fix, because today's test does odd-address accesses which is unexpected for those functions, but 2-byte alignments should be supported hence tested by the test. Limiting the test to a 16-bytes alignment deeply reduces the usefullness of the test. Christophe _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v11] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests 2024-03-01 7:17 ` [PATCH v11] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests Christophe Leroy @ 2024-03-01 17:09 ` Charlie Jenkins 2024-03-01 17:24 ` David Laight 0 siblings, 1 reply; 7+ messages in thread From: Charlie Jenkins @ 2024-03-01 17:09 UTC (permalink / raw) To: Christophe Leroy Cc: Guenter Roeck, David Laight, Palmer Dabbelt, Andrew Morton, Helge Deller, James E.J. Bottomley, Parisc List, Arnd Bergmann, Geert Uytterhoeven, Russell King, linux-kernel@vger.kernel.org, Palmer Dabbelt, Linux ARM, netdev@vger.kernel.org On Fri, Mar 01, 2024 at 07:17:38AM +0000, Christophe Leroy wrote: > +CC netdev ARM Russell > > Le 29/02/2024 à 23:46, Charlie Jenkins a écrit : > > The test cases for ip_fast_csum and csum_ipv6_magic were not properly > > aligning the IP header, which were causing failures on architectures > > that do not support misaligned accesses like some ARM platforms. To > > solve this, align the data along (14 + NET_IP_ALIGN) bytes which is the > > standard alignment of an IP header and must be supported by the > > architecture. > > In your description, please provide more details on platforms that have > a problem, what the problem is exactly (Failed calculation, slowliness, > kernel Oops, panic, ....) on each platform. > > And please copy maintainers and lists of platforms your are specifically > addressing with this change. And as this is network related, netdev list > should have been copied as well. > > I still think that your patch is not the good approach, it looks like > you are ignoring all the discussion. Below is a quote of what Geert said > and I fully agree with that: > > IMHO the tests should validate the expected functionality. If a test > fails, either functionality is missing or behaves wrong, or the test > is wrong. > > What is the point of writing tests for a core functionality like network > checksumming that do not match the expected functionality? > > > So we all agree that there is something to fix, because today's test > does odd-address accesses which is unexpected for those functions, but > 2-byte alignments should be supported hence tested by the test. Limiting > the test to a 16-bytes alignment deeply reduces the usefullness of the test. > Maybe I am lost in the conversations. This isn't limited to 16-bytes alignment? It aligns along 14 + NET_IP_ALIGN. That is 16 on some platforms and 14 on platforms where unaligned accesses are desired. These functions are expected to be called with this offset. Testing with any other alignment is not the expected behavior. These tests are testing the expected functionality. - Charlie > Christophe _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH v11] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests 2024-03-01 17:09 ` Charlie Jenkins @ 2024-03-01 17:24 ` David Laight 2024-03-01 17:30 ` Charlie Jenkins 0 siblings, 1 reply; 7+ messages in thread From: David Laight @ 2024-03-01 17:24 UTC (permalink / raw) To: 'Charlie Jenkins', Christophe Leroy Cc: Guenter Roeck, Palmer Dabbelt, Andrew Morton, Helge Deller, James E.J. Bottomley, Parisc List, Arnd Bergmann, Geert Uytterhoeven, Russell King, linux-kernel@vger.kernel.org, Palmer Dabbelt, Linux ARM, netdev@vger.kernel.org From: Charlie Jenkins > Sent: 01 March 2024 17:09 > > On Fri, Mar 01, 2024 at 07:17:38AM +0000, Christophe Leroy wrote: > > +CC netdev ARM Russell > > > > Le 29/02/2024 à 23:46, Charlie Jenkins a écrit : > > > The test cases for ip_fast_csum and csum_ipv6_magic were not properly > > > aligning the IP header, which were causing failures on architectures > > > that do not support misaligned accesses like some ARM platforms. To > > > solve this, align the data along (14 + NET_IP_ALIGN) bytes which is the > > > standard alignment of an IP header and must be supported by the > > > architecture. > > > > In your description, please provide more details on platforms that have > > a problem, what the problem is exactly (Failed calculation, slowliness, > > kernel Oops, panic, ....) on each platform. > > > > And please copy maintainers and lists of platforms your are specifically > > addressing with this change. And as this is network related, netdev list > > should have been copied as well. > > > > I still think that your patch is not the good approach, it looks like > > you are ignoring all the discussion. Below is a quote of what Geert said > > and I fully agree with that: > > > > IMHO the tests should validate the expected functionality. If a test > > fails, either functionality is missing or behaves wrong, or the test > > is wrong. > > > > What is the point of writing tests for a core functionality like network > > checksumming that do not match the expected functionality? > > > > > > So we all agree that there is something to fix, because today's test > > does odd-address accesses which is unexpected for those functions, but > > 2-byte alignments should be supported hence tested by the test. Limiting > > the test to a 16-bytes alignment deeply reduces the usefullness of the test. > > > > Maybe I am lost in the conversations. This isn't limited to 16-bytes > alignment? It aligns along 14 + NET_IP_ALIGN. That is 16 on some > platforms and 14 on platforms where unaligned accesses are desired. > These functions are expected to be called with this offset. Testing with > any other alignment is not the expected behavior. These tests are > testing the expected functionality. Aligned received frames can have a 4 byte VLAN header (or two) removed. So the alignment of the IP header is either 4n or 4n+2. If the cpu fault misaligned accesses you really want the alignment to be 4n. You pretty much never want to trap and fixup a misaligned access. Especially in the network stack. I suspect it is better to do a realignment copy of the entire frame. At some point the data will be copied again, although you may want a CBU (crystal ball unit) to decide whether to align on an 8n or 8n+4 boundary to optimise a later copy. CPU that support misaligned transfers just make coders sloppy :-) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v11] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests 2024-03-01 17:24 ` David Laight @ 2024-03-01 17:30 ` Charlie Jenkins 0 siblings, 0 replies; 7+ messages in thread From: Charlie Jenkins @ 2024-03-01 17:30 UTC (permalink / raw) To: David Laight Cc: Christophe Leroy, Guenter Roeck, Palmer Dabbelt, Andrew Morton, Helge Deller, James E.J. Bottomley, Parisc List, Arnd Bergmann, Geert Uytterhoeven, Russell King, linux-kernel@vger.kernel.org, Palmer Dabbelt, Linux ARM, netdev@vger.kernel.org On Fri, Mar 01, 2024 at 05:24:39PM +0000, David Laight wrote: > From: Charlie Jenkins > > Sent: 01 March 2024 17:09 > > > > On Fri, Mar 01, 2024 at 07:17:38AM +0000, Christophe Leroy wrote: > > > +CC netdev ARM Russell > > > > > > Le 29/02/2024 à 23:46, Charlie Jenkins a écrit : > > > > The test cases for ip_fast_csum and csum_ipv6_magic were not properly > > > > aligning the IP header, which were causing failures on architectures > > > > that do not support misaligned accesses like some ARM platforms. To > > > > solve this, align the data along (14 + NET_IP_ALIGN) bytes which is the > > > > standard alignment of an IP header and must be supported by the > > > > architecture. > > > > > > In your description, please provide more details on platforms that have > > > a problem, what the problem is exactly (Failed calculation, slowliness, > > > kernel Oops, panic, ....) on each platform. > > > > > > And please copy maintainers and lists of platforms your are specifically > > > addressing with this change. And as this is network related, netdev list > > > should have been copied as well. > > > > > > I still think that your patch is not the good approach, it looks like > > > you are ignoring all the discussion. Below is a quote of what Geert said > > > and I fully agree with that: > > > > > > IMHO the tests should validate the expected functionality. If a test > > > fails, either functionality is missing or behaves wrong, or the test > > > is wrong. > > > > > > What is the point of writing tests for a core functionality like network > > > checksumming that do not match the expected functionality? > > > > > > > > > So we all agree that there is something to fix, because today's test > > > does odd-address accesses which is unexpected for those functions, but > > > 2-byte alignments should be supported hence tested by the test. Limiting > > > the test to a 16-bytes alignment deeply reduces the usefullness of the test. > > > > > > > Maybe I am lost in the conversations. This isn't limited to 16-bytes > > alignment? It aligns along 14 + NET_IP_ALIGN. That is 16 on some > > platforms and 14 on platforms where unaligned accesses are desired. > > These functions are expected to be called with this offset. Testing with > > any other alignment is not the expected behavior. These tests are > > testing the expected functionality. > > Aligned received frames can have a 4 byte VLAN header (or two) removed. > So the alignment of the IP header is either 4n or 4n+2. > If the cpu fault misaligned accesses you really want the alignment > to be 4n. > > You pretty much never want to trap and fixup a misaligned access. > Especially in the network stack. > I suspect it is better to do a realignment copy of the entire frame. > At some point the data will be copied again, although you may want > a CBU (crystal ball unit) to decide whether to align on an 8n > or 8n+4 boundary to optimise a later copy. > > CPU that support misaligned transfers just make coders sloppy :-) > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales) > Can you elaborate on how exactly you suggest the tests to be changed to accomidate what you are saying here? I don't understand how what I have proposed doesn't represent the use case of these functions. - Charlie _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <62b69aaf-7633-4bd8-aefe-5ba47147dba7@roeck-us.net>]
[parent not found: <f422742a-4c86-4cb0-a4f7-a62f0310eb23@csgroup.eu>]
[parent not found: <6df98c91-26b1-497a-9202-18bf86c0130d@roeck-us.net>]
* Re: [PATCH v11] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests [not found] ` <6df98c91-26b1-497a-9202-18bf86c0130d@roeck-us.net> @ 2024-03-04 11:39 ` Christophe Leroy 2024-03-04 13:39 ` Arnd Bergmann 0 siblings, 1 reply; 7+ messages in thread From: Christophe Leroy @ 2024-03-04 11:39 UTC (permalink / raw) To: Guenter Roeck, Russell King Cc: linux-kernel@vger.kernel.org, Palmer Dabbelt, David Laight, Charlie Jenkins, James E.J. Bottomley, Helge Deller, Palmer Dabbelt, Geert Uytterhoeven, Arnd Bergmann, Andrew Morton, Parisc List, Linux ARM Hi Russell and Guenter, Le 03/03/2024 à 16:26, Guenter Roeck a écrit : > On 3/3/24 02:20, Christophe Leroy wrote: >> >> >> Le 01/03/2024 à 19:32, Guenter Roeck a écrit : >>> This leaves the mps2-an385:mps2_defconfig crash, which is avoided by >>> this patch. >>> My understanding, which may be wrong, is that arm images with thumb >>> instructions >>> do not support unaligned accesses (maybe I should say do not support >>> unaligned >>> accesses with the mps2-an385 qemu emulation; I did not test with real >>> hardware, >>> after all). ... >> >> Can you tell how to proceed ? >> > > You can't run it directly. mps2-an385 is one of the platforms where > the qemu maintainers insisted that qemu shall not initialize the CPU. > You have to provide a shim such as > https://github.com/groeck/linux-build-test/blob/master/rootfs/arm/mps2-boot.axf > as bios. You also have to provide the dtb file. > > On top of that, you would need a customized version of qemu which > actually reads the command line, the bios file, and the dtb. See > https://github.com/groeck/linux-build-test/tree/master/qemu > branch v8.2.1-local or v8.1.5-local. > Many thanks for your guidance. So, I did the test and what I can say: ip_fast_csum() works whatever the alignment is. csum_ipv6_magic() is the problem with unaligned ipv6 source or destination addresses: [ 0.503757] KTAP version 1 [ 0.503854] 1..1 [ 0.504156] KTAP version 1 [ 0.504251] # Subtest: checksum [ 0.504563] # module: checksum_kunit [ 0.504730] 1..5 [ 0.546418] ok 1 test_csum_fixed_random_inputs [ 0.627853] ok 2 test_csum_all_carry_inputs [ 0.704918] ok 3 test_csum_no_carry_inputs [ 0.705845] ok 4 test_ip_fast_csum [ 0.706320] [ 0.706320] Unhandled exception: IPSR = 00000006 LR = fffffff1 [ 0.706796] CPU: 0 PID: 28 Comm: kunit_try_catch Tainted: G N 6.8.0-rc1-00609-g9c0b7a2e25f0 #649 [ 0.707177] Hardware name: Generic DT based system [ 0.707400] PC is at __csum_ipv6_magic+0x8/0xb4 [ 0.708170] LR is at test_csum_ipv6_magic+0x3d/0xa4 [ 0.708415] pc : [<211b0da8>] lr : [<210e3bf5>] psr: 0100020b [ 0.708692] sp : 2153debc ip : 46c7f0d2 fp : 00000000 [ 0.708919] r10: 00000000 r9 : 2141dc48 r8 : 211e0e20 [ 0.709148] r7 : 00003085 r6 : 00000001 r5 : 2141dd24 r4 : 211e0c2e [ 0.709422] r3 : 2c000000 r2 : 1ac7f0d2 r1 : 211e0c19 r0 : 211e0c09 [ 0.709704] xPSR: 0100020b I don't know much about ARM instruction set, seems like the ldr instruction used in ip_fast_csum() doesn't mind unaligned accesses while ldmia instruction used in csum_ipv6_magic() minds. Or is it a wrong behaviour of QEMU ? If I change the test as follows to only use word aligned IPv6 addresses, it works: diff --git a/lib/checksum_kunit.c b/lib/checksum_kunit.c index 225bb7701460..4d86fc8ccd78 100644 --- a/lib/checksum_kunit.c +++ b/lib/checksum_kunit.c @@ -607,7 +607,7 @@ static void test_csum_ipv6_magic(struct kunit *test) const int csum_offset = sizeof(struct in6_addr) + sizeof(struct in6_addr) + sizeof(int) + sizeof(char); - for (int i = 0; i < NUM_IPv6_TESTS; i++) { + for (int i = 0; i < NUM_IPv6_TESTS; i += 4) { saddr = (const struct in6_addr *)(random_buf + i); daddr = (const struct in6_addr *)(random_buf + i + daddr_offset); If I change csum_ipv6_magic() as follows to use instruction ldr instead of ldmia, it also works without any change to the test: diff --git a/arch/arm/lib/csumipv6.S b/arch/arm/lib/csumipv6.S index 3559d515144c..a312d0836b95 100644 --- a/arch/arm/lib/csumipv6.S +++ b/arch/arm/lib/csumipv6.S @@ -12,12 +12,18 @@ ENTRY(__csum_ipv6_magic) str lr, [sp, #-4]! adds ip, r2, r3 - ldmia r1, {r1 - r3, lr} + ldr r2, [r1], #4 + ldr r3, [r1], #4 + ldr lr, [r1], #4 + ldr r1, [r1] adcs ip, ip, r1 adcs ip, ip, r2 adcs ip, ip, r3 adcs ip, ip, lr - ldmia r0, {r0 - r3} + ldr r1, [r0], #4 + ldr r2, [r0], #4 + ldr r3, [r0], #4 + ldr r0, [r0] adcs r0, ip, r0 adcs r0, r0, r1 adcs r0, r0, r2 So now we are back to the initial question, should checksumming on unaligned addresses be supported or not ? Russell I understand from previous answers from you that half-word alignment should be supported, in that case should ARM version of csum_ipv6_magic() be modified ? In that case can you propose the most optimised fix ? If not, then the test has to be fixed to only use word-aligned IPv6 addresses. Thanks Christophe _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v11] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests 2024-03-04 11:39 ` Christophe Leroy @ 2024-03-04 13:39 ` Arnd Bergmann 2024-03-05 9:27 ` David Laight 0 siblings, 1 reply; 7+ messages in thread From: Arnd Bergmann @ 2024-03-04 13:39 UTC (permalink / raw) To: Christophe Leroy, Guenter Roeck, Russell King Cc: linux-kernel@vger.kernel.org, Palmer Dabbelt, David Laight, Charlie Jenkins, James E . J . Bottomley, Helge Deller, Palmer Dabbelt, Geert Uytterhoeven, Andrew Morton, Parisc List, Linux ARM On Mon, Mar 4, 2024, at 12:39, Christophe Leroy wrote: > Le 03/03/2024 à 16:26, Guenter Roeck a écrit : >> On 3/3/24 02:20, Christophe Leroy wrote: > > I don't know much about ARM instruction set, seems like the ldr > instruction used in ip_fast_csum() doesn't mind unaligned accesses while > ldmia instruction used in csum_ipv6_magic() minds. Or is it a wrong > behaviour of QEMU ? Correct. On ARMv6 and newer, accessing normal unaligned memory with ldr/str does not trap, and that covers most unaligned accesses. Some of the cases that don't allow unaligned access include: - ARMv4/ARMv5 cannot access unaligned memory with the same instructions. Apparently the same is true for ARMv7-M. - multi-word accesses (ldrd/strd and ldm/stm) require 32-bit alignment. These are generated for most 64-bit variables and some arrays - unaligned access on MMIO registers (__iomem pointers) always trap - atomic access (ldrex/strex) requires aligned data - The C standard disallows casting to a type with larger alignment requirements, and gcc is known to produce code that doesn't work with this (and other) undefined behavior. > If I change the test as follows to only use word aligned IPv6 addresses, > it works: > > diff --git a/lib/checksum_kunit.c b/lib/checksum_kunit.c > index 225bb7701460..4d86fc8ccd78 100644 > --- a/lib/checksum_kunit.c > +++ b/lib/checksum_kunit.c > @@ -607,7 +607,7 @@ static void test_csum_ipv6_magic(struct kunit *test) > const int csum_offset = sizeof(struct in6_addr) + sizeof(struct > in6_addr) + > sizeof(int) + sizeof(char); > > - for (int i = 0; i < NUM_IPv6_TESTS; i++) { > + for (int i = 0; i < NUM_IPv6_TESTS; i += 4) { > saddr = (const struct in6_addr *)(random_buf + i); > daddr = (const struct in6_addr *)(random_buf + i + > daddr_offset); > > > If I change csum_ipv6_magic() as follows to use instruction ldr instead > of ldmia, it also works without any change to the test: > > diff --git a/arch/arm/lib/csumipv6.S b/arch/arm/lib/csumipv6.S > index 3559d515144c..a312d0836b95 100644 > --- a/arch/arm/lib/csumipv6.S > +++ b/arch/arm/lib/csumipv6.S > @@ -12,12 +12,18 @@ > ENTRY(__csum_ipv6_magic) > str lr, [sp, #-4]! > adds ip, r2, r3 > - ldmia r1, {r1 - r3, lr} > + ldr r2, [r1], #4 > + ldr r3, [r1], #4 > + ldr lr, [r1], #4 > + ldr r1, [r1] > > So now we are back to the initial question, should checksumming on > unaligned addresses be supported or not ? > > Russell I understand from previous answers from you that half-word > alignment should be supported, in that case should ARM version of > csum_ipv6_magic() be modified ? In that case can you propose the most > optimised fix ? The csumipv6.S code predates ARMv6 and is indeed suboptimal on v6/v7 processors with unaligned ipv6 headers. Your workaround looks like it should be much better, but it would at the same time make the ARMv5 case much more expensive because it traps four times instead of just one. > If not, then the test has to be fixed to only use word-aligned IPv6 > addresses. Because of the gcc issue I mentioned, net/ipv6/ip6_checksum.c and anything else that accesses misaligned ipv6 headers may need to be changed as well. Marking in6_addr as '__packed __aligned(2)' should be sufficient for that. This will prevent gcc from issuing ldm or ldrd on ARMv6+ as well as making optimization based on the two lower bits of the address being zero on x86 and others. The downside is that it forces 16-bit loads and stores to be used on architectures that don't have efficient unaligned access (armv5, alpha, mips, sparc and xtensa among others) even when the IP headers are fully aligned. Arnd _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH v11] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests 2024-03-04 13:39 ` Arnd Bergmann @ 2024-03-05 9:27 ` David Laight 0 siblings, 0 replies; 7+ messages in thread From: David Laight @ 2024-03-05 9:27 UTC (permalink / raw) To: 'Arnd Bergmann', Christophe Leroy, Guenter Roeck, Russell King Cc: linux-kernel@vger.kernel.org, Palmer Dabbelt, Charlie Jenkins, James E . J . Bottomley, Helge Deller, Palmer Dabbelt, Geert Uytterhoeven, Andrew Morton, Parisc List, Linux ARM From: Arnd Bergmann > Sent: 04 March 2024 13:40 ... > > If not, then the test has to be fixed to only use word-aligned IPv6 > > addresses. > > Because of the gcc issue I mentioned, net/ipv6/ip6_checksum.c > and anything else that accesses misaligned ipv6 headers may need > to be changed as well. Marking in6_addr as '__packed __aligned(2)' > should be sufficient for that. This will prevent gcc from issuing > ldm or ldrd on ARMv6+ as well as making optimization based on > the two lower bits of the address being zero on x86 and others. Eh? x86 pretty much doesn't care unless you are using AVX. > The downside is that it forces 16-bit loads and stores to be > used on architectures that don't have efficient unaligned > access (armv5, alpha, mips, sparc and xtensa among others) > even when the IP headers are fully aligned. Aren't the later accesses to the header also going to fault? IIRC there is an skb_pull() call to ensure all the IP header is in the linear skb fragment? Perhaps there should be an skb_pull_aligned() that will ensure the data is 32bit aligned on systems where the misaligned accesses fault? There might still need to be something to stop gcc generating ldm/ldrd which can fault on systems where a normal register read wouldn't. Do any recent arm cpu have the strongarm 'feature' than ldm always took 16 clocks? David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-03-05 9:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240229-fix_sparse_errors_checksum_tests-v11-1-f608d9ec7574@rivosinc.com>
2024-03-01 7:17 ` [PATCH v11] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests Christophe Leroy
2024-03-01 17:09 ` Charlie Jenkins
2024-03-01 17:24 ` David Laight
2024-03-01 17:30 ` Charlie Jenkins
[not found] ` <62b69aaf-7633-4bd8-aefe-5ba47147dba7@roeck-us.net>
[not found] ` <f422742a-4c86-4cb0-a4f7-a62f0310eb23@csgroup.eu>
[not found] ` <6df98c91-26b1-497a-9202-18bf86c0130d@roeck-us.net>
2024-03-04 11:39 ` Christophe Leroy
2024-03-04 13:39 ` Arnd Bergmann
2024-03-05 9:27 ` David Laight
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).