* [PATCH] atomic: Specify natural alignment for atomic_t @ 2025-08-25 2:03 Finn Thain [not found] ` <20250825032743.80641-1-ioworker0@gmail.com> 2025-08-26 15:22 ` Eero Tamminen 0 siblings, 2 replies; 15+ messages in thread From: Finn Thain @ 2025-08-25 2:03 UTC (permalink / raw) To: Andrew Morton, Lance Yang Cc: Geert Uytterhoeven, Masami Hiramatsu, Eero Tamminen, Peter Zijlstra, Will Deacon, stable, linux-kernel, linux-m68k Some recent commits incorrectly assumed the natural alignment of locks. That assumption fails on Linux/m68k (and, interestingly, would have failed on Linux/cris also). This leads to spurious warnings from the hang check code. Fix this bug by adding the necessary 'aligned' attribute. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Lance Yang <lance.yang@linux.dev> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Eero Tamminen <oak@helsinkinet.fi> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org Reported-by: Eero Tamminen <oak@helsinkinet.fi> Closes: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc1_0g@mail.gmail.com/ Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker") Signed-off-by: Finn Thain <fthain@linux-m68k.org> --- I tested this on m68k using GCC and it fixed the problem for me. AFAIK, the other architectures naturally align ints already so I'm expecting to see no effect there. --- include/linux/types.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/types.h b/include/linux/types.h index 6dfdb8e8e4c3..cd5b2b0f4b02 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -179,7 +179,7 @@ typedef phys_addr_t resource_size_t; typedef unsigned long irq_hw_number_t; typedef struct { - int counter; + int counter __aligned(sizeof(int)); } atomic_t; #define ATOMIC_INIT(i) { (i) } -- 2.49.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
[parent not found: <20250825032743.80641-1-ioworker0@gmail.com>]
[parent not found: <c8851682-25f1-f594-e30f-5b62e019d37b@linux-m68k.org>]
[parent not found: <96ae7afc-c882-4c3d-9dea-3e2ae2789caf@linux.dev>]
[parent not found: <5a44c60b-650a-1f8a-d5cb-abf9f0716817@linux-m68k.org>]
[parent not found: <4e7e7292-338d-4a57-84ec-ae7427f6ad7c@linux.dev>]
* Re: [PATCH] atomic: Specify natural alignment for atomic_t [not found] ` <4e7e7292-338d-4a57-84ec-ae7427f6ad7c@linux.dev> @ 2025-08-25 10:49 ` Finn Thain 2025-08-25 11:19 ` Lance Yang 0 siblings, 1 reply; 15+ messages in thread From: Finn Thain @ 2025-08-25 10:49 UTC (permalink / raw) To: Lance Yang Cc: akpm, geert, linux-kernel, mhiramat, oak, peterz, stable, will, Lance Yang, linux-m68k [Belated Cc linux-m68k...] On Mon, 25 Aug 2025, Lance Yang wrote: > > On 2025/8/25 14:17, Finn Thain wrote: > > > > On Mon, 25 Aug 2025, Lance Yang wrote: > > > >> > >> What if we squash the runtime check fix into your patch? > > > > Did my patch not solve the problem? > > Hmm... it should solve the problem for natural alignment, which is a > critical fix. > > But it cannot solve the problem of forced misalignment from drivers > using #pragma pack(1). The runtime warning will still trigger in those > cases. > > I built a simple test module on a kernel with your patch applied: > > ``` > #include <linux/module.h> > #include <linux/init.h> > > struct __attribute__((packed)) test_container { > char padding[49]; > struct mutex io_lock; > }; > > static int __init alignment_init(void) > { > struct test_container cont; > pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4); > return 0; > } > > static void __exit alignment_exit(void) > { > pr_info("Module unloaded\n"); > } > > module_init(alignment_init); > module_exit(alignment_exit); > MODULE_LICENSE("GPL"); > MODULE_AUTHOR("x"); > MODULE_DESCRIPTION("x"); > ``` > > Result from dmesg: > [Mon Aug 25 15:44:50 2025] io_lock address offset mod 4: 1 > Thanks for sending code to illustrate your point. Unfortunately, I was not able to reproduce your results. Tested on x86, your test module shows no misalignment: [131840.349042] io_lock address offset mod 4: 0 Tested on m68k I also get 0, given the patch at the top of this thread: [ 0.400000] io_lock address offset mod 4: 0 > > As we can see, a packed struct can still force the entire mutex object > to an unaligned address. With an address like this, the WARN_ON_ONCE can > still be triggered. I don't think so. But there is something unexpected going on here -- the output from pahole appears to say io_lock is at offset 49, which seems to contradict the printk() output above. struct test_container { char padding[49]; /* 0 49 */ struct mutex io_lock __attribute__((__aligned__(1))); /* 49 12 */ /* size: 61, cachelines: 1, members: 2 */ /* forced alignments: 1 */ /* last cacheline: 61 bytes */ } __attribute__((__packed__)); ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-08-25 10:49 ` Finn Thain @ 2025-08-25 11:19 ` Lance Yang 2025-08-25 11:36 ` Lance Yang 0 siblings, 1 reply; 15+ messages in thread From: Lance Yang @ 2025-08-25 11:19 UTC (permalink / raw) To: Finn Thain Cc: akpm, geert, linux-kernel, mhiramat, oak, peterz, stable, will, Lance Yang, linux-m68k Thanks for digging deeper! On 2025/8/25 18:49, Finn Thain wrote: > > [Belated Cc linux-m68k...] > > On Mon, 25 Aug 2025, Lance Yang wrote: > >> >> On 2025/8/25 14:17, Finn Thain wrote: >>> >>> On Mon, 25 Aug 2025, Lance Yang wrote: >>> >>>> >>>> What if we squash the runtime check fix into your patch? >>> >>> Did my patch not solve the problem? >> >> Hmm... it should solve the problem for natural alignment, which is a >> critical fix. >> >> But it cannot solve the problem of forced misalignment from drivers >> using #pragma pack(1). The runtime warning will still trigger in those >> cases. >> >> I built a simple test module on a kernel with your patch applied: >> >> ``` >> #include <linux/module.h> >> #include <linux/init.h> >> >> struct __attribute__((packed)) test_container { >> char padding[49]; >> struct mutex io_lock; >> }; >> >> static int __init alignment_init(void) >> { >> struct test_container cont; >> pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4); >> return 0; >> } >> >> static void __exit alignment_exit(void) >> { >> pr_info("Module unloaded\n"); >> } >> >> module_init(alignment_init); >> module_exit(alignment_exit); >> MODULE_LICENSE("GPL"); >> MODULE_AUTHOR("x"); >> MODULE_DESCRIPTION("x"); >> ``` >> >> Result from dmesg: >> [Mon Aug 25 15:44:50 2025] io_lock address offset mod 4: 1 >> > > Thanks for sending code to illustrate your point. Unfortunately, I was not > able to reproduce your results. Tested on x86, your test module shows no > misalignment: > > [131840.349042] io_lock address offset mod 4: 0 > > Tested on m68k I also get 0, given the patch at the top of this thread: > > [ 0.400000] io_lock address offset mod 4: 0 > >> >> As we can see, a packed struct can still force the entire mutex object >> to an unaligned address. With an address like this, the WARN_ON_ONCE can >> still be triggered. > > I don't think so. But there is something unexpected going on here -- the > output from pahole appears to say io_lock is at offset 49, which seems to > contradict the printk() output above. Interesting! That contradiction is the key. It seems we're seeing different compiler behaviors. I'm on GCC 14.2.0 (Debian 14.2.0-19), and it appears to be strictly respecting the packed attribute. So let's print something more: ``` #include <linux/module.h> #include <linux/init.h> struct __attribute__((packed)) test_container { char padding[49]; struct mutex io_lock; }; static int __init alignment_init(void) { struct test_container cont; pr_info("Container base address : %px\n", &cont); pr_info("io_lock member address : %px\n", &cont.io_lock); pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4); return 0; } static void __exit alignment_exit(void) { pr_info("Module unloaded\n"); } module_init(alignment_init); module_exit(alignment_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("x"); MODULE_DESCRIPTION("x"); ``` Result from dmesg: ``` [Mon Aug 25 19:15:33 2025] Container base address : ff1100063570f840 [Mon Aug 25 19:15:33 2025] io_lock member address : ff1100063570f871 [Mon Aug 25 19:15:33 2025] io_lock address offset mod 4: 1 ``` io_lock is exactly base + 49, resulting in misalignment. Seems like your compiler is cleverly re-aligning the whole struct on the stack, but we can't rely on that behavior, as it's not guaranteed across all compilers or versions. wdyt? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-08-25 11:19 ` Lance Yang @ 2025-08-25 11:36 ` Lance Yang 2025-08-27 23:43 ` Finn Thain 0 siblings, 1 reply; 15+ messages in thread From: Lance Yang @ 2025-08-25 11:36 UTC (permalink / raw) To: Finn Thain Cc: akpm, geert, linux-kernel, mhiramat, oak, peterz, stable, will, Lance Yang, linux-m68k On 2025/8/25 19:19, Lance Yang wrote: > Thanks for digging deeper! > > On 2025/8/25 18:49, Finn Thain wrote: >> >> [Belated Cc linux-m68k...] >> >> On Mon, 25 Aug 2025, Lance Yang wrote: >> >>> >>> On 2025/8/25 14:17, Finn Thain wrote: >>>> >>>> On Mon, 25 Aug 2025, Lance Yang wrote: >>>> >>>>> >>>>> What if we squash the runtime check fix into your patch? >>>> >>>> Did my patch not solve the problem? >>> >>> Hmm... it should solve the problem for natural alignment, which is a >>> critical fix. >>> >>> But it cannot solve the problem of forced misalignment from drivers >>> using #pragma pack(1). The runtime warning will still trigger in those >>> cases. >>> >>> I built a simple test module on a kernel with your patch applied: >>> >>> ``` >>> #include <linux/module.h> >>> #include <linux/init.h> >>> >>> struct __attribute__((packed)) test_container { >>> char padding[49]; >>> struct mutex io_lock; >>> }; >>> >>> static int __init alignment_init(void) >>> { >>> struct test_container cont; >>> pr_info("io_lock address offset mod 4: %lu\n", (unsigned >>> long)&cont.io_lock % 4); >>> return 0; >>> } >>> >>> static void __exit alignment_exit(void) >>> { >>> pr_info("Module unloaded\n"); >>> } >>> >>> module_init(alignment_init); >>> module_exit(alignment_exit); >>> MODULE_LICENSE("GPL"); >>> MODULE_AUTHOR("x"); >>> MODULE_DESCRIPTION("x"); >>> ``` >>> >>> Result from dmesg: >>> [Mon Aug 25 15:44:50 2025] io_lock address offset mod 4: 1 >>> >> >> Thanks for sending code to illustrate your point. Unfortunately, I was >> not >> able to reproduce your results. Tested on x86, your test module shows no >> misalignment: >> >> [131840.349042] io_lock address offset mod 4: 0 >> >> Tested on m68k I also get 0, given the patch at the top of this thread: >> >> [ 0.400000] io_lock address offset mod 4: 0 >> >>> >>> As we can see, a packed struct can still force the entire mutex object >>> to an unaligned address. With an address like this, the WARN_ON_ONCE can >>> still be triggered. > >> >> I don't think so. But there is something unexpected going on here -- the >> output from pahole appears to say io_lock is at offset 49, which seems to >> contradict the printk() output above. > > Interesting! That contradiction is the key. It seems we're seeing different > compiler behaviors. > > I'm on GCC 14.2.0 (Debian 14.2.0-19), and it appears to be strictly > respecting > the packed attribute. > > So let's print something more: > > ``` > #include <linux/module.h> > #include <linux/init.h> > > struct __attribute__((packed)) test_container { > char padding[49]; > struct mutex io_lock; > }; > > static int __init alignment_init(void) > { > struct test_container cont; > pr_info("Container base address : %px\n", &cont); > pr_info("io_lock member address : %px\n", &cont.io_lock); > pr_info("io_lock address offset mod 4: %lu\n", (unsigned > long)&cont.io_lock % 4); > return 0; > } > > static void __exit alignment_exit(void) > { > pr_info("Module unloaded\n"); > } > > module_init(alignment_init); > module_exit(alignment_exit); > MODULE_LICENSE("GPL"); > MODULE_AUTHOR("x"); > MODULE_DESCRIPTION("x"); > ``` > > Result from dmesg: > > ``` > [Mon Aug 25 19:15:33 2025] Container base address : ff1100063570f840 > [Mon Aug 25 19:15:33 2025] io_lock member address : ff1100063570f871 > [Mon Aug 25 19:15:33 2025] io_lock address offset mod 4: 1 > ``` > > io_lock is exactly base + 49, resulting in misalignment. > > Seems like your compiler is cleverly re-aligning the whole struct on the > stack, but we can't rely on that behavior, as it's not guaranteed across > all compilers or versions. wdyt? Same here, using a global static variable instead of a local one. The result is consistently misaligned. ``` #include <linux/module.h> #include <linux/init.h> static struct __attribute__((packed)) test_container { char padding[49]; struct mutex io_lock; } cont; static int __init alignment_init(void) { pr_info("Container base address : %px\n", &cont); pr_info("io_lock member address : %px\n", &cont.io_lock); pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4); return 0; } static void __exit alignment_exit(void) { pr_info("Module unloaded\n"); } module_init(alignment_init); module_exit(alignment_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("x"); MODULE_DESCRIPTION("x"); ``` Result from dmesg: ``` [Mon Aug 25 19:33:28 2025] Container base address : ffffffffc28f0940 [Mon Aug 25 19:33:28 2025] io_lock member address : ffffffffc28f0971 [Mon Aug 25 19:33:28 2025] io_lock address offset mod 4: 1 ``` ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-08-25 11:36 ` Lance Yang @ 2025-08-27 23:43 ` Finn Thain 2025-08-28 2:05 ` Lance Yang 0 siblings, 1 reply; 15+ messages in thread From: Finn Thain @ 2025-08-27 23:43 UTC (permalink / raw) To: Lance Yang Cc: akpm, geert, linux-kernel, mhiramat, oak, peterz, stable, will, Lance Yang, linux-m68k On Mon, 25 Aug 2025, Lance Yang wrote: > > Same here, using a global static variable instead of a local one. The > result is consistently misaligned. > > ``` > #include <linux/module.h> > #include <linux/init.h> > > static struct __attribute__((packed)) test_container { > char padding[49]; > struct mutex io_lock; > } cont; > > static int __init alignment_init(void) > { > pr_info("Container base address : %px\n", &cont); > pr_info("io_lock member address : %px\n", &cont.io_lock); > pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4); > return 0; > } > > static void __exit alignment_exit(void) > { > pr_info("Module unloaded\n"); > } > > module_init(alignment_init); > module_exit(alignment_exit); > MODULE_LICENSE("GPL"); > MODULE_AUTHOR("x"); > MODULE_DESCRIPTION("x"); > ``` > > Result from dmesg: > > ``` > [Mon Aug 25 19:33:28 2025] Container base address : ffffffffc28f0940 > [Mon Aug 25 19:33:28 2025] io_lock member address : ffffffffc28f0971 > [Mon Aug 25 19:33:28 2025] io_lock address offset mod 4: 1 > ``` > FTR, I was able to reproduce that result (i.e. static storage): [ 0.320000] Container base address : 0055d9d0 [ 0.320000] io_lock member address : 0055da01 [ 0.320000] io_lock address offset mod 4: 1 I think the experiments you sent previously would have demonstrated the same result, except for the unpredictable base address that you sensibly logged in this version. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-08-27 23:43 ` Finn Thain @ 2025-08-28 2:05 ` Lance Yang 2025-09-01 8:45 ` Geert Uytterhoeven 0 siblings, 1 reply; 15+ messages in thread From: Lance Yang @ 2025-08-28 2:05 UTC (permalink / raw) To: Finn Thain Cc: akpm, geert, linux-kernel, mhiramat, oak, peterz, stable, will, Lance Yang, linux-m68k On 2025/8/28 07:43, Finn Thain wrote: > > On Mon, 25 Aug 2025, Lance Yang wrote: > >> >> Same here, using a global static variable instead of a local one. The >> result is consistently misaligned. >> >> ``` >> #include <linux/module.h> >> #include <linux/init.h> >> >> static struct __attribute__((packed)) test_container { >> char padding[49]; >> struct mutex io_lock; >> } cont; >> >> static int __init alignment_init(void) >> { >> pr_info("Container base address : %px\n", &cont); >> pr_info("io_lock member address : %px\n", &cont.io_lock); >> pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4); >> return 0; >> } >> >> static void __exit alignment_exit(void) >> { >> pr_info("Module unloaded\n"); >> } >> >> module_init(alignment_init); >> module_exit(alignment_exit); >> MODULE_LICENSE("GPL"); >> MODULE_AUTHOR("x"); >> MODULE_DESCRIPTION("x"); >> ``` >> >> Result from dmesg: >> >> ``` >> [Mon Aug 25 19:33:28 2025] Container base address : ffffffffc28f0940 >> [Mon Aug 25 19:33:28 2025] io_lock member address : ffffffffc28f0971 >> [Mon Aug 25 19:33:28 2025] io_lock address offset mod 4: 1 >> ``` >> > > FTR, I was able to reproduce that result (i.e. static storage): > > [ 0.320000] Container base address : 0055d9d0 > [ 0.320000] io_lock member address : 0055da01 > [ 0.320000] io_lock address offset mod 4: 1 > > I think the experiments you sent previously would have demonstrated the > same result, except for the unpredictable base address that you sensibly > logged in this version. Thanks for taking the time to reproduce it! This proves the problem can happen in practice (e.g., with packed structs), so we need to ignore the unaligned pointers on the architectures that don't trap for now. Cheers, Lance ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-08-28 2:05 ` Lance Yang @ 2025-09-01 8:45 ` Geert Uytterhoeven 2025-09-02 13:30 ` Lance Yang 2025-09-06 11:43 ` David Laight 0 siblings, 2 replies; 15+ messages in thread From: Geert Uytterhoeven @ 2025-09-01 8:45 UTC (permalink / raw) To: Lance Yang Cc: Finn Thain, akpm, linux-kernel, mhiramat, oak, peterz, stable, will, Lance Yang, linux-m68k Hi Lance, On Thu, 28 Aug 2025 at 04:05, Lance Yang <lance.yang@linux.dev> wrote: > On 2025/8/28 07:43, Finn Thain wrote: > > On Mon, 25 Aug 2025, Lance Yang wrote: > >> Same here, using a global static variable instead of a local one. The > >> result is consistently misaligned. > >> > >> ``` > >> #include <linux/module.h> > >> #include <linux/init.h> > >> > >> static struct __attribute__((packed)) test_container { > >> char padding[49]; > >> struct mutex io_lock; > >> } cont; > >> > >> static int __init alignment_init(void) > >> { > >> pr_info("Container base address : %px\n", &cont); > >> pr_info("io_lock member address : %px\n", &cont.io_lock); > >> pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4); > >> return 0; > >> } > >> > >> static void __exit alignment_exit(void) > >> { > >> pr_info("Module unloaded\n"); > >> } > >> > >> module_init(alignment_init); > >> module_exit(alignment_exit); > >> MODULE_LICENSE("GPL"); > >> MODULE_AUTHOR("x"); > >> MODULE_DESCRIPTION("x"); > >> ``` > >> > >> Result from dmesg: > >> > >> ``` > >> [Mon Aug 25 19:33:28 2025] Container base address : ffffffffc28f0940 > >> [Mon Aug 25 19:33:28 2025] io_lock member address : ffffffffc28f0971 > >> [Mon Aug 25 19:33:28 2025] io_lock address offset mod 4: 1 > >> ``` > >> > > > > FTR, I was able to reproduce that result (i.e. static storage): > > > > [ 0.320000] Container base address : 0055d9d0 > > [ 0.320000] io_lock member address : 0055da01 > > [ 0.320000] io_lock address offset mod 4: 1 > > > > I think the experiments you sent previously would have demonstrated the > > same result, except for the unpredictable base address that you sensibly > > logged in this version. > > Thanks for taking the time to reproduce it! > > This proves the problem can happen in practice (e.g., with packed structs), > so we need to ignore the unaligned pointers on the architectures that don't > trap for now. Putting locks inside a packed struct is definitely a Very Bad Idea and a No Go. Packed structs are meant to describe memory data and MMIO register layouts, and must not contain control data for critical sections. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-09-01 8:45 ` Geert Uytterhoeven @ 2025-09-02 13:30 ` Lance Yang 2025-09-02 14:14 ` Geert Uytterhoeven 2025-09-06 11:43 ` David Laight 1 sibling, 1 reply; 15+ messages in thread From: Lance Yang @ 2025-09-02 13:30 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Finn Thain, akpm, linux-kernel, mhiramat, oak, peterz, stable, will, Lance Yang, linux-m68k Hi Geert, On 2025/9/1 16:45, Geert Uytterhoeven wrote: > Hi Lance, > > On Thu, 28 Aug 2025 at 04:05, Lance Yang <lance.yang@linux.dev> wrote: >> On 2025/8/28 07:43, Finn Thain wrote: >>> On Mon, 25 Aug 2025, Lance Yang wrote: >>>> Same here, using a global static variable instead of a local one. The >>>> result is consistently misaligned. >>>> >>>> ``` >>>> #include <linux/module.h> >>>> #include <linux/init.h> >>>> >>>> static struct __attribute__((packed)) test_container { >>>> char padding[49]; >>>> struct mutex io_lock; >>>> } cont; >>>> >>>> static int __init alignment_init(void) >>>> { >>>> pr_info("Container base address : %px\n", &cont); >>>> pr_info("io_lock member address : %px\n", &cont.io_lock); >>>> pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4); >>>> return 0; >>>> } >>>> >>>> static void __exit alignment_exit(void) >>>> { >>>> pr_info("Module unloaded\n"); >>>> } >>>> >>>> module_init(alignment_init); >>>> module_exit(alignment_exit); >>>> MODULE_LICENSE("GPL"); >>>> MODULE_AUTHOR("x"); >>>> MODULE_DESCRIPTION("x"); >>>> ``` >>>> >>>> Result from dmesg: >>>> >>>> ``` >>>> [Mon Aug 25 19:33:28 2025] Container base address : ffffffffc28f0940 >>>> [Mon Aug 25 19:33:28 2025] io_lock member address : ffffffffc28f0971 >>>> [Mon Aug 25 19:33:28 2025] io_lock address offset mod 4: 1 >>>> ``` >>>> >>> >>> FTR, I was able to reproduce that result (i.e. static storage): >>> >>> [ 0.320000] Container base address : 0055d9d0 >>> [ 0.320000] io_lock member address : 0055da01 >>> [ 0.320000] io_lock address offset mod 4: 1 >>> >>> I think the experiments you sent previously would have demonstrated the >>> same result, except for the unpredictable base address that you sensibly >>> logged in this version. >> >> Thanks for taking the time to reproduce it! >> >> This proves the problem can happen in practice (e.g., with packed structs), >> so we need to ignore the unaligned pointers on the architectures that don't >> trap for now. > > Putting locks inside a packed struct is definitely a Very Bad Idea > and a No Go. Packed structs are meant to describe memory data and Right. That's definitely not how packed structs should be used ;) > MMIO register layouts, and must not contain control data for critical > sections. Unfortunately, this patten was found in an in-tree driver, as reported[1] by kernel test robot, and there might be other undiscovered instances ... [1] https://lore.kernel.org/oe-kbuild-all/202508240539.ARmC1Umu-lkp@intel.com Cheers, Lance ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-09-02 13:30 ` Lance Yang @ 2025-09-02 14:14 ` Geert Uytterhoeven 0 siblings, 0 replies; 15+ messages in thread From: Geert Uytterhoeven @ 2025-09-02 14:14 UTC (permalink / raw) To: Lance Yang Cc: Finn Thain, akpm, linux-kernel, mhiramat, oak, peterz, stable, will, Lance Yang, linux-m68k Hi Lance, On Tue, 2 Sept 2025 at 15:31, Lance Yang <lance.yang@linux.dev> wrote: > On 2025/9/1 16:45, Geert Uytterhoeven wrote: > > On Thu, 28 Aug 2025 at 04:05, Lance Yang <lance.yang@linux.dev> wrote: > >> This proves the problem can happen in practice (e.g., with packed structs), > >> so we need to ignore the unaligned pointers on the architectures that don't > >> trap for now. > > > > Putting locks inside a packed struct is definitely a Very Bad Idea > > and a No Go. Packed structs are meant to describe memory data and > > Right. That's definitely not how packed structs should be used ;) > > > MMIO register layouts, and must not contain control data for critical > > sections. > > Unfortunately, this patten was found in an in-tree driver, as reported[1] > by kernel test robot, and there might be other undiscovered instances ... > > [1] > https://lore.kernel.org/oe-kbuild-all/202508240539.ARmC1Umu-lkp@intel.com That one is completely bogus, and should be removed. Currently it would crash on any platform that does not support unaligned accesses. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-09-01 8:45 ` Geert Uytterhoeven 2025-09-02 13:30 ` Lance Yang @ 2025-09-06 11:43 ` David Laight 1 sibling, 0 replies; 15+ messages in thread From: David Laight @ 2025-09-06 11:43 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Lance Yang, Finn Thain, akpm, linux-kernel, mhiramat, oak, peterz, stable, will, Lance Yang, linux-m68k On Mon, 1 Sep 2025 10:45:46 +0200 Geert Uytterhoeven <geert@linux-m68k.org> wrote: > Hi Lance, > > On Thu, 28 Aug 2025 at 04:05, Lance Yang <lance.yang@linux.dev> wrote: > > On 2025/8/28 07:43, Finn Thain wrote: > > > On Mon, 25 Aug 2025, Lance Yang wrote: > > >> Same here, using a global static variable instead of a local one. The > > >> result is consistently misaligned. > > >> > > >> ``` > > >> #include <linux/module.h> > > >> #include <linux/init.h> > > >> > > >> static struct __attribute__((packed)) test_container { > > >> char padding[49]; > > >> struct mutex io_lock; > > >> } cont; > > >> > > >> static int __init alignment_init(void) > > >> { > > >> pr_info("Container base address : %px\n", &cont); > > >> pr_info("io_lock member address : %px\n", &cont.io_lock); > > >> pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4); > > >> return 0; > > >> } > > >> > > >> static void __exit alignment_exit(void) > > >> { > > >> pr_info("Module unloaded\n"); > > >> } > > >> > > >> module_init(alignment_init); > > >> module_exit(alignment_exit); > > >> MODULE_LICENSE("GPL"); > > >> MODULE_AUTHOR("x"); > > >> MODULE_DESCRIPTION("x"); > > >> ``` > > >> > > >> Result from dmesg: > > >> > > >> ``` > > >> [Mon Aug 25 19:33:28 2025] Container base address : ffffffffc28f0940 > > >> [Mon Aug 25 19:33:28 2025] io_lock member address : ffffffffc28f0971 > > >> [Mon Aug 25 19:33:28 2025] io_lock address offset mod 4: 1 > > >> ``` > > >> > > > > > > FTR, I was able to reproduce that result (i.e. static storage): > > > > > > [ 0.320000] Container base address : 0055d9d0 > > > [ 0.320000] io_lock member address : 0055da01 > > > [ 0.320000] io_lock address offset mod 4: 1 > > > > > > I think the experiments you sent previously would have demonstrated the > > > same result, except for the unpredictable base address that you sensibly > > > logged in this version. > > > > Thanks for taking the time to reproduce it! > > > > This proves the problem can happen in practice (e.g., with packed structs), > > so we need to ignore the unaligned pointers on the architectures that don't > > trap for now. > > Putting locks inside a packed struct is definitely a Very Bad Idea > and a No Go. Packed structs are meant to describe memory data and > MMIO register layouts, and must not contain control data for critical > sections. Even for MMIO register layouts you don't (usually) want 'packed'. You may need to add explicit padding, and an 'error if padded' attribute you be useful. Sometimes you have (eg) a 64bit item on a 32bit boundary, marking the member 'packed' will remove the gap before it - usually what is wanted. In reality pretty much nothing should be 'packed'. David. > > Gr{oetje,eeting}s, > > Geert > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-08-25 2:03 [PATCH] atomic: Specify natural alignment for atomic_t Finn Thain [not found] ` <20250825032743.80641-1-ioworker0@gmail.com> @ 2025-08-26 15:22 ` Eero Tamminen 2025-08-26 17:33 ` Lance Yang 2025-09-01 8:51 ` Geert Uytterhoeven 1 sibling, 2 replies; 15+ messages in thread From: Eero Tamminen @ 2025-08-26 15:22 UTC (permalink / raw) To: Finn Thain, Andrew Morton, Lance Yang Cc: Geert Uytterhoeven, Masami Hiramatsu, Peter Zijlstra, Will Deacon, stable, linux-kernel, linux-m68k Hi Finn & Lance, On 25.8.2025 5.03, Finn Thain wrote: > Some recent commits incorrectly assumed the natural alignment of locks. > That assumption fails on Linux/m68k (and, interestingly, would have failed > on Linux/cris also). This leads to spurious warnings from the hang check > code. Fix this bug by adding the necessary 'aligned' attribute. [...] > Reported-by: Eero Tamminen <oak@helsinkinet.fi> > Closes: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc1_0g@mail.gmail.com/ > Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker") > Signed-off-by: Finn Thain <fthain@linux-m68k.org> > --- > I tested this on m68k using GCC and it fixed the problem for me. AFAIK, > the other architectures naturally align ints already so I'm expecting to > see no effect there. Yes, it fixes both of the issues (warnings & broken console): Tested-by: Eero Tamminen <oak@helsinkinet.fi> (Emulated Atari Falcon) boot up performance with this is within normal variation. On 23.8.2025 10.49, Lance Yang wrote: > Anyway, I've prepared two patches for discussion, either of which should > fix the alignment issue :) > > Patch A[1] adjusts the runtime checks to handle unaligned pointers. > Patch B[2] enforces 4-byte alignment on the core lock structures. > > Both tested on x86-64. > > [1] https://lore.kernel.org/lkml/20250823050036.7748-1-lance.yang@linux.dev > [2] https://lore.kernel.org/lkml/20250823074048.92498-1- > lance.yang@linux.dev Same goes for both of these, except that removing warnings makes minimal kernel boot 1-2% faster than 4-aligning the whole struct. - Eero ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-08-26 15:22 ` Eero Tamminen @ 2025-08-26 17:33 ` Lance Yang 2025-09-01 8:51 ` Geert Uytterhoeven 1 sibling, 0 replies; 15+ messages in thread From: Lance Yang @ 2025-08-26 17:33 UTC (permalink / raw) To: Eero Tamminen Cc: Geert Uytterhoeven, Masami Hiramatsu, Peter Zijlstra, Will Deacon, stable, linux-kernel, linux-m68k, Andrew Morton, Finn Thain Hi Eero, On 2025/8/26 23:22, Eero Tamminen wrote: > Hi Finn & Lance, > > On 25.8.2025 5.03, Finn Thain wrote: >> Some recent commits incorrectly assumed the natural alignment of locks. >> That assumption fails on Linux/m68k (and, interestingly, would have >> failed >> on Linux/cris also). This leads to spurious warnings from the hang check >> code. Fix this bug by adding the necessary 'aligned' attribute. > [...] >> Reported-by: Eero Tamminen <oak@helsinkinet.fi> >> Closes: https://lore.kernel.org/lkml/ >> CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc1_0g@mail.gmail.com/ >> Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded >> blocker") >> Signed-off-by: Finn Thain <fthain@linux-m68k.org> >> --- >> I tested this on m68k using GCC and it fixed the problem for me. AFAIK, >> the other architectures naturally align ints already so I'm expecting to >> see no effect there. > > Yes, it fixes both of the issues (warnings & broken console): > Tested-by: Eero Tamminen <oak@helsinkinet.fi> > > (Emulated Atari Falcon) boot up performance with this is within normal > variation. > > > On 23.8.2025 10.49, Lance Yang wrote: > > Anyway, I've prepared two patches for discussion, either of which should > > fix the alignment issue :) > > > > Patch A[1] adjusts the runtime checks to handle unaligned pointers. > > Patch B[2] enforces 4-byte alignment on the core lock structures. > > > > Both tested on x86-64. > > > > [1] https://lore.kernel.org/lkml/20250823050036.7748-1- > lance.yang@linux.dev > > [2] https://lore.kernel.org/lkml/20250823074048.92498-1- > > lance.yang@linux.dev > > Same goes for both of these, except that removing warnings makes minimal > kernel boot 1-2% faster than 4-aligning the whole struct. Thanks a lot for testing! Cheers, Lance ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-08-26 15:22 ` Eero Tamminen 2025-08-26 17:33 ` Lance Yang @ 2025-09-01 8:51 ` Geert Uytterhoeven 2025-09-01 15:12 ` Eero Tamminen 1 sibling, 1 reply; 15+ messages in thread From: Geert Uytterhoeven @ 2025-09-01 8:51 UTC (permalink / raw) To: Eero Tamminen Cc: Finn Thain, Andrew Morton, Lance Yang, Masami Hiramatsu, Peter Zijlstra, Will Deacon, stable, linux-kernel, linux-m68k Hi Eero, On Tue, 26 Aug 2025 at 17:22, Eero Tamminen <oak@helsinkinet.fi> wrote: > On 25.8.2025 5.03, Finn Thain wrote: > > Some recent commits incorrectly assumed the natural alignment of locks. > > That assumption fails on Linux/m68k (and, interestingly, would have failed > > on Linux/cris also). This leads to spurious warnings from the hang check > > code. Fix this bug by adding the necessary 'aligned' attribute. > [...] > > Reported-by: Eero Tamminen <oak@helsinkinet.fi> > > Closes: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc1_0g@mail.gmail.com/ > > Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker") > > Signed-off-by: Finn Thain <fthain@linux-m68k.org> > > --- > > I tested this on m68k using GCC and it fixed the problem for me. AFAIK, > > the other architectures naturally align ints already so I'm expecting to > > see no effect there. > > Yes, it fixes both of the issues (warnings & broken console): > Tested-by: Eero Tamminen <oak@helsinkinet.fi> > > (Emulated Atari Falcon) boot up performance with this is within normal > variation. > > On 23.8.2025 10.49, Lance Yang wrote: > > Anyway, I've prepared two patches for discussion, either of which should > > fix the alignment issue :) > > > > Patch A[1] adjusts the runtime checks to handle unaligned pointers. > > Patch B[2] enforces 4-byte alignment on the core lock structures. > > > > Both tested on x86-64. > > > > [1] > https://lore.kernel.org/lkml/20250823050036.7748-1-lance.yang@linux.dev > > [2] https://lore.kernel.org/lkml/20250823074048.92498-1- > > lance.yang@linux.dev > > Same goes for both of these, except that removing warnings makes minimal > kernel boot 1-2% faster than 4-aligning the whole struct. That is an interesting outcome! So the gain of naturally-aligning the lock is more than offset by the increased cache pressure due to wasting (a bit?) more memory. Do you know what was the impact on total kernel size? Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-09-01 8:51 ` Geert Uytterhoeven @ 2025-09-01 15:12 ` Eero Tamminen 2025-09-06 11:50 ` David Laight 0 siblings, 1 reply; 15+ messages in thread From: Eero Tamminen @ 2025-09-01 15:12 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Finn Thain, Andrew Morton, Lance Yang, Masami Hiramatsu, Peter Zijlstra, Will Deacon, stable, linux-kernel, linux-m68k Hi Geert, On 1.9.2025 11.51, Geert Uytterhoeven wrote: >> On 23.8.2025 10.49, Lance Yang wrote: >> > Anyway, I've prepared two patches for discussion, either of which should >> > fix the alignment issue :) >> > >> > Patch A[1] adjusts the runtime checks to handle unaligned pointers. >> > Patch B[2] enforces 4-byte alignment on the core lock structures. >> > >> > Both tested on x86-64. >> > >> > [1] >> https://lore.kernel.org/lkml/20250823050036.7748-1-lance.yang@linux.dev >> > [2] https://lore.kernel.org/lkml/20250823074048.92498-1- >> > lance.yang@linux.dev >> >> Same goes for both of these, except that removing warnings makes minimal >> kernel boot 1-2% faster than 4-aligning the whole struct. Note that above result was from (emulated) 68030 Falcon, i.e. something that has really small caches (256-byte i-/d-cache), *and* a kernel config using CONFIG_CC_OPTIMIZE_FOR_SIZE=y (with GCC 12.2). > That is an interesting outcome! So the gain of naturally-aligning the > lock is more than offset by the increased cache pressure due to wasting > (a bit?) more memory. Another reason could be those extra inlined warning checks in: ----------------------------------------------------- $ git grep -e hung_task_set_blocker -e hung_task_clear_blocker kernel/ kernel/locking/mutex.c: hung_task_set_blocker(lock, BLOCKER_TYPE_MUTEX); kernel/locking/mutex.c: hung_task_clear_blocker(); kernel/locking/rwsem.c: hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_READER); kernel/locking/rwsem.c: hung_task_clear_blocker(); kernel/locking/rwsem.c: hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_WRITER); kernel/locking/rwsem.c: hung_task_clear_blocker(); kernel/locking/semaphore.c: hung_task_set_blocker(sem, BLOCKER_TYPE_SEM); kernel/locking/semaphore.c: hung_task_clear_blocker(); ----------------------------------------------------- > Do you know what was the impact on total kernel size? As expected, kernel code size is smaller with the static inlined warn checks removed: ----------------------------------------------------- $ size vmlinux-m68k-6.16-fix1 vmlinux-m68k-6.16-fix2 text data bss dec hex filename 3088520 953532 84224 4126276 3ef644 vmlinux-m68k-6.16-fix1 [1] 3088730 953564 84192 4126486 3ef716 vmlinux-m68k-6.16-fix2 [2] ----------------------------------------------------- But could aligning of structs have caused 32 bytes moving from BSS to DATA section? - Eero PS. I profiled these 3 kernels on emulated Falcon. According to (Hatari) profiler, main difference in the kernel with the warnings removed, is it doing less than half of the calls to NCR5380_read() / atari_scsi_reg_read(), compared to the other 2 versions. These additional 2x calls in the other two versions, seem to mostly come through chain originating from process_scheduled_works(), NCR5380_poll_politely*() functions and bus probing. After quick look at the WARN_ON_ONCE()s and SCSI code, I have no idea how having those checks being inlined to locking functions, or not, would cause a difference like that. I've tried patching & building kernels again, and repeating profiling, but result is same. While Hatari call (graph) tracking might have some issue (due to kernel stack return address manipulation), I don't see how there could be a problem with the profiler instruction counts. Kernel code at given address does not change during boot in monolithic kernel, (emulator) profiler tracks _every_ executed instruction/address, and it's clearly correct function: ------------------------------------ # disassembly with profile data: <instructions percentage>% (<sum of instructions>, <sum of cycles>, <sum of i-cache misses>, <sum of d-cache hits>) ... atari_scsi_falcon_reg_read: $001dd826 link.w a6,#$0 0.43% (414942, 1578432, 44701, 0) $001dd82a move.w sr,d1 0.43% (414942, 224, 8, 0) $001dd82c ori.w #$700,sr 0.43% (414942, 414368, 44705, 0) $001dd830 move.l $8(a6),d0 0.43% (414942, 357922, 44705, 414911) $001dd834 addi.l #$88,d0 0.43% (414942, 1014804, 133917, 0) $001dd83a move.w d0,$8606.w 0.43% (414942, 3618352, 89169, 0) $001dd83e move.w $8604.w,d0 0.43% (414942, 3620646, 89162, 0) $001dd842 move.w d1,sr 0.43% (414942, 2148, 142, 0) $001dd844 unlk a6 0.43% (414942, 436, 0, 414893) $001dd846 rts 0.43% (414942, 1073934, 134123, 414942) atari_scsi_falcon_reg_write: $001dd848 link.w a6,#$0 0.00% (81, 484, 29, 0) $001dd84c move.l $c(a6),d0 0.00% (81, 326, 29, 73) ... ------------------------------------ Maybe those WARN_ON_ONCE() checks just happen to slow down something marginally so that things get interrupted & re-started more for the SCSI code? PPS. emulated machine has no SCSI drives, only one IDE drive (with 4MB Busybox partition): ---------------------------------------------------- scsi host0: Atari native SCSI, irq 15, io_port 0x0, base 0x0, can_queue 1, cmd_per_lun 2, sg_tablesize 1, this_id 7, flags { } atari-falcon-ide atari-falcon-ide: Atari Falcon and Q40/Q60 PATA controller scsi host1: pata_falcon ata1: PATA max PIO4 cmd fff00000 ctl fff00038 data fff00000 no IRQ, using PIO polling ... ata1: found unknown device (class 0) ata1.00: ATA-7: Hatari IDE disk 4M, 1.0, max UDMA/100 ata1.00: 8192 sectors, multi 16: LBA48 ata1.00: configured for PIO ... scsi 1:0:0:0: Direct-Access ATA Hatari IDE disk 1.0 PQ: 0 ANSI: 5 sd 1:0:0:0: [sda] 8192 512-byte logical blocks: (4.19 MB/4.00 MiB) sd 1:0:0:0: [sda] Write Protect is off sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA sd 1:0:0:0: [sda] Preferred minimum I/O size 512 bytes sd 1:0:0:0: [sda] Attached SCSI disk VFS: Mounted root (ext2 filesystem) readonly on device 8:0. --------------------------------------------------- ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] atomic: Specify natural alignment for atomic_t 2025-09-01 15:12 ` Eero Tamminen @ 2025-09-06 11:50 ` David Laight 0 siblings, 0 replies; 15+ messages in thread From: David Laight @ 2025-09-06 11:50 UTC (permalink / raw) To: Eero Tamminen Cc: Geert Uytterhoeven, Finn Thain, Andrew Morton, Lance Yang, Masami Hiramatsu, Peter Zijlstra, Will Deacon, stable, linux-kernel, linux-m68k On Mon, 1 Sep 2025 18:12:53 +0300 Eero Tamminen <oak@helsinkinet.fi> wrote: > Hi Geert, > > On 1.9.2025 11.51, Geert Uytterhoeven wrote: > >> On 23.8.2025 10.49, Lance Yang wrote: > >> > Anyway, I've prepared two patches for discussion, either of which should > >> > fix the alignment issue :) > >> > > >> > Patch A[1] adjusts the runtime checks to handle unaligned pointers. > >> > Patch B[2] enforces 4-byte alignment on the core lock structures. > >> > > >> > Both tested on x86-64. > >> > > >> > [1] > >> https://lore.kernel.org/lkml/20250823050036.7748-1-lance.yang@linux.dev > >> > [2] https://lore.kernel.org/lkml/20250823074048.92498-1- > >> > lance.yang@linux.dev > >> > >> Same goes for both of these, except that removing warnings makes minimal > >> kernel boot 1-2% faster than 4-aligning the whole struct. > > Note that above result was from (emulated) 68030 Falcon, i.e. something > that has really small caches (256-byte i-/d-cache), *and* a kernel > config using CONFIG_CC_OPTIMIZE_FOR_SIZE=y (with GCC 12.2). If you are emulating it on x86 the misaligned memory accesses are likely to be zero cost. On a real '030 I'd expect them to be implemented as two memory accesses. I also doubt (but a guess) that the emulator even attempts to emulate the '030 caches. If they are like the '020 ones the i-cache really only helps short loops. It is more likely that the cost of WARN_ON_ONCE() is far more than you might expect. Especially since it will affect register allocation in the function(s). David > > > > That is an interesting outcome! So the gain of naturally-aligning the > > lock is more than offset by the increased cache pressure due to wasting > > (a bit?) more memory. > > Another reason could be those extra inlined warning checks in: > ----------------------------------------------------- > $ git grep -e hung_task_set_blocker -e hung_task_clear_blocker kernel/ > kernel/locking/mutex.c: hung_task_set_blocker(lock, BLOCKER_TYPE_MUTEX); > kernel/locking/mutex.c: hung_task_clear_blocker(); > kernel/locking/rwsem.c: hung_task_set_blocker(sem, > BLOCKER_TYPE_RWSEM_READER); > kernel/locking/rwsem.c: hung_task_clear_blocker(); > kernel/locking/rwsem.c: hung_task_set_blocker(sem, > BLOCKER_TYPE_RWSEM_WRITER); > kernel/locking/rwsem.c: hung_task_clear_blocker(); > kernel/locking/semaphore.c: hung_task_set_blocker(sem, > BLOCKER_TYPE_SEM); > kernel/locking/semaphore.c: hung_task_clear_blocker(); > ----------------------------------------------------- > > > > Do you know what was the impact on total kernel size? > > As expected, kernel code size is smaller with the static inlined warn > checks removed: > ----------------------------------------------------- > $ size vmlinux-m68k-6.16-fix1 vmlinux-m68k-6.16-fix2 > text data bss dec hex filename > 3088520 953532 84224 4126276 3ef644 vmlinux-m68k-6.16-fix1 [1] > 3088730 953564 84192 4126486 3ef716 vmlinux-m68k-6.16-fix2 [2] > ----------------------------------------------------- > > But could aligning of structs have caused 32 bytes moving from BSS to > DATA section? > > > - Eero > > PS. I profiled these 3 kernels on emulated Falcon. According to (Hatari) > profiler, main difference in the kernel with the warnings removed, is it > doing less than half of the calls to NCR5380_read() / > atari_scsi_reg_read(), compared to the other 2 versions. > > These additional 2x calls in the other two versions, seem to mostly come > through chain originating from process_scheduled_works(), > NCR5380_poll_politely*() functions and bus probing. > > After quick look at the WARN_ON_ONCE()s and SCSI code, I have no idea > how having those checks being inlined to locking functions, or not, > would cause a difference like that. I've tried patching & building > kernels again, and repeating profiling, but result is same. > > While Hatari call (graph) tracking might have some issue (due to kernel > stack return address manipulation), I don't see how there could be a > problem with the profiler instruction counts. Kernel code at given > address does not change during boot in monolithic kernel, (emulator) > profiler tracks _every_ executed instruction/address, and it's clearly > correct function: > ------------------------------------ > # disassembly with profile data: <instructions percentage>% (<sum of > instructions>, <sum of cycles>, <sum of i-cache misses>, <sum of d-cache > hits>) > ... > atari_scsi_falcon_reg_read: > $001dd826 link.w a6,#$0 0.43% (414942, 1578432, 44701, 0) > $001dd82a move.w sr,d1 0.43% (414942, 224, 8, 0) > $001dd82c ori.w #$700,sr 0.43% (414942, 414368, 44705, 0) > $001dd830 move.l $8(a6),d0 0.43% (414942, 357922, 44705, 414911) > $001dd834 addi.l #$88,d0 0.43% (414942, 1014804, 133917, 0) > $001dd83a move.w d0,$8606.w 0.43% (414942, 3618352, 89169, 0) > $001dd83e move.w $8604.w,d0 0.43% (414942, 3620646, 89162, 0) > $001dd842 move.w d1,sr 0.43% (414942, 2148, 142, 0) > $001dd844 unlk a6 0.43% (414942, 436, 0, 414893) > $001dd846 rts 0.43% (414942, 1073934, 134123, 414942) > atari_scsi_falcon_reg_write: > $001dd848 link.w a6,#$0 0.00% (81, 484, 29, 0) > $001dd84c move.l $c(a6),d0 0.00% (81, 326, 29, 73) > ... > ------------------------------------ > > Maybe those WARN_ON_ONCE() checks just happen to slow down something > marginally so that things get interrupted & re-started more for the SCSI > code? > > PPS. emulated machine has no SCSI drives, only one IDE drive (with 4MB > Busybox partition): > ---------------------------------------------------- > scsi host0: Atari native SCSI, irq 15, io_port 0x0, base 0x0, can_queue > 1, cmd_per_lun 2, sg_tablesize 1, this_id 7, flags { } > atari-falcon-ide atari-falcon-ide: Atari Falcon and Q40/Q60 PATA controller > scsi host1: pata_falcon > ata1: PATA max PIO4 cmd fff00000 ctl fff00038 data fff00000 no IRQ, > using PIO polling > ... > ata1: found unknown device (class 0) > ata1.00: ATA-7: Hatari IDE disk 4M, 1.0, max UDMA/100 > ata1.00: 8192 sectors, multi 16: LBA48 > ata1.00: configured for PIO > ... > scsi 1:0:0:0: Direct-Access ATA Hatari IDE disk 1.0 PQ: 0 ANSI: 5 > sd 1:0:0:0: [sda] 8192 512-byte logical blocks: (4.19 MB/4.00 MiB) > sd 1:0:0:0: [sda] Write Protect is off > sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 > sd 1:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't > support DPO or FUA > sd 1:0:0:0: [sda] Preferred minimum I/O size 512 bytes > sd 1:0:0:0: [sda] Attached SCSI disk > VFS: Mounted root (ext2 filesystem) readonly on device 8:0. > --------------------------------------------------- > ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-09-06 11:51 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-08-25 2:03 [PATCH] atomic: Specify natural alignment for atomic_t Finn Thain [not found] ` <20250825032743.80641-1-ioworker0@gmail.com> [not found] ` <c8851682-25f1-f594-e30f-5b62e019d37b@linux-m68k.org> [not found] ` <96ae7afc-c882-4c3d-9dea-3e2ae2789caf@linux.dev> [not found] ` <5a44c60b-650a-1f8a-d5cb-abf9f0716817@linux-m68k.org> [not found] ` <4e7e7292-338d-4a57-84ec-ae7427f6ad7c@linux.dev> 2025-08-25 10:49 ` Finn Thain 2025-08-25 11:19 ` Lance Yang 2025-08-25 11:36 ` Lance Yang 2025-08-27 23:43 ` Finn Thain 2025-08-28 2:05 ` Lance Yang 2025-09-01 8:45 ` Geert Uytterhoeven 2025-09-02 13:30 ` Lance Yang 2025-09-02 14:14 ` Geert Uytterhoeven 2025-09-06 11:43 ` David Laight 2025-08-26 15:22 ` Eero Tamminen 2025-08-26 17:33 ` Lance Yang 2025-09-01 8:51 ` Geert Uytterhoeven 2025-09-01 15:12 ` Eero Tamminen 2025-09-06 11:50 ` David Laight
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).