linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Erhard Furtner <erhard_f@mailbox.org>
To: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: KASAN debug kernel fails to boot at early stage when CONFIG_SMP=y is set (kernel 6.5-rc5, PowerMac G4 3,6)
Date: Fri, 1 Sep 2023 00:44:17 +0200	[thread overview]
Message-ID: <20230901004417.631dc019@yea> (raw)
In-Reply-To: <1085cc49-b5e8-0aa5-dc97-ec4a100463b5@csgroup.eu>

On Thu, 31 Aug 2023 05:32:46 +0000
Christophe Leroy <christophe.leroy@csgroup.eu> wrote:

> Ok so there is some corrupted memory somewhere.
> 
> Can you try what happens when you remove the call to kasan_init() at the 
> start of setup_arch() in arch/powerpc/kernel/setup-common.c

Ok, so I left the other patches in place + btext_map() instead of btext_unmap() at the end of MMU_init() + Michaels patch and additionally commented-out kasan_init() as stated above. The outcome is rather interesting! Now I deterministically get this output at boot OF console, regardless wheter it's a cold boot or warm boot:

via-pmu: Server Mode is disabled
PMU driver v2 initialized for Core99, firmware: 0c
ioremap() called early from pmac_nvram_init+0x208/0x7c0. Use early_ioremap() instead
nvram: Checking bank 0...
nvram: gen0=3234, gen1=3235
nvram: Active bank is: 1
nvram: OF partition at 0x410
nvram: XP partition at 0x1020
nvram: NR partition at 0x1120
Top of RAM: 0x80000000, Total RAM: 0x80000000
Memory hole size: 0MB
Zone ranges:
  DMA      [mem 0x0000000000000000-0x000000002fffffff]
  Normal   empty
  HighMem  [mem 0x0000000030000000-0x000000007fffffff]
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x0000000000000000-0x000000007fffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
percpu: Embedded 14 pages/cpu s24608 r8192 d24544 u57344
pcpu-alloc: s24608 r8192 d24544 u57344 alloc=14*4096
pcpu-alloc: [0] 0 
Kernel command line: ro root=/dev/sda5 nr_cpus=1 zswap.max_pool_percent=16 slub_debug=FZP page_poison=1 netconsole=6666@192.168.178.8/eth0,6666@192.168.178.3/70:85:C2:30:EC:01 init=/usr/lib/systemd/systemd 
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, linear)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
Built 1 zonelists, mobility grouping on.  Total pages: 522560
mem auto-init: stack:all(pattern), heap alloc:off, heap free:off
stackdepot: allocating hash table via alloc_large_system_hash
stackdepot hash table entries: 1048576 (order: 10, 4194304 bytes, linear)
==================================================================
BUG: KASAN: stack-out-of-bounds in __kernel_poison_pages+0x6c/0xd0
Write of size 4896 at addr c17a7000 by task swapper/0

CPU: 0 PID: 0 Comm: swapper Tainted: G                T 6.5.0-rc7-PMacG4-dirty #7
Hardware name: PowerMac3,6 7455 0x80010303 PowerMac
Call Trace:
[c1717ce0] [c0f4ec40] dump_stack_lvl+0x60/0xa4 (unreliable)
[c1717d00] [c0368380] print_report+0x154/0x548
[c1717d50] [c036813c] kasan_report+0xd0/0x160
[c1717db0] [c0369bb4] kasan_check_range+0x1c8/0x308
[c1717dc0] [c036ae88] memset+0x34/0x90
[c1717de0] [c035b6e0] __kernel_poison_pages+0x6c/0xd0
[c1717e00] [c03355e4] __free_pages_ok+0x418/0x500
[c1717e60] [c14372c8] memblock_free_all+0x268/0x400
[c1717f20] [c14103fc] mem_init+0x8c/0x274
[c1717f60] [c1431cd0] mm_core_init+0x240/0x4e0
[c1717fc0] [c1404694] start_kernel+0x150/0x2d8
[c1717f00] [000035d0] 0x35d0

The buggy address belongs to the physical page:
page:(ptrval) refcount:0 mapcount:0 mapping:00000000 index:0x0 pfn:0x17a7
flags: 0x0(zone=0)
page_type: 0xffffffff()
raw: 00000000 eee15380 eee15380 00000000 00000000 00000000 ffffffff 00000000
raw: 00000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 c17a7d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 c17a7d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>c17a7e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
                                                     ^
 c17a7e80: f1 f1 04 f2 04 f2 00 f3 f3 f3 00 00 00 00 00 00
 c17a7f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
Disabling lock debugging due to kernel taint

> I'd also be curious to know what happens when CONFIG_DEBUG_SPINLOCK is 
> disabled.

Disabling CONFIG_DEBUG_SPINLOCK does not change the output above. ^^

> Another question which I'm no sure I asked already: Is it a new problem 
> you have got with recent kernels or is it just that you never tried such 
> a config with older kernels ?

I wanted to revisit https://bugzilla.kernel.org/show_bug.cgi?id=216041 and verify whether it was resolved. KASAN worked around 2019-2021 on my G4 as I reported some bugs with it around that time and you fixed some of the bugs. ;) Like kernel bugzilla #205099, #216190, #205885.

But it always seemed flaky on the G4 and had it's problems. So I can't tell whether this specific issue was there back then or if it's new. At least bug #216190 was also about KASAN and SMP issues.

> Also, when you say you need to start with another SMP kernel first and 
> then you don't have the problem anymore until the next cold reboot, do 
> you mean you have some old kernel with KASAN that works, or is it a 
> kernel without KASAN that you have to start first ?

First. I start with a non-KASAN SMP kernel and afterwards reboot into a KASAN kernel. But now with kasan_init() commented-out in start of setup_arch() in arch/powerpc/kernel/setup-common.c this does not work anymore. I get the dmesg above all the time, at cold and warm boots.

Regards,
Erhard

  reply	other threads:[~2023-08-31 22:45 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-10 23:48 KASAN debug kernel fails to boot at early stage when CONFIG_SMP=y is set (kernel 6.5-rc5, PowerMac G4 3,6) Erhard Furtner
2023-08-11  6:45 ` Christophe Leroy
2023-08-13 19:38   ` Erhard Furtner
2023-08-14  9:40     ` Christophe Leroy
2023-08-14 17:27       ` Erhard Furtner
2023-08-15 17:21         ` [PATCH] Add pr_info() traces for investigation Christophe Leroy
2023-08-15 17:25         ` KASAN debug kernel fails to boot at early stage when CONFIG_SMP=y is set (kernel 6.5-rc5, PowerMac G4 3,6) Christophe Leroy
2023-08-15 20:01           ` Erhard Furtner
2023-08-16 15:56             ` Christophe Leroy
2023-08-17 18:32               ` Erhard Furtner
2023-08-17 23:13                 ` Michael Ellerman
2023-08-18  9:16                   ` Erhard Furtner
2023-08-18 15:47                     ` Christophe Leroy
2023-08-18 16:23                       ` Erhard Furtner
2023-08-22  7:31                         ` Christophe Leroy
2023-08-24  0:00                           ` Erhard Furtner
2023-08-24 11:36                             ` Michael Ellerman
2023-08-27 23:17                               ` Erhard Furtner
2023-08-31  5:32                                 ` Christophe Leroy
2023-08-31 22:44                                   ` Erhard Furtner [this message]
2023-09-01  7:43                                     ` Christophe Leroy
2023-09-03 21:06                                       ` Erhard Furtner
2023-09-04 14:48                                         ` Christophe Leroy
2023-09-04 14:55                                           ` Christophe Leroy
2023-09-04 21:32                                             ` Erhard Furtner
2023-09-12  0:11                                             ` Erhard Furtner
2023-09-12  7:47                                               ` Christophe Leroy
2023-09-12 15:59                                                 ` Erhard Furtner
2023-09-12 17:39                                                   ` Christophe Leroy
2023-09-12 20:09                                                     ` Erhard Furtner
2023-09-13  5:28                                                       ` Christophe Leroy
2023-09-14  4:54                                                     ` Christophe Leroy
2023-09-14 12:33                                                       ` Erhard Furtner
2024-02-28 23:55                                                       ` Erhard Furtner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230901004417.631dc019@yea \
    --to=erhard_f@mailbox.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).