linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
@ 2024-08-30  8:35 syzbot
  2024-08-30  9:52 ` Will Deacon
  0 siblings, 1 reply; 14+ messages in thread
From: syzbot @ 2024-08-30  8:35 UTC (permalink / raw)
  To: catalin.marinas, linux-arm-kernel, linux-kernel, syzkaller-bugs,
	will

Hello,

syzbot found the following issue on:

HEAD commit:    33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme
console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-33faa93b.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/9093742fcee9/vmlinux-33faa93b.xz
kernel image: https://storage.googleapis.com/syzbot-assets/b1f599907931/Image-33faa93b.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com

Booting Linux on physical CPU 0x0000000000 [0x000f0510]
Linux version 6.11.0-rc5-syzkaller-g33faa93bc856 (syzkaller@syzkaller) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #0 SMP PREEMPT now
random: crng init done
Machine model: linux,dummy-virt
efi: UEFI not found.
NUMA: No NUMA configuration found
NUMA: Faking a node at [mem 0x0000000040000000-0x00000000bfffffff]
NUMA: NODE_DATA [mem 0xbfc1d340-0xbfc20fff]
Zone ranges:
  DMA      [mem 0x0000000040000000-0x00000000bfffffff]
  DMA32    empty
  Normal   empty
  Device   empty
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x0000000040000000-0x00000000bfffffff]
Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff]
cma: Reserved 32 MiB at 0x00000000bba00000 on node -1
psci: probing for conduit method from DT.
psci: PSCIv1.1 detected in firmware.
psci: Using standard PSCI v0.2 function IDs
psci: Trusted OS migration not required
psci: SMC Calling Convention v1.0
==================================================================
BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
Write of size 4 at addr 03ff800086867e00 by task swapper/0
Pointer tag: [03], memory tag: [fe]

CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc5-syzkaller-g33faa93bc856 #0
Hardware name: linux,dummy-virt (DT)
Call trace:
 dump_backtrace+0x204/0x3b8 arch/arm64/kernel/stacktrace.c:317
 show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
 __dump_stack lib/dump_stack.c:93 [inline]
 dump_stack_lvl+0x260/0x3b4 lib/dump_stack.c:119
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x118/0x5ac mm/kasan/report.c:488
 kasan_report+0xc8/0x108 mm/kasan/report.c:601
 kasan_check_range+0x94/0xb8 mm/kasan/sw_tags.c:84
 __hwasan_store4_noabort+0x20/0x2c mm/kasan/sw_tags.c:149
 smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
 setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
 start_kernel+0xe0/0xff0 init/main.c:926
 __primary_switched+0x84/0x8c arch/arm64/kernel/head.S:243

The buggy address belongs to stack of task swapper/0

Memory state around the buggy address:
 ffff800086867c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff800086867d00: 00 fe fe 00 00 00 fe fe fe fe fe fe fe fe fe fe
>ffff800086867e00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
                   ^
 ffff800086867f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
 ffff800086868000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
==================================================================
percpu: Embedded 35 pages/cpu s104840 r8192 d30328 u143360
Detected PIPT I-cache on CPU0
CPU features: detected: GIC system register CPU interface
CPU features: detected: HCRX_EL2 register
CPU features: detected: 52-bit Virtual Addressing (LPA2)
CPU features: detected: Virtualization Host Extensions
CPU features: detected: Spectre-v4
alternatives: applying boot alternatives
kasan: KernelAddressSanitizer initialized (sw-tags, stacktrace=on)
Kernel command line: root=/dev/vda console=ttyAMA0 
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes, linear)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
Fallback order for Node 0: 0 
Built 1 zonelists, mobility grouping on.  Total pages: 524288
Policy zone: DMA
mem auto-init: stack:all(zero), heap alloc:on, heap free:off
stackdepot: allocating hash table via alloc_large_system_hash
stackdepot hash table entries: 1048576 (order: 12, 16777216 bytes, linear)
software IO TLB: SWIOTLB bounce buffer size adjusted to 2MB
software IO TLB: area num 1.
software IO TLB: mapped [mem 0x00000000b1a29000-0x00000000b1c29000] (2MB)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
allocated 4194304 bytes of page_ext
trace event string verifier disabled
Running RCU self tests
Running RCU synchronous self tests
rcu: Preemptible hierarchical RCU implementation.
rcu: 	RCU lockdep checking is enabled.
rcu: 	RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=1.
rcu: 	RCU callback double-/use-after-free debug is enabled.
rcu: 	RCU debug extended QS entry/exit.
	Trampoline variant of Tasks RCU enabled.
	Tracing variant of Tasks RCU enabled.
rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
Running RCU synchronous self tests
RCU Tasks: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1.
RCU Tasks Trace: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1.
NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
GICv3: GIC: Using split EOI/Deactivate mode
GICv3: 256 SPIs implemented
GICv3: 0 Extended SPIs implemented
Root IRQ handler: gic_handle_irq
GICv3: GICv3 features: 16 PPIs
GICv3: GICv4 features: 
GICv3: GICD_CTRL.DS=1, SCR_EL3.FIQ=0
GICv3: CPU0: found redistributor 0 region 0:0x00000000080a0000
ITS [mem 0x08080000-0x0809ffff]
ITS@0x0000000008080000: Single VMOVP capable
ITS@0x0000000008080000: allocated 8192 Devices @4a230000 (indirect, esz 8, psz 64K, shr 1)
ITS@0x0000000008080000: allocated 8192 Interrupt Collections @4a240000 (flat, esz 8, psz 64K, shr 1)
ITS@0x0000000008080000: allocated 8192 Virtual CPUs @4a250000 (indirect, esz 8, psz 64K, shr 1)
GICv3: using LPI property table @0x000000004a260000
ITS: Allocated DevID ffff as GICv4 proxy device (2 slots)
ITS: Enabling GICv4 support
GICv3: CPU0: using allocated LPI pending table @0x000000004a270000
rcu: srcu_init: Setting srcu_struct sizes based on contention.
arch_timer: cp15 timer(s) running at 62.50MHz (phys).
clocksource: arch_sys_counter: mask: 0x1ffffffffffffff max_cycles: 0x1cd42e208c, max_idle_ns: 881590405314 ns
sched_clock: 57 bits at 63MHz, resolution 16ns, wraps every 4398046511096ns
Console: colour dummy device 80x25
Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
... MAX_LOCKDEP_SUBCLASSES:  8
... MAX_LOCK_DEPTH:          48
... MAX_LOCKDEP_KEYS:        8192
... CLASSHASH_SIZE:          4096
... MAX_LOCKDEP_ENTRIES:     131072
... MAX_LOCKDEP_CHAINS:      65536
... CHAINHASH_SIZE:          32768
 memory used by lock dependency info: 11817 kB
 memory used for stack traces: 8320 kB
 per task-struct memory footprint: 1920 bytes
Calibrating delay loop (skipped), value calculated using timer frequency.. 125.00 BogoMIPS (lpj=625000)
pid_max: default: 32768 minimum: 301
LSM: initializing lsm=lockdown,capability,landlock,yama,safesetid,tomoyo,selinux,ima,evm
landlock: Up and running.
Yama: becoming mindful.
TOMOYO Linux initialized
SELinux:  Initializing.
Mount-cache hash table entries: 4096 (order: 3, 32768 bytes, linear)
Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes, linear)
Running RCU synchronous self tests
Running RCU synchronous self tests


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-08-30  8:35 [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch syzbot
@ 2024-08-30  9:52 ` Will Deacon
  2024-08-31 17:52   ` Marc Zyngier
  0 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2024-08-30  9:52 UTC (permalink / raw)
  To: syzbot; +Cc: catalin.marinas, linux-arm-kernel, linux-kernel, syzkaller-bugs,
	maz

On Fri, Aug 30, 2024 at 01:35:24AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
> git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme

+Marc, as this is his branch.

> console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
> dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> userspace arch: arm64
> 
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-33faa93b.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/9093742fcee9/vmlinux-33faa93b.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/b1f599907931/Image-33faa93b.gz.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com
> 
> Booting Linux on physical CPU 0x0000000000 [0x000f0510]
> Linux version 6.11.0-rc5-syzkaller-g33faa93bc856 (syzkaller@syzkaller) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #0 SMP PREEMPT now
> random: crng init done
> Machine model: linux,dummy-virt
> efi: UEFI not found.
> NUMA: No NUMA configuration found
> NUMA: Faking a node at [mem 0x0000000040000000-0x00000000bfffffff]
> NUMA: NODE_DATA [mem 0xbfc1d340-0xbfc20fff]
> Zone ranges:
>   DMA      [mem 0x0000000040000000-0x00000000bfffffff]
>   DMA32    empty
>   Normal   empty
>   Device   empty
> Movable zone start for each node
> Early memory node ranges
>   node   0: [mem 0x0000000040000000-0x00000000bfffffff]
> Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff]
> cma: Reserved 32 MiB at 0x00000000bba00000 on node -1
> psci: probing for conduit method from DT.
> psci: PSCIv1.1 detected in firmware.
> psci: Using standard PSCI v0.2 function IDs
> psci: Trusted OS migration not required
> psci: SMC Calling Convention v1.0
> ==================================================================
> BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> Write of size 4 at addr 03ff800086867e00 by task swapper/0
> Pointer tag: [03], memory tag: [fe]
> 
> CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc5-syzkaller-g33faa93bc856 #0
> Hardware name: linux,dummy-virt (DT)
> Call trace:
>  dump_backtrace+0x204/0x3b8 arch/arm64/kernel/stacktrace.c:317
>  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
>  __dump_stack lib/dump_stack.c:93 [inline]
>  dump_stack_lvl+0x260/0x3b4 lib/dump_stack.c:119
>  print_address_description mm/kasan/report.c:377 [inline]
>  print_report+0x118/0x5ac mm/kasan/report.c:488
>  kasan_report+0xc8/0x108 mm/kasan/report.c:601
>  kasan_check_range+0x94/0xb8 mm/kasan/sw_tags.c:84
>  __hwasan_store4_noabort+0x20/0x2c mm/kasan/sw_tags.c:149
>  smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
>  setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
>  start_kernel+0xe0/0xff0 init/main.c:926
>  __primary_switched+0x84/0x8c arch/arm64/kernel/head.S:243
> 
> The buggy address belongs to stack of task swapper/0
> 
> Memory state around the buggy address:
>  ffff800086867c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  ffff800086867d00: 00 fe fe 00 00 00 fe fe fe fe fe fe fe fe fe fe
> >ffff800086867e00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
>                    ^
>  ffff800086867f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
>  ffff800086868000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> ==================================================================

I can't spot the issue here. We have a couple of fixed-length
(4 element) arrays on the stack and they're indexed by a simple loop
counter that runs from 0-3.

Will


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-08-30  9:52 ` Will Deacon
@ 2024-08-31 17:52   ` Marc Zyngier
  2024-09-02 10:03     ` Aleksandr Nogikh
                       ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Marc Zyngier @ 2024-08-31 17:52 UTC (permalink / raw)
  To: Will Deacon
  Cc: syzbot, catalin.marinas, linux-arm-kernel, linux-kernel,
	syzkaller-bugs

On Fri, 30 Aug 2024 10:52:54 +0100,
Will Deacon <will@kernel.org> wrote:
> 
> On Fri, Aug 30, 2024 at 01:35:24AM -0700, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:    33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
> > git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme
> 
> +Marc, as this is his branch.
>
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
> > dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
> > compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > userspace arch: arm64

As it turns out, this isn't specific to this branch. I can reproduce
it with this config on a vanilla 6.10 as a KVM guest. Even worse,
compiling with clang results in an unbootable kernel (without any
output at all).

Mind you, the binary is absolutely massive (130MB with gcc, 156MB with
clang), and I wouldn't be surprised if we were hitting some kind of
odd limit.

> > 
> > Downloadable assets:
> > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-33faa93b.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/9093742fcee9/vmlinux-33faa93b.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/b1f599907931/Image-33faa93b.gz.xz
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com
> > 
> > Booting Linux on physical CPU 0x0000000000 [0x000f0510]
> > Linux version 6.11.0-rc5-syzkaller-g33faa93bc856 (syzkaller@syzkaller) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #0 SMP PREEMPT now
> > random: crng init done
> > Machine model: linux,dummy-virt
> > efi: UEFI not found.
> > NUMA: No NUMA configuration found
> > NUMA: Faking a node at [mem 0x0000000040000000-0x00000000bfffffff]
> > NUMA: NODE_DATA [mem 0xbfc1d340-0xbfc20fff]
> > Zone ranges:
> >   DMA      [mem 0x0000000040000000-0x00000000bfffffff]
> >   DMA32    empty
> >   Normal   empty
> >   Device   empty
> > Movable zone start for each node
> > Early memory node ranges
> >   node   0: [mem 0x0000000040000000-0x00000000bfffffff]
> > Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff]
> > cma: Reserved 32 MiB at 0x00000000bba00000 on node -1
> > psci: probing for conduit method from DT.
> > psci: PSCIv1.1 detected in firmware.
> > psci: Using standard PSCI v0.2 function IDs
> > psci: Trusted OS migration not required
> > psci: SMC Calling Convention v1.0
> > ==================================================================
> > BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > Write of size 4 at addr 03ff800086867e00 by task swapper/0
> > Pointer tag: [03], memory tag: [fe]
> > 
> > CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc5-syzkaller-g33faa93bc856 #0
> > Hardware name: linux,dummy-virt (DT)
> > Call trace:
> >  dump_backtrace+0x204/0x3b8 arch/arm64/kernel/stacktrace.c:317
> >  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
> >  __dump_stack lib/dump_stack.c:93 [inline]
> >  dump_stack_lvl+0x260/0x3b4 lib/dump_stack.c:119
> >  print_address_description mm/kasan/report.c:377 [inline]
> >  print_report+0x118/0x5ac mm/kasan/report.c:488
> >  kasan_report+0xc8/0x108 mm/kasan/report.c:601
> >  kasan_check_range+0x94/0xb8 mm/kasan/sw_tags.c:84
> >  __hwasan_store4_noabort+0x20/0x2c mm/kasan/sw_tags.c:149
> >  smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> >  setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> >  start_kernel+0xe0/0xff0 init/main.c:926
> >  __primary_switched+0x84/0x8c arch/arm64/kernel/head.S:243
> > 
> > The buggy address belongs to stack of task swapper/0
> > 
> > Memory state around the buggy address:
> >  ffff800086867c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >  ffff800086867d00: 00 fe fe 00 00 00 fe fe fe fe fe fe fe fe fe fe
> > >ffff800086867e00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> >                    ^
> >  ffff800086867f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> >  ffff800086868000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > ==================================================================
> 
> I can't spot the issue here. We have a couple of fixed-length
> (4 element) arrays on the stack and they're indexed by a simple loop
> counter that runs from 0-3.

Having trimmed the config to the extreme, I can only trigger the
warning with CONFIG_KASAN_SW_TAGS (CONFIG_KASAN_GENERIC does not
scream). Same thing if I use gcc 14.2.0.

However, compiling with clang 14 (Debian clang version 14.0.6) does
*not* result in a screaming kernel, even with KASAN_SW_TAGS.

So I can see two possibilities here:

- either gcc is incompatible with KASAN_SW_TAGS and the generic
  version is the only one that works

- or we have a compiler bug on our hands.

Frankly, I can't believe the later, as the code is so daft that I
can't imagine gcc getting it *that* wrong.

Who knows enough about KASAN to dig into this?

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-08-31 17:52   ` Marc Zyngier
@ 2024-09-02 10:03     ` Aleksandr Nogikh
  2024-09-03 15:39       ` Alexander Potapenko
  2024-09-04 18:26     ` Mark Rutland
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Aleksandr Nogikh @ 2024-09-02 10:03 UTC (permalink / raw)
  To: Marc Zyngier, kasan-dev
  Cc: Will Deacon, syzbot, catalin.marinas, linux-arm-kernel,
	linux-kernel, syzkaller-bugs

+kasan-dev

On Sat, Aug 31, 2024 at 7:53 PM 'Marc Zyngier' via syzkaller-bugs
<syzkaller-bugs@googlegroups.com> wrote:
>
> On Fri, 30 Aug 2024 10:52:54 +0100,
> Will Deacon <will@kernel.org> wrote:
> >
> > On Fri, Aug 30, 2024 at 01:35:24AM -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit:    33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
> > > git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme
> >
> > +Marc, as this is his branch.
> >
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
> > > compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > > userspace arch: arm64
>
> As it turns out, this isn't specific to this branch. I can reproduce
> it with this config on a vanilla 6.10 as a KVM guest. Even worse,
> compiling with clang results in an unbootable kernel (without any
> output at all).
>
> Mind you, the binary is absolutely massive (130MB with gcc, 156MB with
> clang), and I wouldn't be surprised if we were hitting some kind of
> odd limit.
>
> > >
> > > Downloadable assets:
> > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-33faa93b.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/9093742fcee9/vmlinux-33faa93b.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/b1f599907931/Image-33faa93b.gz.xz
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com
> > >
> > > Booting Linux on physical CPU 0x0000000000 [0x000f0510]
> > > Linux version 6.11.0-rc5-syzkaller-g33faa93bc856 (syzkaller@syzkaller) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #0 SMP PREEMPT now
> > > random: crng init done
> > > Machine model: linux,dummy-virt
> > > efi: UEFI not found.
> > > NUMA: No NUMA configuration found
> > > NUMA: Faking a node at [mem 0x0000000040000000-0x00000000bfffffff]
> > > NUMA: NODE_DATA [mem 0xbfc1d340-0xbfc20fff]
> > > Zone ranges:
> > >   DMA      [mem 0x0000000040000000-0x00000000bfffffff]
> > >   DMA32    empty
> > >   Normal   empty
> > >   Device   empty
> > > Movable zone start for each node
> > > Early memory node ranges
> > >   node   0: [mem 0x0000000040000000-0x00000000bfffffff]
> > > Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff]
> > > cma: Reserved 32 MiB at 0x00000000bba00000 on node -1
> > > psci: probing for conduit method from DT.
> > > psci: PSCIv1.1 detected in firmware.
> > > psci: Using standard PSCI v0.2 function IDs
> > > psci: Trusted OS migration not required
> > > psci: SMC Calling Convention v1.0
> > > ==================================================================
> > > BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > > BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > > Write of size 4 at addr 03ff800086867e00 by task swapper/0
> > > Pointer tag: [03], memory tag: [fe]
> > >
> > > CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc5-syzkaller-g33faa93bc856 #0
> > > Hardware name: linux,dummy-virt (DT)
> > > Call trace:
> > >  dump_backtrace+0x204/0x3b8 arch/arm64/kernel/stacktrace.c:317
> > >  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
> > >  __dump_stack lib/dump_stack.c:93 [inline]
> > >  dump_stack_lvl+0x260/0x3b4 lib/dump_stack.c:119
> > >  print_address_description mm/kasan/report.c:377 [inline]
> > >  print_report+0x118/0x5ac mm/kasan/report.c:488
> > >  kasan_report+0xc8/0x108 mm/kasan/report.c:601
> > >  kasan_check_range+0x94/0xb8 mm/kasan/sw_tags.c:84
> > >  __hwasan_store4_noabort+0x20/0x2c mm/kasan/sw_tags.c:149
> > >  smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > >  setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > >  start_kernel+0xe0/0xff0 init/main.c:926
> > >  __primary_switched+0x84/0x8c arch/arm64/kernel/head.S:243
> > >
> > > The buggy address belongs to stack of task swapper/0
> > >
> > > Memory state around the buggy address:
> > >  ffff800086867c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > >  ffff800086867d00: 00 fe fe 00 00 00 fe fe fe fe fe fe fe fe fe fe
> > > >ffff800086867e00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > >                    ^
> > >  ffff800086867f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > >  ffff800086868000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > > ==================================================================
> >
> > I can't spot the issue here. We have a couple of fixed-length
> > (4 element) arrays on the stack and they're indexed by a simple loop
> > counter that runs from 0-3.
>
> Having trimmed the config to the extreme, I can only trigger the
> warning with CONFIG_KASAN_SW_TAGS (CONFIG_KASAN_GENERIC does not
> scream). Same thing if I use gcc 14.2.0.
>
> However, compiling with clang 14 (Debian clang version 14.0.6) does
> *not* result in a screaming kernel, even with KASAN_SW_TAGS.
>
> So I can see two possibilities here:
>
> - either gcc is incompatible with KASAN_SW_TAGS and the generic
>   version is the only one that works
>
> - or we have a compiler bug on our hands.
>
> Frankly, I can't believe the later, as the code is so daft that I
> can't imagine gcc getting it *that* wrong.
>
> Who knows enough about KASAN to dig into this?
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-09-02 10:03     ` Aleksandr Nogikh
@ 2024-09-03 15:39       ` Alexander Potapenko
  2024-09-03 16:05         ` Marc Zyngier
  0 siblings, 1 reply; 14+ messages in thread
From: Alexander Potapenko @ 2024-09-03 15:39 UTC (permalink / raw)
  To: samuel.holland, Andrey Konovalov
  Cc: Marc Zyngier, Aleksandr Nogikh, kasan-dev, Will Deacon, syzbot,
	catalin.marinas, linux-arm-kernel, linux-kernel, syzkaller-bugs

On Mon, Sep 2, 2024 at 12:03 PM 'Aleksandr Nogikh' via kasan-dev
<kasan-dev@googlegroups.com> wrote:
>
> +kasan-dev
>
> On Sat, Aug 31, 2024 at 7:53 PM 'Marc Zyngier' via syzkaller-bugs
> <syzkaller-bugs@googlegroups.com> wrote:
> >
> > On Fri, 30 Aug 2024 10:52:54 +0100,
> > Will Deacon <will@kernel.org> wrote:
> > >
> > > On Fri, Aug 30, 2024 at 01:35:24AM -0700, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit:    33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
> > > > git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme
> > >
> > > +Marc, as this is his branch.
> > >
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
> > > > compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > > > userspace arch: arm64
> >
> > As it turns out, this isn't specific to this branch. I can reproduce
> > it with this config on a vanilla 6.10 as a KVM guest. Even worse,
> > compiling with clang results in an unbootable kernel (without any
> > output at all).
> >
> > Mind you, the binary is absolutely massive (130MB with gcc, 156MB with
> > clang), and I wouldn't be surprised if we were hitting some kind of
> > odd limit.
> >
> > > >
> > > > Downloadable assets:
> > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-33faa93b.raw.xz
> > > > vmlinux: https://storage.googleapis.com/syzbot-assets/9093742fcee9/vmlinux-33faa93b.xz
> > > > kernel image: https://storage.googleapis.com/syzbot-assets/b1f599907931/Image-33faa93b.gz.xz
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com
> > > >
> > > > Booting Linux on physical CPU 0x0000000000 [0x000f0510]
> > > > Linux version 6.11.0-rc5-syzkaller-g33faa93bc856 (syzkaller@syzkaller) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #0 SMP PREEMPT now
> > > > random: crng init done
> > > > Machine model: linux,dummy-virt
> > > > efi: UEFI not found.
> > > > NUMA: No NUMA configuration found
> > > > NUMA: Faking a node at [mem 0x0000000040000000-0x00000000bfffffff]
> > > > NUMA: NODE_DATA [mem 0xbfc1d340-0xbfc20fff]
> > > > Zone ranges:
> > > >   DMA      [mem 0x0000000040000000-0x00000000bfffffff]
> > > >   DMA32    empty
> > > >   Normal   empty
> > > >   Device   empty
> > > > Movable zone start for each node
> > > > Early memory node ranges
> > > >   node   0: [mem 0x0000000040000000-0x00000000bfffffff]
> > > > Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff]
> > > > cma: Reserved 32 MiB at 0x00000000bba00000 on node -1
> > > > psci: probing for conduit method from DT.
> > > > psci: PSCIv1.1 detected in firmware.
> > > > psci: Using standard PSCI v0.2 function IDs
> > > > psci: Trusted OS migration not required
> > > > psci: SMC Calling Convention v1.0
> > > > ==================================================================
> > > > BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > > > BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > > > Write of size 4 at addr 03ff800086867e00 by task swapper/0
> > > > Pointer tag: [03], memory tag: [fe]
> > > >
> > > > CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc5-syzkaller-g33faa93bc856 #0
> > > > Hardware name: linux,dummy-virt (DT)
> > > > Call trace:
> > > >  dump_backtrace+0x204/0x3b8 arch/arm64/kernel/stacktrace.c:317
> > > >  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
> > > >  __dump_stack lib/dump_stack.c:93 [inline]
> > > >  dump_stack_lvl+0x260/0x3b4 lib/dump_stack.c:119
> > > >  print_address_description mm/kasan/report.c:377 [inline]
> > > >  print_report+0x118/0x5ac mm/kasan/report.c:488
> > > >  kasan_report+0xc8/0x108 mm/kasan/report.c:601
> > > >  kasan_check_range+0x94/0xb8 mm/kasan/sw_tags.c:84
> > > >  __hwasan_store4_noabort+0x20/0x2c mm/kasan/sw_tags.c:149
> > > >  smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > > >  setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > > >  start_kernel+0xe0/0xff0 init/main.c:926
> > > >  __primary_switched+0x84/0x8c arch/arm64/kernel/head.S:243
> > > >
> > > > The buggy address belongs to stack of task swapper/0
> > > >
> > > > Memory state around the buggy address:
> > > >  ffff800086867c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > >  ffff800086867d00: 00 fe fe 00 00 00 fe fe fe fe fe fe fe fe fe fe
> > > > >ffff800086867e00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > > >                    ^
> > > >  ffff800086867f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > > >  ffff800086868000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > > > ==================================================================
> > >
> > > I can't spot the issue here. We have a couple of fixed-length
> > > (4 element) arrays on the stack and they're indexed by a simple loop
> > > counter that runs from 0-3.
> >
> > Having trimmed the config to the extreme, I can only trigger the
> > warning with CONFIG_KASAN_SW_TAGS (CONFIG_KASAN_GENERIC does not
> > scream). Same thing if I use gcc 14.2.0.
> >
> > However, compiling with clang 14 (Debian clang version 14.0.6) does
> > *not* result in a screaming kernel, even with KASAN_SW_TAGS.
> >
> > So I can see two possibilities here:
> >
> > - either gcc is incompatible with KASAN_SW_TAGS and the generic
> >   version is the only one that works
> >
> > - or we have a compiler bug on our hands.
> >
> > Frankly, I can't believe the later, as the code is so daft that I
> > can't imagine gcc getting it *that* wrong.
> >
> > Who knows enough about KASAN to dig into this?

This looks related to Samuel's "arm64: Fix KASAN random tag seed
initialization" patch that landed in August.

I am a bit surprised the bug is reported before the
"KernelAddressSanitizer initialized" banner is printed - I thought we
shouldn't be reporting anything until the tool is fully initialized.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-09-03 15:39       ` Alexander Potapenko
@ 2024-09-03 16:05         ` Marc Zyngier
  2024-09-03 16:43           ` Samuel Holland
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Zyngier @ 2024-09-03 16:05 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: samuel.holland, Andrey Konovalov, Aleksandr Nogikh, kasan-dev,
	Will Deacon, syzbot, catalin.marinas, linux-arm-kernel,
	linux-kernel, syzkaller-bugs

On Tue, 03 Sep 2024 16:39:28 +0100,
Alexander Potapenko <glider@google.com> wrote:
> 
> On Mon, Sep 2, 2024 at 12:03 PM 'Aleksandr Nogikh' via kasan-dev
> <kasan-dev@googlegroups.com> wrote:
> >
> > +kasan-dev
> >
> > On Sat, Aug 31, 2024 at 7:53 PM 'Marc Zyngier' via syzkaller-bugs
> > <syzkaller-bugs@googlegroups.com> wrote:
> > >
> > > On Fri, 30 Aug 2024 10:52:54 +0100,
> > > Will Deacon <will@kernel.org> wrote:
> > > >
> > > > On Fri, Aug 30, 2024 at 01:35:24AM -0700, syzbot wrote:
> > > > > Hello,
> > > > >
> > > > > syzbot found the following issue on:
> > > > >
> > > > > HEAD commit:    33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
> > > > > git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme
> > > >
> > > > +Marc, as this is his branch.
> > > >
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
> > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
> > > > > compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > > > > userspace arch: arm64
> > >
> > > As it turns out, this isn't specific to this branch. I can reproduce
> > > it with this config on a vanilla 6.10 as a KVM guest. Even worse,
> > > compiling with clang results in an unbootable kernel (without any
> > > output at all).
> > >
> > > Mind you, the binary is absolutely massive (130MB with gcc, 156MB with
> > > clang), and I wouldn't be surprised if we were hitting some kind of
> > > odd limit.
> > >
> > > > >
> > > > > Downloadable assets:
> > > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-33faa93b.raw.xz
> > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/9093742fcee9/vmlinux-33faa93b.xz
> > > > > kernel image: https://storage.googleapis.com/syzbot-assets/b1f599907931/Image-33faa93b.gz.xz
> > > > >
> > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com
> > > > >
> > > > > Booting Linux on physical CPU 0x0000000000 [0x000f0510]
> > > > > Linux version 6.11.0-rc5-syzkaller-g33faa93bc856 (syzkaller@syzkaller) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #0 SMP PREEMPT now
> > > > > random: crng init done
> > > > > Machine model: linux,dummy-virt
> > > > > efi: UEFI not found.
> > > > > NUMA: No NUMA configuration found
> > > > > NUMA: Faking a node at [mem 0x0000000040000000-0x00000000bfffffff]
> > > > > NUMA: NODE_DATA [mem 0xbfc1d340-0xbfc20fff]
> > > > > Zone ranges:
> > > > >   DMA      [mem 0x0000000040000000-0x00000000bfffffff]
> > > > >   DMA32    empty
> > > > >   Normal   empty
> > > > >   Device   empty
> > > > > Movable zone start for each node
> > > > > Early memory node ranges
> > > > >   node   0: [mem 0x0000000040000000-0x00000000bfffffff]
> > > > > Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff]
> > > > > cma: Reserved 32 MiB at 0x00000000bba00000 on node -1
> > > > > psci: probing for conduit method from DT.
> > > > > psci: PSCIv1.1 detected in firmware.
> > > > > psci: Using standard PSCI v0.2 function IDs
> > > > > psci: Trusted OS migration not required
> > > > > psci: SMC Calling Convention v1.0
> > > > > ==================================================================
> > > > > BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > > > > BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > > > > Write of size 4 at addr 03ff800086867e00 by task swapper/0
> > > > > Pointer tag: [03], memory tag: [fe]
> > > > >
> > > > > CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc5-syzkaller-g33faa93bc856 #0
> > > > > Hardware name: linux,dummy-virt (DT)
> > > > > Call trace:
> > > > >  dump_backtrace+0x204/0x3b8 arch/arm64/kernel/stacktrace.c:317
> > > > >  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
> > > > >  __dump_stack lib/dump_stack.c:93 [inline]
> > > > >  dump_stack_lvl+0x260/0x3b4 lib/dump_stack.c:119
> > > > >  print_address_description mm/kasan/report.c:377 [inline]
> > > > >  print_report+0x118/0x5ac mm/kasan/report.c:488
> > > > >  kasan_report+0xc8/0x108 mm/kasan/report.c:601
> > > > >  kasan_check_range+0x94/0xb8 mm/kasan/sw_tags.c:84
> > > > >  __hwasan_store4_noabort+0x20/0x2c mm/kasan/sw_tags.c:149
> > > > >  smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > > > >  setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > > > >  start_kernel+0xe0/0xff0 init/main.c:926
> > > > >  __primary_switched+0x84/0x8c arch/arm64/kernel/head.S:243
> > > > >
> > > > > The buggy address belongs to stack of task swapper/0
> > > > >
> > > > > Memory state around the buggy address:
> > > > >  ffff800086867c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > >  ffff800086867d00: 00 fe fe 00 00 00 fe fe fe fe fe fe fe fe fe fe
> > > > > >ffff800086867e00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > > > >                    ^
> > > > >  ffff800086867f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > > > >  ffff800086868000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > > > > ==================================================================
> > > >
> > > > I can't spot the issue here. We have a couple of fixed-length
> > > > (4 element) arrays on the stack and they're indexed by a simple loop
> > > > counter that runs from 0-3.
> > >
> > > Having trimmed the config to the extreme, I can only trigger the
> > > warning with CONFIG_KASAN_SW_TAGS (CONFIG_KASAN_GENERIC does not
> > > scream). Same thing if I use gcc 14.2.0.
> > >
> > > However, compiling with clang 14 (Debian clang version 14.0.6) does
> > > *not* result in a screaming kernel, even with KASAN_SW_TAGS.
> > >
> > > So I can see two possibilities here:
> > >
> > > - either gcc is incompatible with KASAN_SW_TAGS and the generic
> > >   version is the only one that works
> > >
> > > - or we have a compiler bug on our hands.
> > >
> > > Frankly, I can't believe the later, as the code is so daft that I
> > > can't imagine gcc getting it *that* wrong.
> > >
> > > Who knows enough about KASAN to dig into this?
> 
> This looks related to Samuel's "arm64: Fix KASAN random tag seed
> initialization" patch that landed in August.

f75c235565f9 arm64: Fix KASAN random tag seed initialization

$ git describe --contains f75c235565f9 --match=v\*
v6.11-rc4~15^2

So while this is in -rc4, -rc6 still has the same issue (with GCC --
clang is OK).

> I am a bit surprised the bug is reported before the
> "KernelAddressSanitizer initialized" banner is printed - I thought we
> shouldn't be reporting anything until the tool is fully initialized.

Specially if this can report false positives...

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-09-03 16:05         ` Marc Zyngier
@ 2024-09-03 16:43           ` Samuel Holland
  2024-09-04 15:31             ` Alexander Potapenko
  0 siblings, 1 reply; 14+ messages in thread
From: Samuel Holland @ 2024-09-03 16:43 UTC (permalink / raw)
  To: Marc Zyngier, Alexander Potapenko
  Cc: Andrey Konovalov, Aleksandr Nogikh, kasan-dev, Will Deacon,
	syzbot, catalin.marinas, linux-arm-kernel, linux-kernel,
	syzkaller-bugs

On 2024-09-03 11:05 AM, Marc Zyngier wrote:
> On Tue, 03 Sep 2024 16:39:28 +0100,
> Alexander Potapenko <glider@google.com> wrote:
>>
>> On Mon, Sep 2, 2024 at 12:03 PM 'Aleksandr Nogikh' via kasan-dev
>> <kasan-dev@googlegroups.com> wrote:
>>>
>>> +kasan-dev
>>>
>>> On Sat, Aug 31, 2024 at 7:53 PM 'Marc Zyngier' via syzkaller-bugs
>>> <syzkaller-bugs@googlegroups.com> wrote:
>>>>
>>>> On Fri, 30 Aug 2024 10:52:54 +0100,
>>>> Will Deacon <will@kernel.org> wrote:
>>>>>
>>>>> On Fri, Aug 30, 2024 at 01:35:24AM -0700, syzbot wrote:
>>>>>> Hello,
>>>>>>
>>>>>> syzbot found the following issue on:
>>>>>>
>>>>>> HEAD commit:    33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
>>>>>> git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme
>>>>>
>>>>> +Marc, as this is his branch.
>>>>>
>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
>>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
>>>>>> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>>>>>> userspace arch: arm64
>>>>
>>>> As it turns out, this isn't specific to this branch. I can reproduce
>>>> it with this config on a vanilla 6.10 as a KVM guest. Even worse,
>>>> compiling with clang results in an unbootable kernel (without any
>>>> output at all).
>>>>
>>>> Mind you, the binary is absolutely massive (130MB with gcc, 156MB with
>>>> clang), and I wouldn't be surprised if we were hitting some kind of
>>>> odd limit.
>>>>
>>>>>>
>>>>>> Downloadable assets:
>>>>>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-33faa93b.raw.xz
>>>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/9093742fcee9/vmlinux-33faa93b.xz
>>>>>> kernel image: https://storage.googleapis.com/syzbot-assets/b1f599907931/Image-33faa93b.gz.xz
>>>>>>
>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>>> Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com
>>>>>>
>>>>>> Booting Linux on physical CPU 0x0000000000 [0x000f0510]
>>>>>> Linux version 6.11.0-rc5-syzkaller-g33faa93bc856 (syzkaller@syzkaller) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #0 SMP PREEMPT now
>>>>>> random: crng init done
>>>>>> Machine model: linux,dummy-virt
>>>>>> efi: UEFI not found.
>>>>>> NUMA: No NUMA configuration found
>>>>>> NUMA: Faking a node at [mem 0x0000000040000000-0x00000000bfffffff]
>>>>>> NUMA: NODE_DATA [mem 0xbfc1d340-0xbfc20fff]
>>>>>> Zone ranges:
>>>>>>   DMA      [mem 0x0000000040000000-0x00000000bfffffff]
>>>>>>   DMA32    empty
>>>>>>   Normal   empty
>>>>>>   Device   empty
>>>>>> Movable zone start for each node
>>>>>> Early memory node ranges
>>>>>>   node   0: [mem 0x0000000040000000-0x00000000bfffffff]
>>>>>> Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff]
>>>>>> cma: Reserved 32 MiB at 0x00000000bba00000 on node -1
>>>>>> psci: probing for conduit method from DT.
>>>>>> psci: PSCIv1.1 detected in firmware.
>>>>>> psci: Using standard PSCI v0.2 function IDs
>>>>>> psci: Trusted OS migration not required
>>>>>> psci: SMC Calling Convention v1.0
>>>>>> ==================================================================
>>>>>> BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
>>>>>> BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
>>>>>> Write of size 4 at addr 03ff800086867e00 by task swapper/0
>>>>>> Pointer tag: [03], memory tag: [fe]
>>>>>>
>>>>>> CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc5-syzkaller-g33faa93bc856 #0
>>>>>> Hardware name: linux,dummy-virt (DT)
>>>>>> Call trace:
>>>>>>  dump_backtrace+0x204/0x3b8 arch/arm64/kernel/stacktrace.c:317
>>>>>>  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
>>>>>>  __dump_stack lib/dump_stack.c:93 [inline]
>>>>>>  dump_stack_lvl+0x260/0x3b4 lib/dump_stack.c:119
>>>>>>  print_address_description mm/kasan/report.c:377 [inline]
>>>>>>  print_report+0x118/0x5ac mm/kasan/report.c:488
>>>>>>  kasan_report+0xc8/0x108 mm/kasan/report.c:601
>>>>>>  kasan_check_range+0x94/0xb8 mm/kasan/sw_tags.c:84
>>>>>>  __hwasan_store4_noabort+0x20/0x2c mm/kasan/sw_tags.c:149
>>>>>>  smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
>>>>>>  setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
>>>>>>  start_kernel+0xe0/0xff0 init/main.c:926
>>>>>>  __primary_switched+0x84/0x8c arch/arm64/kernel/head.S:243
>>>>>>
>>>>>> The buggy address belongs to stack of task swapper/0
>>>>>>
>>>>>> Memory state around the buggy address:
>>>>>>  ffff800086867c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>>>>  ffff800086867d00: 00 fe fe 00 00 00 fe fe fe fe fe fe fe fe fe fe
>>>>>>> ffff800086867e00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
>>>>>>                    ^
>>>>>>  ffff800086867f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
>>>>>>  ffff800086868000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
>>>>>> ==================================================================
>>>>>
>>>>> I can't spot the issue here. We have a couple of fixed-length
>>>>> (4 element) arrays on the stack and they're indexed by a simple loop
>>>>> counter that runs from 0-3.
>>>>
>>>> Having trimmed the config to the extreme, I can only trigger the
>>>> warning with CONFIG_KASAN_SW_TAGS (CONFIG_KASAN_GENERIC does not
>>>> scream). Same thing if I use gcc 14.2.0.
>>>>
>>>> However, compiling with clang 14 (Debian clang version 14.0.6) does
>>>> *not* result in a screaming kernel, even with KASAN_SW_TAGS.
>>>>
>>>> So I can see two possibilities here:
>>>>
>>>> - either gcc is incompatible with KASAN_SW_TAGS and the generic
>>>>   version is the only one that works
>>>>
>>>> - or we have a compiler bug on our hands.
>>>>
>>>> Frankly, I can't believe the later, as the code is so daft that I
>>>> can't imagine gcc getting it *that* wrong.
>>>>
>>>> Who knows enough about KASAN to dig into this?
>>
>> This looks related to Samuel's "arm64: Fix KASAN random tag seed
>> initialization" patch that landed in August.
> 
> f75c235565f9 arm64: Fix KASAN random tag seed initialization
> 
> $ git describe --contains f75c235565f9 --match=v\*
> v6.11-rc4~15^2
> 
> So while this is in -rc4, -rc6 still has the same issue (with GCC --
> clang is OK).

I wouldn't expect it to be related to my patch. smp_build_mpidr_hash() gets
called before kasan_init_sw_tags() both before and after applying my patch.

Since the variable in question is a stack variable, the random tag is generated
by GCC, not the kernel function.

Since smp_build_mpidr_hash() is inlined into setup_arch(), which also calls
kasan_init(), maybe the issue is that GCC tries to allocate the local variable
and write the tag to shadow memory before kasan_init() actually sets up the
shadow memory?

Regards,
Samuel

>> I am a bit surprised the bug is reported before the
>> "KernelAddressSanitizer initialized" banner is printed - I thought we
>> shouldn't be reporting anything until the tool is fully initialized.
> 
> Specially if this can report false positives...
> 
> Thanks,
> 
> 	M.
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-09-03 16:43           ` Samuel Holland
@ 2024-09-04 15:31             ` Alexander Potapenko
  0 siblings, 0 replies; 14+ messages in thread
From: Alexander Potapenko @ 2024-09-04 15:31 UTC (permalink / raw)
  To: Samuel Holland
  Cc: Marc Zyngier, Andrey Konovalov, Aleksandr Nogikh, kasan-dev,
	Will Deacon, syzbot, catalin.marinas, linux-arm-kernel,
	linux-kernel, syzkaller-bugs

> >>>> Who knows enough about KASAN to dig into this?
> >>
> >> This looks related to Samuel's "arm64: Fix KASAN random tag seed
> >> initialization" patch that landed in August.
> >
> > f75c235565f9 arm64: Fix KASAN random tag seed initialization
> >
> > $ git describe --contains f75c235565f9 --match=v\*
> > v6.11-rc4~15^2
> >
> > So while this is in -rc4, -rc6 still has the same issue (with GCC --
> > clang is OK).
>
> I wouldn't expect it to be related to my patch. smp_build_mpidr_hash() gets
> called before kasan_init_sw_tags() both before and after applying my patch.

Hm, you are right, this problem indeed dates back to v6.9 or earlier.

> Since the variable in question is a stack variable, the random tag is generated
> by GCC, not the kernel function.
>
> Since smp_build_mpidr_hash() is inlined into setup_arch(), which also calls
> kasan_init(), maybe the issue is that GCC tries to allocate the local variable
> and write the tag to shadow memory before kasan_init() actually sets up the
> shadow memory?

Should it be inlined at all?
setup_arch() is a __no_sanitize_address function, and
smp_build_mpidr_hash() is an instrumented one.
The latter is not supposed to be inlined into the former, unless the
latter is always_inline
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89124).

The report seems to go away if I mark smp_build_mpidr_hash() as noinline.
This doesn't explain, though, why Clang build doesn't work at all...

>
> Regards,
> Samuel
>
> >> I am a bit surprised the bug is reported before the
> >> "KernelAddressSanitizer initialized" banner is printed - I thought we
> >> shouldn't be reporting anything until the tool is fully initialized.
> >
> > Specially if this can report false positives...
> >
> > Thanks,
> >
> >       M.
> >
>


-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-08-31 17:52   ` Marc Zyngier
  2024-09-02 10:03     ` Aleksandr Nogikh
@ 2024-09-04 18:26     ` Mark Rutland
  2024-09-05 14:03     ` Mark Rutland
  2024-09-23 10:46     ` Mark Rutland
  3 siblings, 0 replies; 14+ messages in thread
From: Mark Rutland @ 2024-09-04 18:26 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Will Deacon, syzbot, catalin.marinas, linux-arm-kernel,
	linux-kernel, syzkaller-bugs

On Sat, Aug 31, 2024 at 06:52:52PM +0100, Marc Zyngier wrote:
> On Fri, 30 Aug 2024 10:52:54 +0100,
> Will Deacon <will@kernel.org> wrote:
> > 
> > On Fri, Aug 30, 2024 at 01:35:24AM -0700, syzbot wrote:
> > > Hello,
> > > 
> > > syzbot found the following issue on:
> > > 
> > > HEAD commit:    33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
> > > git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme
> > 
> > +Marc, as this is his branch.
> >
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
> > > compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > > userspace arch: arm64
> 
> As it turns out, this isn't specific to this branch. I can reproduce
> it with this config on a vanilla 6.10 as a KVM guest. Even worse,
> compiling with clang results in an unbootable kernel (without any
> output at all).
> 
> Mind you, the binary is absolutely massive (130MB with gcc, 156MB with
> clang), and I wouldn't be surprised if we were hitting some kind of
> odd limit.
> 
> > > 
> > > Downloadable assets:
> > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-33faa93b.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/9093742fcee9/vmlinux-33faa93b.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/b1f599907931/Image-33faa93b.gz.xz
> > > 
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com
> > > 
> > > Booting Linux on physical CPU 0x0000000000 [0x000f0510]
> > > Linux version 6.11.0-rc5-syzkaller-g33faa93bc856 (syzkaller@syzkaller) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #0 SMP PREEMPT now
> > > random: crng init done
> > > Machine model: linux,dummy-virt
> > > efi: UEFI not found.
> > > NUMA: No NUMA configuration found
> > > NUMA: Faking a node at [mem 0x0000000040000000-0x00000000bfffffff]
> > > NUMA: NODE_DATA [mem 0xbfc1d340-0xbfc20fff]
> > > Zone ranges:
> > >   DMA      [mem 0x0000000040000000-0x00000000bfffffff]
> > >   DMA32    empty
> > >   Normal   empty
> > >   Device   empty
> > > Movable zone start for each node
> > > Early memory node ranges
> > >   node   0: [mem 0x0000000040000000-0x00000000bfffffff]
> > > Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff]
> > > cma: Reserved 32 MiB at 0x00000000bba00000 on node -1
> > > psci: probing for conduit method from DT.
> > > psci: PSCIv1.1 detected in firmware.
> > > psci: Using standard PSCI v0.2 function IDs
> > > psci: Trusted OS migration not required
> > > psci: SMC Calling Convention v1.0
> > > ==================================================================
> > > BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > > BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > > Write of size 4 at addr 03ff800086867e00 by task swapper/0
> > > Pointer tag: [03], memory tag: [fe]
> > > 
> > > CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc5-syzkaller-g33faa93bc856 #0
> > > Hardware name: linux,dummy-virt (DT)
> > > Call trace:
> > >  dump_backtrace+0x204/0x3b8 arch/arm64/kernel/stacktrace.c:317
> > >  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
> > >  __dump_stack lib/dump_stack.c:93 [inline]
> > >  dump_stack_lvl+0x260/0x3b4 lib/dump_stack.c:119
> > >  print_address_description mm/kasan/report.c:377 [inline]
> > >  print_report+0x118/0x5ac mm/kasan/report.c:488
> > >  kasan_report+0xc8/0x108 mm/kasan/report.c:601
> > >  kasan_check_range+0x94/0xb8 mm/kasan/sw_tags.c:84
> > >  __hwasan_store4_noabort+0x20/0x2c mm/kasan/sw_tags.c:149
> > >  smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > >  setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > >  start_kernel+0xe0/0xff0 init/main.c:926
> > >  __primary_switched+0x84/0x8c arch/arm64/kernel/head.S:243
> > > 
> > > The buggy address belongs to stack of task swapper/0
> > > 
> > > Memory state around the buggy address:
> > >  ffff800086867c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > >  ffff800086867d00: 00 fe fe 00 00 00 fe fe fe fe fe fe fe fe fe fe
> > > >ffff800086867e00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > >                    ^
> > >  ffff800086867f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > >  ffff800086868000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > > ==================================================================
> > 
> > I can't spot the issue here. We have a couple of fixed-length
> > (4 element) arrays on the stack and they're indexed by a simple loop
> > counter that runs from 0-3.
> 
> Having trimmed the config to the extreme, I can only trigger the
> warning with CONFIG_KASAN_SW_TAGS (CONFIG_KASAN_GENERIC does not
> scream). Same thing if I use gcc 14.2.0.

Likewise.

> However, compiling with clang 14 (Debian clang version 14.0.6) does
> *not* result in a screaming kernel, even with KASAN_SW_TAGS.
> 
> So I can see two possibilities here:
> 
> - either gcc is incompatible with KASAN_SW_TAGS and the generic
>   version is the only one that works
> 
> - or we have a compiler bug on our hands.

Looking at this there seem to be a bunch of problems here. I'll try to
write this up better tomorrow, but as a holding reply for now, there's
definitely a compiler issue and mahybe a kernel issue.

Mark.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-08-31 17:52   ` Marc Zyngier
  2024-09-02 10:03     ` Aleksandr Nogikh
  2024-09-04 18:26     ` Mark Rutland
@ 2024-09-05 14:03     ` Mark Rutland
  2024-09-05 14:25       ` Ard Biesheuvel
  2024-09-23 10:46     ` Mark Rutland
  3 siblings, 1 reply; 14+ messages in thread
From: Mark Rutland @ 2024-09-05 14:03 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Will Deacon, syzbot, catalin.marinas, linux-arm-kernel,
	linux-kernel, syzkaller-bugs, ardb, Nathan Chancellor,
	Nick Desaulniers, Bill Wendling, Justin Stitt

[adding Ard and LLVM folk; there's a question right at the end after
some context]

On Sat, Aug 31, 2024 at 06:52:52PM +0100, Marc Zyngier wrote:
> On Fri, 30 Aug 2024 10:52:54 +0100,
> Will Deacon <will@kernel.org> wrote:
> > 
> > On Fri, Aug 30, 2024 at 01:35:24AM -0700, syzbot wrote:
> > > Hello,
> > > 
> > > syzbot found the following issue on:
> > > 
> > > HEAD commit:    33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
> > > git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme
> > 
> > +Marc, as this is his branch.
> >
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
> > > compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > > userspace arch: arm64
> 
> As it turns out, this isn't specific to this branch. I can reproduce
> it with this config on a vanilla 6.10 as a KVM guest. Even worse,
> compiling with clang results in an unbootable kernel (without any
> output at all).
> 
> Mind you, the binary is absolutely massive (130MB with gcc, 156MB with
> clang), and I wouldn't be surprised if we were hitting some kind of
> odd limit.

Putting the KASAN issue aside (which I'll handle in a separate thread),
I think there is a real issue here with LLVM.

What's going on here is that .idmap.text ends up more than 128M away
from .head.text, so the 'b primary_entry' at the start of the Image
isn't in range:

| [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w _text        
| ffff800080000000 g       .head.text     0000000000000000 _text
| [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w primary_entry
| ffff8000889df0e0 g       .rodata.text   000000000000006c primary_entry

... as those are ~128MiB apart.

When building with GCC those end up ~101MiB apart:

| [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w _text        
| ffff800080000000 g       .head.text     0000000000000000 _text
| [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w primary_entry
| ffff8000865ae0e0 g       .rodata.text   000000000000006c primary_entry

When that happens, LLD makes the header branch to a veneer/thunk:

| ffff800080000000 <_text>:
| ffff800080000000:       fa405a4d        ccmp    x18, #0x0, #0xd, pl     // pl = nfrst
| ffff800080000004:       14003fff        b       ffff800080010000 <__AArch64AbsLongThunk_primary_entry>

... and unfortunately, that veneer/thunk uses a literal with the
statically-linked TTBR1 address of primary_entry:

| ffff800080010000 <__AArch64AbsLongThunk_primary_entry>:
| ffff800080010000:       58000050        ldr     x16, ffff800080010008 <__AArch64AbsLongThunk_primary_entry+0x8>
| ffff800080010004:       d61f0200        br      x16
| ffff800080010008:       889df0e0        .word   0x889df0e0
| ffff80008001000c:       ffff8000        .word   0xffff8000

... so as soon as the CPU tries to branch there it'll take a synchronous
exception since either:

(a) The MMU is off, and that's larger than the physical address size.

(b) The MMU is on, but there's no TTBR1 mapping.

We can bodge around this instance by manually open-coding a veneer with
ADRP+ADD+BR after the header, and having the header branch to that, but
AFAICT we have no guarantee that other early asm or PI C code won't hit
the same problem.

It'd be good if we could convince LLD to use ADRP+ADD, since we already
rely on the entire kernel image falling within 2GiB for data
relocations. I'm not sure if it doesn't support using ADRP+ADD in
veneers or if we're doing something that prevents it from using ADRP+ADD
in the veneer.

By comparison, if I force the branch range to be longer, GCC 14.1.0 and
GNU LD 2.4.20 use ADRP+ADD for the veneer, and the resulting kernel
boots successfully.

I tested that by hacking some .rodata between .head.text and .idmap.text
with:

| char hack_force_veneer[SZ_128M] __ro_after_init;

... which forces a ~230MiB branch range using the config above:

| [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w _text        
| ffff800080000000 g       .head.text     0000000000000000 _text
| [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w primary_entry
| ffff80008e5be0e0 g       .rodata.text   000000000000006c primary_entry

... with the generated code being:

| ffff800080000000 <__efistub__text>:
| ffff800080000000:       fa405a4d        ccmp    x18, #0x0, #0xd, pl     // pl = nfrst
| ffff800080000004:       14004001        b       ffff800080010008 <__primary_entry_veneer>
...
| ffff800080010008 <__primary_entry_veneer>:
| ffff800080010008:       d0072d70        adrp    x16, ffff80008e5be000 <__idmap_text_start>
| ffff80008001000c:       91038210        add     x16, x16, #0xe0
| ffff800080010010:       d61f0200        br      x16

LLVM folk, is there any existing option to ask LLD to use ADRP+ADD for
the veneer/thunk? ... and if not, would it be possible to add an option
for that?

I realise it shouldn't matter for most users, but it'd be nice to avoid
the boobytrap for anyone building test kernels.

Mark.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-09-05 14:03     ` Mark Rutland
@ 2024-09-05 14:25       ` Ard Biesheuvel
  2024-09-19  9:14         ` Mark Rutland
  0 siblings, 1 reply; 14+ messages in thread
From: Ard Biesheuvel @ 2024-09-05 14:25 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Marc Zyngier, Will Deacon, syzbot, catalin.marinas,
	linux-arm-kernel, linux-kernel, syzkaller-bugs, Nathan Chancellor,
	Nick Desaulniers, Bill Wendling, Justin Stitt

On Thu, 5 Sept 2024 at 16:03, Mark Rutland <mark.rutland@arm.com> wrote:
>
> [adding Ard and LLVM folk; there's a question right at the end after
> some context]
>
> On Sat, Aug 31, 2024 at 06:52:52PM +0100, Marc Zyngier wrote:
> > On Fri, 30 Aug 2024 10:52:54 +0100,
> > Will Deacon <will@kernel.org> wrote:
> > >
> > > On Fri, Aug 30, 2024 at 01:35:24AM -0700, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit:    33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
> > > > git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme
> > >
> > > +Marc, as this is his branch.
> > >
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
> > > > compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > > > userspace arch: arm64
> >
> > As it turns out, this isn't specific to this branch. I can reproduce
> > it with this config on a vanilla 6.10 as a KVM guest. Even worse,
> > compiling with clang results in an unbootable kernel (without any
> > output at all).
> >
> > Mind you, the binary is absolutely massive (130MB with gcc, 156MB with
> > clang), and I wouldn't be surprised if we were hitting some kind of
> > odd limit.
>
> Putting the KASAN issue aside (which I'll handle in a separate thread),
> I think there is a real issue here with LLVM.
>
> What's going on here is that .idmap.text ends up more than 128M away
> from .head.text, so the 'b primary_entry' at the start of the Image
> isn't in range:
>
> | [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w _text
> | ffff800080000000 g       .head.text     0000000000000000 _text
> | [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w primary_entry
> | ffff8000889df0e0 g       .rodata.text   000000000000006c primary_entry
>
> ... as those are ~128MiB apart.
>
> When building with GCC those end up ~101MiB apart:
>
> | [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w _text
> | ffff800080000000 g       .head.text     0000000000000000 _text
> | [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w primary_entry
> | ffff8000865ae0e0 g       .rodata.text   000000000000006c primary_entry
>
> When that happens, LLD makes the header branch to a veneer/thunk:
>
> | ffff800080000000 <_text>:
> | ffff800080000000:       fa405a4d        ccmp    x18, #0x0, #0xd, pl     // pl = nfrst
> | ffff800080000004:       14003fff        b       ffff800080010000 <__AArch64AbsLongThunk_primary_entry>
>
> ... and unfortunately, that veneer/thunk uses a literal with the
> statically-linked TTBR1 address of primary_entry:
>
> | ffff800080010000 <__AArch64AbsLongThunk_primary_entry>:
> | ffff800080010000:       58000050        ldr     x16, ffff800080010008 <__AArch64AbsLongThunk_primary_entry+0x8>
> | ffff800080010004:       d61f0200        br      x16
> | ffff800080010008:       889df0e0        .word   0x889df0e0
> | ffff80008001000c:       ffff8000        .word   0xffff8000
>
...
> LLVM folk, is there any existing option to ask LLD to use ADRP+ADD for
> the veneer/thunk? ... and if not, would it be possible to add an option
> for that?
>

ld.lld takes --pic-veneer, which (from looking at the llvm sources)
appears to do what we need here.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-09-05 14:25       ` Ard Biesheuvel
@ 2024-09-19  9:14         ` Mark Rutland
  0 siblings, 0 replies; 14+ messages in thread
From: Mark Rutland @ 2024-09-19  9:14 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Marc Zyngier, Will Deacon, syzbot, catalin.marinas,
	linux-arm-kernel, linux-kernel, syzkaller-bugs, Nathan Chancellor,
	Nick Desaulniers, Bill Wendling, Justin Stitt

On Thu, Sep 05, 2024 at 04:25:57PM +0200, Ard Biesheuvel wrote:
> On Thu, 5 Sept 2024 at 16:03, Mark Rutland <mark.rutland@arm.com> wrote:
> > Putting the KASAN issue aside (which I'll handle in a separate thread),
> > I think there is a real issue here with LLVM.
> >
> > What's going on here is that .idmap.text ends up more than 128M away
> > from .head.text, so the 'b primary_entry' at the start of the Image
> > isn't in range:
> >
> > | [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w _text
> > | ffff800080000000 g       .head.text     0000000000000000 _text
> > | [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w primary_entry
> > | ffff8000889df0e0 g       .rodata.text   000000000000006c primary_entry
> >
> > ... as those are ~128MiB apart.
> >
> > When building with GCC those end up ~101MiB apart:
> >
> > | [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w _text
> > | ffff800080000000 g       .head.text     0000000000000000 _text
> > | [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w primary_entry
> > | ffff8000865ae0e0 g       .rodata.text   000000000000006c primary_entry
> >
> > When that happens, LLD makes the header branch to a veneer/thunk:
> >
> > | ffff800080000000 <_text>:
> > | ffff800080000000:       fa405a4d        ccmp    x18, #0x0, #0xd, pl     // pl = nfrst
> > | ffff800080000004:       14003fff        b       ffff800080010000 <__AArch64AbsLongThunk_primary_entry>
> >
> > ... and unfortunately, that veneer/thunk uses a literal with the
> > statically-linked TTBR1 address of primary_entry:
> >
> > | ffff800080010000 <__AArch64AbsLongThunk_primary_entry>:
> > | ffff800080010000:       58000050        ldr     x16, ffff800080010008 <__AArch64AbsLongThunk_primary_entry+0x8>
> > | ffff800080010004:       d61f0200        br      x16
> > | ffff800080010008:       889df0e0        .word   0x889df0e0
> > | ffff80008001000c:       ffff8000        .word   0xffff8000
> >
> ...
> > LLVM folk, is there any existing option to ask LLD to use ADRP+ADD for
> > the veneer/thunk? ... and if not, would it be possible to add an option
> > for that?
> >
> 
> ld.lld takes --pic-veneer, which (from looking at the llvm sources)
> appears to do what we need here.

Ah; now I take another look I see that's in the man page and is also
supported by GNU LD, so I'll spin a patch to use that.

Thanks for the pointer!

Mark.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-08-31 17:52   ` Marc Zyngier
                       ` (2 preceding siblings ...)
  2024-09-05 14:03     ` Mark Rutland
@ 2024-09-23 10:46     ` Mark Rutland
  2024-09-23 20:12       ` Andrey Konovalov
  3 siblings, 1 reply; 14+ messages in thread
From: Mark Rutland @ 2024-09-23 10:46 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Will Deacon, syzbot, catalin.marinas, linux-arm-kernel,
	linux-kernel, syzkaller-bugs, kasan-dev, Aleksandr Nogikh,
	Alexander Potapenko, Andrey Ryabinin, Andrey Konovalov

[adding KASAN folk]

There appears to be a GCC bug here, analysis below.

The issues with clang are unrelated, and I will follow up with a
separate mail for those.

On Sat, Aug 31, 2024 at 06:52:52PM +0100, Marc Zyngier wrote:
> On Fri, 30 Aug 2024 10:52:54 +0100,
> Will Deacon <will@kernel.org> wrote:
> > 
> > On Fri, Aug 30, 2024 at 01:35:24AM -0700, syzbot wrote:
> > > Hello,
> > > 
> > > syzbot found the following issue on:
> > > 
> > > HEAD commit:    33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
> > > git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme
> > 
> > +Marc, as this is his branch.
> >
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
> > > compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > > userspace arch: arm64
> 
> As it turns out, this isn't specific to this branch. I can reproduce
> it with this config on a vanilla 6.10 as a KVM guest. Even worse,
> compiling with clang results in an unbootable kernel (without any
> output at all).
> 
> Mind you, the binary is absolutely massive (130MB with gcc, 156MB with
> clang), and I wouldn't be surprised if we were hitting some kind of
> odd limit.
> 
> > > 
> > > Downloadable assets:
> > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-33faa93b.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/9093742fcee9/vmlinux-33faa93b.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/b1f599907931/Image-33faa93b.gz.xz
> > > 
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com
> > > 
> > > Booting Linux on physical CPU 0x0000000000 [0x000f0510]
> > > Linux version 6.11.0-rc5-syzkaller-g33faa93bc856 (syzkaller@syzkaller) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #0 SMP PREEMPT now
> > > random: crng init done
> > > Machine model: linux,dummy-virt
> > > efi: UEFI not found.
> > > NUMA: No NUMA configuration found
> > > NUMA: Faking a node at [mem 0x0000000040000000-0x00000000bfffffff]
> > > NUMA: NODE_DATA [mem 0xbfc1d340-0xbfc20fff]
> > > Zone ranges:
> > >   DMA      [mem 0x0000000040000000-0x00000000bfffffff]
> > >   DMA32    empty
> > >   Normal   empty
> > >   Device   empty
> > > Movable zone start for each node
> > > Early memory node ranges
> > >   node   0: [mem 0x0000000040000000-0x00000000bfffffff]
> > > Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff]
> > > cma: Reserved 32 MiB at 0x00000000bba00000 on node -1
> > > psci: probing for conduit method from DT.
> > > psci: PSCIv1.1 detected in firmware.
> > > psci: Using standard PSCI v0.2 function IDs
> > > psci: Trusted OS migration not required
> > > psci: SMC Calling Convention v1.0
> > > ==================================================================
> > > BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > > BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > > Write of size 4 at addr 03ff800086867e00 by task swapper/0
> > > Pointer tag: [03], memory tag: [fe]
> > > 
> > > CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc5-syzkaller-g33faa93bc856 #0
> > > Hardware name: linux,dummy-virt (DT)
> > > Call trace:
> > >  dump_backtrace+0x204/0x3b8 arch/arm64/kernel/stacktrace.c:317
> > >  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
> > >  __dump_stack lib/dump_stack.c:93 [inline]
> > >  dump_stack_lvl+0x260/0x3b4 lib/dump_stack.c:119
> > >  print_address_description mm/kasan/report.c:377 [inline]
> > >  print_report+0x118/0x5ac mm/kasan/report.c:488
> > >  kasan_report+0xc8/0x108 mm/kasan/report.c:601
> > >  kasan_check_range+0x94/0xb8 mm/kasan/sw_tags.c:84
> > >  __hwasan_store4_noabort+0x20/0x2c mm/kasan/sw_tags.c:149
> > >  smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > >  setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > >  start_kernel+0xe0/0xff0 init/main.c:926
> > >  __primary_switched+0x84/0x8c arch/arm64/kernel/head.S:243
> > > 
> > > The buggy address belongs to stack of task swapper/0
> > > 
> > > Memory state around the buggy address:
> > >  ffff800086867c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > >  ffff800086867d00: 00 fe fe 00 00 00 fe fe fe fe fe fe fe fe fe fe
> > > >ffff800086867e00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > >                    ^
> > >  ffff800086867f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > >  ffff800086868000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > > ==================================================================
> > 
> > I can't spot the issue here. We have a couple of fixed-length
> > (4 element) arrays on the stack and they're indexed by a simple loop
> > counter that runs from 0-3.
> 
> Having trimmed the config to the extreme, I can only trigger the
> warning with CONFIG_KASAN_SW_TAGS (CONFIG_KASAN_GENERIC does not
> scream). Same thing if I use gcc 14.2.0.
> 
> However, compiling with clang 14 (Debian clang version 14.0.6) does
> *not* result in a screaming kernel, even with KASAN_SW_TAGS.
> 
> So I can see two possibilities here:
> 
> - either gcc is incompatible with KASAN_SW_TAGS and the generic
>   version is the only one that works
> 
> - or we have a compiler bug on our hands.
> 
> Frankly, I can't believe the later, as the code is so daft that I
> can't imagine gcc getting it *that* wrong.

It looks like what's happening here is:

(1) With CONFIG_KASAN_SW_TAGS=y we pass the compiler
    `-fsanitize=kernel-hwaddress`.

(2) When GCC is passed `-fsanitize=hwaddress` or
    `-fsanitize=kernel-hwaddress` it ignores
    `__attribute__((no_sanitize_address))`, and instruments functions we
    require are not instrumented.

    I believe this is a compiler bug, as there doesn't seem to be a
    separate attribute to prevent instrumentation in this mode.

(3) In this config, smp_build_mpidr_hash() gets inlined into
    setup_arch(), and as setup_arch() is instrumented, all of the stack
    variables for smp_build_mpidr_hash() are initialized at the start of
    setup_arch(), with calls to __hwasan_tag_memory().

    At this point, we are using the early shadow (where a single page of
    shadow is used for all memory).

(4) In setup_arch(), we call kasan_init() to transition from the early
    shadow to the runtime shadow. This replaces the early shadow memory
    with new shadow memory initialized to KASAN_SHADOW_INIT (0xFE AKA
    KASAN_TAG_INVALID), including the shadow for the stack.

(5) Once the CPU returns back into setup_arch(), it's using the new
    shadow initialized to 0xFE. Subsequent stack accesses which check
    the shadow see 0xFE in the shadow, and fault. Note that in the dump
    of the shadow above, the shadow around ffff800086867d80 and above is
    all 0xFE, while below that functions have managed to clear the
    shadow.

Compiler test case below. Note that this demonstrates the compiler
ignores  `__attribute__((no_sanitize_address))` regardless of
KASAN_STACK, so KASAN_SW_TAGS is generally broken with GCC. All versions
I tried were broken, from 11.3.0 to 14.2.0 inclusive.

I think we have to disable KASAN_SW_TAGS with GCC until this is fixed.

| [mark@lakrids:/mnt/data/tests/kasan-tags]% cat test.c
| #define __nsa           __attribute__((no_sanitize_address))
| 
| long __nsa load_long(long *ptr)
| {
|         return *ptr;
| }
| 
| void __nsa store_long(long *ptr, long val)
| {
|         *ptr = val;
| }
| 
| void extern_func(void);
| 
| long __nsa stack_func(void)
| {
|         volatile long val = 0;
|         extern_func();
|         return val;
| }
| [mark@lakrids:/mnt/data/tests/kasan-tags]% usekorg 12.1.0 aarch64-linux-gcc -c test.c -O2  -fsanitize=kernel-hwaddress
| [mark@lakrids:/mnt/data/tests/kasan-tags]% usekorg 14.2.0 aarch64-linux-objdump -d test.o
| 
| test.o:     file format elf64-littleaarch64
| 
| 
| Disassembly of section .text:
| 
| 0000000000000000 <load_long>:
|    0:   a9be7bfd        stp     x29, x30, [sp, #-32]!
|    4:   910003fd        mov     x29, sp
|    8:   f9000bf3        str     x19, [sp, #16]
|    c:   aa0003f3        mov     x19, x0
|   10:   94000000        bl      0 <__hwasan_load8_noabort>
|   14:   f9400260        ldr     x0, [x19]
|   18:   f9400bf3        ldr     x19, [sp, #16]
|   1c:   a8c27bfd        ldp     x29, x30, [sp], #32
|   20:   d65f03c0        ret
| 
| 0000000000000024 <store_long>:
|   24:   a9be7bfd        stp     x29, x30, [sp, #-32]!
|   28:   910003fd        mov     x29, sp
|   2c:   a90153f3        stp     x19, x20, [sp, #16]
|   30:   aa0003f3        mov     x19, x0
|   34:   aa0103f4        mov     x20, x1
|   38:   94000000        bl      0 <__hwasan_store8_noabort>
|   3c:   f9000274        str     x20, [x19]
|   40:   a94153f3        ldp     x19, x20, [sp, #16]
|   44:   a8c27bfd        ldp     x29, x30, [sp], #32
|   48:   d65f03c0        ret
|   4c:   d503201f        nop
| 
| 0000000000000050 <stack_func>:
|   50:   a9be7bfd        stp     x29, x30, [sp, #-32]!
|   54:   910003fd        mov     x29, sp
|   58:   f9000fff        str     xzr, [sp, #24]
|   5c:   94000000        bl      0 <extern_func>
|   60:   f9400fe0        ldr     x0, [sp, #24]
|   64:   a8c27bfd        ldp     x29, x30, [sp], #32
|   68:   d65f03c0        ret
| [mark@lakrids:/mnt/data/tests/kasan-tags]% usekorg 12.1.0 aarch64-linux-gcc -c test.c -O2  -fsanitize=kernel-hwaddress  --param hwasan-instrument-stack=1
| [mark@lakrids:/mnt/data/tests/kasan-tags]% usekorg 14.2.0 aarch64-linux-objdump -d test.o
| 
| test.o:     file format elf64-littleaarch64
| 
| 
| Disassembly of section .text:
| 
| 0000000000000000 <load_long>:
|    0:   a9be7bfd        stp     x29, x30, [sp, #-32]!
|    4:   910003fd        mov     x29, sp
|    8:   f9000bf3        str     x19, [sp, #16]
|    c:   aa0003f3        mov     x19, x0
|   10:   94000000        bl      0 <__hwasan_load8_noabort>
|   14:   f9400260        ldr     x0, [x19]
|   18:   f9400bf3        ldr     x19, [sp, #16]
|   1c:   a8c27bfd        ldp     x29, x30, [sp], #32
|   20:   d65f03c0        ret
| 
| 0000000000000024 <store_long>:
|   24:   a9be7bfd        stp     x29, x30, [sp, #-32]!
|   28:   910003fd        mov     x29, sp
|   2c:   a90153f3        stp     x19, x20, [sp, #16]
|   30:   aa0003f3        mov     x19, x0
|   34:   aa0103f4        mov     x20, x1
|   38:   94000000        bl      0 <__hwasan_store8_noabort>
|   3c:   f9000274        str     x20, [x19]
|   40:   a94153f3        ldp     x19, x20, [sp, #16]
|   44:   a8c27bfd        ldp     x29, x30, [sp], #32
|   48:   d65f03c0        ret
|   4c:   d503201f        nop
| 
| 0000000000000050 <stack_func>:
|   50:   a9bd7bfd        stp     x29, x30, [sp, #-48]!
|   54:   d2800202        mov     x2, #0x10                       // #16
|   58:   9100c3e0        add     x0, sp, #0x30
|   5c:   910003fd        mov     x29, sp
|   60:   d378fc01        lsr     x1, x0, #56
|   64:   910083e0        add     x0, sp, #0x20
|   68:   11000821        add     w1, w1, #0x2
|   6c:   f9000bf3        str     x19, [sp, #16]
|   70:   94000000        bl      0 <__hwasan_tag_memory>
|   74:   d2e04000        mov     x0, #0x200000000000000          // #144115188075855872
|   78:   8b2063e0        add     x0, sp, x0
|   7c:   f900101f        str     xzr, [x0, #32]
|   80:   94000000        bl      0 <extern_func>
|   84:   d2e04000        mov     x0, #0x200000000000000          // #144115188075855872
|   88:   8b2063e0        add     x0, sp, x0
|   8c:   d2800202        mov     x2, #0x10                       // #16
|   90:   52800001        mov     w1, #0x0                        // #0
|   94:   f9401013        ldr     x19, [x0, #32]
|   98:   910083e0        add     x0, sp, #0x20
|   9c:   94000000        bl      0 <__hwasan_tag_memory>
|   a0:   aa1303e0        mov     x0, x19
|   a4:   f9400bf3        ldr     x19, [sp, #16]
|   a8:   a8c37bfd        ldp     x29, x30, [sp], #48
|   ac:   d65f03c0        ret
| [mark@lakrids:/mnt/data/tests/kasan-tags]%
| [mark@lakrids:/mnt/data/tests/kasan-tags]% usekorg 12.1.0 aarch64-linux-gcc -c test.c -O2  -fsanitize=hwaddress
| [mark@lakrids:/mnt/data/tests/kasan-tags]% usekorg 14.2.0 aarch64-linux-objdump -d test.o
| 
| test.o:     file format elf64-littleaarch64
| 
| 
| Disassembly of section .text:
| 
| 0000000000000000 <load_long>:
|    0:   a9be7bfd        stp     x29, x30, [sp, #-32]!
|    4:   910003fd        mov     x29, sp
|    8:   f9000bf3        str     x19, [sp, #16]
|    c:   aa0003f3        mov     x19, x0
|   10:   94000000        bl      0 <__hwasan_load8>
|   14:   f9400260        ldr     x0, [x19]
|   18:   f9400bf3        ldr     x19, [sp, #16]
|   1c:   a8c27bfd        ldp     x29, x30, [sp], #32
|   20:   d65f03c0        ret
| 
| 0000000000000024 <store_long>:
|   24:   a9be7bfd        stp     x29, x30, [sp, #-32]!
|   28:   910003fd        mov     x29, sp
|   2c:   a90153f3        stp     x19, x20, [sp, #16]
|   30:   aa0003f3        mov     x19, x0
|   34:   aa0103f4        mov     x20, x1
|   38:   94000000        bl      0 <__hwasan_store8>
|   3c:   f9000274        str     x20, [x19]
|   40:   a94153f3        ldp     x19, x20, [sp, #16]
|   44:   a8c27bfd        ldp     x29, x30, [sp], #32
|   48:   d65f03c0        ret
|   4c:   d503201f        nop
| 
| 0000000000000050 <stack_func>:
|   50:   a9bd7bfd        stp     x29, x30, [sp, #-48]!
|   54:   910003fd        mov     x29, sp
|   58:   f9000bf3        str     x19, [sp, #16]
|   5c:   94000000        bl      0 <__hwasan_generate_tag>
|   60:   9100c3e1        add     x1, sp, #0x30
|   64:   d2800202        mov     x2, #0x10                       // #16
|   68:   aa00e033        orr     x19, x1, x0, lsl #56
|   6c:   910083e0        add     x0, sp, #0x20
|   70:   d378fe61        lsr     x1, x19, #56
|   74:   94000000        bl      0 <__hwasan_tag_memory>
|   78:   f81f027f        stur    xzr, [x19, #-16]
|   7c:   94000000        bl      0 <extern_func>
|   80:   f85f0273        ldur    x19, [x19, #-16]
|   84:   910083e0        add     x0, sp, #0x20
|   88:   d2800202        mov     x2, #0x10                       // #16
|   8c:   52800001        mov     w1, #0x0                        // #0
|   90:   94000000        bl      0 <__hwasan_tag_memory>
|   94:   aa1303e0        mov     x0, x19
|   98:   f9400bf3        ldr     x19, [sp, #16]
|   9c:   a8c37bfd        ldp     x29, x30, [sp], #48
|   a0:   d65f03c0        ret
| 
| Disassembly of section .text.startup:
| 
| 0000000000000000 <_sub_I_00099_0>:
|    0:   14000000        b       0 <__hwasan_init>
| [mark@lakrids:/mnt/data/tests/kasan-tags]% usekorg 12.1.0 aarch64-linux-gcc -c test.c -O2  -fsanitize=hwaddress  --param hwasan-instrument-stack=1
| [mark@lakrids:/mnt/data/tests/kasan-tags]% usekorg 14.2.0 aarch64-linux-objdump -d test.o
| 
| test.o:     file format elf64-littleaarch64
| 
| 
| Disassembly of section .text:
| 
| 0000000000000000 <load_long>:
|    0:   a9be7bfd        stp     x29, x30, [sp, #-32]!
|    4:   910003fd        mov     x29, sp
|    8:   f9000bf3        str     x19, [sp, #16]
|    c:   aa0003f3        mov     x19, x0
|   10:   94000000        bl      0 <__hwasan_load8>
|   14:   f9400260        ldr     x0, [x19]
|   18:   f9400bf3        ldr     x19, [sp, #16]
|   1c:   a8c27bfd        ldp     x29, x30, [sp], #32
|   20:   d65f03c0        ret
| 
| 0000000000000024 <store_long>:
|   24:   a9be7bfd        stp     x29, x30, [sp, #-32]!
|   28:   910003fd        mov     x29, sp
|   2c:   a90153f3        stp     x19, x20, [sp, #16]
|   30:   aa0003f3        mov     x19, x0
|   34:   aa0103f4        mov     x20, x1
|   38:   94000000        bl      0 <__hwasan_store8>
|   3c:   f9000274        str     x20, [x19]
|   40:   a94153f3        ldp     x19, x20, [sp, #16]
|   44:   a8c27bfd        ldp     x29, x30, [sp], #32
|   48:   d65f03c0        ret
|   4c:   d503201f        nop
| 
| 0000000000000050 <stack_func>:
|   50:   a9bd7bfd        stp     x29, x30, [sp, #-48]!
|   54:   910003fd        mov     x29, sp
|   58:   f9000bf3        str     x19, [sp, #16]
|   5c:   94000000        bl      0 <__hwasan_generate_tag>
|   60:   9100c3e1        add     x1, sp, #0x30
|   64:   d2800202        mov     x2, #0x10                       // #16
|   68:   aa00e033        orr     x19, x1, x0, lsl #56
|   6c:   910083e0        add     x0, sp, #0x20
|   70:   d378fe61        lsr     x1, x19, #56
|   74:   94000000        bl      0 <__hwasan_tag_memory>
|   78:   f81f027f        stur    xzr, [x19, #-16]
|   7c:   94000000        bl      0 <extern_func>
|   80:   f85f0273        ldur    x19, [x19, #-16]
|   84:   910083e0        add     x0, sp, #0x20
|   88:   d2800202        mov     x2, #0x10                       // #16
|   8c:   52800001        mov     w1, #0x0                        // #0
|   90:   94000000        bl      0 <__hwasan_tag_memory>
|   94:   aa1303e0        mov     x0, x19
|   98:   f9400bf3        ldr     x19, [sp, #16]
|   9c:   a8c37bfd        ldp     x29, x30, [sp], #48
|   a0:   d65f03c0        ret
| 
| Disassembly of section .text.startup:
| 
| 0000000000000000 <_sub_I_00099_0>:
|    0:   14000000        b       0 <__hwasan_init>

Mark.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch
  2024-09-23 10:46     ` Mark Rutland
@ 2024-09-23 20:12       ` Andrey Konovalov
  0 siblings, 0 replies; 14+ messages in thread
From: Andrey Konovalov @ 2024-09-23 20:12 UTC (permalink / raw)
  To: Mark Rutland, Alexander Potapenko
  Cc: Marc Zyngier, Will Deacon, syzbot, catalin.marinas,
	linux-arm-kernel, linux-kernel, syzkaller-bugs, kasan-dev,
	Aleksandr Nogikh, Andrey Ryabinin

On Mon, Sep 23, 2024 at 12:46 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> > > > ==================================================================
> > > > BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > > > BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > > > Write of size 4 at addr 03ff800086867e00 by task swapper/0
> > > > Pointer tag: [03], memory tag: [fe]
> > > >
> > > > CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc5-syzkaller-g33faa93bc856 #0
> > > > Hardware name: linux,dummy-virt (DT)
> > > > Call trace:
> > > >  dump_backtrace+0x204/0x3b8 arch/arm64/kernel/stacktrace.c:317
> > > >  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
> > > >  __dump_stack lib/dump_stack.c:93 [inline]
> > > >  dump_stack_lvl+0x260/0x3b4 lib/dump_stack.c:119
> > > >  print_address_description mm/kasan/report.c:377 [inline]
> > > >  print_report+0x118/0x5ac mm/kasan/report.c:488
> > > >  kasan_report+0xc8/0x108 mm/kasan/report.c:601
> > > >  kasan_check_range+0x94/0xb8 mm/kasan/sw_tags.c:84
> > > >  __hwasan_store4_noabort+0x20/0x2c mm/kasan/sw_tags.c:149
> > > >  smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
> > > >  setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
> > > >  start_kernel+0xe0/0xff0 init/main.c:926
> > > >  __primary_switched+0x84/0x8c arch/arm64/kernel/head.S:243
> > > >
> > > > The buggy address belongs to stack of task swapper/0
> > > >
> > > > Memory state around the buggy address:
> > > >  ffff800086867c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > >  ffff800086867d00: 00 fe fe 00 00 00 fe fe fe fe fe fe fe fe fe fe
> > > > >ffff800086867e00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > > >                    ^
> > > >  ffff800086867f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > > >  ffff800086868000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > > > ==================================================================
> > >
> > > I can't spot the issue here. We have a couple of fixed-length
> > > (4 element) arrays on the stack and they're indexed by a simple loop
> > > counter that runs from 0-3.
> >
> > Having trimmed the config to the extreme, I can only trigger the
> > warning with CONFIG_KASAN_SW_TAGS (CONFIG_KASAN_GENERIC does not
> > scream). Same thing if I use gcc 14.2.0.
> >
> > However, compiling with clang 14 (Debian clang version 14.0.6) does
> > *not* result in a screaming kernel, even with KASAN_SW_TAGS.

Yeah, this is #1 from https://bugzilla.kernel.org/show_bug.cgi?id=218854.

> > So I can see two possibilities here:
> >
> > - either gcc is incompatible with KASAN_SW_TAGS and the generic
> >   version is the only one that works
> >
> > - or we have a compiler bug on our hands.
> >
> > Frankly, I can't believe the later, as the code is so daft that I
> > can't imagine gcc getting it *that* wrong.
>
> It looks like what's happening here is:
>
> (1) With CONFIG_KASAN_SW_TAGS=y we pass the compiler
>     `-fsanitize=kernel-hwaddress`.
>
> (2) When GCC is passed `-fsanitize=hwaddress` or
>     `-fsanitize=kernel-hwaddress` it ignores
>     `__attribute__((no_sanitize_address))`, and instruments functions we
>     require are not instrumented.
>
>     I believe this is a compiler bug, as there doesn't seem to be a
>     separate attribute to prevent instrumentation in this mode.
>
> (3) In this config, smp_build_mpidr_hash() gets inlined into
>     setup_arch(), and as setup_arch() is instrumented, all of the stack
>     variables for smp_build_mpidr_hash() are initialized at the start of
>     setup_arch(), with calls to __hwasan_tag_memory().
>
>     At this point, we are using the early shadow (where a single page of
>     shadow is used for all memory).
>
> (4) In setup_arch(), we call kasan_init() to transition from the early
>     shadow to the runtime shadow. This replaces the early shadow memory
>     with new shadow memory initialized to KASAN_SHADOW_INIT (0xFE AKA
>     KASAN_TAG_INVALID), including the shadow for the stack.
>
> (5) Once the CPU returns back into setup_arch(), it's using the new
>     shadow initialized to 0xFE. Subsequent stack accesses which check
>     the shadow see 0xFE in the shadow, and fault. Note that in the dump
>     of the shadow above, the shadow around ffff800086867d80 and above is
>     all 0xFE, while below that functions have managed to clear the
>     shadow.
>
> Compiler test case below. Note that this demonstrates the compiler
> ignores  `__attribute__((no_sanitize_address))` regardless of
> KASAN_STACK, so KASAN_SW_TAGS is generally broken with GCC. All versions
> I tried were broken, from 11.3.0 to 14.2.0 inclusive.

Thank you for the detailed investigation report!

> I think we have to disable KASAN_SW_TAGS with GCC until this is fixed.

Sounds good to me.

Please reference https://bugzilla.kernel.org/show_bug.cgi?id=218854 if
you end up sending a patch for this.

Also the syzbot's kvm instance should probably be switched to Clang
(@Alexander).

Thank you!


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-09-23 20:14 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-30  8:35 [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch syzbot
2024-08-30  9:52 ` Will Deacon
2024-08-31 17:52   ` Marc Zyngier
2024-09-02 10:03     ` Aleksandr Nogikh
2024-09-03 15:39       ` Alexander Potapenko
2024-09-03 16:05         ` Marc Zyngier
2024-09-03 16:43           ` Samuel Holland
2024-09-04 15:31             ` Alexander Potapenko
2024-09-04 18:26     ` Mark Rutland
2024-09-05 14:03     ` Mark Rutland
2024-09-05 14:25       ` Ard Biesheuvel
2024-09-19  9:14         ` Mark Rutland
2024-09-23 10:46     ` Mark Rutland
2024-09-23 20:12       ` Andrey Konovalov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).