public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* lockdep + kasan bug?
@ 2023-11-20 23:36 Kent Overstreet
  2023-11-21 10:36 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Kent Overstreet @ 2023-11-20 23:36 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng
  Cc: linux-kernel

I've been seeing a lot of reports like the following in a lot of my
lockdep + kasan tests.

Some lockdep patches are in my tree: they don't touch this code path
(except I do have to increase MAX_LOCK_DEPTH from 48 to 63, perhaps that
has unintended side effects?)

https://evilpiepirate.org/git/bcachefs.git/log/?id=2f42f415f7573001b4f4887b785d8a8747b3757f

bcachefs does take a _large_ number of locks for lockdep to track, also
possibly relevant

Have not dug into the lockdep hash table of outstanding locks code yet
but happy to test patches...

04752 ========= TEST   tiering_variable_buckets_replicas
04752 
04752 WATCHDOG 3600
04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): mounting version 1.3: rebalance_work opts=metadata_replicas=2,data_replicas=2,foreground_target=ssd,background_target=hdd,promote_target=ssd,fsck
04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): initializing new filesystem
04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): going read-write
04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): marking superblocks
04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): initializing freespace
04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): done initializing freespace
04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): reading snapshots table
04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): reading snapshots done
04753 WATCHDOG 3600
04753 randrw: (g=0): rw=randrw, bs=(R) 4096B-1024KiB, (W) 4096B-1024KiB, (T) 4096B-1024KiB, ioengine=libaio, iodepth=64
04753 fio-3.33
04753 Starting 1 process
04753 randrw: Laying out IO file (1 file / 3500MiB)
05117 Jobs: 1 (f=1)
05117 BUG: KASAN: global-out-of-bounds in add_chain_block+0x44/0x288
05117 Read of size 4 at addr ffffffc081b7a8bc by task fio/120528
05117 
05117 CPU: 11 PID: 120528 Comm: fio Tainted: G             L     6.6.0-ktest-gc18b7260ddd3 #8209
05117 Hardware name: linux,dummy-virt (DT)
05117 Call trace:
05117  dump_backtrace+0xa8/0xe8
05117  show_stack+0x1c/0x30
05117  dump_stack_lvl+0x5c/0xa0
05117  print_report+0x1e4/0x5a0
05117  kasan_report+0x80/0xc0
05117  __asan_load4+0x90/0xb0
05117  add_chain_block+0x44/0x288
05117  __lock_acquire+0x1104/0x24f8
05117  lock_acquire+0x1e0/0x470
05117  _raw_spin_lock_nested+0x54/0x78
05117  raw_spin_rq_lock_nested+0x30/0x50
05117  try_to_wake_up+0x3b4/0x1050
05117  wake_up_process+0x1c/0x30
05117  kick_pool+0x104/0x1b0
05117  __queue_work+0x350/0xa58
05117  queue_work_on+0x98/0xd0
05117  __bch2_btree_node_write+0xec0/0x10a0
05117  bch2_btree_node_write+0x88/0x138
05117  btree_split+0x744/0x14a0
05117  bch2_btree_split_leaf+0x94/0x258
05117  bch2_trans_commit_error.isra.0+0x234/0x7d0
05117  __bch2_trans_commit+0x1128/0x3010
05117  bch2_extent_update+0x410/0x570
05117  bch2_write_index_default+0x404/0x598
05117  __bch2_write_index+0xb0/0x3b0
05117  __bch2_write+0x6f0/0x928
05117  bch2_write+0x368/0x8e0
05117  bch2_direct_write+0xaa8/0x12c0
05117  bch2_write_iter+0x2e4/0x1050
05117  aio_write.constprop.0+0x19c/0x420
05117  io_submit_one.constprop.0+0xf30/0x17a0
05117  __arm64_sys_io_submit+0x244/0x388
05117  invoke_syscall.constprop.0+0x64/0x138
05117  do_el0_svc+0x7c/0x120
05117  el0_svc+0x34/0x80
05117  el0t_64_sync_handler+0xb8/0xc0
05117  el0t_64_sync+0x14c/0x150
05117 
05117 The buggy address belongs to the variable:
05117  nr_large_chain_blocks+0x3c/0x40
05117 
05117 The buggy address belongs to the virtual mapping at
05117  [ffffffc081710000, ffffffc088861000) created by:
05117  paging_init+0x260/0x820
05117 
05117 The buggy address belongs to the physical page:
05117 page:00000000ce625900 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x41d7a
05117 flags: 0x4000(reserved|zone=0)
05117 page_type: 0xffffffff()
05117 raw: 0000000000004000 fffffffe00075e88 fffffffe00075e88 0000000000000000
05117 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
05117 page dumped because: kasan: bad access detected
05117 
05117 Memory state around the buggy address:
05117  ffffffc081b7a780: 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
05117  ffffffc081b7a800: 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
05117 >ffffffc081b7a880: 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
05117                                         ^
05117  ffffffc081b7a900: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9
05117  ffffffc081b7a980: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9
05117 ==================================================================
05117 Kernel panic - not syncing: kasan.fault=panic set ...
05117 CPU: 11 PID: 120528 Comm: fio Tainted: G             L     6.6.0-ktest-gc18b7260ddd3 #8209
05117 Hardware name: linux,dummy-virt (DT)
05117 Call trace:
05117  dump_backtrace+0xa8/0xe8
05117  show_stack+0x1c/0x30
05117  dump_stack_lvl+0x5c/0xa0
05117  dump_stack+0x18/0x20
05117  panic+0x3ac/0x408
05117  kasan_report_invalid_free+0x0/0x90
05117  kasan_report+0x90/0xc0
05117  __asan_load4+0x90/0xb0
05117  add_chain_block+0x44/0x288
05117  __lock_acquire+0x1104/0x24f8
05117  lock_acquire+0x1e0/0x470
05117  _raw_spin_lock_nested+0x54/0x78
05117  raw_spin_rq_lock_nested+0x30/0x50
05117  try_to_wake_up+0x3b4/0x1050
05117  wake_up_process+0x1c/0x30
05117  kick_pool+0x104/0x1b0
05117  __queue_work+0x350/0xa58
05117  queue_work_on+0x98/0xd0
05117  __bch2_btree_node_write+0xec0/0x10a0
05117  bch2_btree_node_write+0x88/0x138
05117  btree_split+0x744/0x14a0
05117  bch2_btree_split_leaf+0x94/0x258
05117  bch2_trans_commit_error.isra.0+0x234/0x7d0
05117  __bch2_trans_commit+0x1128/0x3010
05117  bch2_extent_update+0x410/0x570
05117  bch2_write_index_default+0x404/0x598
05117  __bch2_write_index+0xb0/0x3b0
05117  __bch2_write+0x6f0/0x928
05117  bch2_write+0x368/0x8e0
05117  bch2_direct_write+0xaa8/0x12c0
05117  bch2_write_iter+0x2e4/0x1050
05117  aio_write.constprop.0+0x19c/0x420
05117  io_submit_one.constprop.0+0xf30/0x17a0
05117  __arm64_sys_io_submit+0x244/0x388
05117  invoke_syscall.constprop.0+0x64/0x138
05117  do_el0_svc+0x7c/0x120
05117  el0_svc+0x34/0x80
05117  el0t_64_sync_handler+0xb8/0xc0
05117  el0t_64_sync+0x14c/0x150
05117 SMP: stopping secondary CPUs
05117 Kernel Offset: disabled
05117 CPU features: 0x0,00000000,70000001,1040500b
05117 Memory Limit: none
05117 ---[ end Kernel panic - not syncing: kasan.fault=panic set ... ]---
05122 ========= FAILED TIMEOUT tiering_variable_buckets_replicas in 3600s

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: lockdep + kasan bug?
  2023-11-20 23:36 lockdep + kasan bug? Kent Overstreet
@ 2023-11-21 10:36 ` Peter Zijlstra
  2023-11-21 11:14   ` Mark Rutland
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2023-11-21 10:36 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng, linux-kernel

On Mon, Nov 20, 2023 at 06:36:59PM -0500, Kent Overstreet wrote:
> I've been seeing a lot of reports like the following in a lot of my
> lockdep + kasan tests.

I'm not aware of any such issues, then again, I rarely run with KASAN
enabled myself, I mostly leave that to the robots, who are far more
patient than me with slow kernels.

> Some lockdep patches are in my tree: they don't touch this code path
> (except I do have to increase MAX_LOCK_DEPTH from 48 to 63, perhaps that
> has unintended side effects?)
> 
> https://evilpiepirate.org/git/bcachefs.git/log/?id=2f42f415f7573001b4f4887b785d8a8747b3757f

yeah, don't see anything weird there. I mean, sad about the no-recursion
thing, esp. after you did those custom order bits.

> bcachefs does take a _large_ number of locks for lockdep to track, also
> possibly relevant
> 
> Have not dug into the lockdep hash table of outstanding locks code yet
> but happy to test patches...
> 
> 04752 ========= TEST   tiering_variable_buckets_replicas
> 04752 
> 04752 WATCHDOG 3600
> 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): mounting version 1.3: rebalance_work opts=metadata_replicas=2,data_replicas=2,foreground_target=ssd,background_target=hdd,promote_target=ssd,fsck
> 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): initializing new filesystem
> 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): going read-write
> 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): marking superblocks
> 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): initializing freespace
> 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): done initializing freespace
> 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): reading snapshots table
> 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): reading snapshots done
> 04753 WATCHDOG 3600
> 04753 randrw: (g=0): rw=randrw, bs=(R) 4096B-1024KiB, (W) 4096B-1024KiB, (T) 4096B-1024KiB, ioengine=libaio, iodepth=64
> 04753 fio-3.33
> 04753 Starting 1 process
> 04753 randrw: Laying out IO file (1 file / 3500MiB)
> 05117 Jobs: 1 (f=1)
> 05117 BUG: KASAN: global-out-of-bounds in add_chain_block+0x44/0x288
> 05117 Read of size 4 at addr ffffffc081b7a8bc by task fio/120528
> 05117 
> 05117 CPU: 11 PID: 120528 Comm: fio Tainted: G             L     6.6.0-ktest-gc18b7260ddd3 #8209
> 05117 Hardware name: linux,dummy-virt (DT)
> 05117 Call trace:
> 05117  dump_backtrace+0xa8/0xe8
> 05117  show_stack+0x1c/0x30
> 05117  dump_stack_lvl+0x5c/0xa0
> 05117  print_report+0x1e4/0x5a0
> 05117  kasan_report+0x80/0xc0
> 05117  __asan_load4+0x90/0xb0
> 05117  add_chain_block+0x44/0x288
> 05117  __lock_acquire+0x1104/0x24f8
> 05117  lock_acquire+0x1e0/0x470
> 05117  _raw_spin_lock_nested+0x54/0x78
> 05117  raw_spin_rq_lock_nested+0x30/0x50
> 05117  try_to_wake_up+0x3b4/0x1050
> 05117  wake_up_process+0x1c/0x30
> 05117  kick_pool+0x104/0x1b0
> 05117  __queue_work+0x350/0xa58
> 05117  queue_work_on+0x98/0xd0
> 05117  __bch2_btree_node_write+0xec0/0x10a0
> 05117  bch2_btree_node_write+0x88/0x138
> 05117  btree_split+0x744/0x14a0
> 05117  bch2_btree_split_leaf+0x94/0x258
> 05117  bch2_trans_commit_error.isra.0+0x234/0x7d0
> 05117  __bch2_trans_commit+0x1128/0x3010
> 05117  bch2_extent_update+0x410/0x570
> 05117  bch2_write_index_default+0x404/0x598
> 05117  __bch2_write_index+0xb0/0x3b0
> 05117  __bch2_write+0x6f0/0x928
> 05117  bch2_write+0x368/0x8e0
> 05117  bch2_direct_write+0xaa8/0x12c0
> 05117  bch2_write_iter+0x2e4/0x1050
> 05117  aio_write.constprop.0+0x19c/0x420
> 05117  io_submit_one.constprop.0+0xf30/0x17a0
> 05117  __arm64_sys_io_submit+0x244/0x388
> 05117  invoke_syscall.constprop.0+0x64/0x138
> 05117  do_el0_svc+0x7c/0x120
> 05117  el0_svc+0x34/0x80
> 05117  el0t_64_sync_handler+0xb8/0xc0
> 05117  el0t_64_sync+0x14c/0x150
> 05117 
> 05117 The buggy address belongs to the variable:
> 05117  nr_large_chain_blocks+0x3c/0x40

This is weird, nr_lage_chain_blocks is a single variable, if the
compiler keeps layout according to the source file, this would be
chaing_block_bucket[14] or something weird like that.

Perhaps figure out what it things the @size argument to
add_chain_block() would be?

> 05117 
> 05117 The buggy address belongs to the virtual mapping at
> 05117  [ffffffc081710000, ffffffc088861000) created by:
> 05117  paging_init+0x260/0x820
> 05117 
> 05117 The buggy address belongs to the physical page:
> 05117 page:00000000ce625900 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x41d7a
> 05117 flags: 0x4000(reserved|zone=0)
> 05117 page_type: 0xffffffff()
> 05117 raw: 0000000000004000 fffffffe00075e88 fffffffe00075e88 0000000000000000
> 05117 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
> 05117 page dumped because: kasan: bad access detected
> 05117 
> 05117 Memory state around the buggy address:
> 05117  ffffffc081b7a780: 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
> 05117  ffffffc081b7a800: 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
> 05117 >ffffffc081b7a880: 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
> 05117                                         ^
> 05117  ffffffc081b7a900: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9
> 05117  ffffffc081b7a980: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9
> 05117 ==================================================================
> 05117 Kernel panic - not syncing: kasan.fault=panic set ...
> 05117 CPU: 11 PID: 120528 Comm: fio Tainted: G             L     6.6.0-ktest-gc18b7260ddd3 #8209
> 05117 Hardware name: linux,dummy-virt (DT)
> 05117 Call trace:
> 05117  dump_backtrace+0xa8/0xe8
> 05117  show_stack+0x1c/0x30
> 05117  dump_stack_lvl+0x5c/0xa0
> 05117  dump_stack+0x18/0x20
> 05117  panic+0x3ac/0x408
> 05117  kasan_report_invalid_free+0x0/0x90
> 05117  kasan_report+0x90/0xc0
> 05117  __asan_load4+0x90/0xb0
> 05117  add_chain_block+0x44/0x288
> 05117  __lock_acquire+0x1104/0x24f8
> 05117  lock_acquire+0x1e0/0x470
> 05117  _raw_spin_lock_nested+0x54/0x78
> 05117  raw_spin_rq_lock_nested+0x30/0x50
> 05117  try_to_wake_up+0x3b4/0x1050
> 05117  wake_up_process+0x1c/0x30
> 05117  kick_pool+0x104/0x1b0
> 05117  __queue_work+0x350/0xa58
> 05117  queue_work_on+0x98/0xd0
> 05117  __bch2_btree_node_write+0xec0/0x10a0
> 05117  bch2_btree_node_write+0x88/0x138
> 05117  btree_split+0x744/0x14a0
> 05117  bch2_btree_split_leaf+0x94/0x258
> 05117  bch2_trans_commit_error.isra.0+0x234/0x7d0
> 05117  __bch2_trans_commit+0x1128/0x3010
> 05117  bch2_extent_update+0x410/0x570
> 05117  bch2_write_index_default+0x404/0x598
> 05117  __bch2_write_index+0xb0/0x3b0
> 05117  __bch2_write+0x6f0/0x928
> 05117  bch2_write+0x368/0x8e0
> 05117  bch2_direct_write+0xaa8/0x12c0
> 05117  bch2_write_iter+0x2e4/0x1050
> 05117  aio_write.constprop.0+0x19c/0x420
> 05117  io_submit_one.constprop.0+0xf30/0x17a0
> 05117  __arm64_sys_io_submit+0x244/0x388
> 05117  invoke_syscall.constprop.0+0x64/0x138
> 05117  do_el0_svc+0x7c/0x120
> 05117  el0_svc+0x34/0x80
> 05117  el0t_64_sync_handler+0xb8/0xc0
> 05117  el0t_64_sync+0x14c/0x150
> 05117 SMP: stopping secondary CPUs
> 05117 Kernel Offset: disabled
> 05117 CPU features: 0x0,00000000,70000001,1040500b
> 05117 Memory Limit: none
> 05117 ---[ end Kernel panic - not syncing: kasan.fault=panic set ... ]---
> 05122 ========= FAILED TIMEOUT tiering_variable_buckets_replicas in 3600s

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: lockdep + kasan bug?
  2023-11-21 10:36 ` Peter Zijlstra
@ 2023-11-21 11:14   ` Mark Rutland
  2023-11-21 11:41     ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Rutland @ 2023-11-21 11:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Kent Overstreet, Ingo Molnar, Will Deacon, Waiman Long,
	Boqun Feng, linux-kernel

On Tue, Nov 21, 2023 at 11:36:14AM +0100, Peter Zijlstra wrote:
> On Mon, Nov 20, 2023 at 06:36:59PM -0500, Kent Overstreet wrote:
> > I've been seeing a lot of reports like the following in a lot of my
> > lockdep + kasan tests.
> 
> I'm not aware of any such issues, then again, I rarely run with KASAN
> enabled myself, I mostly leave that to the robots, who are far more
> patient than me with slow kernels.
> 
> > Some lockdep patches are in my tree: they don't touch this code path
> > (except I do have to increase MAX_LOCK_DEPTH from 48 to 63, perhaps that
> > has unintended side effects?)
> > 
> > https://evilpiepirate.org/git/bcachefs.git/log/?id=2f42f415f7573001b4f4887b785d8a8747b3757f
> 
> yeah, don't see anything weird there. I mean, sad about the no-recursion
> thing, esp. after you did those custom order bits.
> 
> > bcachefs does take a _large_ number of locks for lockdep to track, also
> > possibly relevant
> > 
> > Have not dug into the lockdep hash table of outstanding locks code yet
> > but happy to test patches...
> > 
> > 04752 ========= TEST   tiering_variable_buckets_replicas
> > 04752 
> > 04752 WATCHDOG 3600
> > 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): mounting version 1.3: rebalance_work opts=metadata_replicas=2,data_replicas=2,foreground_target=ssd,background_target=hdd,promote_target=ssd,fsck
> > 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): initializing new filesystem
> > 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): going read-write
> > 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): marking superblocks
> > 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): initializing freespace
> > 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): done initializing freespace
> > 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): reading snapshots table
> > 04753 bcachefs (ea667958-8bbd-451b-9043-9132a2fd2fa4): reading snapshots done
> > 04753 WATCHDOG 3600
> > 04753 randrw: (g=0): rw=randrw, bs=(R) 4096B-1024KiB, (W) 4096B-1024KiB, (T) 4096B-1024KiB, ioengine=libaio, iodepth=64
> > 04753 fio-3.33
> > 04753 Starting 1 process
> > 04753 randrw: Laying out IO file (1 file / 3500MiB)
> > 05117 Jobs: 1 (f=1)
> > 05117 BUG: KASAN: global-out-of-bounds in add_chain_block+0x44/0x288
> > 05117 Read of size 4 at addr ffffffc081b7a8bc by task fio/120528
> > 05117 
> > 05117 CPU: 11 PID: 120528 Comm: fio Tainted: G             L     6.6.0-ktest-gc18b7260ddd3 #8209
> > 05117 Hardware name: linux,dummy-virt (DT)
> > 05117 Call trace:
> > 05117  dump_backtrace+0xa8/0xe8
> > 05117  show_stack+0x1c/0x30
> > 05117  dump_stack_lvl+0x5c/0xa0
> > 05117  print_report+0x1e4/0x5a0
> > 05117  kasan_report+0x80/0xc0
> > 05117  __asan_load4+0x90/0xb0
> > 05117  add_chain_block+0x44/0x288
> > 05117  __lock_acquire+0x1104/0x24f8
> > 05117  lock_acquire+0x1e0/0x470
> > 05117  _raw_spin_lock_nested+0x54/0x78
> > 05117  raw_spin_rq_lock_nested+0x30/0x50
> > 05117  try_to_wake_up+0x3b4/0x1050
> > 05117  wake_up_process+0x1c/0x30
> > 05117  kick_pool+0x104/0x1b0
> > 05117  __queue_work+0x350/0xa58
> > 05117  queue_work_on+0x98/0xd0
> > 05117  __bch2_btree_node_write+0xec0/0x10a0
> > 05117  bch2_btree_node_write+0x88/0x138
> > 05117  btree_split+0x744/0x14a0
> > 05117  bch2_btree_split_leaf+0x94/0x258
> > 05117  bch2_trans_commit_error.isra.0+0x234/0x7d0
> > 05117  __bch2_trans_commit+0x1128/0x3010
> > 05117  bch2_extent_update+0x410/0x570
> > 05117  bch2_write_index_default+0x404/0x598
> > 05117  __bch2_write_index+0xb0/0x3b0
> > 05117  __bch2_write+0x6f0/0x928
> > 05117  bch2_write+0x368/0x8e0
> > 05117  bch2_direct_write+0xaa8/0x12c0
> > 05117  bch2_write_iter+0x2e4/0x1050
> > 05117  aio_write.constprop.0+0x19c/0x420
> > 05117  io_submit_one.constprop.0+0xf30/0x17a0
> > 05117  __arm64_sys_io_submit+0x244/0x388
> > 05117  invoke_syscall.constprop.0+0x64/0x138
> > 05117  do_el0_svc+0x7c/0x120
> > 05117  el0_svc+0x34/0x80
> > 05117  el0t_64_sync_handler+0xb8/0xc0
> > 05117  el0t_64_sync+0x14c/0x150
> > 05117 
> > 05117 The buggy address belongs to the variable:
> > 05117  nr_large_chain_blocks+0x3c/0x40
> 
> This is weird, nr_lage_chain_blocks is a single variable, if the
> compiler keeps layout according to the source file, this would be
> chaing_block_bucket[14] or something weird like that.

I think the size here is bogus; IIUC that's determined form the start of the
next symbol, which happens to be 64 bytes away from the start of
nr_lage_chain_blocks.

From the memory state dump, there's padding/redzone between two global objects,
and I think we're accessing a negative offset from the next object. More on
that below.

> Perhaps figure out what it things the @size argument to
> add_chain_block() would be?
> 
> > 05117 
> > 05117 The buggy address belongs to the virtual mapping at
> > 05117  [ffffffc081710000, ffffffc088861000) created by:
> > 05117  paging_init+0x260/0x820
> > 05117 
> > 05117 The buggy address belongs to the physical page:
> > 05117 page:00000000ce625900 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x41d7a
> > 05117 flags: 0x4000(reserved|zone=0)
> > 05117 page_type: 0xffffffff()
> > 05117 raw: 0000000000004000 fffffffe00075e88 fffffffe00075e88 0000000000000000
> > 05117 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
> > 05117 page dumped because: kasan: bad access detected
> > 05117 
> > 05117 Memory state around the buggy address:
> > 05117  ffffffc081b7a780: 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
> > 05117  ffffffc081b7a800: 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
> > 05117 >ffffffc081b7a880: 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
> > 05117                                         ^

In this dump:

* '00' means all 8 bytes of an 8-byte region areaccessible
* '04' means the first 4 bytes on an 8-byte region are accessible
* 'f9' means KASAN_GLOBAL_REDZONE / padding between objects

So at 0xffffffc081b7a880 we have a 4-byte object, 60 bytes of padding, then a
64-byte object.

I think the 4-byte object at 0xffffffc081b7a880 is nr_large_chain_blocks, and
the later 64-byte object is chain_block_buckets[].

I suspect the dodgy access is to chain_block_buckets[-1], which hits the last 4
bytes of the redzone and gets (incorrectly/misleadingly) attributed to
nr_large_chain_blocks.

Mark.

> > 05117  ffffffc081b7a900: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9
> > 05117  ffffffc081b7a980: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9
> > 05117 ==================================================================
> > 05117 Kernel panic - not syncing: kasan.fault=panic set ...
> > 05117 CPU: 11 PID: 120528 Comm: fio Tainted: G             L     6.6.0-ktest-gc18b7260ddd3 #8209
> > 05117 Hardware name: linux,dummy-virt (DT)
> > 05117 Call trace:
> > 05117  dump_backtrace+0xa8/0xe8
> > 05117  show_stack+0x1c/0x30
> > 05117  dump_stack_lvl+0x5c/0xa0
> > 05117  dump_stack+0x18/0x20
> > 05117  panic+0x3ac/0x408
> > 05117  kasan_report_invalid_free+0x0/0x90
> > 05117  kasan_report+0x90/0xc0
> > 05117  __asan_load4+0x90/0xb0
> > 05117  add_chain_block+0x44/0x288
> > 05117  __lock_acquire+0x1104/0x24f8
> > 05117  lock_acquire+0x1e0/0x470
> > 05117  _raw_spin_lock_nested+0x54/0x78
> > 05117  raw_spin_rq_lock_nested+0x30/0x50
> > 05117  try_to_wake_up+0x3b4/0x1050
> > 05117  wake_up_process+0x1c/0x30
> > 05117  kick_pool+0x104/0x1b0
> > 05117  __queue_work+0x350/0xa58
> > 05117  queue_work_on+0x98/0xd0
> > 05117  __bch2_btree_node_write+0xec0/0x10a0
> > 05117  bch2_btree_node_write+0x88/0x138
> > 05117  btree_split+0x744/0x14a0
> > 05117  bch2_btree_split_leaf+0x94/0x258
> > 05117  bch2_trans_commit_error.isra.0+0x234/0x7d0
> > 05117  __bch2_trans_commit+0x1128/0x3010
> > 05117  bch2_extent_update+0x410/0x570
> > 05117  bch2_write_index_default+0x404/0x598
> > 05117  __bch2_write_index+0xb0/0x3b0
> > 05117  __bch2_write+0x6f0/0x928
> > 05117  bch2_write+0x368/0x8e0
> > 05117  bch2_direct_write+0xaa8/0x12c0
> > 05117  bch2_write_iter+0x2e4/0x1050
> > 05117  aio_write.constprop.0+0x19c/0x420
> > 05117  io_submit_one.constprop.0+0xf30/0x17a0
> > 05117  __arm64_sys_io_submit+0x244/0x388
> > 05117  invoke_syscall.constprop.0+0x64/0x138
> > 05117  do_el0_svc+0x7c/0x120
> > 05117  el0_svc+0x34/0x80
> > 05117  el0t_64_sync_handler+0xb8/0xc0
> > 05117  el0t_64_sync+0x14c/0x150
> > 05117 SMP: stopping secondary CPUs
> > 05117 Kernel Offset: disabled
> > 05117 CPU features: 0x0,00000000,70000001,1040500b
> > 05117 Memory Limit: none
> > 05117 ---[ end Kernel panic - not syncing: kasan.fault=panic set ... ]---
> > 05122 ========= FAILED TIMEOUT tiering_variable_buckets_replicas in 3600s

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: lockdep + kasan bug?
  2023-11-21 11:14   ` Mark Rutland
@ 2023-11-21 11:41     ` Peter Zijlstra
  2023-11-22 23:57       ` Kent Overstreet
  2023-11-24 10:16       ` [tip: locking/urgent] lockdep: Fix block chain corruption tip-bot2 for Peter Zijlstra
  0 siblings, 2 replies; 6+ messages in thread
From: Peter Zijlstra @ 2023-11-21 11:41 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kent Overstreet, Ingo Molnar, Will Deacon, Waiman Long,
	Boqun Feng, linux-kernel

On Tue, Nov 21, 2023 at 11:14:37AM +0000, Mark Rutland wrote:

> > > 05117 The buggy address belongs to the variable:
> > > 05117  nr_large_chain_blocks+0x3c/0x40
> > 
> > This is weird, nr_lage_chain_blocks is a single variable, if the
> > compiler keeps layout according to the source file, this would be
> > chaing_block_bucket[14] or something weird like that.
> 
> I think the size here is bogus; IIUC that's determined form the start of the
> next symbol, which happens to be 64 bytes away from the start of
> nr_lage_chain_blocks.
> 
> From the memory state dump, there's padding/redzone between two global objects,
> and I think we're accessing a negative offset from the next object. More on
> that below.
> 
> > Perhaps figure out what it things the @size argument to
> > add_chain_block() would be?
> > 
> > > 05117 
> > > 05117 The buggy address belongs to the virtual mapping at
> > > 05117  [ffffffc081710000, ffffffc088861000) created by:
> > > 05117  paging_init+0x260/0x820
> > > 05117 
> > > 05117 The buggy address belongs to the physical page:
> > > 05117 page:00000000ce625900 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x41d7a
> > > 05117 flags: 0x4000(reserved|zone=0)
> > > 05117 page_type: 0xffffffff()
> > > 05117 raw: 0000000000004000 fffffffe00075e88 fffffffe00075e88 0000000000000000
> > > 05117 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
> > > 05117 page dumped because: kasan: bad access detected
> > > 05117 
> > > 05117 Memory state around the buggy address:
> > > 05117  ffffffc081b7a780: 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
> > > 05117  ffffffc081b7a800: 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
> > > 05117 >ffffffc081b7a880: 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
> > > 05117                                         ^
> 
> In this dump:
> 
> * '00' means all 8 bytes of an 8-byte region areaccessible
> * '04' means the first 4 bytes on an 8-byte region are accessible
> * 'f9' means KASAN_GLOBAL_REDZONE / padding between objects
> 
> So at 0xffffffc081b7a880 we have a 4-byte object, 60 bytes of padding, then a
> 64-byte object.
> 
> I think the 4-byte object at 0xffffffc081b7a880 is nr_large_chain_blocks, and
> the later 64-byte object is chain_block_buckets[].

Oh! That's very helpful, thanks!

> I suspect the dodgy access is to chain_block_buckets[-1], which hits the last 4
> bytes of the redzone and gets (incorrectly/misleadingly) attributed to
> nr_large_chain_blocks.

That would mean @size == 0, at which point size_to_bucket() returns -1
and the above happens.

alloc_chain_hlocks() has 'size - req', for the first with the
precondition 'size >= rq', which allows the 0.

The second is an iteration with the condition size > req, which does not
allow the 0 case.

So the first, thing, IIRC, this is trying to split a block,
del_chain_block() takes what we need, and add_chain_block() puts back
the remainder, except in the above case the remainder is 0 sized and
things go sideways or so.

Does the below help?

---
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index e85b5ad3e206..151bd3de5936 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3497,7 +3497,8 @@ static int alloc_chain_hlocks(int req)
 		size = chain_block_size(curr);
 		if (likely(size >= req)) {
 			del_chain_block(0, size, chain_block_next(curr));
-			add_chain_block(curr + req, size - req);
+			if (size > req)
+				add_chain_block(curr + req, size - req);
 			return curr;
 		}
 	}


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: lockdep + kasan bug?
  2023-11-21 11:41     ` Peter Zijlstra
@ 2023-11-22 23:57       ` Kent Overstreet
  2023-11-24 10:16       ` [tip: locking/urgent] lockdep: Fix block chain corruption tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 6+ messages in thread
From: Kent Overstreet @ 2023-11-22 23:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng,
	linux-kernel

On Tue, Nov 21, 2023 at 12:41:26PM +0100, Peter Zijlstra wrote:
> > I suspect the dodgy access is to chain_block_buckets[-1], which hits the last 4
> > bytes of the redzone and gets (incorrectly/misleadingly) attributed to
> > nr_large_chain_blocks.
> 
> That would mean @size == 0, at which point size_to_bucket() returns -1
> and the above happens.
> 
> alloc_chain_hlocks() has 'size - req', for the first with the
> precondition 'size >= rq', which allows the 0.
> 
> The second is an iteration with the condition size > req, which does not
> allow the 0 case.
> 
> So the first, thing, IIRC, this is trying to split a block,
> del_chain_block() takes what we need, and add_chain_block() puts back
> the remainder, except in the above case the remainder is 0 sized and
> things go sideways or so.
> 
> Does the below help?
> 
> ---
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index e85b5ad3e206..151bd3de5936 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -3497,7 +3497,8 @@ static int alloc_chain_hlocks(int req)
>  		size = chain_block_size(curr);
>  		if (likely(size >= req)) {
>  			del_chain_block(0, size, chain_block_next(curr));
> -			add_chain_block(curr + req, size - req);
> +			if (size > req)
> +				add_chain_block(curr + req, size - req);
>  			return curr;
>  		}
>  	}
> 

Yep, no kasan splats with that patch

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [tip: locking/urgent] lockdep: Fix block chain corruption
  2023-11-21 11:41     ` Peter Zijlstra
  2023-11-22 23:57       ` Kent Overstreet
@ 2023-11-24 10:16       ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 6+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-11-24 10:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kent Overstreet, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the locking/urgent branch of tip:

Commit-ID:     bca4104b00fec60be330cd32818dd5c70db3d469
Gitweb:        https://git.kernel.org/tip/bca4104b00fec60be330cd32818dd5c70db3d469
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Tue, 21 Nov 2023 12:41:26 +01:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 24 Nov 2023 11:04:54 +01:00

lockdep: Fix block chain corruption

Kent reported an occasional KASAN splat in lockdep. Mark then noted:

> I suspect the dodgy access is to chain_block_buckets[-1], which hits the last 4
> bytes of the redzone and gets (incorrectly/misleadingly) attributed to
> nr_large_chain_blocks.

That would mean @size == 0, at which point size_to_bucket() returns -1
and the above happens.

alloc_chain_hlocks() has 'size - req', for the first with the
precondition 'size >= rq', which allows the 0.

This code is trying to split a block, del_chain_block() takes what we
need, and add_chain_block() puts back the remainder, except in the
above case the remainder is 0 sized and things go sideways.

Fixes: 810507fe6fd5 ("locking/lockdep: Reuse freed chain_hlocks entries")
Reported-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Kent Overstreet <kent.overstreet@linux.dev>
Link: https://lkml.kernel.org/r/20231121114126.GH8262@noisy.programming.kicks-ass.net
---
 kernel/locking/lockdep.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index e85b5ad..151bd3d 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3497,7 +3497,8 @@ static int alloc_chain_hlocks(int req)
 		size = chain_block_size(curr);
 		if (likely(size >= req)) {
 			del_chain_block(0, size, chain_block_next(curr));
-			add_chain_block(curr + req, size - req);
+			if (size > req)
+				add_chain_block(curr + req, size - req);
 			return curr;
 		}
 	}

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-11-24 10:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-20 23:36 lockdep + kasan bug? Kent Overstreet
2023-11-21 10:36 ` Peter Zijlstra
2023-11-21 11:14   ` Mark Rutland
2023-11-21 11:41     ` Peter Zijlstra
2023-11-22 23:57       ` Kent Overstreet
2023-11-24 10:16       ` [tip: locking/urgent] lockdep: Fix block chain corruption tip-bot2 for Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox