public inbox for linux-m68k@lists.linux-m68k.org
 help / color / mirror / Atom feed
From: Michael Schmitz <schmitzmic@gmail.com>
To: Finn Thain <fthain@linux-m68k.org>
Cc: Guenter Roeck <linux@roeck-us.net>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	linux-m68k@lists.linux-m68k.org
Subject: Re: spinlock recursion when running q800 emulation in qemu
Date: Sat, 9 Mar 2024 09:14:03 +1300	[thread overview]
Message-ID: <bd6eb981-0dfe-a9c9-7777-0cfa0703efc0@gmail.com> (raw)
In-Reply-To: <5076e848-9bd3-3fea-0aca-5f62a8739a73@linux-m68k.org>

Hi Finn,

Am 08.03.2024 um 22:33 schrieb Finn Thain:
> On Fri, 8 Mar 2024, I wrote:
>
>> On Fri, 8 Mar 2024, Michael Schmitz wrote:
>>
>>>
>>>> IMHO, Gueunter would do better to instrument the spinlock tests on the
>>>> assumption that the locking in arch/m68k really is correct.
>>>
>>> I've come to agree - maybe logging any run queue locks taken by the init
>>> task with IRQs enabled might help?
>>>
>>
>> If I understood how the spinlock test algorithm worked, I'd be adding
>> sanity checks to it to test the assumptions underlying that algorithm.
>>
>> I would not be surprised if it uses logic that sees irqs_disabled() and
>> then assumes that all hard irqs are disabled, or something similar.

That is quite likely, given our past experience with core code 
assumptions about interrupts.

>> I don't think I'd be adding printk() because the BUG is intermittent and
>> printk() messes with timing.
>>
>
> Another way to tackle this is first find a reliable reproducer, then raise
> the IPL before before dispatching hard interrupts.
>
> The patch below was a forgotten experiment that I tried 3 years ago to
> deal with a similar failure. At the time I was using stress-ng to
> investigate a different issue, but maybe a workload like that could be a
> reproducer.
>
> ---
>
> After enabling CONFIG_DEBUG_SPINLOCK, I see the following failure on m68k.
>
> [ 4089.070000] BUG: spinlock trylock failure on UP on CPU#0, stress-ng/1341
> [ 4089.070000]  lock: 0x53d4a0, .magic: dead4ead, .owner: stress-ng/1341, .owner_cpu: 0
> [ 4089.070000] CPU: 0 PID: 1341 Comm: stress-ng Not tainted 5.14.0-multi-debug #2
> [ 4089.070000] Stack from 00a7dcc8:
> [ 4089.070000]         00a7dcc8 004da40f 004da40f 00000000 00a7dce4 0043e88e 004da40f 00a7dd10
> [ 4089.070000]         0006316c 004d240e 0053d4a0 dead4ead 00956bba 0000053d 00000000 0053d4a0
> [ 4089.070000]         00a7ddd8 0055e6ac 00a7dd28 000632b2 0053d4a0 004d2488 0005c81b 0053d458
> [ 4089.070000]         00a7dd34 0044780e 0053d4a0 00a7dd58 002b8694 0053d4a0 00000001 0000007f
> [ 4089.070000]         00860000 00000011 00528138 00528198 00a7dd7c 00069f6a 0000000d 00014200
> [ 4089.070000]         00528138 00a7dd78 00a7def0 00528138 00014200 00a7dd98 00069ff0 00528138
> [ 4089.070000] Call Trace: [<0043e88e>] dump_stack+0x10/0x16
> [ 4089.070000]  [<0006316c>] spin_dump+0x6c/0xc0
> [ 4089.070000]  [<000632b2>] do_raw_spin_trylock+0x32/0x80
> [ 4089.070000]  [<0005c81b>] init_dl_bandwidth+0x1b/0x80
> [ 4089.070000]  [<0044780e>] _raw_spin_trylock+0xe/0x40
> [ 4089.070000]  [<002b8694>] add_interrupt_randomness+0x154/0x1c0
> [ 4089.070000]  [<00069f6a>] handle_irq_event_percpu+0x2a/0x80
> [ 4089.070000]  [<00014200>] INV_L2+0x8/0x10
> [ 4089.070000]  [<00014200>] INV_L2+0x8/0x10
> [ 4089.070000]  [<00069ff0>] handle_irq_event+0x30/0x80
> [ 4089.070000]  [<0006d9e4>] handle_simple_irq+0x64/0x80
> [ 4089.070000]  [<000695fe>] generic_handle_irq+0x3e/0x80
> [ 4089.070000]  [<00002ea4>] do_IRQ+0x24/0x40
> [ 4089.070000]  [<00002da8>] user_irqvec_fixup+0xc/0x14
> [ 4089.070000]  [<0005c81a>] init_dl_bandwidth+0x1a/0x80
> [ 4089.070000]  [<000d51c0>] find_vma+0x0/0x80
> [ 4089.070000]  [<00444500>] down_read+0x0/0x200
> [ 4089.070000]  [<002b86b0>] add_interrupt_randomness+0x170/0x1c0
> [ 4089.070000]  [<00069f6a>] handle_irq_event_percpu+0x2a/0x80
> [ 4089.070000]  [<00069ff0>] handle_irq_event+0x30/0x80
> [ 4089.070000]  [<0006d9e4>] handle_simple_irq+0x64/0x80
> [ 4089.070000]  [<000695fe>] generic_handle_irq+0x3e/0x80
> [ 4089.070000]  [<00002ea4>] do_IRQ+0x24/0x40
> [ 4089.070000]  [<00002d74>] auto_irqhandler_fixup+0x4/0xc
> [ 4089.070000]  [<0000653c>] buserr_c+0x17c/0x600
> [ 4089.070000]  [<00002ba8>] buserr+0x20/0x28
> [ 4089.070000]  [<0008800d>] get_next_modinfo+0xcd/0x100
>
> The problem seems to be that add_interrupt_randomness() was re-entered,
> leading to a contended input_pool.lock and "spinlock trylock failure".

We already knew that add_interrupt_randomness() can be reentered, though 
there are now checks in place that avoid taking input_pool.lock in 
hardirq context. But these checks have been added less than three years 
ago.

>
> diff --git a/arch/m68k/kernel/irq.c b/arch/m68k/kernel/irq.c
> index 9ab4f550342e..b46d8a57f4da 100644
> --- a/arch/m68k/kernel/irq.c
> +++ b/arch/m68k/kernel/irq.c
> @@ -20,10 +20,13 @@
>  asmlinkage void do_IRQ(int irq, struct pt_regs *regs)
>  {
>  	struct pt_regs *oldregs = set_irq_regs(regs);
> +	unsigned long flags;
>
> +	local_irq_save(flags);
>  	irq_enter();
>  	generic_handle_irq(irq);
>  	irq_exit();
> +	local_irq_restore(flags);
>
>  	set_irq_regs(oldregs);
>  }
>

I notice legacy_timer_tick() states it must be called with interrupts 
disabled. We do not explicitly do that in via_timer_handler(), but rely 
on the IPL remaining high. Maybe adding local_irq_save(flags) / 
local_irq_restore(flags) around the legacy_timer_tick() call would help 
debug this?

Cheers,

	Michael

  reply	other threads:[~2024-03-08 20:14 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-04 17:58 spinlock recursion when running q800 emulation in qemu Guenter Roeck
2024-03-05  0:33 ` Finn Thain
2024-03-05  0:48   ` Michael Schmitz
     [not found]   ` <fcb506f2-523d-4efc-ae3d-fe3c79c6f09e@gmail.com>
2024-03-05  0:58     ` Guenter Roeck
2024-03-05  1:06       ` Michael Schmitz
2024-03-06  7:14 ` Michael Schmitz
2024-03-06  8:30   ` Brad Boyer
2024-03-06 23:13     ` Finn Thain
2024-03-06 23:46       ` Guenter Roeck
2024-03-07 23:35         ` Finn Thain
2024-03-06 23:42     ` Michael Schmitz
2024-03-06 23:52   ` Finn Thain
2024-03-08  0:20     ` Michael Schmitz
2024-03-08  0:56       ` Finn Thain
2024-03-08  8:06         ` Michael Schmitz
2024-03-08  9:15           ` Finn Thain
2024-03-08  9:33             ` Finn Thain
2024-03-08 20:14               ` Michael Schmitz [this message]
2024-03-09  5:02                 ` Finn Thain
2024-03-09 20:56                   ` Michael Schmitz
2024-03-09 22:18                     ` Finn Thain
2024-03-11  7:06                       ` Michael Schmitz
2024-03-11  8:35                         ` Finn Thain
2024-03-12  0:51                           ` Michael Schmitz
2024-03-12  7:59                             ` Geert Uytterhoeven
2024-03-12 20:14                               ` Michael Schmitz
2024-03-13  0:16                               ` Michael Schmitz
2024-03-13  4:39                                 ` Preemption (was: Re: spinlock recursion when running q800 emulation in qemu) Michael Schmitz
2024-03-13  4:40                                 ` spinlock recursion when running q800 emulation in qemu Finn Thain
2024-03-13  5:34                                   ` Michael Schmitz
2024-03-14  0:59                                   ` Michael Schmitz
2024-03-15  4:32                                     ` Michael Schmitz
2024-03-15  7:24                                       ` Finn Thain
2024-03-18  6:24                                         ` Michael Schmitz
2024-03-18  9:31                                           ` Finn Thain
2024-03-20  1:00                                             ` Michael Schmitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bd6eb981-0dfe-a9c9-7777-0cfa0703efc0@gmail.com \
    --to=schmitzmic@gmail.com \
    --cc=fthain@linux-m68k.org \
    --cc=geert@linux-m68k.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux@roeck-us.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox