public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: srinivas pandruvada <srinivas.pandruvada@linux.intel.com>
To: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>,
	LKML <linux-kernel@vger.kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Thorsten Leemhuis <regressions@leemhuis.info>
Subject: Re: INFO: rcu_preempt detected expedited stalls on CPUs/tasks (6.1.0-rc3): in cat /sys/kernel/debug/kmemleak
Date: Fri, 04 Nov 2022 16:00:34 -0700	[thread overview]
Message-ID: <5372edc0b0a45de9286a2d4948e9858dcb9741fc.camel@linux.intel.com> (raw)
In-Reply-To: <e048a5c5-dea5-a31a-1022-1b007e022adb@alu.unizg.hr>


On Fri, 2022-11-04 at 23:01 +0100, Mirsad Goran Todorovac wrote:
> Dear all,
> 
> When investigating thermald kmemleak, it occurred that the "cat 
> /sys/kernel/debug/kmemleak"
> and "tail -20 /sys/kernel/debug/kmemleak" commands take unusual
> amount 
> of time.
> 
This memleak occurred because of bad element in the BIOS configuration
of _OSC method.
Unfortunately fixing this in ACPI interpreter can have other side
effects as this is not expected. So the ACPICA team
(https://acpica.org/community) is looking further.

Thanks,
Srinivas

> Dmesg output showed expedited stalls that the commands caused NMIs
> and 
> NMI backtraces:
> 
> [ 8123.263464] rcu: INFO: rcu_preempt detected expedited stalls on 
> CPUs/tasks: { 0-.... } 26 jiffies s: 3829 root: 0x1/.
> [ 8123.263500] rcu: blocking rcu_node structures (internal RCU
> debug):
> [ 8123.263508] Sending NMI from CPU 7 to CPUs 0:
> [ 8123.263528] NMI backtrace for cpu 0
> [ 8123.263539] CPU: 0 PID: 27898 Comm: cat Not tainted 6.1.0-rc3 #1
> [ 8123.263552] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN34WW 
> 03/08/2022
> [ 8123.263557] RIP: 0010:kmemleak_seq_start+0x41/0x80
> [ 8123.263579] Code: 55 04 a6 00 4c 63 e0 85 c0 78 40 e8 b9 80 db ff
> 48 
> 8b 05 92 fb 88 01 4c 8d 60 f8 48 3d 30
>                                      62 63 92 75 17 eb 32 49 8b 44 24
> 08 
> <48> 83 eb 01 4c 8d 60 f8 48 3d 30 62 63 92 74 1d
>                                      48 85 db 7f e6 4c
> [ 8123.263588] RSP: 0018:ffff9968e400fc30 EFLAGS: 00000206
> [ 8123.263598] RAX: ffff8963b7005b58 RBX: 0000000000011cb8 RCX: 
> 0000000000000001
> [ 8123.263604] RDX: ffff8963b09c4000 RSI: ffff8963856de028 RDI: 
> ffffffff926361c0
> [ 8123.263608] RBP: ffff9968e400fc40 R08: 0000000000020000 R09: 
> ffff896380592a80
> [ 8123.263613] R10: 0000000000020000 R11: 0000000000000000 R12: 
> ffff8964114c2390
> [ 8123.263617] R13: ffff89639fa25b00 R14: ffff8963856de000 R15: 
> ffff9968e400fe30
> [ 8123.263622] FS:  00007ff217c15740(0000) GS:ffff896528800000(0000) 
> knlGS:0000000000000000
> [ 8123.263630] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 8123.263636] CR2: 00007f19bd6a8000 CR3: 000000028e392001 CR4: 
> 0000000000770ef0
> [ 8123.263642] PKRU: 55555554
> [ 8123.263646] Call Trace:
> [ 8123.263649]  <TASK>
> [ 8123.263656]  seq_read_iter+0x169/0x420
> [ 8123.263671]  seq_read+0xad/0xe0
> [ 8123.263685]  full_proxy_read+0x59/0x90
> [ 8123.263701]  vfs_read+0xb2/0x2e0
> [ 8123.263718]  ksys_read+0x61/0xe0
> [ 8123.263730]  __x64_sys_read+0x1a/0x20
> [ 8123.263741]  do_syscall_64+0x58/0x80
> [ 8123.263754]  ? do_syscall_64+0x67/0x80
> [ 8123.263767]  ? exit_to_user_mode_prepare+0x15d/0x190
> [ 8123.263781]  ? syscall_exit_to_user_mode+0x1b/0x30
> [ 8123.263791]  ? do_syscall_64+0x67/0x80
> [ 8123.263804]  ? syscall_exit_to_user_mode+0x1b/0x30
> [ 8123.263813]  ? do_syscall_64+0x67/0x80
> [ 8123.263823]  ? do_syscall_64+0x67/0x80
> [ 8123.263833]  ? do_syscall_64+0x67/0x80
> [ 8123.263844]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [ 8123.263857] RIP: 0033:0x7ff217914992
> [ 8123.263867] Code: c0 e9 b2 fe ff ff 50 48 8d 3d fa b2 0c 00 e8 c5
> 1d 
> 02 00 0f 1f 44 00 00 f3 0f
>                                      1e fa 64 8b 04 25 18 00 00 00 85
> c0 
> 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56
>                                      c3 0f 1f 44 00 00 48 83 ec 28 48
> 89 
> 54 24
> [ 8123.263875] RSP: 002b:00007ffdfcbc3e28 EFLAGS: 00000246 ORIG_RAX: 
> 0000000000000000
> [ 8123.263894] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 
> 00007ff217914992
> [ 8123.263899] RDX: 0000000000020000 RSI: 00007ff217b9d000 RDI: 
> 0000000000000003
> [ 8123.263904] RBP: 00007ff217b9d000 R08: 00007ff217b9c010 R09: 
> 00007ff217b9c010
> [ 8123.263909] R10: 0000000000000022 R11: 0000000000000246 R12: 
> 0000000000022000
> [ 8123.263914] R13: 0000000000000003 R14: 0000000000020000 R15: 
> 0000000000020000
> [ 8123.263927]  </TASK>
> 
> To reproduce:
> 
> Enable CONFIG_DEBUG_KMEMLEAK=y in linux_stable 6.1.0-rc3 build.
> 
> Then, a stress test on a service known from the previous report is
> required:
> 
> for a in {1..1000}; doRIP: 0010:kmemleak_seq_start+0x41/0x80
>           echo $a
>           systemctl stop thermald
>           sleep 0.5
>           systemctl start thermald
>           sleep 0.5
> done
> 
> After that, /sys/kernel/debug/kmemleak indicated 1413 unreferenced
> objects.
> 
> However, dmesg had shown the stalls on CPUs while executing "cat" or 
> "tail -40" of /sys/kernel/debug/kmemleak.
> 
> I've read the 
> https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt, but I was
> unable to understand
> what is going on except to locate the stall to "RIP: 
> 0010:kmemleak_seq_start+0x41/0x80" and probable
> maintainer/developer.
> 
> Please find attached the kernel build config, the output of "cat 
> /sys/kernel/debug/kmemleak" and
> dmesg ouput about the expedited stalls. The number of jiffies is not
> so 
> large, but IMHO they possibly indicate
> greater problems or deadlocks in RCU code in kmemleak.
> 
> NOTE:
> Please add:
> 
> Reported-By: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
> 
> if (or when) the bug is fixed.
> 
> Thank you.
> 
> -mt
> 
> --
> Mirsad Goran Todorovac
> Sistem inženjer
> Grafički fakultet | Akademija likovnih umjetnosti
> Sveučilište u Zagrebu


  reply	other threads:[~2022-11-04 23:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-04 22:01 INFO: rcu_preempt detected expedited stalls on CPUs/tasks (6.1.0-rc3): in cat /sys/kernel/debug/kmemleak Mirsad Goran Todorovac
2022-11-04 23:00 ` srinivas pandruvada [this message]
     [not found]   ` <32da7e3f-8608-29a9-3787-1d4c3315d235@alu.unizg.hr>
2022-11-07  9:53     ` Mirsad Goran Todorovac
2022-11-08 14:21     ` INFO: BISECTED: " Mirsad Goran Todorovac
2022-11-09 10:48 ` INFO: rcu_preempt detected expedited stalls on CPUs/tasks (6.1.0-rc3): in cat /sys/kernel/debug/kmemleak #forregzbot Thorsten Leemhuis
2022-11-29  8:53   ` Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5372edc0b0a45de9286a2d4948e9858dcb9741fc.camel@linux.intel.com \
    --to=srinivas.pandruvada@linux.intel.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mirsad.todorovac@alu.unizg.hr \
    --cc=regressions@leemhuis.info \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox