linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [REGRESSION] bisected: perf: hang when using async-profiler caused by perf: Fix the POLL_HUP delivery breakage
@ 2025-10-11  8:31 Octavia Togami
  2025-10-13  2:34 ` Mi, Dapeng
  0 siblings, 1 reply; 5+ messages in thread
From: Octavia Togami @ 2025-10-11  8:31 UTC (permalink / raw)
  To: stable; +Cc: regressions, peterz, linux-kernel, linux-perf-users

Using async-profiler
(https://github.com/async-profiler/async-profiler/) on Linux
6.17.1-arch1-1 causes a complete hang of the CPU. This has been
reported by many people at https://github.com/lucko/spark/issues/530.
spark is a piece of software that uses async-profiler internally.

As seen in https://github.com/lucko/spark/issues/530#issuecomment-3339974827,
this was bisected to 18dbcbfabfffc4a5d3ea10290c5ad27f22b0d240 perf:
Fix the POLL_HUP delivery breakage. Reverting this commit on 6.17.1
fixed the issue for me.

Steps to reproduce:
1. Get a copy of async-profiler. I tested both v3 (affects older spark
versions) and v4.1 (latest at time of writing). Unarchive it, this is
<async-profiler-dir>.
2. Set kernel parameters kernel.perf_event_paranoid=1 and
kernel.kptr_restrict=0 as instructed by
https://github.com/async-profiler/async-profiler/blob/fb673227c7fb311f872ce9566769b006b357ecbe/docs/GettingStarted.md
3. Install a version of Java that comes with jshell, i.e. Java 9 or
newer. Note: jshell is used for ease of reproduction. Any Java
application that is actively running will work.
4. Run `printf 'int acc; while (true) { acc++; }' | jshell -`. This
will start an infinitely running Java process.
5. Run `jps` and take the PID next to the text RemoteExecutionControl
-- this is the process that was just started.
6. Attach async-profiler to this process by running
`<async-profiler-dir>/bin/asprof -d 1 <PID>`. This will run for one
second, then the system should freeze entirely shortly thereafter.

I triggered a sysrq crash while the system was frozen, and the output
I found in journalctl afterwards is at
https://gist.github.com/octylFractal/76611ee76060051e5efc0c898dd0949e
I'm not sure if that text is actually from the triggered crash, but it
seems relevant. If needed, please tell me how to get the actual crash
report, I'm not sure where it is.

I'm using an AMD Ryzen 9 5900X 12-Core Processor. Given that I've seen
no Intel reports, it may be AMD specific. I don't have an Intel CPU on
hand to test with.

/proc/version: Linux version 6.17.1-arch1-1 (linux@archlinux) (gcc
(GCC) 15.2.1 20250813, GNU ld (GNU Binutils) 2.45.0) #1 SMP
PREEMPT_DYNAMIC Mon, 06 Oct 2025 18:48:29 +0000
Operating System: Arch Linux
uname -mi: x86_64 unknown

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-10-13  8:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-11  8:31 [REGRESSION] bisected: perf: hang when using async-profiler caused by perf: Fix the POLL_HUP delivery breakage Octavia Togami
2025-10-13  2:34 ` Mi, Dapeng
2025-10-13  6:55   ` Octavia Togami
2025-10-13  8:05   ` Peter Zijlstra
2025-10-13  8:38     ` Mi, Dapeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).