* Question about perf sibling_list race problem
@ 2023-11-29 9:33 Zhengyuan Liu
2023-11-29 10:32 ` Peter Zijlstra
0 siblings, 1 reply; 3+ messages in thread
From: Zhengyuan Liu @ 2023-11-29 9:33 UTC (permalink / raw)
To: peterz, mingo
Cc: linux-perf-users, stable, 胡海, 刘云,
huangjinhui, Zhengyuan Liu, acme
Hi, all
We are encountering a perf related soft lockup as shown below:
[25023823.265138] watchdog: BUG: soft lockup - CPU#29 stuck for 45s!
[YD:3284696]
[25023823.275772] net_failover virtio_scsi failover
[25023823.276750] CPU: 29 PID: 3284696 Comm: YD Kdump: loaded Not
tainted 4.19.90-23.18.v2101.ky10.aarch64 #1
[25023823.278257] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[25023823.279475] pstate: 80400005 (Nzcv daif +PAN -UAO)
[25023823.280516] pc : perf_iterate_sb+0x1b8/0x1f0
[25023823.281530] lr : perf_iterate_sb+0x18c/0x1f0
[25023823.282529] sp : ffff801f282efbf0
[25023823.283446] x29: ffff801f282efbf0 x28: ffff801f207a8b80
[25023823.284551] x27: 0000000000000000 x26: ffff801f99b355e8
[25023823.285674] x25: 0000000000000000 x24: ffff8019e2fbd800
[25023823.286770] x23: ffff0000093f0018 x22: ffff801f282efc40
[25023823.287864] x21: ffff000008255f60 x20: ffff801ffdf58e80
[25023823.288964] x19: ffff8019f1c27800 x18: 0000000000000000
[25023823.290060] x17: 0000000000000000 x16: 0000000000000000
[25023823.291164] x15: 0400000000000000 x14: 0000000000000000
[25023823.292266] x13: ffff000008c6e340 x12: 0000000000000002
[25023823.293381] x11: ffff000008c6e318 x10: 00000019e5feff20
[25023823.294486] x9 : ffff8019fb49c000 x8 : 0058e6fd335b260e
[25023823.295597] x7 : 0000000100321ed8 x6 : ffff00003d083780
[25023823.296715] x5 : 00ffffffffffffff x4 : 0000801ff4ae0000
[25023823.297860] x3 : ffff801ffdf64cc0 x2 : ffff000009858758
[25023823.298977] x1 : 0000000000000000 x0 : ffff8019e2fbd800
[25023823.300090] Call trace:
[25023823.300962] perf_iterate_sb+0x1b8/0x1f0
[25023823.301961] perf_event_task+0x78/0x80
[25023823.302946] perf_event_exit_task+0xa4/0xb0
[25023823.303978] do_exit+0x38c/0x5d0
[25023823.304932] do_group_exit+0x3c/0xd8
[25023823.305904] get_signal+0x12c/0x740
[25023823.306859] do_signal+0x158/0x260
[25023823.307795] do_notify_resume+0xd8/0x358
[25023823.308781] work_pending+0x8/0x10
We got a vmcore by enable panic_on_soft_lockup, from the vmcore we
found the perf_event accessed through
perf_iterate_sb -> perf_iterate_sb_cpu -> event_filter_match ->
pmu_filter_match -> for_each_sibling_event
had been removed:
#define for_each_sibling_event(sibling, event) \
if ((event)->group_leader == (event)) \
list_for_each_entry((sibling), &(event)->sibling_list,
sibling_list)
#define list_for_each_entry(pos, head, member) \
for (pos = __container_of((head)->next, pos, member); \
&pos->member != (head); \
pos = __container_of(pos->member.next, pos, member))
crash> struct perf_event ffff8019e2fbd800
struct perf_event {
event_entry = {
next = 0xffff8019f1c27800,
prev = 0xdead000000000200
},
...
state = PERF_EVENT_STATE_DEAD,
...
}
By the way, we also found another process which is deleting sibling_list:
crash> bt 3284533
PID: 3284533 TASK: ffff801f901ae880 CPU: 16 COMMAND: "YD"
#0 [ffff801f8cd977f0] __switch_to at ffff000008088ba4
#1 [ffff801f8cd97810] __schedule at ffff000008bf10c4
#2 [ffff801f8cd97890] schedule at ffff000008bf17b0
#3 [ffff801f8cd978a0] schedule_timeout at ffff000008bf5b10
#4 [ffff801f8cd97960] wait_for_common at ffff000008bf2530
#5 [ffff801f8cd979f0] wait_for_completion at ffff000008bf2644
#6 [ffff801f8cd97a10] __wait_rcu_gp at ffff000008171c00
#7 [ffff801f8cd97a80] synchronize_sched at ffff000008179da8
#8 [ffff801f8cd97ad0] perf_trace_event_unreg at ffff000008216d50
#9 [ffff801f8cd97b00] perf_trace_destroy at ffff000008217148
#10 [ffff801f8cd97b20] tp_perf_event_destroy at ffff000008256ae0
#11 [ffff801f8cd97b30] _free_event at ffff00000825f21c
#12 [ffff801f8cd97b70] put_event at ffff00000825faf0
#13 [ffff801f8cd97b80] perf_event_release_kernel at ffff00000825fcb8
#14 [ffff801f8cd97be0] perf_release at ffff00000825fdbc
#15 [ffff801f8cd97bf0] __fput at ffff00000832f0b8
#16 [ffff801f8cd97c30] ____fput at ffff00000832f28c
#17 [ffff801f8cd97c50] task_work_run at ffff00000810f8c8
#18 [ffff801f8cd97c90] do_exit at ffff0000080ef458
#19 [ffff801f8cd97cf0] do_group_exit at ffff0000080ef738
#20 [ffff801f8cd97d20] get_signal at ffff0000080fdde0
#21 [ffff801f8cd97d90] do_signal at ffff00000808e488
#22 [ffff801f8cd97e80] do_notify_resume at ffff00000808e7f4
#23 [ffff801f8cd97ff0] work_pending at ffff000008083f60
So it's reasonable to suspect that perf_iterate_sb is traversing
sibling_list while another
process is deleting it which eventually caused for_each_sibling_event
to endless loop and thus soft lockup.
The race scenario thus could be this:
CPU 29: CPU 16:
perf_event_release_kernel
--> mutex_lock(&ctx->mutex)
--> perf_remove_from_context
--> perf_group_detach(event);
for_each_sibling_event() -->
list_del_init(&event->sibling_list)
As commit f3c0eba287049(“perf: Add a few assertions”)said:
“Notable for_each_sibling_event() relies on exclusion from
modification. This would normally be holding either ctx->lock or
ctx->mutex, however due to how things are constructed disabling IRQs
is a valid and sufficient substitute for ctx->lock.”, we think it's
necessary to hold ctx ->mutex, but currently LTS such as 4.19,5.4,5.10,
and 6.1 all does not do so:
perf_event_task
--> perf_iterate_sb
--> perf_iterate_sb_cpu
--> event_filter_match
--> pmu_filter_match
--> for_each_sibling_event
commit bd27568117664(“perf: Rewrite core context handling”)had removed
the pmu_filter_match operation, so it may be a temporary workaround
for this issue.
But it's necessary to confirm if there is a race problem between
sibling_list, and if it is, how
to fix currently LTS branches.
Thanks in advance.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Question about perf sibling_list race problem
2023-11-29 9:33 Question about perf sibling_list race problem Zhengyuan Liu
@ 2023-11-29 10:32 ` Peter Zijlstra
2023-12-05 11:57 ` Zhengyuan Liu
0 siblings, 1 reply; 3+ messages in thread
From: Peter Zijlstra @ 2023-11-29 10:32 UTC (permalink / raw)
To: Zhengyuan Liu
Cc: mingo, linux-perf-users, stable, 胡海,
刘云, huangjinhui, Zhengyuan Liu, acme
On Wed, Nov 29, 2023 at 05:33:33PM +0800, Zhengyuan Liu wrote:
> Hi, all
>
> We are encountering a perf related soft lockup as shown below:
>
> [25023823.265138] watchdog: BUG: soft lockup - CPU#29 stuck for 45s!
> [YD:3284696]
> [25023823.275772] net_failover virtio_scsi failover
> [25023823.276750] CPU: 29 PID: 3284696 Comm: YD Kdump: loaded Not
> tainted 4.19.90-23.18.v2101.ky10.aarch64 #1
^^^^^^^^^^^^^^^^^^^
That is some unholy ancient kernel. Please see if you can reproduce on
something recent.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Question about perf sibling_list race problem
2023-11-29 10:32 ` Peter Zijlstra
@ 2023-12-05 11:57 ` Zhengyuan Liu
0 siblings, 0 replies; 3+ messages in thread
From: Zhengyuan Liu @ 2023-12-05 11:57 UTC (permalink / raw)
To: Peter Zijlstra
Cc: mingo, linux-perf-users, stable, 胡海,
刘云, huangjinhui, Zhengyuan Liu, acme
On Wed, Nov 29, 2023 at 6:32 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Nov 29, 2023 at 05:33:33PM +0800, Zhengyuan Liu wrote:
> > Hi, all
> >
> > We are encountering a perf related soft lockup as shown below:
> >
> > [25023823.265138] watchdog: BUG: soft lockup - CPU#29 stuck for 45s!
> > [YD:3284696]
> > [25023823.275772] net_failover virtio_scsi failover
> > [25023823.276750] CPU: 29 PID: 3284696 Comm: YD Kdump: loaded Not
> > tainted 4.19.90-23.18.v2101.ky10.aarch64 #1
> ^^^^^^^^^^^^^^^^^^^
>
> That is some unholy ancient kernel. Please see if you can reproduce on
> something recent.
Sorry for the late reply since my company mail server has some trouble.
I don't have a reproducer, It's an online server and happens once
every few months.
From our analysis, the recent kernel shouldn't have this problem after commit
bd27568117664(“perf: Rewrite core context handling”). But LTS branches such as
v4.19 and v5.4 will be used for a long time, so I think it's worth
fixing this problem.
Thanks,
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-12-05 11:57 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-29 9:33 Question about perf sibling_list race problem Zhengyuan Liu
2023-11-29 10:32 ` Peter Zijlstra
2023-12-05 11:57 ` Zhengyuan Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox