public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* BUG:  ib_mad ftrace event unsupported migration
@ 2022-11-02 10:49 Leonid Ravich
  2022-11-02 11:44 ` Steven Rostedt
  0 siblings, 1 reply; 14+ messages in thread
From: Leonid Ravich @ 2022-11-02 10:49 UTC (permalink / raw)
  To: linux-trace-kernel@vger.kernel.org, mingo@redhat.com,
	rostedt@goodmis.org, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: Yigal Korman

after enabling ib_mad events hit this warning 

echo 1 > /sys/kernel/debug/tracing/events/ib_mad/enable

WARNING: CPU: 0 PID: 1888000 at kernel/trace/ring_buffer.c:2492 rb_commit+0xc1/0x220
CPU: 0 PID: 1888000 Comm: kworker/u9:0 Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-305.3.1.el8.x86_64 #1
l: Hardware name: Red Hat KVM, BIOS 1.13.0-2.module_el8.3.0+555+a55c8938 04/01/2014
l: Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
l: RIP: 0010:rb_commit+0xc1/0x220
l: Code: 00 00 80 fa 1f 0f 84 ec 00 00 00 48 01 87 b8 00 00 00 48 8b 87 98 00 00 00 48 85 c0 0f 85 72 ff ff ff 48 8b 47 08 f0 ff 40 08 <0f> 0b c3 48 8b 00 48 89 87 b8 00 00 00 e9 48 ff ff ff 48 8b 47 48
l: RSP: 0000:ffffa8ac80f9bca0 EFLAGS: 00010202
l: RAX: ffff8951c7c01300 RBX: ffff8951c7c14a00 RCX: 0000000000000246
l: RDX: ffff8951c707c000 RSI: ffff8951c707c57c RDI: ffff8951c7c14a00
l: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
l: R10: ffff8951c7c01300 R11: 0000000000000001 R12: 0000000000000246
l: R13: 0000000000000000 R14: ffffffff964c70c0 R15: 0000000000000000
l: FS:  0000000000000000(0000) GS:ffff8951fbc00000(0000) knlGS:0000000000000000
l: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
l: CR2: 00007f20e8f39010 CR3: 000000002ca10005 CR4: 0000000000170ef0
l: Call Trace:
l:  ring_buffer_unlock_commit+0x1d/0xa0
l:  trace_buffer_unlock_commit_regs+0x3b/0x1b0
l:  trace_event_buffer_commit+0x67/0x1d0
l:  trace_event_raw_event_ib_mad_recv_done_handler+0x11c/0x160 [ib_core]
l:  ib_mad_recv_done+0x48b/0xc10 [ib_core]
l:  ? trace_event_raw_event_cq_poll+0x6f/0xb0 [ib_core]
l:  __ib_process_cq+0x91/0x1c0 [ib_core]
l:  ib_cq_poll_work+0x26/0x80 [ib_core]
l:  process_one_work+0x1a7/0x360
l:  ? create_worker+0x1a0/0x1a0
l:  worker_thread+0x30/0x390
l:  ? create_worker+0x1a0/0x1a0
l:  kthread+0x116/0x130
l:  ? kthread_flush_work_fn+0x10/0x10
l:  ret_from_fork+0x35/0x40
l: ---[ end trace 78ba8509d3830a16 ]---

warning originated from below code (kernel/trace/ring_buffer.c)

rb_end_commit()
{
...
if (RB_WARN_ON(cpu_buffer,
                !local_read(&cpu_buffer->committing)))
...
}

after investigation  looks like there is a broken assumption of rb_end_commit and rb_start_commit are on same CPU (no migration) during trace 
looking in ib_mad trace  in include/trace/events/ib_mad.h there is a call to create_mad_addr_info during  TP assign which sleeps (at least under mlx5 ib driver)

so the scenario looks :

rb_start_commit - buffer 0xffffa0984777e400  CPU 1 pid 1368
switch from 1368 to 1605
thread 1368 () is migrating from 1 to 0
rb_end_commit - buffer 0xffffa09847c14a00  CPU 0 pid 1368

before starting throwing some patch into the the air  I would like to align with you the approach we should take here. 

my suggestion here : 
- ftrace infra should verify no migration happen  (end and start happens on same CPU)  in case not we will  throw warning for the issue  .
- ftrace users will be responsible to avoid migration during TP assign (RDMA in my case)

please let me know what do you think 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-11-03 16:33 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-02 10:49 BUG: ib_mad ftrace event unsupported migration Leonid Ravich
2022-11-02 11:44 ` Steven Rostedt
2022-11-02 13:02   ` Jason Gunthorpe
2022-11-02 14:02     ` Leonid Ravich
2022-11-02 14:04       ` Jason Gunthorpe
2022-11-02 14:17         ` Steven Rostedt
2022-11-02 14:24           ` Jason Gunthorpe
2022-11-02 15:59             ` Steven Rostedt
2022-11-02 16:01               ` Jason Gunthorpe
2022-11-02 20:01               ` Leonid Ravich
2022-11-02 22:19                 ` Steven Rostedt
2022-11-03 12:22                   ` Leonid Ravich
2022-11-03 16:32                     ` Steven Rostedt
2022-11-02 14:20       ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox