public inbox for dmaengine@vger.kernel.org
 help / color / mirror / Atom feed
* [ptdma] pt_core_execute_cmd() from interrupt context results in panic
@ 2022-12-28  9:41 Eric Pilmore
  2023-01-17 18:00 ` Eric Pilmore
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Pilmore @ 2022-12-28  9:41 UTC (permalink / raw)
  To: Mehta, Sanju, Vinod, dmaengine; +Cc: Eric Pilmore

Wondering if this might be a known issue in the ptdma DMA driver. Did
not see anything obvious in bugzilla.

I am doing some testing of the ntb_netdev module in conjunction with
the ptdma module as the supporting DMA engines on an AMD Rome CPU
based platform. The ptdma driver being used is the latest code in the
Linux (6.2) repository.

There are no issues in doing simple ping operations across the
ntb_netdev (TCP/IP) interface, including sending large packets which
we know will cause the respective DMA engines to be utilized. However,
while doing iperf testing across the ntb_netdev interface, we have
encountered a panic:

[ 1626.776583] RIP: 0010:mutex_spin_on_owner+0x3b/0xa0
....
[ 1626.776588] Call Trace:
[ 1626.776588]  <IRQ>
[ 1626.776589]  __mutex_lock.isra.7+0xad/0x4c0
[ 1626.776589]  ? ntb_transport_rx_enqueue+0x127/0x200 [ntb_transport]
[ 1626.776589]  __mutex_lock_slowpath+0x13/0x20
[ 1626.776590]  ? __mutex_lock_slowpath+0x13/0x20
[ 1626.776590]  mutex_lock+0x2f/0x40
[ 1626.776590]  pt_core_perform_passthru+0xc5/0x160 [ptdma]
[ 1626.776591]  pt_cmd_callback.part.7+0x262/0x2d0 [ptdma]
[ 1626.776591]  pt_cmd_callback+0x13/0x20 [ptdma]
[ 1626.776591]  pt_check_status_trans+0xc3/0x120 [ptdma]
[ 1626.776592]  pt_core_irq_handler+0x36/0x60 [ptdma]
[ 1626.776592]  __handle_irq_event_percpu+0x44/0x1a0
[ 1626.776592]  handle_irq_event_percpu+0x32/0x80
[ 1626.776593]  handle_irq_event+0x3b/0x60
[ 1626.776593]  handle_edge_irq+0x83/0x1a0
[ 1626.776593]  handle_irq+0x20/0x30
[ 1626.776593]  do_IRQ+0x50/0xe0
[ 1626.776594]  common_interrupt+0xf/0xf

The issue is that the ptdma handlers are getting called in interrupt
context, and ultimately the flow leads to pt_core_execute_cmd() which
will attempt to grab a mutex, which is really not appropriate in
interrupt context. I have temporarily changed the lock in question to
a spinlock, which seems to have resolved the issue. However, I don't
know enough about the ptdma driver to really know if this is the
desired repair.

Hoping that others with more knowledge in this driver might be able to
comment as to the validity of this bug and whether a spinlock is the
correct approach here. If it is, I would be happy to submit a patch,
otherwise I can just file a bugzilla for the module owner to make a
more appropriate fix.

Thanks for any advice.

Eric Pilmore

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-01-18  7:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-28  9:41 [ptdma] pt_core_execute_cmd() from interrupt context results in panic Eric Pilmore
2023-01-17 18:00 ` Eric Pilmore
2023-01-18  6:50   ` Vinod Koul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox