public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/3] spi: tegra210-quad: Improve timeout handling under high system load
@ 2025-10-28 15:57 Vishwaroop A
  2025-10-28 15:57 ` [PATCH v5 1/3] spi: tegra210-quad: Fix timeout handling Vishwaroop A
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Vishwaroop A @ 2025-10-28 15:57 UTC (permalink / raw)
  To: Mark Brown, Thierry Reding, Jonathan Hunter, Sowjanya Komatineni,
	Laxman Dewangan, smangipudi, kyarlagadda
  Cc: Vishwaroop A, linux-spi, linux-tegra, linux-kernel

Hi,

This patch series addresses timeout handling issues in the Tegra QSPI driver
that occur under high system load conditions. We've observed that when CPUs
are saturated (due to error injection, RAS firmware activity, or general CPU
contention), QSPI interrupt handlers can be delayed, causing spurious transfer
failures even though the hardware completed the operation successfully.

Patch 1 fixes a stale pointer issue by ensuring curr_xfer is cleared on timeout
and checked when the IRQ thread finally runs. It also ensures interrupts are
properly cleared on failure paths.

Patch 2 refactors the timeout cleanup code into dedicated helper functions
(tegra_qspi_reset, tegra_qspi_dma_stop, tegra_qspi_pio_stop) to improve code
readability and maintainability. This is purely a code reorganization with no
functional changes.

Patch 3 adds hardware status checking on timeout. Before failing a transfer,
the driver now reads QSPI_TRANS_STATUS to verify if the hardware actually
completed the operation. If so, it manually invokes the completion handler
instead of failing the transfer. This distinguishes genuine hardware timeouts
from delayed/lost interrupts.

These changes have been tested in production environments under various high
load scenarios including RAS testing and CPU saturation workloads.

Changes in v5:
- No code changes, rebased to resolve conflicts

Changes in v4:
- Removed Change-Id from commit messages

Changes in v3:
- Added missing tqspi->curr_xfer = NULL assignment in handle_cpu_based_xfer()
- Split the previous patch 2/2 into two separate patches (now 2/3 and 3/3)
- Patch 2/3: New patch - refactoring only, no functional changes
- Patch 3/3: Functional changes to add hardware timeout checking

Changes in v2:
- Fixed indentation in patch 1/2: The "Reset controller if timeout happens"
  block now has correct indentation (inside the WARN_ON_ONCE block)
- No functional changes

Thierry Reding (1):
  spi: tegra210-quad: Fix timeout handling

Vishwaroop A (2):
  spi: tegra210-quad: Refactor error handling into helper functions
  spi: tegra210-quad: Check hardware status on timeout

 drivers/spi/spi-tegra210-quad.c | 174 +++++++++++++++++++++++---------
 1 file changed, 128 insertions(+), 46 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-11-12 14:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-28 15:57 [PATCH v5 0/3] spi: tegra210-quad: Improve timeout handling under high system load Vishwaroop A
2025-10-28 15:57 ` [PATCH v5 1/3] spi: tegra210-quad: Fix timeout handling Vishwaroop A
2025-11-12 14:39   ` Breno Leitao
2025-10-28 15:57 ` [PATCH v5 2/3] spi: tegra210-quad: Refactor error handling into helper functions Vishwaroop A
2025-11-03 14:15   ` Thierry Reding
2025-10-28 15:57 ` [PATCH v5 3/3] spi: tegra210-quad: Check hardware status on timeout Vishwaroop A
2025-11-03 14:16   ` Thierry Reding
2025-11-06 10:06 ` [PATCH v5 0/3] spi: tegra210-quad: Improve timeout handling under high system load Jon Hunter
2025-11-06 11:34 ` Mark Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox