Linux Tegra architecture development
 help / color / mirror / Atom feed
From: Vishwaroop A <va@nvidia.com>
To: Thierry Reding <thierry.reding@kernel.org>,
	Jon Hunter <jonathanh@nvidia.com>,
	Mark Brown <broonie@kernel.org>
Cc: Vishwaroop A <va@nvidia.com>,
	Laxman Dewangan <ldewangan@nvidia.com>,
	Sowjanya Komatineni <skomatineni@nvidia.com>,
	Breno Leitao <leitao@debian.org>,
	Suresh Mangipudi <smangipudi@nvidia.com>,
	"Krishna Yarlagadda" <kyarlagadda@nvidia.com>,
	<linux-tegra@vger.kernel.org>, <linux-spi@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: [PATCH v4 0/3] spi: tegra210-quad: Improve interrupt handling for loaded systems
Date: Wed, 10 Jun 2026 06:23:59 +0000	[thread overview]
Message-ID: <cover.1781037385.git.va@nvidia.com> (raw)

The current threaded IRQ implementation in spi-tegra210-quad suffers from
scheduler-induced latency on heavily loaded systems. Because threaded IRQ
handlers are subject to CFS scheduling, they can be delayed long enough to
trigger transfer timeouts even though hardware completes in microseconds.
This results in false timeout errors and WARN_ON splats during normal
operation.

This series addresses the problem in three steps:

1. Convert the threaded IRQ handler to a hard IRQ + high-priority unbound
   workqueue model. The hard IRQ does the minimum: capture FIFO status,
   mask and clear the controller IRQ, then schedule the bottom half. The
   workqueue handler runs in process context (can sleep for DMA
   completion) and can execute on any CPU, avoiding the CPU0 bottleneck
   inherent in threaded IRQs.

2. Cache QSPI_TRANS_STATUS in the ISR before clearing it. This lets the
   timeout handler distinguish between a real hardware timeout (QSPI_RDY
   not set) and a delayed workqueue (QSPI_RDY set), preventing false
   timeout errors when hardware has already completed. Pair the cache
   publication with smp_store_release()/smp_load_acquire() so the timeout
   handler observes a coherent set of cached fields on weakly-ordered
   architectures.

3. Process small PIO transfers (those that complete the whole spi_transfer
   in a single chunk) directly in hard IRQ context, eliminating workqueue
   scheduling latency for TPM-style short reads.

Changes since v3:

  Patch 1 ("Convert to hard IRQ with high-priority workqueue"):
    - Dropped IRQF_SHARED. Tegra QSPI uses a dedicated GIC SPI line on
      every SoC that uses this driver, so the ISR does not need a
      runtime PM reference. Addresses Mark Brown's "Since we now have
      IRQF_SHARED we need to take a runtime PM reference here" comment.
    - Switched from devm_request_irq() to plain request_irq() in
      probe() and added explicit free_irq() in remove(), in the order:
      spi_unregister_controller -> free_irq -> destroy_workqueue ->
      pm_runtime_dont_use_autosuspend -> pm_runtime_force_suspend ->
      tegra_qspi_deinit_dma. Addresses Mark's "devm + non-devm mix
      seems likely to be racy" comments on probe() and remove().
    - Removed the tegra_qspi_unmask_irq() call from the work_handler
      NULL-bail path. Addresses Mark's "unmask after dropping the lock
      feels like it opens up races" comment. The next transfer's setup
      path re-arms the IRQ.
    - Snapshot tqspi->curr_xfer under tqspi->lock at the top of
      handle_dma_based_xfer() so the (potentially long) DMA waits
      operate on a stable transfer pointer even if the timeout path
      clears curr_xfer concurrently. Integrates cleanly with Breno
      Leitao's recently merged protect-curr_xfer series.

  Patch 2 ("Cache TRANS_STATUS in ISR for timeout handler"):
    - Cached status_reg / tx_status / rx_status / trans_status are now
      published with WRITE_ONCE() and smp_store_release() in the ISR
      and consumed with smp_load_acquire() in
      tegra_qspi_handle_timeout(). This keeps KCSAN clean across the
      hard-IRQ / process-context boundary.
    - Reset the cached trans_status to zero in
      tegra_qspi_setup_transfer_one() (under tqspi->lock) before
      unmasking the IRQ for the new transfer, so a stale RDY bit from
      the previous transfer cannot mislead the timeout handler.
    - Dropped the WARN_ON() / WARN_ON_ONCE() wrappers around the
      wait_for_completion_timeout() calls in
      tegra_qspi_{combined,non_combined}_seq_xfer(). With the new
      recovery path in place a timeout is no longer a programming
      error; the existing dev_warn_ratelimited("QSPI interrupt
      timeout, but transfer complete") and dev_err("transfer timeout")
      diagnostics are preserved.

  Patch 3 ("Process small PIO transfers in hard IRQ context"):
    - Replaced the previous "curr_dma_words <= QSPI_FIFO_DEPTH" check
      with a tqspi->is_last_pio_chunk scalar computed in
      tegra_qspi_start_cpu_based_transfer() before it unmasks the IRQ.
      Addresses Mark Brown's "Is cur_dma_words always in the same
      units as QSPI_FIFO_DEPTH - I see there's packed transfer support
      in the driver?" comment: the scalar comparison is units-correct
      (cur_pos + curr_dma_words * bytes_per_word >= t->len) and the
      hard-IRQ fastpath no longer dereferences the spi_transfer
      object, so it stays safe against any teardown race that could
      clear curr_xfer concurrently.
    - Fastpath additionally gates on tx_status == 0 && rx_status == 0
      because handle_cpu_based_xfer()'s error path calls
      tegra_qspi_handle_error() -> device_reset(), which can sleep and
      must not run from hard IRQ context.
    - is_curr_dma_xfer and is_last_pio_chunk are written from process
      context (the transfer-start functions) and read lock-free from
      the hard IRQ handler and the workqueue handler, so the writers
      use WRITE_ONCE() and the readers use READ_ONCE() to prevent
      compiler tearing and silence KCSAN.

Changes since v2:
  - Added cancel_work_sync() in remove to flush pending work before
    devm tore down the workqueue (Jon Hunter). v4 has replaced devm
    altogether per Mark Brown's comment, so the explicit teardown now
    relies on free_irq() preventing new work being queued, followed by
    destroy_workqueue() draining what is in-flight.
  - Rewrote patch 2 commit message to describe the race in terms of
    the workqueue model rather than referencing the old threaded IRQ
    (Jon).
  - s/NULLed/cleared/ in code comment (Jon).

Changes since v1:
  - Switched to devm_alloc_workqueue() and devm_request_irq() for
    resource management (Jon Hunter). v4 has since reverted to
    non-devm for the IRQ and workqueue per Mark Brown's review, so
    teardown order can be made explicit.
  - Improved patch 2 commit message to explain the timeout race
    scenario and clarify that the issue pre-exists the workqueue
    conversion (Jon).
  - Removed unnecessary local variable in tegra_qspi_handle_timeout
    (Jon).
  - Moved "workqueue was delayed" comment updates from patch 2 to
    patch 1, since patch 1 introduces the workqueue (Jon).

Tested on TH500 with TPM and QSPI flash devices under sustained CPU
stress (stress-ng), and on Tegra234 (P3960 / P3970) under similar
load. With the v4 series applied, transfers complete correctly without
spurious timeout errors or WARN_ON splats.

Vishwaroop A (3):
  spi: tegra210-quad: Convert to hard IRQ with high-priority workqueue
  spi: tegra210-quad: Cache TRANS_STATUS in ISR for timeout handler
  spi: tegra210-quad: Process small PIO transfers in hard IRQ context

 drivers/spi/spi-tegra210-quad.c | 280 +++++++++++++++++++++++++-------
 1 file changed, 221 insertions(+), 59 deletions(-)

-- 
2.17.1


             reply	other threads:[~2026-06-10  6:24 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-10  6:23 Vishwaroop A [this message]
2026-06-10  6:24 ` [PATCH v4 1/3] spi: tegra210-quad: Convert to hard IRQ with high-priority workqueue Vishwaroop A
2026-06-10  6:24 ` [PATCH v4 2/3] spi: tegra210-quad: Cache TRANS_STATUS in ISR for timeout handler Vishwaroop A
2026-06-10  6:24 ` [PATCH v4 3/3] spi: tegra210-quad: Process small PIO transfers in hard IRQ context Vishwaroop A

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1781037385.git.va@nvidia.com \
    --to=va@nvidia.com \
    --cc=broonie@kernel.org \
    --cc=jonathanh@nvidia.com \
    --cc=kyarlagadda@nvidia.com \
    --cc=ldewangan@nvidia.com \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-spi@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=skomatineni@nvidia.com \
    --cc=smangipudi@nvidia.com \
    --cc=thierry.reding@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox