All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vishwaroop A <va@nvidia.com>
To: <leitao@debian.org>
Cc: <thierry.reding@gmail.com>, <treding@nvidia.com>,
	<jonathanh@nvidia.com>, <skomatineni@nvidia.com>,
	<ldewangan@nvidia.com>, <broonie@kernel.org>,
	<linux-spi@vger.kernel.org>, <linux-tegra@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <kernel-team@meta.com>,
	<puranjay@kernel.org>, <usamaarif642@gmail.com>,
	Vishwaroop A <va@nvidia.com>
Subject: Re: [PATCH 0/6] spi: tegra-qspi: Fix race condition causing NULL pointer dereference and spurious IRQ
Date: Wed, 21 Jan 2026 17:56:17 +0000	[thread overview]
Message-ID: <20260121000000.0000000-1-va@nvidia.com> (raw)
In-Reply-To: <aW-seUXIJv4Lz7bK@gmail.com>

Hi Breno,

After reviewing Mark Brown's feedback and the code carefully, let me clarify the 
correct logic. This is important to get right.

**IRQ Handler Semantics (per Mark Brown):**
- IRQ_NONE = interrupt was NOT from this device
- IRQ_HANDLED = interrupt WAS from this device (regardless of whether we fully processed it)

**The QSPI_RDY Bit:**
This bit in QSPI_TRANS_STATUS is set by hardware when a transfer completes and triggers 
the interrupt. Software clears it by writing 1.

**Why Your Original v1 Logic is Correct:**

Your "[PATCH 1/6] spi: tegra210-quad: Return IRQ_HANDLED when timeout already processed 
transfer" reads QSPI_TRANS_STATUS at the start of tegra_qspi_isr_thread():

    if (!tqspi->curr_xfer) {
        if (!(status & QSPI_RDY))
            return IRQ_NONE;        // HW never set RDY → spurious interrupt
        return IRQ_HANDLED;         // HW did set RDY → real interrupt, timeout processed it
    }

**Scenario 1 - Delayed ISR (the race you're fixing):**
1. HW completes transfer, sets QSPI_RDY, interrupt fires
2. ISR thread delayed (CPU busy)
3. Timeout handler runs, processes transfer, clears curr_xfer
4. Delayed ISR finally wakes up
5. Reads QSPI_RDY (may still be set)
6. curr_xfer is NULL
7. Return IRQ_HANDLED → this WAS our interrupt, just processed by timeout

**Scenario 2 - Truly Spurious:**
1. Spurious interrupt fires
2. QSPI_RDY = 0 (no transfer completed)
3. curr_xfer is NULL
4. Return IRQ_NONE → not our interrupt

**Your Latest Proposal Has It Backwards:**

The version in your last email returns IRQ_HANDLED when QSPI_RDY is NOT set, which is 
incorrect per Mark's feedback.

**For v2:**

Patches 1-5: Keep as-is from v1 (all correct)

Patch 6 ("[PATCH 6/6] spi: tegra210-quad: Protect curr_xfer check in IRQ handler"): 
The TODO comment you added asks about keeping the lock held across the handler call. I'd 
suggest removing the TODO and replacing it with a comment explaining why the current 
approach is safe:

    spin_unlock_irqrestore(&tqspi->lock, flags);

    /*
     * Lock is released here but handlers safely re-check curr_xfer under lock
     * before dereferencing. DMA handler also needs to sleep in 
     * wait_for_completion_*(), which cannot be done while holding spinlock.
     */
    if (!tqspi->is_curr_dma_xfer)
        return handle_cpu_based_xfer(tqspi);

This documents the design decision and closes the TODO.

The device_reset_optional() was from your March 2025 series - keep that separate.

**Testing:**

Carol Soto will validate v2 with your test methodology and provide feedback.

**Follow-on:**

I'll implement hard IRQ handler support separately after your fix merges.

Best,
Vishwaroop


  parent reply	other threads:[~2026-01-21 17:56 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-16 10:41 [PATCH 0/6] spi: tegra-qspi: Fix race condition causing NULL pointer dereference and spurious IRQ Breno Leitao
2026-01-16 10:41 ` [PATCH 1/6] spi: tegra210-quad: Return IRQ_HANDLED when timeout already processed transfer Breno Leitao
2026-01-16 11:33   ` Thierry Reding
2026-01-16 11:48   ` Usama Arif
2026-01-16 12:06     ` Thierry Reding
2026-01-16 10:41 ` [PATCH 2/6] spi: tegra210-quad: Move curr_xfer read inside spinlock Breno Leitao
2026-01-16 11:38   ` Thierry Reding
2026-01-16 10:41 ` [PATCH 3/6] spi: tegra210-quad: Protect curr_xfer assignment in tegra_qspi_setup_transfer_one Breno Leitao
2026-01-16 10:41 ` [PATCH 4/6] spi: tegra210-quad: Protect curr_xfer in tegra_qspi_combined_seq_xfer Breno Leitao
2026-01-16 10:41 ` [PATCH 5/6] spi: tegra210-quad: Protect curr_xfer clearing in tegra_qspi_non_combined_seq_xfer Breno Leitao
2026-01-16 10:41 ` [PATCH 6/6] spi: tegra210-quad: Protect curr_xfer check in IRQ handler Breno Leitao
2026-01-20 11:22 ` [PATCH 0/6] spi: tegra-qspi: Fix race condition causing NULL pointer dereference and spurious IRQ Vishwaroop A
2026-01-20 16:49   ` Breno Leitao
2026-01-20 16:52     ` Mark Brown
2026-01-21 17:56     ` Vishwaroop A [this message]
2026-01-22 17:04       ` Breno Leitao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260121000000.0000000-1-va@nvidia.com \
    --to=va@nvidia.com \
    --cc=broonie@kernel.org \
    --cc=jonathanh@nvidia.com \
    --cc=kernel-team@meta.com \
    --cc=ldewangan@nvidia.com \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-spi@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=puranjay@kernel.org \
    --cc=skomatineni@nvidia.com \
    --cc=thierry.reding@gmail.com \
    --cc=treding@nvidia.com \
    --cc=usamaarif642@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.