linux-spi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Clark <james.clark@linaro.org>
To: Vladimir Oltean <vladimir.oltean@nxp.com>
Cc: Vladimir Oltean <olteanv@gmail.com>,
	Mark Brown <broonie@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Larisa Grigore <larisa.grigore@nxp.com>,
	Frank Li <Frank.li@nxp.com>, Christoph Hellwig <hch@lst.de>,
	linux-spi@vger.kernel.org, imx@lists.linux.dev,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status
Date: Tue, 1 Jul 2025 11:02:18 +0100	[thread overview]
Message-ID: <8eedf34e-b870-4a73-b966-e9745809dff3@linaro.org> (raw)
In-Reply-To: <20250630204135.gzffv33j3pk3bgx6@skbuf>



On 30/06/2025 9:41 pm, Vladimir Oltean wrote:
> On Mon, Jun 30, 2025 at 01:54:11PM +0100, James Clark wrote:
>> On 27/06/2025 10:30 pm, Vladimir Oltean wrote:
>>> On Fri, Jun 27, 2025 at 11:21:38AM +0100, James Clark wrote:
>>>> This will allow us to return a status from the interrupt handler in a
>>>> later commit and avoids copying it at the end of
>>>> dspi_transfer_one_message(). For consistency make polling and DMA modes
>>>> use the same mechanism.
>>>>
>>>> Refactor dspi_rxtx() and dspi_poll() to not return -EINPROGRESS because
>>>> this isn't actually a status that was ever returned to the core layer
>>>> but some internal state. Wherever that was used we can look at dspi->len
>>>> instead.
>>>>
>>>> No functional changes intended.
>>>>
>>>> Signed-off-by: James Clark <james.clark@linaro.org>
>>>> ---
>>>
>>> This commit doesn't work, please do not merge this patch.
>>>
>>> You are changing the logic in DMA mode, interrupt-based FIFO and PIO all
>>> in one go, in a commit whose title and primary purpose is unrelated to
>>> that. Just a mention of the type "while at it, also do that". And in
>>> that process, that bundled refactoring introduces a subtle, but severe bug.
>>>
>>> No, that is discouraged. Make one patch per logical change, where only
>>> one thing is happening and which is obviously correct. It helps you and
>>> it helps the reviewer.
>>>
>>> Please find attached a set of 3 patches that represent a broken down and
>>> corrected variant of this one. First 2 should be squashed together in
>>> your next submission, they are just to illustrate the bug that you've
>>> introduced (which can be reproduced on any SoC in XSPI mode).
>>>
>>
>> Thanks for the debugging, yes it looks like the patches could be broken down
>> a bit.
>>
>> Just for clarity, is this bug affecting host+polling mode? I can see the
>> logic bug in dspi_poll() which I must have tested less thoroughly, but I
>> can't actually see any difference in dspi_interrupt().
> 
> It should affect both, I tested your patches unmodified, i.e. interrupt
> based XSPI FIFO mode (in master mode).
> 
> Assume (not real numbers, just for explanation's sake) dspi->len is 2
> (2 FIFO sizes worth of 32-bit words, but let's assume for simplicity
> that each dspi_pop_tx() call simply decrements the len by 1).
> 
> The correct behavior would be this:
> 
> dspi_transfer_one_message()
> -> dspi->len = 2
> -> dspi_fifo_write()
>     -> dspi_xspi_fifo_write()
>        -> dspi_pop_tx()
>           -> dspi->len = 1
> -> wait_for_completion(&dspi->xfer_done)
>                                             <IRQ>
>                                             dspi_interrupt()
>                                             -> dspi_rxtx()
>                                                -> dspi_fifo_read()
>                                                -> dspi_fifo_write()
>                                                   -> dspi_xspi_fifo_write()
>                                                      -> dspi_pop_tx()
>                                                         -> dspi->len = 0
>                                             <IRQ>
>                                             dspi_interrupt()
>                                             -> dspi_rxtx()
>                                                -> dspi_fifo_read()
>                                             -> complete(&dspi->xfer_done)
> -> reinit_completion(&dspi->xfer_done)
> 
> but the behavior with your proposed logic is this:
> 
> dspi_transfer_one_message()
> -> dspi->len = 2
> -> dspi_fifo_write()
>     -> dspi_xspi_fifo_write()
>        -> dspi_pop_tx()
>           -> dspi->len = 1
> -> wait_for_completion(&dspi->xfer_done)
>                                             <IRQ>
>                                             dspi_interrupt()
>                                             -> dspi_rxtx()
>                                                -> dspi_fifo_read()
>                                                -> dspi_fifo_write()
>                                                   -> dspi_xspi_fifo_write()
>                                                      -> dspi_pop_tx()
>                                                         -> dspi->len = 0
>                                             -> complete(&dspi->xfer_done)
> -> reinit_completion(&dspi->xfer_done)
>                                             <IRQ>
>                                             dspi_interrupt()
>                                             -> Second interrupt is spurious at
>                                                this point, since the process
>                                                context may have proceeded
>                                                to change pointers in
>                                                dspi->cur_transfer, etc.
> 
> Clearer now? Essentially the complete() call is premature, it needs to
> be not after the dspi_fifo_write() call, but after its subsequent
> dspi_fifo_read(), which comes after yet another IRQ, in the IRQ-triggered
> path.
> 

Much clearer, thanks. Not sure how I missed that, maybe a confusion 
about whether it was dspi_fifo_read() or dspi_fifo_write() that modifies 
  dspi->len.

> Not sure why you are not able to reproduce this, maybe luck had it that
> the complete() call never woke up the process context earlier than the
> second IRQ in the above case triggered?
> 
> I'm not doing anything special in particular, just booted a board with a
> SPI device driver (sja1105). This transfers some sequences of relatively
> large buffers (256 bytes) at probe time, maybe that exercises the
> controller driver more than the average peripheral driver.

It's strange because I was stressing it quite a lot, especially with the 
performance testing.


  reply	other threads:[~2025-07-01 10:02 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-27 10:21 [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements James Clark
2025-06-27 10:21 ` [PATCH v4 1/6] spi: spi-fsl-dspi: Clear completion counter before initiating transfer James Clark
2025-06-27 19:41   ` Frank Li
2025-06-27 10:21 ` [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status James Clark
2025-06-27 21:30   ` Vladimir Oltean
2025-06-30 12:54     ` James Clark
2025-06-30 20:41       ` Vladimir Oltean
2025-07-01 10:02         ` James Clark [this message]
2025-07-21 13:25     ` James Clark
2025-07-21 13:39       ` Vladimir Oltean
2025-07-21 14:02         ` James Clark
2025-07-21 14:04           ` Mark Brown
2025-06-27 10:21 ` [PATCH v4 3/6] spi: spi-fsl-dspi: Stub out DMA functions James Clark
2025-06-27 10:21 ` [PATCH v4 4/6] spi: spi-fsl-dspi: Use non-coherent memory for DMA James Clark
2025-06-27 19:38   ` Frank Li
2025-06-27 10:21 ` [PATCH v4 5/6] spi: spi-fsl-dspi: Increase DMA buffer size James Clark
2025-06-27 19:44   ` Frank Li
2025-06-30  8:59     ` James Clark
2025-07-01 14:47   ` Vladimir Oltean
2025-07-01 15:08     ` James Clark
2025-07-01 15:09     ` Arnd Bergmann
2025-06-27 10:21 ` [PATCH v4 6/6] spi: spi-fsl-dspi: Report FIFO overflows as errors James Clark
2025-06-27 19:56   ` Frank Li
2025-06-27 21:41     ` Mark Brown
2025-06-30 10:46       ` James Clark
2025-06-30 11:40 ` (subset) [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements Mark Brown
2025-06-30 15:26 ` Vladimir Oltean
2025-07-01 12:42   ` James Clark
2025-07-01 13:18     ` Mark Brown
2025-07-01 13:57     ` Vladimir Oltean
2025-07-01 14:36       ` Mark Brown
2025-07-01 14:53         ` Vladimir Oltean
2025-07-01 15:16           ` Mark Brown
2025-07-01 15:24             ` Vladimir Oltean
2025-07-01 15:30               ` Mark Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8eedf34e-b870-4a73-b966-e9745809dff3@linaro.org \
    --to=james.clark@linaro.org \
    --cc=Frank.li@nxp.com \
    --cc=arnd@arndb.de \
    --cc=broonie@kernel.org \
    --cc=hch@lst.de \
    --cc=imx@lists.linux.dev \
    --cc=larisa.grigore@nxp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-spi@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=vladimir.oltean@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).