Re: [PATCH 0/1] spi: fsl_espi: fix din offset

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Michael Walle" <mwalle@kernel.org>
To: "Tomas Alvarez Vanoli" <tomas.alvarez-vanoli@hitachienergy.com>,
	"u-boot@lists.denx.de" <u-boot@lists.denx.de>
Cc: "trini@konsulko.com" <trini@konsulko.com>,
	"chuanhua.han@nxp.com" <chuanhua.han@nxp.com>
Subject: Re: [PATCH 0/1] spi: fsl_espi: fix din offset
Date: Tue, 28 Apr 2026 11:59:03 +0200	[thread overview]
Message-ID: <DI4P7BNPXKYW.37JP7FIR13OW2@kernel.org> (raw)
In-Reply-To: <DB9PR06MB7849FF9D7AFDD1CD3C3304B7C2362@DB9PR06MB7849.eurprd06.prod.outlook.com>

Hi,

On Mon Apr 27, 2026 at 5:00 PM CEST, Tomas Alvarez Vanoli wrote:
>>What kind of issues do you see? And how does your patch resolves
>>this? I guess we can assume the code was working correctly once.
>
> The problem was that when calling spi_xfer like so:
>
> `spi_xfer(spi_slv, bitlen, &dout, &din, SPI_XFER_ONCE);`
>
> To read a read only part of chip with known values, I was getting garbage.
>
> Digging deeper into the driver and adding print statements I was able to
> determine that:
>
> - fsl_espi_rx was getting the expected data and putting it in the correct
> address (local din)
>
> - The memcpy later in the code was taking data from a different address
> (buffer + 2 * cmd_len) and putting it in the user pointer (the &din from my
> call)
>
>>Honestly, I don't understand that code either. ...
>
> The switch at the start of the function determines the buffer and offset setup.
> To be fair I only focused on my issue, SPI_XFER_ONCE which is the same as
> SPI_XFER_BEGIN | SPI_XFER_END, and in a small transaction.
>
> Inside the ONCE case, the driver allocates a buffer of double the transaction,
> because it puts the tx data in the first half, and then will write the rx data
> in the second half.

I mean:

Why is there an split between cmd and data (cmd_buf, cmd_len etc.)
although there is not such distinction in the SPI context. Why is
the first xfer (i.e. only SPI_XFER_BEGIN is set), delayed? What is
that mysterious "(*buffer == 0x0b)".
Why is the max transfer length 0xfff0. Presumably, because there is
an assumption that the "command" is <= 16 bytes. Then that matches
the 16bit length field in the ESPI peripheral ("TRANLEN"). Although
there is no check that the "command" is really less than 16 bytes..

> The case of only "SPI_XFER_END" is a bit more "cryptic" to put it nicely, and I
> did not properly debug it if I am being honest.

Yeah but that's the crux here, isn't it? Esp. because there the
buffer length isn't multiplied by two. Only the "cmd_len" is. Hmm.
Also, either data_in or data_out is handled. Ok. I *guess* that
driver supports only half duplex operations, that is probably also
the reason of that command/data split. Command is always output,
after that, there is either data in *or* data out. Doh. In your
case, i.e. SPI_XFER_ONCE, it's actually a full duplex operation, or
at least it can be, if data_in != NULL and data_out != NULL.

Funny enough, SPI_XFER_ONCE with data_out == NULL is not handled.
Maybe that can't happen in u-boot. Dunno.

Also, the CS handling is probably wrong, as it will change on every
transfer, though that is what SPI_XFER_{BEGIN,END} is for.

> However, for my case of the ONCE transfer, if you look at the start of the while
> loop:
>
> ```
> 	while (num_chunks--) {
> 		if (data_in)
> 			din = buffer + rx_offset;
> ```
>
> This means that if the user passed a pointer to get the read data back, then
> we set a local "din" to the buffer we allocated at the start plus rx_offset.
> In the "ONCE" case, this is the second half of the buffer. In the END case, this
> "cmd_len".
>
> This din pointer is what is passed to "fsl_espi_rx", where I had corroborated
> that the chip data read was the expected.
>
> When we get to later in the while loop, there's this:
>
> ```
> 		if (data_in) {
> 			memcpy(data_in, buffer + 2 * cmd_len, tran_len);
> 			if (*buffer == 0x0b) {
> 				data_in += tran_len;
> 				data_len -= tran_len;
> 				*(int *)buffer += tran_len;
> 			}
> 		}
> ```
>
> where I made my change, in the memcpy source pointer. This is the code in charge
> of copying the data back to the caller/user.

Yeah but why was there the "2 * cmd_len" in the first place. I mean
you've probably fixed the ONCE case, but broke the usual case (as
MTD/spi-mem is using it). rx_offset "1*cmd_len" and the actual
data is at 2*cmd_len because the second cmd spot is where the data
is received that you'll get when sending the "cmd".

Also, isn't the tx path sending garbage in the non-ONCE case, when
reading data? It seems it will send the data it was receiving
beforehand because dout is always pointing to the allocated buffer.
Thus after cmd_len, it will send the data fsl_espi_rx() was
receiving before.

> As you can see, for the ONCE case, cmd_len is 0, so it is taking the data at the
> start of buffer (my tx data, not the rx data). Hopefully the bug I was
> experiencing is clearer with this explanation.
>
> As I mentally traced this code I realized of a better fix, hopefully you can
> review this and/or test in on your board too.
>
> Given that for the ONCE transfer cmd_len is 0, what could be changed is also:
>
> diff --git a/drivers/spi/fsl_espi.c b/drivers/spi/fsl_espi.c
> index b1586d12912..ef2ebd5840b 100644
> --- a/drivers/spi/fsl_espi.c
> +++ b/drivers/spi/fsl_espi.c
> @@ -287,13 +287,13 @@ int spi_xfer(struct spi_slave *slave, unsigned int bitlen, const void *data_out,
>                 break;
>         case SPI_XFER_BEGIN | SPI_XFER_END:
>                 len = data_len;
> -               buffer = (unsigned char *)malloc(len * 2);
> +               buffer = (unsigned char *)malloc(len);
>                 if (!buffer) {
>                         debug("SF: Failed to malloc memory.\n");
>                         return 1;
>                 }
>                 memcpy(buffer, data_out, len);
> -               rx_offset = len;
> +               rx_offset = 0;
>                 cmd_len = 0;
>                 break;
>         }
>
> This theoretically should work and should also not affect the "END" case.
>
> The two things that support this are that:
>
> - The loop always writes out first, so the rx data can safely overwrite the tx
> - cmd_len is set to 0 for the ONCE case, so data_in memcpy would take the rx
> data from the right rx_offset (0)

Yeah, I get to the same conclusion. Another possible solution would
be to check for cmd_len == 0 and use rx_offset, otherwise cmd_len *
2.

Will you prepare a patch or should I include one in my larger "fix
NXP PPC boards" series I'm about to send today or tomorrow?

> Hopefully this explanation and my proposed alternative fix are helpful, I'm glad
> someone else was able to test this, I had cc'ed someone from nxp hoping they
> could shine some light on the code but their email does not exist anymore.
>
> It would be good to know what flags the flash probe is passing to the driver,
> to know if the problem actualyl arises from the switch statement at the start
> or handling multiple chunks/blks. I can say for certain that for a single
> chunk and signle block, the ONCE transfer was not working correctly for me.

I don't have the board at hand right now. But I'm pretty sure, it's
split between SPI_XFER_BEGIN and SPI_XFER_END, see
https://elixir.bootlin.com/u-boot/v2026.04/source/drivers/spi/spi-mem.c#L418

-michael

     prev parent reply	other threads:[~2026-04-28  9:59 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24 17:02 [PATCH 0/1] spi: fsl_espi: fix din offset Tomas Alvarez Vanoli
2026-03-24 17:02 ` [PATCH 1/1] " Tomas Alvarez Vanoli
2026-04-27 12:33 ` [PATCH 0/1] " Michael Walle
2026-04-27 15:00   ` Tomas Alvarez Vanoli
2026-04-28  9:59     ` Michael Walle [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DI4P7BNPXKYW.37JP7FIR13OW2@kernel.org \
    --to=mwalle@kernel.org \
    --cc=chuanhua.han@nxp.com \
    --cc=tomas.alvarez-vanoli@hitachienergy.com \
    --cc=trini@konsulko.com \
    --cc=u-boot@lists.denx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.