* [PATCH 0/4] spi: axi-spi-engine: offload instruction optimization
@ 2025-04-28 20:58 David Lechner
2025-04-28 20:58 ` [PATCH 1/4] spi: axi-spi-engine: wait for completion in setup David Lechner
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: David Lechner @ 2025-04-28 20:58 UTC (permalink / raw)
To: Michael Hennerich, Nuno Sá, Mark Brown
Cc: linux-spi, linux-kernel, David Lechner
In order to achieve a 4 MSPS rate on a 16-bit ADC with a 80 MHz SCLK
using the SPI offload feature of the AXI SPI Engine, we need to shave
off some time that is spent executing unnecessary instructions. There
are a few one-time setup instructions that can be moved so that they
execute only once when the SPI offload trigger is enabled rather than
repeating each time the offload is triggered. Additionally, a recent
change to the IP block allows dropping the SYNC instruction completely.
With these changes, we are left with only the 3 instructions that are
needed to to assert CS, transfer the data, and deassert CS. This makes
3 + 16 * 12.5 ns = 237.5 ns < 250 ns which is comfortably within the
available time period.
Signed-off-by: David Lechner <dlechner@baylibre.com>
---
David Lechner (4):
spi: axi-spi-engine: wait for completion in setup
spi: axi-spi-engine: don't repeat mode config for offload
spi: axi-spi-engine: optimize bits_per_word for offload
spi: axi-spi-engine: omit SYNC from offload instructions
drivers/spi/spi-axi-spi-engine.c | 91 +++++++++++++++++++++++++++++++++++++---
1 file changed, 85 insertions(+), 6 deletions(-)
---
base-commit: aba9c2fee9598d797034ffd289b0da770d9119e8
change-id: 20250331-adi-main-46863acfc540
Best regards,
--
David Lechner <dlechner@baylibre.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/4] spi: axi-spi-engine: wait for completion in setup
2025-04-28 20:58 [PATCH 0/4] spi: axi-spi-engine: offload instruction optimization David Lechner
@ 2025-04-28 20:58 ` David Lechner
2025-04-28 20:58 ` [PATCH 2/4] spi: axi-spi-engine: don't repeat mode config for offload David Lechner
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: David Lechner @ 2025-04-28 20:58 UTC (permalink / raw)
To: Michael Hennerich, Nuno Sá, Mark Brown
Cc: linux-spi, linux-kernel, David Lechner
Add a polling wait for SPI instruction execution to complete in the
spi_engine_setup() function. In practice, these instructions complete
in a few 10s of nanoseconds, so we never ran into any race conditions,
but it is good practice to wait for the completion of the SPI engine
instructions before returning from the setup function.
Signed-off-by: David Lechner <dlechner@baylibre.com>
---
drivers/spi/spi-axi-spi-engine.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/spi/spi-axi-spi-engine.c b/drivers/spi/spi-axi-spi-engine.c
index da9840957778579dad3286f493abad87ad8a3bfc..d040deffa9bb9bdcb67bcc8af0a1cfad2e4f6041 100644
--- a/drivers/spi/spi-axi-spi-engine.c
+++ b/drivers/spi/spi-axi-spi-engine.c
@@ -14,6 +14,7 @@
#include <linux/fpga/adi-axi-common.h>
#include <linux/interrupt.h>
#include <linux/io.h>
+#include <linux/iopoll.h>
#include <linux/of.h>
#include <linux/module.h>
#include <linux/overflow.h>
@@ -739,12 +740,16 @@ static int spi_engine_setup(struct spi_device *device)
{
struct spi_controller *host = device->controller;
struct spi_engine *spi_engine = spi_controller_get_devdata(host);
+ unsigned int reg;
if (device->mode & SPI_CS_HIGH)
spi_engine->cs_inv |= BIT(spi_get_chipselect(device, 0));
else
spi_engine->cs_inv &= ~BIT(spi_get_chipselect(device, 0));
+ writel_relaxed(SPI_ENGINE_CMD_SYNC(0),
+ spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
+
writel_relaxed(SPI_ENGINE_CMD_CS_INV(spi_engine->cs_inv),
spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
@@ -755,7 +760,11 @@ static int spi_engine_setup(struct spi_device *device)
writel_relaxed(SPI_ENGINE_CMD_ASSERT(0, 0xff),
spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
- return 0;
+ writel_relaxed(SPI_ENGINE_CMD_SYNC(1),
+ spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
+
+ return readl_relaxed_poll_timeout(spi_engine->base + SPI_ENGINE_REG_SYNC_ID,
+ reg, reg == 1, 1, 1000);
}
static int spi_engine_transfer_one_message(struct spi_controller *host,
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/4] spi: axi-spi-engine: don't repeat mode config for offload
2025-04-28 20:58 [PATCH 0/4] spi: axi-spi-engine: offload instruction optimization David Lechner
2025-04-28 20:58 ` [PATCH 1/4] spi: axi-spi-engine: wait for completion in setup David Lechner
@ 2025-04-28 20:58 ` David Lechner
2025-04-29 9:08 ` Nuno Sá
2025-04-28 20:58 ` [PATCH 3/4] spi: axi-spi-engine: optimize bits_per_word " David Lechner
` (2 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: David Lechner @ 2025-04-28 20:58 UTC (permalink / raw)
To: Michael Hennerich, Nuno Sá, Mark Brown
Cc: linux-spi, linux-kernel, David Lechner
Add an optimization to avoid repeating the config instruction in each
SPI message when using SPI offloading. Instead, the instruction is
run once when the SPI offload trigger is enabled.
This is done to allow higher sample rates for ADCs using this SPI
controller.
Signed-off-by: David Lechner <dlechner@baylibre.com>
---
drivers/spi/spi-axi-spi-engine.c | 33 ++++++++++++++++++++++++++++++---
1 file changed, 30 insertions(+), 3 deletions(-)
diff --git a/drivers/spi/spi-axi-spi-engine.c b/drivers/spi/spi-axi-spi-engine.c
index d040deffa9bb9bdcb67bcc8af0a1cfad2e4f6041..05ef2589f8dc0bdaa1b3bb3a459670d174f821a2 100644
--- a/drivers/spi/spi-axi-spi-engine.c
+++ b/drivers/spi/spi-axi-spi-engine.c
@@ -141,6 +141,7 @@ struct spi_engine_offload {
struct spi_engine *spi_engine;
unsigned long flags;
unsigned int offload_num;
+ unsigned int spi_mode_config;
};
struct spi_engine {
@@ -284,6 +285,7 @@ static void spi_engine_compile_message(struct spi_message *msg, bool dry,
{
struct spi_device *spi = msg->spi;
struct spi_controller *host = spi->controller;
+ struct spi_engine_offload *priv;
struct spi_transfer *xfer;
int clk_div, new_clk_div, inst_ns;
bool keep_cs = false;
@@ -297,9 +299,18 @@ static void spi_engine_compile_message(struct spi_message *msg, bool dry,
clk_div = 1;
- spi_engine_program_add_cmd(p, dry,
- SPI_ENGINE_CMD_WRITE(SPI_ENGINE_CMD_REG_CONFIG,
- spi_engine_get_config(spi)));
+ /*
+ * As an optimization, SPI offload sets once this when the offload is
+ * enabled instead of repeating the instruction in each message.
+ */
+ if (msg->offload) {
+ priv = msg->offload->priv;
+ priv->spi_mode_config = spi_engine_get_config(spi);
+ } else {
+ spi_engine_program_add_cmd(p, dry,
+ SPI_ENGINE_CMD_WRITE(SPI_ENGINE_CMD_REG_CONFIG,
+ spi_engine_get_config(spi)));
+ }
xfer = list_first_entry(&msg->transfers, struct spi_transfer, transfer_list);
spi_engine_gen_cs(p, dry, spi, !xfer->cs_off);
@@ -842,6 +853,22 @@ static int spi_engine_trigger_enable(struct spi_offload *offload)
struct spi_engine_offload *priv = offload->priv;
struct spi_engine *spi_engine = priv->spi_engine;
unsigned int reg;
+ int ret;
+
+ writel_relaxed(SPI_ENGINE_CMD_SYNC(0),
+ spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
+
+ writel_relaxed(SPI_ENGINE_CMD_WRITE(SPI_ENGINE_CMD_REG_CONFIG,
+ priv->spi_mode_config),
+ spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
+
+ writel_relaxed(SPI_ENGINE_CMD_SYNC(1),
+ spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
+
+ ret = readl_relaxed_poll_timeout(spi_engine->base + SPI_ENGINE_REG_SYNC_ID,
+ reg, reg == 1, 1, 1000);
+ if (ret)
+ return ret;
reg = readl_relaxed(spi_engine->base +
SPI_ENGINE_REG_OFFLOAD_CTRL(priv->offload_num));
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/4] spi: axi-spi-engine: optimize bits_per_word for offload
2025-04-28 20:58 [PATCH 0/4] spi: axi-spi-engine: offload instruction optimization David Lechner
2025-04-28 20:58 ` [PATCH 1/4] spi: axi-spi-engine: wait for completion in setup David Lechner
2025-04-28 20:58 ` [PATCH 2/4] spi: axi-spi-engine: don't repeat mode config for offload David Lechner
@ 2025-04-28 20:58 ` David Lechner
2025-04-28 20:58 ` [PATCH 4/4] spi: axi-spi-engine: omit SYNC from offload instructions David Lechner
2025-04-30 22:52 ` [PATCH 0/4] spi: axi-spi-engine: offload instruction optimization Mark Brown
4 siblings, 0 replies; 9+ messages in thread
From: David Lechner @ 2025-04-28 20:58 UTC (permalink / raw)
To: Michael Hennerich, Nuno Sá, Mark Brown
Cc: linux-spi, linux-kernel, David Lechner
Add an optimization to avoid repeating bits_per_word instructions in
each message when using SPI offload. This only applies when all data
xfers in a message have the same bits_per_word. In this case, we can
execute the instruction that sets bits_per_word when the offload trigger
is enabled. This is useful e.g. for obtaining higher sample rates on
ADCs since each message takes less time to execute.
Signed-off-by: David Lechner <dlechner@baylibre.com>
---
drivers/spi/spi-axi-spi-engine.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/drivers/spi/spi-axi-spi-engine.c b/drivers/spi/spi-axi-spi-engine.c
index 05ef2589f8dc0bdaa1b3bb3a459670d174f821a2..b54d2e1437c9993d251aa2842d9040ec0949a78d 100644
--- a/drivers/spi/spi-axi-spi-engine.c
+++ b/drivers/spi/spi-axi-spi-engine.c
@@ -142,6 +142,7 @@ struct spi_engine_offload {
unsigned long flags;
unsigned int offload_num;
unsigned int spi_mode_config;
+ u8 bits_per_word;
};
struct spi_engine {
@@ -267,6 +268,8 @@ static int spi_engine_precompile_message(struct spi_message *msg)
{
unsigned int clk_div, max_hz = msg->spi->controller->max_speed_hz;
struct spi_transfer *xfer;
+ u8 min_bits_per_word = U8_MAX;
+ u8 max_bits_per_word = 0;
list_for_each_entry(xfer, &msg->transfers, transfer_list) {
/* If we have an offload transfer, we can't rx to buffer */
@@ -275,6 +278,24 @@ static int spi_engine_precompile_message(struct spi_message *msg)
clk_div = DIV_ROUND_UP(max_hz, xfer->speed_hz);
xfer->effective_speed_hz = max_hz / min(clk_div, 256U);
+
+ if (xfer->len) {
+ min_bits_per_word = min(min_bits_per_word, xfer->bits_per_word);
+ max_bits_per_word = max(max_bits_per_word, xfer->bits_per_word);
+ }
+ }
+
+ /*
+ * If all xfers in the message use the same bits_per_word, we can
+ * provide some optimization when using SPI offload.
+ */
+ if (msg->offload) {
+ struct spi_engine_offload *priv = msg->offload->priv;
+
+ if (min_bits_per_word == max_bits_per_word)
+ priv->bits_per_word = min_bits_per_word;
+ else
+ priv->bits_per_word = 0;
}
return 0;
@@ -306,6 +327,12 @@ static void spi_engine_compile_message(struct spi_message *msg, bool dry,
if (msg->offload) {
priv = msg->offload->priv;
priv->spi_mode_config = spi_engine_get_config(spi);
+
+ /*
+ * If all xfers use the same bits_per_word, it can be optimized
+ * in the same way.
+ */
+ bits_per_word = priv->bits_per_word;
} else {
spi_engine_program_add_cmd(p, dry,
SPI_ENGINE_CMD_WRITE(SPI_ENGINE_CMD_REG_CONFIG,
@@ -862,6 +889,11 @@ static int spi_engine_trigger_enable(struct spi_offload *offload)
priv->spi_mode_config),
spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
+ if (priv->bits_per_word)
+ writel_relaxed(SPI_ENGINE_CMD_WRITE(SPI_ENGINE_CMD_REG_XFER_BITS,
+ priv->bits_per_word),
+ spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
+
writel_relaxed(SPI_ENGINE_CMD_SYNC(1),
spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/4] spi: axi-spi-engine: omit SYNC from offload instructions
2025-04-28 20:58 [PATCH 0/4] spi: axi-spi-engine: offload instruction optimization David Lechner
` (2 preceding siblings ...)
2025-04-28 20:58 ` [PATCH 3/4] spi: axi-spi-engine: optimize bits_per_word " David Lechner
@ 2025-04-28 20:58 ` David Lechner
2025-04-30 22:52 ` [PATCH 0/4] spi: axi-spi-engine: offload instruction optimization Mark Brown
4 siblings, 0 replies; 9+ messages in thread
From: David Lechner @ 2025-04-28 20:58 UTC (permalink / raw)
To: Michael Hennerich, Nuno Sá, Mark Brown
Cc: linux-spi, linux-kernel, David Lechner
Add optimization to omit SYNC instructions from offload messages.
Starting with IP core v1.5.0, the SYNC instruction is no longer required
for proper operation when using the offload feature. Omitting the SYNC
instruction saves a few clock cycles needed to executed which can e.g.
allow achieving higher sample rates on ADCs.
Signed-off-by: David Lechner <dlechner@baylibre.com>
---
drivers/spi/spi-axi-spi-engine.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/spi-axi-spi-engine.c b/drivers/spi/spi-axi-spi-engine.c
index b54d2e1437c9993d251aa2842d9040ec0949a78d..8cc19934b48b5276f49c4049dcb2dbbeb4112871 100644
--- a/drivers/spi/spi-axi-spi-engine.c
+++ b/drivers/spi/spi-axi-spi-engine.c
@@ -162,6 +162,7 @@ struct spi_engine {
unsigned int offload_sdo_mem_size;
struct spi_offload *offload;
u32 offload_caps;
+ bool offload_requires_sync;
};
static void spi_engine_program_add_cmd(struct spi_engine_program *p,
@@ -702,6 +703,8 @@ static void spi_engine_offload_unprepare(struct spi_offload *offload)
static int spi_engine_optimize_message(struct spi_message *msg)
{
+ struct spi_controller *host = msg->spi->controller;
+ struct spi_engine *spi_engine = spi_controller_get_devdata(host);
struct spi_engine_program p_dry, *p;
int ret;
@@ -718,8 +721,13 @@ static int spi_engine_optimize_message(struct spi_message *msg)
spi_engine_compile_message(msg, false, p);
- spi_engine_program_add_cmd(p, false, SPI_ENGINE_CMD_SYNC(
- msg->offload ? 0 : AXI_SPI_ENGINE_CUR_MSG_SYNC_ID));
+ /*
+ * Non-offload needs SYNC for completion interrupt. Older versions of
+ * the IP core also need SYNC for offload to work properly.
+ */
+ if (!msg->offload || spi_engine->offload_requires_sync)
+ spi_engine_program_add_cmd(p, false, SPI_ENGINE_CMD_SYNC(
+ msg->offload ? 0 : AXI_SPI_ENGINE_CUR_MSG_SYNC_ID));
msg->opt_state = p;
@@ -1055,6 +1063,9 @@ static int spi_engine_probe(struct platform_device *pdev)
spi_engine->offload_sdo_mem_size = SPI_ENGINE_OFFLOAD_SDO_FIFO_SIZE;
}
+ /* IP v1.5 dropped the requirement for SYNC in offload messages. */
+ spi_engine->offload_requires_sync = ADI_AXI_PCORE_VER_MINOR(version) < 5;
+
writel_relaxed(0x00, spi_engine->base + SPI_ENGINE_REG_RESET);
writel_relaxed(0xff, spi_engine->base + SPI_ENGINE_REG_INT_PENDING);
writel_relaxed(0x00, spi_engine->base + SPI_ENGINE_REG_INT_ENABLE);
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/4] spi: axi-spi-engine: don't repeat mode config for offload
2025-04-28 20:58 ` [PATCH 2/4] spi: axi-spi-engine: don't repeat mode config for offload David Lechner
@ 2025-04-29 9:08 ` Nuno Sá
2025-04-29 15:24 ` David Lechner
0 siblings, 1 reply; 9+ messages in thread
From: Nuno Sá @ 2025-04-29 9:08 UTC (permalink / raw)
To: David Lechner, Michael Hennerich, Nuno Sá, Mark Brown
Cc: linux-spi, linux-kernel
Hi David,
The whole series LGTM but I do have a small concern (maybe an hypothetical one)
in this one...
On Mon, 2025-04-28 at 15:58 -0500, David Lechner wrote:
> Add an optimization to avoid repeating the config instruction in each
> SPI message when using SPI offloading. Instead, the instruction is
> run once when the SPI offload trigger is enabled.
>
> This is done to allow higher sample rates for ADCs using this SPI
> controller.
>
> Signed-off-by: David Lechner <dlechner@baylibre.com>
> ---
> drivers/spi/spi-axi-spi-engine.c | 33 ++++++++++++++++++++++++++++++---
> 1 file changed, 30 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/spi/spi-axi-spi-engine.c b/drivers/spi/spi-axi-spi-
> engine.c
> index
> d040deffa9bb9bdcb67bcc8af0a1cfad2e4f6041..05ef2589f8dc0bdaa1b3bb3a459670d174f8
> 21a2 100644
> --- a/drivers/spi/spi-axi-spi-engine.c
> +++ b/drivers/spi/spi-axi-spi-engine.c
> @@ -141,6 +141,7 @@ struct spi_engine_offload {
> struct spi_engine *spi_engine;
> unsigned long flags;
> unsigned int offload_num;
> + unsigned int spi_mode_config;
> };
>
> struct spi_engine {
> @@ -284,6 +285,7 @@ static void spi_engine_compile_message(struct spi_message
> *msg, bool dry,
> {
> struct spi_device *spi = msg->spi;
> struct spi_controller *host = spi->controller;
> + struct spi_engine_offload *priv;
> struct spi_transfer *xfer;
> int clk_div, new_clk_div, inst_ns;
> bool keep_cs = false;
> @@ -297,9 +299,18 @@ static void spi_engine_compile_message(struct spi_message
> *msg, bool dry,
>
> clk_div = 1;
>
> - spi_engine_program_add_cmd(p, dry,
> - SPI_ENGINE_CMD_WRITE(SPI_ENGINE_CMD_REG_CONFIG,
> - spi_engine_get_config(spi)));
> + /*
> + * As an optimization, SPI offload sets once this when the offload is
> + * enabled instead of repeating the instruction in each message.
> + */
> + if (msg->offload) {
> + priv = msg->offload->priv;
> + priv->spi_mode_config = spi_engine_get_config(spi);
> + } else {
> + spi_engine_program_add_cmd(p, dry,
> + SPI_ENGINE_CMD_WRITE(SPI_ENGINE_CMD_REG_CONFIG,
> + spi_engine_get_config(spi)));
> + }
>
> xfer = list_first_entry(&msg->transfers, struct spi_transfer,
> transfer_list);
> spi_engine_gen_cs(p, dry, spi, !xfer->cs_off);
> @@ -842,6 +853,22 @@ static int spi_engine_trigger_enable(struct spi_offload
> *offload)
> struct spi_engine_offload *priv = offload->priv;
> struct spi_engine *spi_engine = priv->spi_engine;
> unsigned int reg;
> + int ret;
> +
> + writel_relaxed(SPI_ENGINE_CMD_SYNC(0),
> + spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
> +
> + writel_relaxed(SPI_ENGINE_CMD_WRITE(SPI_ENGINE_CMD_REG_CONFIG,
> + priv->spi_mode_config),
> + spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
> +
> + writel_relaxed(SPI_ENGINE_CMD_SYNC(1),
> + spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
> +
I would assume that SPI_ENGINE_CMD_SYNC(0) + SPI_ENGINE_CMD_SYNC(1) should be
executed in order by the core? I think all this relaxed API's don't give any
guarantee about store completion so, in theory, we could have SYNC(0) after
SYNC(1). Even the full barrier variants don't guarantee this I believe [1].
There's ioremap_np() variant but likely not supported in many platforms. Doing a
read() before SYNC(1) should be all we need to guarantee proper ordering here.
Or maybe this is all theoretical and not an issue on the platforms this driver
is supported but meh, just raising the possibility.
[1]: https://elixir.bootlin.com/linux/v6.12-rc6/source/Documentation/driver-api/device-io.rst#L299
- Nuno Sá
> + ret = readl_relaxed_poll_timeout(spi_engine->base +
> SPI_ENGINE_REG_SYNC_ID,
> + reg, reg == 1, 1, 1000);
> + if (ret)
> + return ret;
>
> reg = readl_relaxed(spi_engine->base +
> SPI_ENGINE_REG_OFFLOAD_CTRL(priv->offload_num));
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/4] spi: axi-spi-engine: don't repeat mode config for offload
2025-04-29 9:08 ` Nuno Sá
@ 2025-04-29 15:24 ` David Lechner
2025-04-29 15:40 ` Nuno Sá
0 siblings, 1 reply; 9+ messages in thread
From: David Lechner @ 2025-04-29 15:24 UTC (permalink / raw)
To: Nuno Sá, Michael Hennerich, Nuno Sá, Mark Brown
Cc: linux-spi, linux-kernel
On 4/29/25 4:08 AM, Nuno Sá wrote:
> Hi David,
>
> The whole series LGTM but I do have a small concern (maybe an hypothetical one)
> in this one...
>
>
> On Mon, 2025-04-28 at 15:58 -0500, David Lechner wrote:
...
>> +
>> + writel_relaxed(SPI_ENGINE_CMD_SYNC(0),
>> + spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
>> +
>> + writel_relaxed(SPI_ENGINE_CMD_WRITE(SPI_ENGINE_CMD_REG_CONFIG,
>> + priv->spi_mode_config),
>> + spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
>> +
>> + writel_relaxed(SPI_ENGINE_CMD_SYNC(1),
>> + spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
>> +
>
> I would assume that SPI_ENGINE_CMD_SYNC(0) + SPI_ENGINE_CMD_SYNC(1) should be
> executed in order by the core? I think all this relaxed API's don't give any
> guarantee about store completion so, in theory, we could have SYNC(0) after
> SYNC(1). Even the full barrier variants don't guarantee this I believe [1].
> There's ioremap_np() variant but likely not supported in many platforms. Doing a
> read() before SYNC(1) should be all we need to guarantee proper ordering here.
>
> Or maybe this is all theoretical and not an issue on the platforms this driver
> is supported but meh, just raising the possibility.
>
>
> [1]: https://elixir.bootlin.com/linux/v6.12-rc6/source/Documentation/driver-api/device-io.rst#L299
>
The way I am reading this, relaxed isn't "no barriers", just "weaker barriers".
Quoting from the linked doc:
On architectures that require an expensive barrier for serializing
against DMA, these “relaxed” versions of the MMIO accessors only
serialize against each other, but contain a less expensive barrier
operation.
So that sounds like we should not have a problem to me. (And if there is a
problem, then it would affect other parts of the driver, like when we load
the fifos with tx data in a loop.)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/4] spi: axi-spi-engine: don't repeat mode config for offload
2025-04-29 15:24 ` David Lechner
@ 2025-04-29 15:40 ` Nuno Sá
0 siblings, 0 replies; 9+ messages in thread
From: Nuno Sá @ 2025-04-29 15:40 UTC (permalink / raw)
To: David Lechner, Michael Hennerich, Nuno Sá, Mark Brown
Cc: linux-spi, linux-kernel
On Tue, 2025-04-29 at 10:24 -0500, David Lechner wrote:
> On 4/29/25 4:08 AM, Nuno Sá wrote:
> > Hi David,
> >
> > The whole series LGTM but I do have a small concern (maybe an hypothetical
> > one)
> > in this one...
> >
> >
> > On Mon, 2025-04-28 at 15:58 -0500, David Lechner wrote:
>
>
> ...
>
> > > +
> > > + writel_relaxed(SPI_ENGINE_CMD_SYNC(0),
> > > + spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
> > > +
> > > + writel_relaxed(SPI_ENGINE_CMD_WRITE(SPI_ENGINE_CMD_REG_CONFIG,
> > > + priv->spi_mode_config),
> > > + spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
> > > +
> > > + writel_relaxed(SPI_ENGINE_CMD_SYNC(1),
> > > + spi_engine->base + SPI_ENGINE_REG_CMD_FIFO);
> > > +
> >
> > I would assume that SPI_ENGINE_CMD_SYNC(0) + SPI_ENGINE_CMD_SYNC(1) should
> > be
> > executed in order by the core? I think all this relaxed API's don't give any
> > guarantee about store completion so, in theory, we could have SYNC(0) after
> > SYNC(1). Even the full barrier variants don't guarantee this I believe [1].
> > There's ioremap_np() variant but likely not supported in many platforms.
> > Doing a
> > read() before SYNC(1) should be all we need to guarantee proper ordering
> > here.
> >
> > Or maybe this is all theoretical and not an issue on the platforms this
> > driver
> > is supported but meh, just raising the possibility.
> >
> >
> > [1]:
> > https://elixir.bootlin.com/linux/v6.12-rc6/source/Documentation/driver-api/device-io.rst#L299
> >
>
> The way I am reading this, relaxed isn't "no barriers", just "weaker
> barriers".
Yes, sometimes just compiler ones. Bad phrasing from me...
> Quoting from the linked doc:
>
> On architectures that require an expensive barrier for serializing
> against DMA, these “relaxed” versions of the MMIO accessors only
> serialize against each other, but contain a less expensive barrier
> operation.
>
> So that sounds like we should not have a problem to me. (And if there is a
> problem, then it would affect other parts of the driver, like when we load
> the fifos with tx data in a loop.)
Well that just ensures that they are issued by the order you wrote them but it
does not guarantee that they end up in the same order on the peripheral itself.
Anyways, likely not an issue on the arches we care and as you mentioned it,
there are other places of the driver where this could matter. So, I guess not
material for this series.
- Nuno Sá
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/4] spi: axi-spi-engine: offload instruction optimization
2025-04-28 20:58 [PATCH 0/4] spi: axi-spi-engine: offload instruction optimization David Lechner
` (3 preceding siblings ...)
2025-04-28 20:58 ` [PATCH 4/4] spi: axi-spi-engine: omit SYNC from offload instructions David Lechner
@ 2025-04-30 22:52 ` Mark Brown
4 siblings, 0 replies; 9+ messages in thread
From: Mark Brown @ 2025-04-30 22:52 UTC (permalink / raw)
To: Michael Hennerich, Nuno Sá, David Lechner; +Cc: linux-spi, linux-kernel
On Mon, 28 Apr 2025 15:58:55 -0500, David Lechner wrote:
> In order to achieve a 4 MSPS rate on a 16-bit ADC with a 80 MHz SCLK
> using the SPI offload feature of the AXI SPI Engine, we need to shave
> off some time that is spent executing unnecessary instructions. There
> are a few one-time setup instructions that can be moved so that they
> execute only once when the SPI offload trigger is enabled rather than
> repeating each time the offload is triggered. Additionally, a recent
> change to the IP block allows dropping the SYNC instruction completely.
> With these changes, we are left with only the 3 instructions that are
> needed to to assert CS, transfer the data, and deassert CS. This makes
> 3 + 16 * 12.5 ns = 237.5 ns < 250 ns which is comfortably within the
> available time period.
>
> [...]
Applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next
Thanks!
[1/4] spi: axi-spi-engine: wait for completion in setup
commit: 1d0ee0c9df31c9fd1e4f8d7e2464e36fbf6e3f75
[2/4] spi: axi-spi-engine: don't repeat mode config for offload
commit: 8fc13b822c74a46587c0d8aae4ea0820b6bdb933
[3/4] spi: axi-spi-engine: optimize bits_per_word for offload
commit: 087591c9e4fde86fe2971c34a2745c208103248e
[4/4] spi: axi-spi-engine: omit SYNC from offload instructions
commit: e6702c44c2adb28b62f81de498e9b1e4562ce660
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-04-30 22:52 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-28 20:58 [PATCH 0/4] spi: axi-spi-engine: offload instruction optimization David Lechner
2025-04-28 20:58 ` [PATCH 1/4] spi: axi-spi-engine: wait for completion in setup David Lechner
2025-04-28 20:58 ` [PATCH 2/4] spi: axi-spi-engine: don't repeat mode config for offload David Lechner
2025-04-29 9:08 ` Nuno Sá
2025-04-29 15:24 ` David Lechner
2025-04-29 15:40 ` Nuno Sá
2025-04-28 20:58 ` [PATCH 3/4] spi: axi-spi-engine: optimize bits_per_word " David Lechner
2025-04-28 20:58 ` [PATCH 4/4] spi: axi-spi-engine: omit SYNC from offload instructions David Lechner
2025-04-30 22:52 ` [PATCH 0/4] spi: axi-spi-engine: offload instruction optimization Mark Brown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).