* [PATCH v3 01/13] spi: dt-bindings: allow spi-max-frequency to specify a frequency pair
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-28 8:32 ` Miquel Raynal
2026-05-28 17:36 ` Conor Dooley
2026-05-27 17:55 ` [PATCH v3 02/13] spi: dt-bindings: cdns,qspi-nor: add PHY tuning pattern partition property Santhosh Kumar K
` (12 subsequent siblings)
13 siblings, 2 replies; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Some SPI controllers support high-speed operating modes that require
controller-side configuration before the device can be driven at its
rated maximum frequency. In these cases two frequencies are relevant:
a conservative speed usable without any such configuration, and the
maximum speed achievable once the controller is set up accordingly.
The existing spi-max-frequency property accepts only a single u32,
which cannot express this distinction. Extend it to accept either a
single value (retaining full backward compatibility) or a two-element
array [base-frequency, max-frequency], where base-frequency is the
conservative operating speed and max-frequency is the highest speed
the device supports after controller-side configuration.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
.../devicetree/bindings/spi/spi-peripheral-props.yaml | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
index 880a9f624566..c88f6f3a1801 100644
--- a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
+++ b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
@@ -41,9 +41,15 @@ properties:
The device requires the LSB first mode.
spi-max-frequency:
- $ref: /schemas/types.yaml#/definitions/uint32
+ $ref: /schemas/types.yaml#/definitions/uint32-array
+ minItems: 1
+ maxItems: 2
description:
- Maximum SPI clocking speed of the device in Hz.
+ SPI clocking speed of the device in Hz. Either a single maximum
+ frequency, or two values [base-frequency, max-frequency] where
+ base-frequency is the conservative speed and max-frequency is the
+ highest speed the device supports after controller-side configurations
+ such as data training.
spi-cs-setup-delay-ns:
description:
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 01/13] spi: dt-bindings: allow spi-max-frequency to specify a frequency pair
2026-05-27 17:55 ` [PATCH v3 01/13] spi: dt-bindings: allow spi-max-frequency to specify a frequency pair Santhosh Kumar K
@ 2026-05-28 8:32 ` Miquel Raynal
2026-05-28 17:36 ` Conor Dooley
1 sibling, 0 replies; 25+ messages in thread
From: Miquel Raynal @ 2026-05-28 8:32 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, richard, vigneshr, pratyush,
mwalle, takahiro.kuwano, linux-spi, devicetree, linux-kernel,
linux-mtd, praneeth, u-kumar1, a-dutta
On 27/05/2026 at 23:25:15 +0530, Santhosh Kumar K <s-k6@ti.com> wrote:
> Some SPI controllers support high-speed operating modes that require
> controller-side configuration before the device can be driven at its
> rated maximum frequency. In these cases two frequencies are relevant:
> a conservative speed usable without any such configuration, and the
> maximum speed achievable once the controller is set up accordingly.
>
> The existing spi-max-frequency property accepts only a single u32,
> which cannot express this distinction. Extend it to accept either a
> single value (retaining full backward compatibility) or a two-element
> array [base-frequency, max-frequency], where base-frequency is the
> conservative operating speed and max-frequency is the highest speed
> the device supports after controller-side configuration.
>
> Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 01/13] spi: dt-bindings: allow spi-max-frequency to specify a frequency pair
2026-05-27 17:55 ` [PATCH v3 01/13] spi: dt-bindings: allow spi-max-frequency to specify a frequency pair Santhosh Kumar K
2026-05-28 8:32 ` Miquel Raynal
@ 2026-05-28 17:36 ` Conor Dooley
1 sibling, 0 replies; 25+ messages in thread
From: Conor Dooley @ 2026-05-28 17:36 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano, linux-spi,
devicetree, linux-kernel, linux-mtd, praneeth, u-kumar1, a-dutta
[-- Attachment #1.1: Type: text/plain, Size: 2918 bytes --]
On Wed, May 27, 2026 at 11:25:15PM +0530, Santhosh Kumar K wrote:
> Some SPI controllers support high-speed operating modes that require
> controller-side configuration before the device can be driven at its
> rated maximum frequency. In these cases two frequencies are relevant:
> a conservative speed usable without any such configuration, and the
> maximum speed achievable once the controller is set up accordingly.
>
> The existing spi-max-frequency property accepts only a single u32,
> which cannot express this distinction. Extend it to accept either a
> single value (retaining full backward compatibility) or a two-element
> array [base-frequency, max-frequency], where base-frequency is the
> conservative operating speed and max-frequency is the highest speed
> the device supports after controller-side configuration.
>
> Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
> ---
> .../devicetree/bindings/spi/spi-peripheral-props.yaml | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
Pretty sure this hasn't been tested, dt_binding_check cannot even build
processed-schema.json with this applied because there are multiple
definitions of spi-max-frequency with it applied.
The sashiko makes the point that this breaks every binding that uses
minimum/maximum to set constraints too, because these properties do not
apply to arrays unless applied per item.
I also don't get the point of this property, why can't you just set the
max that the device can do and if the controller can configure itself to
be fast enough it will do so, and if it can't then it'll pick whatever
the fastest it can actually do instead?
Seems like you're abusing a peripheral property to encode information
about the controller.
pw-bot: changes-requested
Thanks,
Conor.
>
> diff --git a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
> index 880a9f624566..c88f6f3a1801 100644
> --- a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
> +++ b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
> @@ -41,9 +41,15 @@ properties:
> The device requires the LSB first mode.
>
> spi-max-frequency:
> - $ref: /schemas/types.yaml#/definitions/uint32
> + $ref: /schemas/types.yaml#/definitions/uint32-array
> + minItems: 1
> + maxItems: 2
> description:
> - Maximum SPI clocking speed of the device in Hz.
> + SPI clocking speed of the device in Hz. Either a single maximum
> + frequency, or two values [base-frequency, max-frequency] where
> + base-frequency is the conservative speed and max-frequency is the
> + highest speed the device supports after controller-side configurations
> + such as data training.
>
> spi-cs-setup-delay-ns:
> description:
> --
> 2.34.1
>
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
[-- Attachment #2: Type: text/plain, Size: 144 bytes --]
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 02/13] spi: dt-bindings: cdns,qspi-nor: add PHY tuning pattern partition property
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
2026-05-27 17:55 ` [PATCH v3 01/13] spi: dt-bindings: allow spi-max-frequency to specify a frequency pair Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-28 8:34 ` Miquel Raynal
2026-05-27 17:55 ` [PATCH v3 03/13] spi: parse two-element spi-max-frequency property Santhosh Kumar K
` (11 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
PHY tuning requires a known data pattern to be readable from flash.
When no partition is explicitly identified, the controller must search
all available partitions to locate the pattern by label, which adds
overhead and relies on label naming conventions outside the
controller's control.
Add cdns,phy-pattern-partition, a phandle property that allows the DT
author to directly reference the flash partition holding the PHY tuning
pattern. The controller uses this partition during calibration, avoiding
the partition search entirely.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
.../bindings/spi/cdns,qspi-nor-peripheral-props.yaml | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/Documentation/devicetree/bindings/spi/cdns,qspi-nor-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/cdns,qspi-nor-peripheral-props.yaml
index 510b82c177c0..0ffcdf5b00d0 100644
--- a/Documentation/devicetree/bindings/spi/cdns,qspi-nor-peripheral-props.yaml
+++ b/Documentation/devicetree/bindings/spi/cdns,qspi-nor-peripheral-props.yaml
@@ -39,4 +39,12 @@ properties:
Delay in nanoseconds between setting qspi_n_ss_out low and
first bit transfer.
+ cdns,phy-pattern-partition:
+ $ref: /schemas/types.yaml#/definitions/phandle
+ description:
+ Phandle to the flash partition containing the PHY tuning pattern.
+ When present, the controller uses this partition to locate the
+ pattern data during PHY tuning instead of searching all partitions
+ by label.
+
additionalProperties: true
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 02/13] spi: dt-bindings: cdns,qspi-nor: add PHY tuning pattern partition property
2026-05-27 17:55 ` [PATCH v3 02/13] spi: dt-bindings: cdns,qspi-nor: add PHY tuning pattern partition property Santhosh Kumar K
@ 2026-05-28 8:34 ` Miquel Raynal
0 siblings, 0 replies; 25+ messages in thread
From: Miquel Raynal @ 2026-05-28 8:34 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, richard, vigneshr, pratyush,
mwalle, takahiro.kuwano, linux-spi, devicetree, linux-kernel,
linux-mtd, praneeth, u-kumar1, a-dutta
On 27/05/2026 at 23:25:16 +0530, Santhosh Kumar K <s-k6@ti.com> wrote:
> PHY tuning requires a known data pattern to be readable from flash.
> When no partition is explicitly identified, the controller must search
> all available partitions to locate the pattern by label, which adds
> overhead and relies on label naming conventions outside the
> controller's control.
>
> Add cdns,phy-pattern-partition, a phandle property that allows the DT
> author to directly reference the flash partition holding the PHY tuning
> pattern. The controller uses this partition during calibration, avoiding
> the partition search entirely.
I would remove the "avoiding the partition search entirely". While
thruthful, this is related to the history of the feature (how it was
implemented before) but doesn't give any useful hint to the reader of
the Linux kernel mainline repository. I would just drop that mention,
and same below.
> Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
> ---
> .../bindings/spi/cdns,qspi-nor-peripheral-props.yaml | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/spi/cdns,qspi-nor-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/cdns,qspi-nor-peripheral-props.yaml
> index 510b82c177c0..0ffcdf5b00d0 100644
> --- a/Documentation/devicetree/bindings/spi/cdns,qspi-nor-peripheral-props.yaml
> +++ b/Documentation/devicetree/bindings/spi/cdns,qspi-nor-peripheral-props.yaml
> @@ -39,4 +39,12 @@ properties:
> Delay in nanoseconds between setting qspi_n_ss_out low and
> first bit transfer.
>
> + cdns,phy-pattern-partition:
> + $ref: /schemas/types.yaml#/definitions/phandle
> + description:
> + Phandle to the flash partition containing the PHY tuning pattern.
> + When present, the controller uses this partition to locate the
> + pattern data during PHY tuning
Period ^
> + instead of searching all partitions by label.
This sentence can be dropped.
> +
> additionalProperties: true
With these comments addressed, you can add my
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 03/13] spi: parse two-element spi-max-frequency property
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
2026-05-27 17:55 ` [PATCH v3 01/13] spi: dt-bindings: allow spi-max-frequency to specify a frequency pair Santhosh Kumar K
2026-05-27 17:55 ` [PATCH v3 02/13] spi: dt-bindings: cdns,qspi-nor: add PHY tuning pattern partition property Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-28 8:37 ` Miquel Raynal
2026-05-27 17:55 ` [PATCH v3 04/13] spi: spi-mem: add spi_mem_apply_base_freq_cap() Santhosh Kumar K
` (10 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Some SPI controllers support high-speed operating modes that require
controller-side configuration before the device can be driven at its
rated maximum. In such cases the device has two relevant speeds: a
conservative rate for baseline operation and a maximum rate achievable
once the controller is fully configured.
Extend struct spi_device with a base_speed_hz field. Update
of_spi_parse_dt() to populate base_speed_hz from the first element and
max_speed_hz from the second when spi-max-frequency carries two values.
A single-element property continues to populate only max_speed_hz,
preserving existing behaviour. base_speed_hz is zero when not set.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi.c | 17 ++++++++++++++---
include/linux/spi/spi.h | 2 ++
2 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 76e3563c523f..a1050b1a1d4c 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -2364,7 +2364,7 @@ static int of_spi_parse_dt(struct spi_controller *ctlr, struct spi_device *spi,
struct device_node *nc)
{
u32 value, cs[SPI_DEVICE_CS_CNT_MAX], map[SPI_DEVICE_DATA_LANE_CNT_MAX];
- int rc, idx, max_num_data_lanes;
+ int rc, idx, max_num_data_lanes, nfreq;
/* Mode (clock phase/polarity/etc.) */
if (of_property_read_bool(nc, "spi-cpha"))
@@ -2596,9 +2596,20 @@ static int of_spi_parse_dt(struct spi_controller *ctlr, struct spi_device *spi,
*/
spi->cs_index_mask = BIT(0);
- /* Device speed */
- if (!of_property_read_u32(nc, "spi-max-frequency", &value))
+ /*
+ * Device speed: a single value sets max_speed_hz; two values set
+ * base_speed_hz (conservative) and max_speed_hz (maximum after
+ * controller-side configuration).
+ */
+ nfreq = of_property_count_u32_elems(nc, "spi-max-frequency");
+ if (nfreq == 2) {
+ of_property_read_u32_index(nc, "spi-max-frequency", 0,
+ &spi->base_speed_hz);
+ of_property_read_u32_index(nc, "spi-max-frequency", 1,
+ &spi->max_speed_hz);
+ } else if (!of_property_read_u32(nc, "spi-max-frequency", &value)) {
spi->max_speed_hz = value;
+ }
/* Device CS delays */
of_spi_parse_dt_cs_delay(nc, &spi->cs_setup, "spi-cs-setup-delay-ns");
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index f6ed93eff00b..e4fc8ac1c889 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -139,6 +139,7 @@ extern void spi_transfer_cs_change_delay_exec(struct spi_message *msg,
* @max_speed_hz: Maximum clock rate to be used with this chip
* (on this board); may be changed by the device's driver.
* The spi_transfer.speed_hz can override this for each transfer.
+ * @base_speed_hz: Conservative clock rate for the device; zero when not set.
* @bits_per_word: Data transfers involve one or more words; word sizes
* like eight or 12 bits are common. In-memory wordsizes are
* powers of two bytes (e.g. 20 bit samples use 32 bits).
@@ -191,6 +192,7 @@ struct spi_device {
struct device dev;
struct spi_controller *controller;
u32 max_speed_hz;
+ u32 base_speed_hz;
u8 bits_per_word;
bool rt;
#define SPI_NO_TX BIT(31) /* No transmit wire */
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 03/13] spi: parse two-element spi-max-frequency property
2026-05-27 17:55 ` [PATCH v3 03/13] spi: parse two-element spi-max-frequency property Santhosh Kumar K
@ 2026-05-28 8:37 ` Miquel Raynal
0 siblings, 0 replies; 25+ messages in thread
From: Miquel Raynal @ 2026-05-28 8:37 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, richard, vigneshr, pratyush,
mwalle, takahiro.kuwano, linux-spi, devicetree, linux-kernel,
linux-mtd, praneeth, u-kumar1, a-dutta
>
> - /* Device speed */
> - if (!of_property_read_u32(nc, "spi-max-frequency", &value))
> + /*
> + * Device speed: a single value sets max_speed_hz; two values set
> + * base_speed_hz (conservative) and max_speed_hz (maximum after
> + * controller-side configuration).
> + */
> + nfreq = of_property_count_u32_elems(nc, "spi-max-frequency");
> + if (nfreq == 2) {
> + of_property_read_u32_index(nc, "spi-max-frequency", 0,
> + &spi->base_speed_hz);
> + of_property_read_u32_index(nc, "spi-max-frequency", 1,
> + &spi->max_speed_hz);
I don't know how useful that is, but I would use an intermediate
variable and check the return value of the of_property_* helper before
filling spi->max|base_speed_hz.
With this fixed,
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 04/13] spi: spi-mem: add spi_mem_apply_base_freq_cap()
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (2 preceding siblings ...)
2026-05-27 17:55 ` [PATCH v3 03/13] spi: parse two-element spi-max-frequency property Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-28 8:43 ` Miquel Raynal
2026-05-27 17:55 ` [PATCH v3 05/13] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning() Santhosh Kumar K
` (9 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
When a device exposes both a conservative speed and a maximum speed,
operations that have not been configured for max-speed use must be
prevented from running at the device maximum. Without this, any op with
max_freq == 0 would be silently raised to max_speed_hz by
spi_mem_adjust_op_freq(), bypassing the intended conservative limit.
Add spi_mem_apply_base_freq_cap(). When base_speed_hz is set it caps
op->max_freq to that value, unless the op already has max_freq set to
max_speed_hz, which signals it has been configured for max-speed use.
Call it in spi_mem_exec_op() before spi_mem_adjust_op_freq() so the
final frequency is within both the base-speed constraint and the device
maximum.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-mem.c | 26 +++++++++++++++++++++++++-
include/linux/spi/spi-mem.h | 1 +
2 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
index a88b9f038356..d16986274cbc 100644
--- a/drivers/spi/spi-mem.c
+++ b/drivers/spi/spi-mem.c
@@ -398,7 +398,11 @@ int spi_mem_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
u8 *tmpbuf;
int ret;
- /* Make sure the operation frequency is correct before going futher */
+ /*
+ * Ops not configured for maximum speed are limited to the conservative
+ * base speed; spi_mem_adjust_op_freq() then caps to the device maximum.
+ */
+ spi_mem_apply_base_freq_cap(mem, (struct spi_mem_op *)op);
spi_mem_adjust_op_freq(mem, (struct spi_mem_op *)op);
dev_vdbg(&mem->spi->dev, "[cmd: 0x%02x][%dB addr: %#8llx][%2dB dummy][%4dB data %s] %d%c-%d%c-%d%c-%d%c @ %uHz\n",
@@ -599,6 +603,26 @@ void spi_mem_adjust_op_freq(struct spi_mem *mem, struct spi_mem_op *op)
}
EXPORT_SYMBOL_GPL(spi_mem_adjust_op_freq);
+/**
+ * spi_mem_apply_base_freq_cap() - Enforce the conservative base speed for
+ * operations that are not explicitly validated
+ * @mem: the SPI memory
+ * @op: the operation to adjust
+ *
+ * When @mem->spi->base_speed_hz is non-zero, caps @op->max_freq to that
+ * value unless @op->max_freq is already set to @mem->spi->max_speed_hz,
+ * which signals the operation has been configured for max-speed use.
+ */
+void spi_mem_apply_base_freq_cap(struct spi_mem *mem, struct spi_mem_op *op)
+{
+ if (!mem->spi->base_speed_hz || op->max_freq == mem->spi->max_speed_hz)
+ return;
+
+ if (!op->max_freq || op->max_freq > mem->spi->base_speed_hz)
+ op->max_freq = mem->spi->base_speed_hz;
+}
+EXPORT_SYMBOL_GPL(spi_mem_apply_base_freq_cap);
+
/**
* spi_mem_calc_op_duration() - Derives the theoretical length (in ns) of an
* operation. This helps finding the best variant
diff --git a/include/linux/spi/spi-mem.h b/include/linux/spi/spi-mem.h
index 722abd9aee3c..98125cb4cc6b 100644
--- a/include/linux/spi/spi-mem.h
+++ b/include/linux/spi/spi-mem.h
@@ -462,6 +462,7 @@ bool spi_mem_default_supports_op(struct spi_mem *mem,
int spi_mem_adjust_op_size(struct spi_mem *mem, struct spi_mem_op *op);
void spi_mem_adjust_op_freq(struct spi_mem *mem, struct spi_mem_op *op);
+void spi_mem_apply_base_freq_cap(struct spi_mem *mem, struct spi_mem_op *op);
u64 spi_mem_calc_op_duration(struct spi_mem *mem, struct spi_mem_op *op);
bool spi_mem_supports_op(struct spi_mem *mem,
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 04/13] spi: spi-mem: add spi_mem_apply_base_freq_cap()
2026-05-27 17:55 ` [PATCH v3 04/13] spi: spi-mem: add spi_mem_apply_base_freq_cap() Santhosh Kumar K
@ 2026-05-28 8:43 ` Miquel Raynal
0 siblings, 0 replies; 25+ messages in thread
From: Miquel Raynal @ 2026-05-28 8:43 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, richard, vigneshr, pratyush,
mwalle, takahiro.kuwano, linux-spi, devicetree, linux-kernel,
linux-mtd, praneeth, u-kumar1, a-dutta
Hi Santhosh,
> --- a/drivers/spi/spi-mem.c
> +++ b/drivers/spi/spi-mem.c
> @@ -398,7 +398,11 @@ int spi_mem_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
> u8 *tmpbuf;
> int ret;
>
> - /* Make sure the operation frequency is correct before going futher */
> + /*
> + * Ops not configured for maximum speed are limited to the conservative
> + * base speed; spi_mem_adjust_op_freq() then caps to the device maximum.
> + */
> + spi_mem_apply_base_freq_cap(mem, (struct spi_mem_op *)op);
> spi_mem_adjust_op_freq(mem, (struct spi_mem_op *)op);
There are many more spi_mem_adjust_op_freq() calls in the core where we would
not apply the base frequency. Aren't we missing these places? Wouldn't it
be more appropriate to call spi_mem_apply_base_freq_cap() at the beginning
of spi_mem_adjust_op_freq() ?
Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 05/13] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning()
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (3 preceding siblings ...)
2026-05-27 17:55 ` [PATCH v3 04/13] spi: spi-mem: add spi_mem_apply_base_freq_cap() Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-28 8:44 ` Miquel Raynal
2026-05-27 17:55 ` [PATCH v3 06/13] spi: cadence-quadspi: move cqspi_readdata_capture earlier Santhosh Kumar K
` (8 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
SPI memory controllers that support high-speed operating modes often
require a tuning procedure to calibrate internal timing before operating
at maximum frequency. There is currently no standard spi-mem interface
for drivers to trigger this procedure.
Add an execute_tuning callback to struct spi_controller_mem_ops. The
callback receives a mandatory read op template and an optional write op
template. On success the controller sets op->max_freq in each provided
template to the validated clock rate.
Add the corresponding spi_mem_execute_tuning() wrapper that validates
inputs and returns -EOPNOTSUPP when the controller has not implemented
the callback, allowing callers to handle controllers that do not support
tuning gracefully.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-mem.c | 31 +++++++++++++++++++++++++++++++
include/linux/spi/spi-mem.h | 9 +++++++++
2 files changed, 40 insertions(+)
diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
index d16986274cbc..5a4bf4b17fc1 100644
--- a/drivers/spi/spi-mem.c
+++ b/drivers/spi/spi-mem.c
@@ -675,6 +675,37 @@ u64 spi_mem_calc_op_duration(struct spi_mem *mem, struct spi_mem_op *op)
}
EXPORT_SYMBOL_GPL(spi_mem_calc_op_duration);
+/**
+ * spi_mem_execute_tuning() - Execute controller tuning procedure
+ * @mem: the SPI memory device
+ * @read_op: read operation template (mandatory)
+ * @write_op: write operation template (optional, may be NULL)
+ *
+ * Requests the controller to perform tuning for high-speed operation
+ * using the provided op templates. On success the controller callback
+ * sets @read_op->max_freq (and @write_op->max_freq when non-NULL) to
+ * the validated clock rate.
+ *
+ * Return: 0 on success, -EINVAL if @mem or @read_op is NULL,
+ * -EOPNOTSUPP if controller doesn't support tuning,
+ * or a controller-specific error code on failure.
+ */
+int spi_mem_execute_tuning(struct spi_mem *mem, struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op)
+{
+ struct spi_controller *ctlr;
+
+ if (!mem || !read_op)
+ return -EINVAL;
+
+ ctlr = mem->spi->controller;
+ if (!ctlr->mem_ops || !ctlr->mem_ops->execute_tuning)
+ return -EOPNOTSUPP;
+
+ return ctlr->mem_ops->execute_tuning(mem, read_op, write_op);
+}
+EXPORT_SYMBOL_GPL(spi_mem_execute_tuning);
+
static ssize_t spi_mem_no_dirmap_read(struct spi_mem_dirmap_desc *desc,
u64 offs, size_t len, void *buf)
{
diff --git a/include/linux/spi/spi-mem.h b/include/linux/spi/spi-mem.h
index 98125cb4cc6b..2457ec6f63d6 100644
--- a/include/linux/spi/spi-mem.h
+++ b/include/linux/spi/spi-mem.h
@@ -346,6 +346,10 @@ static inline void *spi_mem_get_drvdata(struct spi_mem *mem)
* @poll_status: poll memory device status until (status & mask) == match or
* when the timeout has expired. It fills the data buffer with
* the last status value.
+ * @execute_tuning: run the controller tuning procedure using the provided
+ * read and optional write op templates. On success, set
+ * @read_op->max_freq (and @write_op->max_freq when non-NULL)
+ * to the validated clock rate.
*
* This interface should be implemented by SPI controllers providing an
* high-level interface to execute SPI memory operation, which is usually the
@@ -376,6 +380,8 @@ struct spi_controller_mem_ops {
unsigned long initial_delay_us,
unsigned long polling_rate_us,
unsigned long timeout_ms);
+ int (*execute_tuning)(struct spi_mem *mem, struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op);
};
/**
@@ -465,6 +471,9 @@ void spi_mem_adjust_op_freq(struct spi_mem *mem, struct spi_mem_op *op);
void spi_mem_apply_base_freq_cap(struct spi_mem *mem, struct spi_mem_op *op);
u64 spi_mem_calc_op_duration(struct spi_mem *mem, struct spi_mem_op *op);
+int spi_mem_execute_tuning(struct spi_mem *mem, struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op);
+
bool spi_mem_supports_op(struct spi_mem *mem,
const struct spi_mem_op *op);
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 05/13] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning()
2026-05-27 17:55 ` [PATCH v3 05/13] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning() Santhosh Kumar K
@ 2026-05-28 8:44 ` Miquel Raynal
0 siblings, 0 replies; 25+ messages in thread
From: Miquel Raynal @ 2026-05-28 8:44 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, richard, vigneshr, pratyush,
mwalle, takahiro.kuwano, linux-spi, devicetree, linux-kernel,
linux-mtd, praneeth, u-kumar1, a-dutta
On 27/05/2026 at 23:25:19 +0530, Santhosh Kumar K <s-k6@ti.com> wrote:
> SPI memory controllers that support high-speed operating modes often
> require a tuning procedure to calibrate internal timing before operating
> at maximum frequency. There is currently no standard spi-mem interface
> for drivers to trigger this procedure.
>
> Add an execute_tuning callback to struct spi_controller_mem_ops. The
> callback receives a mandatory read op template and an optional write op
> template. On success the controller sets op->max_freq in each provided
> template to the validated clock rate.
>
> Add the corresponding spi_mem_execute_tuning() wrapper that validates
> inputs and returns -EOPNOTSUPP when the controller has not implemented
> the callback, allowing callers to handle controllers that do not support
> tuning gracefully.
>
> Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
This approach looks so much better.
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 06/13] spi: cadence-quadspi: move cqspi_readdata_capture earlier
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (4 preceding siblings ...)
2026-05-27 17:55 ` [PATCH v3 05/13] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning() Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-27 17:55 ` [PATCH v3 07/13] spi: cadence-quadspi: add DQS support to read data capture Santhosh Kumar K
` (7 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Move cqspi_readdata_capture() function earlier in the file. This is
preparatory refactoring for upcoming PHY tuning support for read and
write operations.
No functional changes.
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 45 +++++++++++++++----------------
1 file changed, 22 insertions(+), 23 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index aaba1a3ad577..54fd7b591e06 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -453,6 +453,28 @@ static int cqspi_wait_idle(struct cqspi_st *cqspi)
}
}
+static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
+ const unsigned int delay)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+
+ reg = readl(reg_base + CQSPI_REG_READCAPTURE);
+
+ if (bypass)
+ reg |= BIT(CQSPI_REG_READCAPTURE_BYPASS_LSB);
+ else
+ reg &= ~BIT(CQSPI_REG_READCAPTURE_BYPASS_LSB);
+
+ reg &= ~(CQSPI_REG_READCAPTURE_DELAY_MASK
+ << CQSPI_REG_READCAPTURE_DELAY_LSB);
+
+ reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
+ << CQSPI_REG_READCAPTURE_DELAY_LSB;
+
+ writel(reg, reg_base + CQSPI_REG_READCAPTURE);
+}
+
static int cqspi_exec_flash_cmd(struct cqspi_st *cqspi, unsigned int reg)
{
void __iomem *reg_base = cqspi->iobase;
@@ -1270,29 +1292,6 @@ static void cqspi_config_baudrate_div(struct cqspi_st *cqspi)
writel(reg, reg_base + CQSPI_REG_CONFIG);
}
-static void cqspi_readdata_capture(struct cqspi_st *cqspi,
- const bool bypass,
- const unsigned int delay)
-{
- void __iomem *reg_base = cqspi->iobase;
- unsigned int reg;
-
- reg = readl(reg_base + CQSPI_REG_READCAPTURE);
-
- if (bypass)
- reg |= BIT(CQSPI_REG_READCAPTURE_BYPASS_LSB);
- else
- reg &= ~BIT(CQSPI_REG_READCAPTURE_BYPASS_LSB);
-
- reg &= ~(CQSPI_REG_READCAPTURE_DELAY_MASK
- << CQSPI_REG_READCAPTURE_DELAY_LSB);
-
- reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
- << CQSPI_REG_READCAPTURE_DELAY_LSB;
-
- writel(reg, reg_base + CQSPI_REG_READCAPTURE);
-}
-
static void cqspi_configure(struct cqspi_flash_pdata *f_pdata,
unsigned long sclk)
{
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH v3 07/13] spi: cadence-quadspi: add DQS support to read data capture
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (5 preceding siblings ...)
2026-05-27 17:55 ` [PATCH v3 06/13] spi: cadence-quadspi: move cqspi_readdata_capture earlier Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-27 17:55 ` [PATCH v3 08/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (6 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Add DQS (Data Strobe) parameter to cqspi_readdata_capture() to control
data capture timing. DQS mode uses a dedicated strobe signal for
improved timing margins in high-speed SPI modes.
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 54fd7b591e06..201d69c64c49 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -192,6 +192,7 @@ struct cqspi_driver_platdata {
#define CQSPI_REG_READCAPTURE_BYPASS_LSB 0
#define CQSPI_REG_READCAPTURE_DELAY_LSB 1
#define CQSPI_REG_READCAPTURE_DELAY_MASK 0xF
+#define CQSPI_REG_READCAPTURE_DQS_LSB 8
#define CQSPI_REG_SIZE 0x14
#define CQSPI_REG_SIZE_ADDRESS_LSB 0
@@ -454,7 +455,7 @@ static int cqspi_wait_idle(struct cqspi_st *cqspi)
}
static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
- const unsigned int delay)
+ const bool dqs, const unsigned int delay)
{
void __iomem *reg_base = cqspi->iobase;
unsigned int reg;
@@ -472,6 +473,11 @@ static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
<< CQSPI_REG_READCAPTURE_DELAY_LSB;
+ if (dqs)
+ reg |= BIT(CQSPI_REG_READCAPTURE_DQS_LSB);
+ else
+ reg &= ~BIT(CQSPI_REG_READCAPTURE_DQS_LSB);
+
writel(reg, reg_base + CQSPI_REG_READCAPTURE);
}
@@ -1313,7 +1319,7 @@ static void cqspi_configure(struct cqspi_flash_pdata *f_pdata,
cqspi->sclk = sclk;
cqspi_config_baudrate_div(cqspi);
cqspi_delay(f_pdata);
- cqspi_readdata_capture(cqspi, !cqspi->rclk_en,
+ cqspi_readdata_capture(cqspi, !cqspi->rclk_en, false,
f_pdata->read_delay);
}
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH v3 08/13] spi: cadence-quadspi: add PHY tuning support
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (6 preceding siblings ...)
2026-05-27 17:55 ` [PATCH v3 07/13] spi: cadence-quadspi: add DQS support to read data capture Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-28 8:54 ` Miquel Raynal
2026-05-27 17:55 ` [PATCH v3 09/13] spi: cadence-quadspi: reject 2-byte-address DDR ops on PHY-tunable hardware Santhosh Kumar K
` (5 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
The Cadence QSPI controller supports a delay-line PHY for high-speed
operation. Without calibration the PHY is unused and read capture relies
on a fixed delay, limiting throughput at frequencies above the base
operating speed.
Add an execute_tuning callback that performs delay-line calibration using
a known data pattern written to a dedicated flash region. The pattern is
either read from a NOR partition identified by the DT property
cdns,phy-pattern-partition, or written to the NAND page cache before
each calibration read.
For DDR protocols (8D-8D-8D) a 2D sweep of (rx_delay, tx_delay) pairs
is performed to find the widest passing region in the combined RX/TX
space. Binary search locates the gap boundary between passing regions
when two separate windows exist; the final operating point is placed at
the centre of the larger region with a small temperature-dependent
offset.
For SDR protocols a 1D sweep of the RX delay is sufficient. Two windows
at adjacent read_delay values are measured; the wider one's midpoint is
selected.
The tuning infrastructure is platform-specific: only am654-based OSPI
controllers populate the execute_tuning hook. All other platform data
entries return -EOPNOTSUPP and are unaffected.
spi-max-frequency may carry two values in DT; the second (higher) value
is the tuned target rate stored in max_clk_rate. When only one value is
present max_clk_rate is zero and tuning is skipped.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 1798 ++++++++++++++++++++++++++++-
1 file changed, 1789 insertions(+), 9 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 201d69c64c49..508bc5bc4ab5 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -65,15 +65,26 @@ enum {
struct cqspi_st;
+struct phy_setting {
+ u8 rx;
+ u8 tx;
+ u8 read_delay;
+};
+
struct cqspi_flash_pdata {
- struct cqspi_st *cqspi;
- u32 clk_rate;
- u32 read_delay;
- u32 tshsl_ns;
- u32 tsd2d_ns;
- u32 tchsh_ns;
- u32 tslch_ns;
- u8 cs;
+ struct cqspi_st *cqspi;
+ u32 max_clk_rate;
+ u32 read_delay;
+ u32 tshsl_ns;
+ u32 tsd2d_ns;
+ u32 tchsh_ns;
+ u32 tslch_ns;
+ bool use_dqs;
+ bool use_phy;
+ u8 cs;
+ struct phy_setting phy_setting;
+ struct spi_mem_op phy_read_op;
+ struct spi_mem_op phy_write_op;
};
static const struct clk_bulk_data cqspi_clks[CLK_QSPI_NUM] = {
@@ -129,12 +140,15 @@ struct cqspi_driver_platdata {
int (*indirect_read_dma)(struct cqspi_flash_pdata *f_pdata,
u_char *rxbuf, loff_t from_addr, size_t n_rx);
u32 (*get_dma_status)(struct cqspi_st *cqspi);
+ int (*execute_tuning)(struct spi_mem *mem, struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op);
};
/* Operation timeout value */
#define CQSPI_TIMEOUT_MS 500
#define CQSPI_READ_TIMEOUT_MS 10
#define CQSPI_BUSYWAIT_TIMEOUT_US 500
+#define CQSPI_DLL_TIMEOUT_US 300
/* Runtime_pm autosuspend delay */
#define CQSPI_AUTOSUSPEND_TIMEOUT 2000
@@ -148,12 +162,14 @@ struct cqspi_driver_platdata {
/* Register map */
#define CQSPI_REG_CONFIG 0x00
#define CQSPI_REG_CONFIG_ENABLE_MASK BIT(0)
+#define CQSPI_REG_CONFIG_PHY_EN BIT(3)
#define CQSPI_REG_CONFIG_ENB_DIR_ACC_CTRL BIT(7)
#define CQSPI_REG_CONFIG_DECODE_MASK BIT(9)
#define CQSPI_REG_CONFIG_CHIPSELECT_LSB 10
#define CQSPI_REG_CONFIG_DMA_MASK BIT(15)
#define CQSPI_REG_CONFIG_BAUD_LSB 19
#define CQSPI_REG_CONFIG_DTR_PROTO BIT(24)
+#define CQSPI_REG_CONFIG_PHY_PIPELINE BIT(25)
#define CQSPI_REG_CONFIG_DUAL_OPCODE BIT(30)
#define CQSPI_REG_CONFIG_IDLE_LSB 31
#define CQSPI_REG_CONFIG_CHIPSELECT_MASK 0xF
@@ -192,6 +208,7 @@ struct cqspi_driver_platdata {
#define CQSPI_REG_READCAPTURE_BYPASS_LSB 0
#define CQSPI_REG_READCAPTURE_DELAY_LSB 1
#define CQSPI_REG_READCAPTURE_DELAY_MASK 0xF
+#define CQSPI_REG_READCAPTURE_EDGE_LSB 5
#define CQSPI_REG_READCAPTURE_DQS_LSB 8
#define CQSPI_REG_SIZE 0x14
@@ -273,6 +290,27 @@ struct cqspi_driver_platdata {
#define CQSPI_REG_POLLING_STATUS 0xB0
#define CQSPI_REG_POLLING_STATUS_DUMMY_LSB 16
+#define CQSPI_REG_PHY_CONFIG 0xB4
+#define CQSPI_REG_PHY_CONFIG_RX_DEL_LSB 0
+#define CQSPI_REG_PHY_CONFIG_RX_DEL_MASK 0x7F
+#define CQSPI_REG_PHY_CONFIG_TX_DEL_LSB 16
+#define CQSPI_REG_PHY_CONFIG_TX_DEL_MASK 0x7F
+#define CQSPI_REG_PHY_CONFIG_DLL_RESET BIT(30)
+#define CQSPI_REG_PHY_CONFIG_RESYNC BIT(31)
+
+#define CQSPI_REG_PHY_DLL_MASTER 0xB8
+#define CQSPI_REG_PHY_DLL_MASTER_INIT_DELAY_LSB 0
+#define CQSPI_REG_PHY_DLL_MASTER_INIT_DELAY_VAL 16
+#define CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LEN 0x7
+#define CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LSB 20
+#define CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_3 0x2
+#define CQSPI_REG_PHY_DLL_MASTER_BYPASS BIT(23)
+#define CQSPI_REG_PHY_DLL_MASTER_CYCLE BIT(24)
+
+#define CQSPI_REG_DLL_OBS_LOW 0xBC
+#define CQSPI_REG_DLL_OBS_LOW_DLL_LOCK BIT(0)
+#define CQSPI_REG_DLL_OBS_LOW_LOOPBACK_LOCK BIT(15)
+
#define CQSPI_REG_OP_EXT_LOWER 0xE0
#define CQSPI_REG_OP_EXT_READ_LSB 24
#define CQSPI_REG_OP_EXT_WRITE_LSB 16
@@ -321,6 +359,50 @@ struct cqspi_driver_platdata {
#define CQSPI_REG_VERSAL_DMA_VAL 0x602
+#define CQSPI_PHY_INIT_RD 1
+#define CQSPI_PHY_MAX_RD 4
+#define CQSPI_PHY_MAX_DELAY 127
+#define CQSPI_PHY_DDR_SEARCH_STEP 4
+#define CQSPI_PHY_TX_LOOKUP_LOW_START 28
+#define CQSPI_PHY_TX_LOOKUP_LOW_END 48
+#define CQSPI_PHY_TX_LOOKUP_HIGH_START 60
+#define CQSPI_PHY_TX_LOOKUP_HIGH_END 96
+#define CQSPI_PHY_RX_LOW_SEARCH_START 0
+#define CQSPI_PHY_RX_LOW_SEARCH_END 40
+#define CQSPI_PHY_RX_HIGH_SEARCH_START 24
+#define CQSPI_PHY_RX_HIGH_SEARCH_END 127
+#define CQSPI_PHY_TX_LOW_SEARCH_START 0
+#define CQSPI_PHY_TX_LOW_SEARCH_END 64
+#define CQSPI_PHY_TX_HIGH_SEARCH_START 78
+#define CQSPI_PHY_TX_HIGH_SEARCH_END 127
+#define CQSPI_PHY_SEARCH_OFFSET 8
+
+#define CQSPI_PHY_DEFAULT_TEMP 45
+#define CQSPI_PHY_MIN_TEMP -45
+#define CQSPI_PHY_MAX_TEMP 130
+#define CQSPI_PHY_MID_TEMP (CQSPI_PHY_MIN_TEMP + \
+ ((CQSPI_PHY_MAX_TEMP - \
+ CQSPI_PHY_MIN_TEMP) / 2))
+
+/*
+ * PHY tuning pattern for calibrating read data capture delay. This 128-byte
+ * pattern provides sufficient bit transitions across all byte lanes to
+ * reliably detect timing windows at high frequencies.
+ */
+static const u8 phy_tuning_pattern[] __aligned(64) = {
+ 0xFE, 0xFF, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0x00, 0xFE, 0xFE, 0x01,
+ 0x01, 0x01, 0x01, 0x00, 0x00, 0xFE, 0xFE, 0x01, 0xFF, 0xFF, 0xFF, 0xFF,
+ 0xFF, 0x00, 0x00, 0xFE, 0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0xFE,
+ 0xFE, 0xFF, 0x01, 0x01, 0x01, 0x01, 0x01, 0xFE, 0x00, 0xFE, 0xFE, 0x01,
+ 0x01, 0x01, 0x01, 0xFE, 0x00, 0xFE, 0xFE, 0x01, 0xFF, 0xFF, 0xFF, 0xFF,
+ 0xFF, 0xFE, 0x00, 0xFE, 0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFE, 0x00, 0xFE,
+ 0xFE, 0xFF, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0xFE, 0xFE, 0xFE, 0x01,
+ 0x01, 0x01, 0x01, 0x00, 0xFE, 0xFE, 0xFE, 0x01, 0xFF, 0xFF, 0xFF, 0xFF,
+ 0xFF, 0x00, 0xFE, 0xFE, 0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0xFE, 0xFE,
+ 0xFE, 0xFF, 0x01, 0x01, 0x01, 0x01, 0x01, 0xFE, 0xFE, 0xFE, 0xFE, 0x01,
+ 0x01, 0x01, 0x01, 0xFE, 0xFE, 0xFE, 0xFE, 0x01,
+};
+
static int cqspi_wait_for_bit(const struct cqspi_driver_platdata *ddata,
void __iomem *reg, const u32 mask, bool clr,
bool busywait)
@@ -1555,10 +1637,1687 @@ static bool cqspi_supports_mem_op(struct spi_mem *mem,
return spi_mem_default_supports_op(mem, op);
}
+static int cqspi_write_pattern_to_cache(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem,
+ struct spi_mem_op *write_op)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ write_op->data.nbytes = sizeof(phy_tuning_pattern);
+ write_op->data.buf.out = phy_tuning_pattern;
+
+ ret = spi_mem_exec_op(mem, write_op);
+ if (ret) {
+ dev_err(dev, "Failed to write PHY pattern to cache: %d\n", ret);
+ return ret;
+ }
+ dev_dbg(dev, "PHY pattern (%zu bytes) written to cache\n",
+ sizeof(phy_tuning_pattern));
+
+ return 0;
+}
+
+static int cqspi_get_phy_pattern_offset(struct device *dev, u32 *offset)
+{
+ struct device_node *np, *flash_np = NULL, *part_np;
+ const __be32 *reg;
+ int len;
+
+ if (!dev || !dev->of_node)
+ return -EINVAL;
+
+ for_each_child_of_node(dev->of_node, np) {
+ if (of_node_name_prefix(np, "flash")) {
+ flash_np = np;
+ break;
+ }
+ }
+
+ if (!flash_np)
+ return -ENODEV;
+
+ part_np = of_parse_phandle(flash_np, "cdns,phy-pattern-partition", 0);
+ of_node_put(flash_np);
+ if (!part_np)
+ return -ENODEV;
+
+ reg = of_get_property(part_np, "reg", &len);
+ if (reg && len >= sizeof(__be32)) {
+ *offset = be32_to_cpu(reg[0]);
+ of_node_put(part_np);
+ return 0;
+ }
+
+ of_node_put(part_np);
+ return -ENOENT;
+}
+
+static int cqspi_phy_check_pattern(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem)
+{
+ struct spi_mem_op op;
+ u8 *read_data;
+ int ret;
+
+ read_data = kmalloc_array(ARRAY_SIZE(phy_tuning_pattern),
+ sizeof(phy_tuning_pattern[0]), GFP_KERNEL);
+ if (!read_data)
+ return -ENOMEM;
+
+ op = f_pdata->phy_read_op;
+ op.data.buf.in = read_data;
+ op.data.nbytes = sizeof(phy_tuning_pattern);
+
+ ret = spi_mem_exec_op(mem, &op);
+ if (ret)
+ goto out;
+
+ if (memcmp(read_data, phy_tuning_pattern, sizeof(phy_tuning_pattern)))
+ ret = -EAGAIN;
+
+out:
+ kfree(read_data);
+ return ret;
+}
+
+static void cqspi_phy_set_dll_master(struct cqspi_st *cqspi)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+
+ reg = readl(reg_base + CQSPI_REG_PHY_DLL_MASTER);
+ reg &= ~((CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LEN
+ << CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LSB) |
+ CQSPI_REG_PHY_DLL_MASTER_BYPASS |
+ CQSPI_REG_PHY_DLL_MASTER_CYCLE);
+ reg |= ((CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_3
+ << CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LSB) |
+ CQSPI_REG_PHY_DLL_MASTER_CYCLE);
+
+ writel(reg, reg_base + CQSPI_REG_PHY_DLL_MASTER);
+}
+
+static void cqspi_phy_pre_config(struct cqspi_st *cqspi,
+ struct cqspi_flash_pdata *f_pdata,
+ const bool bypass)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+ u8 dummy;
+
+ cqspi_readdata_capture(cqspi, bypass, f_pdata->use_dqs,
+ f_pdata->phy_setting.read_delay);
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg &= ~(CQSPI_REG_CONFIG_PHY_EN | CQSPI_REG_CONFIG_PHY_PIPELINE);
+ reg |= CQSPI_REG_CONFIG_PHY_EN;
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_RD_INSTR);
+ dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ reg);
+ dummy--;
+ reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK << CQSPI_REG_RD_INSTR_DUMMY_LSB);
+ reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ dummy);
+ writel(reg, reg_base + CQSPI_REG_RD_INSTR);
+
+ cqspi_phy_set_dll_master(cqspi);
+}
+
+static void cqspi_phy_post_config(struct cqspi_st *cqspi,
+ const unsigned int delay)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+ u8 dummy;
+
+ reg = readl(reg_base + CQSPI_REG_READCAPTURE);
+ reg &= ~(CQSPI_REG_READCAPTURE_DELAY_MASK
+ << CQSPI_REG_READCAPTURE_DELAY_LSB);
+
+ reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
+ << CQSPI_REG_READCAPTURE_DELAY_LSB;
+ writel(reg, reg_base + CQSPI_REG_READCAPTURE);
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg &= ~(CQSPI_REG_CONFIG_PHY_EN | CQSPI_REG_CONFIG_PHY_PIPELINE);
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_RD_INSTR);
+ dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ reg);
+ dummy++;
+ reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK << CQSPI_REG_RD_INSTR_DUMMY_LSB);
+ reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ dummy);
+ writel(reg, reg_base + CQSPI_REG_RD_INSTR);
+}
+
+static void cqspi_set_dll(void __iomem *reg_base, u8 rx_dll, u8 tx_dll)
+{
+ unsigned int reg;
+
+ reg = readl(reg_base + CQSPI_REG_PHY_CONFIG);
+ reg &= ~((CQSPI_REG_PHY_CONFIG_RX_DEL_MASK
+ << CQSPI_REG_PHY_CONFIG_RX_DEL_LSB) |
+ (CQSPI_REG_PHY_CONFIG_TX_DEL_MASK
+ << CQSPI_REG_PHY_CONFIG_TX_DEL_LSB));
+ reg |= ((rx_dll & CQSPI_REG_PHY_CONFIG_RX_DEL_MASK)
+ << CQSPI_REG_PHY_CONFIG_RX_DEL_LSB) |
+ ((tx_dll & CQSPI_REG_PHY_CONFIG_TX_DEL_MASK)
+ << CQSPI_REG_PHY_CONFIG_TX_DEL_LSB) |
+ CQSPI_REG_PHY_CONFIG_RESYNC;
+ writel(reg, reg_base + CQSPI_REG_PHY_CONFIG);
+}
+
+static int cqspi_resync_dll(struct cqspi_st *cqspi)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+ int ret;
+
+ ret = cqspi_wait_idle(cqspi);
+ if (ret)
+ return ret;
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg &= ~CQSPI_REG_CONFIG_ENABLE_MASK;
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_PHY_CONFIG);
+ reg &= ~(CQSPI_REG_PHY_CONFIG_DLL_RESET | CQSPI_REG_PHY_CONFIG_RESYNC);
+ writel(reg, reg_base + CQSPI_REG_PHY_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_PHY_DLL_MASTER);
+ reg |= (CQSPI_REG_PHY_DLL_MASTER_INIT_DELAY_VAL
+ << CQSPI_REG_PHY_DLL_MASTER_INIT_DELAY_LSB);
+ writel(reg, reg_base + CQSPI_REG_PHY_DLL_MASTER);
+
+ reg = readl(reg_base + CQSPI_REG_PHY_CONFIG);
+ reg |= CQSPI_REG_PHY_CONFIG_DLL_RESET;
+ writel(reg, reg_base + CQSPI_REG_PHY_CONFIG);
+
+ ret = readl_poll_timeout(reg_base + CQSPI_REG_DLL_OBS_LOW, reg,
+ (reg & CQSPI_REG_DLL_OBS_LOW_DLL_LOCK), 0,
+ CQSPI_DLL_TIMEOUT_US);
+ if (ret)
+ goto re_enable;
+
+ ret = readl_poll_timeout(reg_base + CQSPI_REG_DLL_OBS_LOW, reg,
+ (reg & CQSPI_REG_DLL_OBS_LOW_LOOPBACK_LOCK), 0,
+ CQSPI_DLL_TIMEOUT_US);
+ if (ret)
+ goto re_enable;
+
+ reg = readl(reg_base + CQSPI_REG_PHY_CONFIG);
+ reg |= CQSPI_REG_PHY_CONFIG_RESYNC;
+ writel(reg, reg_base + CQSPI_REG_PHY_CONFIG);
+
+re_enable:
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg |= CQSPI_REG_CONFIG_ENABLE_MASK;
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ return ret;
+}
+
+static int cqspi_phy_apply_setting(struct cqspi_flash_pdata *f_pdata,
+ struct phy_setting *phy)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ unsigned int reg;
+ int ret;
+
+ reg = readl(cqspi->iobase + CQSPI_REG_READCAPTURE);
+ reg |= BIT(CQSPI_REG_READCAPTURE_EDGE_LSB);
+ writel(reg, cqspi->iobase + CQSPI_REG_READCAPTURE);
+
+ cqspi_set_dll(cqspi->iobase, phy->rx, phy->tx);
+
+ ret = cqspi_resync_dll(cqspi);
+ if (ret)
+ return ret;
+
+ f_pdata->phy_setting.read_delay = phy->read_delay;
+ return 0;
+}
+
+static int cqspi_find_rx_low_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ do {
+ phy->rx = CQSPI_PHY_RX_LOW_SEARCH_START;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+
+ phy->rx += CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (phy->rx <= CQSPI_PHY_RX_LOW_SEARCH_END);
+
+ phy->read_delay++;
+ } while (phy->read_delay <= CQSPI_PHY_MAX_RD);
+
+ dev_dbg(dev, "Unable to find RX low\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_rx_low_sdr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ phy->rx = 0;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+ phy->rx++;
+ } while (phy->rx <= CQSPI_PHY_MAX_DELAY);
+
+ dev_dbg(dev, "Unable to find RX low\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_rx_high_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ do {
+ phy->rx = CQSPI_PHY_RX_HIGH_SEARCH_END;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+
+ phy->rx -= CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (phy->rx >= CQSPI_PHY_RX_HIGH_SEARCH_START);
+
+ phy->read_delay--;
+ } while (phy->read_delay >= CQSPI_PHY_INIT_RD);
+
+ dev_dbg(dev, "Unable to find RX high\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_rx_high_sdr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy,
+ u8 lowerbound)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ phy->rx = CQSPI_PHY_MAX_DELAY;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+ phy->rx--;
+ } while (phy->rx > lowerbound);
+
+ dev_dbg(dev, "Unable to find RX high\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_tx_low_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ do {
+ phy->tx = CQSPI_PHY_TX_LOW_SEARCH_START;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+
+ phy->tx += CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (phy->tx <= CQSPI_PHY_TX_LOW_SEARCH_END);
+
+ phy->read_delay++;
+ } while (phy->read_delay <= CQSPI_PHY_MAX_RD);
+
+ dev_dbg(dev, "Unable to find TX low\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_tx_high_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ do {
+ phy->tx = CQSPI_PHY_TX_HIGH_SEARCH_END;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+
+ phy->tx -= CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (phy->tx >= CQSPI_PHY_TX_HIGH_SEARCH_START);
+
+ phy->read_delay--;
+ } while (phy->read_delay >= CQSPI_PHY_INIT_RD);
+
+ dev_dbg(dev, "Unable to find TX high\n");
+ return -ENOENT;
+}
+
+static void cqspi_phy_find_gaplow_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem,
+ struct phy_setting *bottomleft,
+ struct phy_setting *topright,
+ struct phy_setting *gaplow)
+{
+ struct phy_setting left, right, mid;
+ int ret;
+
+ left = *bottomleft;
+ right = *topright;
+
+ mid.tx = left.tx + ((right.tx - left.tx) / 2);
+ mid.rx = left.rx + ((right.rx - left.rx) / 2);
+ mid.read_delay = left.read_delay;
+
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, &mid);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ /* The pattern was not found. Go to the lower half. */
+ right.tx = mid.tx;
+ right.rx = mid.rx;
+
+ mid.tx = left.tx + ((mid.tx - left.tx) / 2);
+ mid.rx = left.rx + ((mid.rx - left.rx) / 2);
+ } else {
+ /* The pattern was found. Go to the upper half. */
+ left.tx = mid.tx;
+ left.rx = mid.rx;
+
+ mid.tx = mid.tx + ((right.tx - mid.tx) / 2);
+ mid.rx = mid.rx + ((right.rx - mid.rx) / 2);
+ }
+
+ /* Break the loop if the window has closed. */
+ } while ((right.tx - left.tx >= 2) && (right.rx - left.rx >= 2));
+
+ *gaplow = mid;
+}
+
+static void cqspi_phy_find_gaphigh_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem,
+ struct phy_setting *bottomleft,
+ struct phy_setting *topright,
+ struct phy_setting *gaphigh)
+{
+ struct phy_setting left, right, mid;
+ int ret;
+
+ left = *bottomleft;
+ right = *topright;
+
+ mid.tx = left.tx + ((right.tx - left.tx) / 2);
+ mid.rx = left.rx + ((right.rx - left.rx) / 2);
+ mid.read_delay = right.read_delay;
+
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, &mid);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ /* The pattern was not found. Go to the upper half. */
+ left.tx = mid.tx;
+ left.rx = mid.rx;
+
+ mid.tx = mid.tx + ((right.tx - mid.tx) / 2);
+ mid.rx = mid.rx + ((right.rx - mid.rx) / 2);
+ } else {
+ /* The pattern was found. Go to the lower half. */
+ right.tx = mid.tx;
+ right.rx = mid.rx;
+
+ mid.tx = left.tx + ((mid.tx - left.tx) / 2);
+ mid.rx = left.rx + ((mid.rx - left.rx) / 2);
+ }
+
+ /* Break the loop if the window has closed. */
+ } while ((right.tx - left.tx >= 2) && (right.rx - left.rx >= 2));
+
+ *gaphigh = mid;
+}
+
+static int cqspi_get_temp(int *temp)
+{
+ /* TODO: read SoC thermal sensor; caller falls back to room temperature */
+ return -EOPNOTSUPP;
+}
+
+static inline void cqspi_phy_reset_setting(struct phy_setting *phy)
+{
+ *phy = (struct phy_setting){ .rx = 0, .tx = 127, .read_delay = 0 };
+}
+
+static int cqspi_phy_tuning_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ struct device *dev = &cqspi->pdev->dev;
+ struct phy_setting rxlow, rxhigh, txlow, txhigh;
+ struct phy_setting srxlow, srxhigh;
+ struct phy_setting bottomleft, topright, searchpoint;
+ struct phy_setting gaplow, gaphigh;
+ struct phy_setting backuppoint, backupcornerpoint;
+ int ret, rx_window, temp;
+ bool primary = true, secondary = true;
+
+ /*
+ * DDR tuning: 2D search across RX and TX delays for optimal timing.
+ *
+ * Algorithm: Find RX boundaries (rxlow/rxhigh) using TX window search,
+ * find TX boundaries (txlow/txhigh) at fixed RX, define valid region,
+ * locate gaps via binary search, select final point with temperature
+ * compensation.
+ *
+ * rx
+ * 127 ^
+ * | topright
+ * | *
+ * | xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * | xxxxxxxxxxxxxxxxxx +++++++
+ * | *
+ * | bottomleft
+ * -----------------------------------------> tx
+ * 0 127
+ */
+
+ f_pdata->use_phy = true;
+
+ /* Golden rxlow search: Find lower RX boundary using TX window sweep */
+
+ /*
+ * rx
+ * 127 ^
+ * | xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | |xxxxx|xxxxx +++++++++++++
+ * | |xxxxx|xxxxxx ++++++++++++
+ * search | |xxxxx|xxxxxxx +++++++++++
+ * rxlow --------->|xxxxx|xxxxxxxx ++++++++++
+ * | |xxxxx|xxxxxxxxx +++++++++
+ * | |xxxxx|xxxxxxxxxx ++++++++
+ * | |xxxxx|xxxxxxxxxxx +++++++
+ * | | |
+ * --------|-----|----------------------------> tx
+ * 0 | | 127
+ * txlow txlow
+ * start end
+ *
+ * |----------------------------------------------------------|
+ * | Primary | Secondary | Final |
+ * | Search | Search | Point |
+ * |---------|-----------|------------------------------------|
+ * | Fail | Fail | Return Fail |
+ * |---------|-----------|------------------------------------|
+ * | Fail | Pass | Return Fail |
+ * |---------|-----------|------------------------------------|
+ * | Pass | Fail | Return Fail |
+ * |---------|-----------|------------------------------------|
+ * | Pass | Pass | rx = min(primary.rx, secondary.rx) |
+ * | | | tx = primary.tx |
+ * | | | read_delay = |
+ * | | | min(primary.read_delay, |
+ * | | | secondary.read_delay) |
+ * |----------------------------------------------------------|
+ */
+
+ /* Primary rxlow: Sweep TX window to find valid RX lower bound */
+
+ rxlow.tx = CQSPI_PHY_TX_LOOKUP_LOW_START;
+ do {
+ dev_dbg(dev, "Searching for Golden Primary rxlow on TX = %d\n",
+ rxlow.tx);
+ rxlow.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_rx_low_ddr(f_pdata, mem, &rxlow);
+ rxlow.tx += CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (ret && rxlow.tx <= CQSPI_PHY_TX_LOOKUP_LOW_END);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Golden Primary rxlow: RX: %d TX: %d RD: %d\n", rxlow.rx,
+ rxlow.tx, rxlow.read_delay);
+
+ /* Secondary rxlow: Verify at offset TX for robustness */
+
+ if (rxlow.tx <= (CQSPI_PHY_TX_LOOKUP_LOW_END - CQSPI_PHY_SEARCH_OFFSET))
+ srxlow.tx = rxlow.tx + CQSPI_PHY_SEARCH_OFFSET;
+ else
+ srxlow.tx = CQSPI_PHY_TX_LOOKUP_LOW_END;
+ dev_dbg(dev, "Searching for Golden Secondary rxlow on TX = %d\n",
+ srxlow.tx);
+ srxlow.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_rx_low_ddr(f_pdata, mem, &srxlow);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Golden Secondary rxlow: RX: %d TX: %d RD: %d\n",
+ srxlow.rx, srxlow.tx, srxlow.read_delay);
+
+ rxlow.rx = min(rxlow.rx, srxlow.rx);
+ rxlow.read_delay = min(rxlow.read_delay, srxlow.read_delay);
+ dev_dbg(dev, "Golden Final rxlow: RX: %d TX: %d RD: %d\n", rxlow.rx,
+ rxlow.tx, rxlow.read_delay);
+
+ /* Golden rxhigh search: Find upper RX boundary at fixed TX */
+
+ /*
+ * rx
+ * 127 ^
+ * | |xxxx ++++++++++++++++++++
+ * | |xxxxx +++++++++++++++++++
+ * search | |xxxxxx ++++++++++++++++++
+ * rxhigh --------->|xxxxxxx +++++++++++++++++
+ * on fixed | |xxxxxxxx ++++++++++++++++
+ * tx | |xxxxxxxxx +++++++++++++++
+ * | |xxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * | xxxxxxxxxxxxxxxxxx +++++++
+ * |
+ * -------------------------------------------> tx
+ * 0 127
+ *
+ * |----------------------------------------------------------|
+ * | Primary | Secondary | Final |
+ * | Search | Search | Point |
+ * |---------|-----------|------------------------------------|
+ * | Fail | Fail | Return Fail |
+ * |---------|-----------|------------------------------------|
+ * | Fail | Pass | Choose Secondary |
+ * |---------|-----------|------------------------------------|
+ * | Pass | Fail | Choose Primary |
+ * |---------|-----------|------------------------------------|
+ * | Pass | Pass | if (secondary.rx > primary.rx) |
+ * | | | Choose Secondary |
+ * | | | else |
+ * | | | Choose Primary |
+ * |----------------------------------------------------------|
+ */
+
+ /* Primary rxhigh: Search at rxlow's TX, decrement from max read_delay */
+
+ rxhigh.tx = rxlow.tx;
+ dev_dbg(dev, "Searching for Golden Primary rxhigh on TX = %d\n",
+ rxhigh.tx);
+ rxhigh.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_rx_high_ddr(f_pdata, mem, &rxhigh);
+ if (ret)
+ primary = false;
+ dev_dbg(dev, "Golden Primary rxhigh: RX: %d TX: %d RD: %d\n", rxhigh.rx,
+ rxhigh.tx, rxhigh.read_delay);
+
+ /* Secondary rxhigh: Verify at offset TX */
+
+ if (rxhigh.tx <=
+ (CQSPI_PHY_TX_LOOKUP_LOW_END - CQSPI_PHY_SEARCH_OFFSET))
+ srxhigh.tx = rxhigh.tx + CQSPI_PHY_SEARCH_OFFSET;
+ else
+ srxhigh.tx = CQSPI_PHY_TX_LOOKUP_LOW_END;
+ dev_dbg(dev, "Searching for Golden Secondary rxhigh on TX = %d\n",
+ srxhigh.tx);
+ srxhigh.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_rx_high_ddr(f_pdata, mem, &srxhigh);
+ if (ret)
+ secondary = false;
+ dev_dbg(dev, "Golden Secondary rxhigh: RX: %d TX: %d RD: %d\n",
+ srxhigh.rx, srxhigh.tx, srxhigh.read_delay);
+
+ if (!primary && !secondary)
+ goto out;
+ else if (!primary)
+ rxhigh = srxhigh;
+ else if (secondary && srxhigh.rx > rxhigh.rx)
+ rxhigh = srxhigh;
+ dev_dbg(dev, "Golden Final rxhigh: RX: %d TX: %d RD: %d\n", rxhigh.rx,
+ rxhigh.tx, rxhigh.read_delay);
+
+ primary = true;
+ secondary = true;
+
+ /* If rxlow/rxhigh at same read_delay, search backup at upper TX range */
+
+ if (rxlow.read_delay == rxhigh.read_delay) {
+ dev_dbg(dev, "rxlow and rxhigh at the same read delay.\n");
+
+ /* Backup rxlow: Search at high TX window */
+
+ /*
+ * rx
+ * 127 ^
+ * | xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++|++++|
+ * | xxxxxxxxxxxxx ++++++|++++|
+ * search | xxxxxxxxxxxxxx +++++|++++|
+ * rxlow --------------------------------->|++++|
+ * | xxxxxxxxxxxxxxxx +++|++++|
+ * | xxxxxxxxxxxxxxxxx ++|++++|
+ * | xxxxxxxxxxxxxxxxxx +|++++|
+ * | | |
+ * --------------------------------|----|-----> tx
+ * 0 | | 127
+ * txhigh txhigh
+ * start end
+ *
+ * |-----------------------------------------------------|
+ * | Primary | Secondary | Final |
+ * | Search | Search | Point |
+ * |---------|-----------|-------------------------------|
+ * | Fail | Fail | Return Fail |
+ * |---------|-----------|-------------------------------|
+ * | Fail | Pass | Return Fail |
+ * |---------|-----------|-------------------------------|
+ * | Pass | Fail | Return Fail |
+ * |---------|-----------|-------------------------------|
+ * | Pass | Pass | rx = |
+ * | | | min(primary.rx, secondary.rx)|
+ * | | | tx = primary.tx |
+ * | | | read_delay = |
+ * | | | min(primary.read_delay, |
+ * | | | secondary.read_delay) |
+ * |-----------------------------------------------------|
+ */
+
+ /* Primary backup: Decrement TX from high window end */
+
+ backuppoint.tx = CQSPI_PHY_TX_LOOKUP_HIGH_END;
+ do {
+ dev_dbg(dev,
+ "Searching for Backup Primary rxlow on TX = %d\n",
+ backuppoint.tx);
+ backuppoint.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_rx_low_ddr(f_pdata, mem, &backuppoint);
+ backuppoint.tx -= CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (ret &&
+ backuppoint.tx >= CQSPI_PHY_TX_LOOKUP_HIGH_START);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Backup Primary rxlow: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ /* Secondary backup: Verify at offset TX */
+
+ if (backuppoint.tx >=
+ (CQSPI_PHY_TX_LOOKUP_HIGH_START + CQSPI_PHY_SEARCH_OFFSET))
+ srxlow.tx = backuppoint.tx - CQSPI_PHY_SEARCH_OFFSET;
+ else
+ srxlow.tx = CQSPI_PHY_TX_LOOKUP_HIGH_START;
+ dev_dbg(dev,
+ "Searching for Backup Secondary rxlow on TX = %d\n",
+ srxlow.tx);
+ srxlow.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_rx_low_ddr(f_pdata, mem, &srxlow);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Backup Secondary rxlow: RX: %d TX: %d RD: %d\n",
+ srxlow.rx, srxlow.tx, srxlow.read_delay);
+
+ backuppoint.rx = min(backuppoint.rx, srxlow.rx);
+ backuppoint.read_delay =
+ min(backuppoint.read_delay, srxlow.read_delay);
+ dev_dbg(dev, "Backup Final rxlow: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ if (backuppoint.rx < rxlow.rx) {
+ rxlow = backuppoint;
+ dev_dbg(dev, "Updating rxlow to the one at TX = %d\n",
+ backuppoint.tx);
+ }
+ dev_dbg(dev, "Final rxlow: RX: %d TX: %d RD: %d\n", rxlow.rx,
+ rxlow.tx, rxlow.read_delay);
+
+ /* Backup rxhigh: Search at fixed backup TX */
+
+ /*
+ * rx
+ * 127 ^
+ * | xxxxx +++++++++++++++++++|
+ * | xxxxxx ++++++++++++++++++|
+ * search | xxxxxxx +++++++++++++++++|
+ * rxhigh -------------------------------------->|
+ * on fixed | xxxxxxxxx +++++++++++++++|
+ * tx | xxxxxxxxxx ++++++++++++++|
+ * | xxxxxxxxxxx +++++++++++++|
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * | xxxxxxxxxxxxxxxxxx +++++++
+ * |
+ * -------------------------------------------> tx
+ * 0 127
+ *
+ * |-----------------------------------------------------|
+ * | Primary | Secondary | Final |
+ * | Search | Search | Point |
+ * |---------|-----------|-------------------------------|
+ * | Fail | Fail | Return Fail |
+ * |---------|-----------|-------------------------------|
+ * | Fail | Pass | Choose Secondary |
+ * |---------|-----------|-------------------------------|
+ * | Pass | Fail | Choose Primary |
+ * |---------|-----------|-------------------------------|
+ * | Pass | Pass | if (secondary.rx > primary.rx)|
+ * | | | Choose Secondary |
+ * | | | else |
+ * | | | Choose Primary |
+ * |-----------------------------------------------------|
+ */
+
+ /* Primary backup rxhigh: Use backup TX, decrement from max read_delay */
+
+ dev_dbg(dev, "Searching for Backup Primary rxhigh on TX = %d\n",
+ backuppoint.tx);
+ backuppoint.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_rx_high_ddr(f_pdata, mem, &backuppoint);
+ if (ret)
+ primary = false;
+ dev_dbg(dev, "Backup Primary rxhigh: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ /* Secondary backup rxhigh: Verify at offset TX */
+
+ if (backuppoint.tx >=
+ (CQSPI_PHY_TX_LOOKUP_HIGH_START + CQSPI_PHY_SEARCH_OFFSET))
+ srxhigh.tx = backuppoint.tx - CQSPI_PHY_SEARCH_OFFSET;
+ else
+ srxhigh.tx = CQSPI_PHY_TX_LOOKUP_HIGH_START;
+ dev_dbg(dev,
+ "Searching for Backup Secondary rxhigh on TX = %d\n",
+ srxhigh.tx);
+ srxhigh.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_rx_high_ddr(f_pdata, mem, &srxhigh);
+ if (ret)
+ secondary = false;
+ dev_dbg(dev, "Backup Secondary rxhigh: RX: %d TX: %d RD: %d\n",
+ srxhigh.rx, srxhigh.tx, srxhigh.read_delay);
+
+ if (!primary && !secondary)
+ goto out;
+ else if (!primary)
+ backuppoint = srxhigh;
+ else if (secondary && srxhigh.rx > backuppoint.rx)
+ backuppoint = srxhigh;
+ dev_dbg(dev, "Backup Final rxhigh: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ if (backuppoint.rx > rxhigh.rx) {
+ rxhigh = backuppoint;
+ dev_dbg(dev, "Updating rxhigh to the one at TX = %d\n",
+ backuppoint.tx);
+ }
+ dev_dbg(dev, "Final rxhigh: RX: %d TX: %d RD: %d\n", rxhigh.rx,
+ rxhigh.tx, rxhigh.read_delay);
+ }
+
+ /* Golden txlow: Fix RX at 1/4 of RX window, search TX lower bound */
+
+ /*
+ * rx
+ * 127 ^
+ * |
+ * rxhigh --------->xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * fix rx | xxxxxxxxxxxxx ++++++++++++
+ * 1/4 b/w ---------><------->xxxxx +++++++++++
+ * rxlow and | xxxx|xxxxxxxxxx ++++++++++
+ * rxhigh | xxxx|xxxxxxxxxxx +++++++++
+ * | xxxx|xxxxxxxxxxxx ++++++++
+ * rxlow --------->xxxx|xxxxxxxxxxxxx +++++++
+ * | |
+ * ------------|------------------------------> tx
+ * 0 | 127
+ * search
+ * txlow
+ */
+
+ rx_window = rxhigh.rx - rxlow.rx;
+ txlow.rx = rxlow.rx + (rx_window / 4);
+ dev_dbg(dev, "Searching for Golden txlow on RX = %d\n", txlow.rx);
+ txlow.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_tx_low_ddr(f_pdata, mem, &txlow);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Golden txlow: RX: %d TX: %d RD: %d\n", txlow.rx, txlow.tx,
+ txlow.read_delay);
+
+ /* Golden txhigh: Same RX as txlow, decrement from max read_delay */
+
+ /*
+ * rx
+ * 127 ^
+ * |
+ * rxhigh --------->xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * fix rx | xxxxxxxxxxxxx ++++++++++++
+ * 1/4 b/w --------------------------------><----->
+ * rxlow and | xxxxxxxxxxxxxxx ++++++|+++
+ * rxhigh | xxxxxxxxxxxxxxxx +++++|+++
+ * | xxxxxxxxxxxxxxxxx ++++|+++
+ * rxlow --------->xxxxxxxxxxxxxxxxxx +++|+++
+ * | |
+ * ----------------------------------|--------> tx
+ * 0 | 127
+ * search
+ * txhigh
+ */
+
+ txhigh.rx = txlow.rx;
+ dev_dbg(dev, "Searching for Golden txhigh on RX = %d\n", txhigh.rx);
+ txhigh.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_tx_high_ddr(f_pdata, mem, &txhigh);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Golden txhigh: RX: %d TX: %d RD: %d\n", txhigh.rx,
+ txhigh.tx, txhigh.read_delay);
+
+ /* If txlow/txhigh at same read_delay, search backup at 3/4 RX window */
+
+ if (txlow.read_delay == txhigh.read_delay) {
+ /* Backup txlow: Fix RX at 3/4 of RX window */
+
+ /*
+ * rx
+ * 127 ^
+ * |
+ * rxhigh --------->xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * fix rx | xxxxxxx ++++++++++++++++++
+ * 3/4 b/w ---------><----->x +++++++++++++++++
+ * rxlow and | xxxx|xxxx ++++++++++++++++
+ * rxhigh | xxxx|xxxxx +++++++++++++++
+ * | xxxx|xxxxxx ++++++++++++++
+ * | xxxx|xxxxxxx +++++++++++++
+ * | xxxx|xxxxxxxx ++++++++++++
+ * | xxxx|xxxxxxxxx +++++++++++
+ * | xxxx|xxxxxxxxxx ++++++++++
+ * | xxxx|xxxxxxxxxxx +++++++++
+ * | xxxx|xxxxxxxxxxxx ++++++++
+ * rxlow --------->xxxx|xxxxxxxxxxxxx +++++++
+ * | |
+ * ------------|------------------------------> tx
+ * 0 | 127
+ * search
+ * txlow
+ */
+
+ dev_dbg(dev, "txlow and txhigh at the same read delay.\n");
+ backuppoint.rx = rxlow.rx + ((rx_window * 3) / 4);
+ dev_dbg(dev, "Searching for Backup txlow on RX = %d\n",
+ backuppoint.rx);
+ backuppoint.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_tx_low_ddr(f_pdata, mem, &backuppoint);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Backup txlow: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ if (backuppoint.tx < txlow.tx) {
+ txlow = backuppoint;
+ dev_dbg(dev, "Updating txlow with the one at RX = %d\n",
+ backuppoint.rx);
+ }
+ dev_dbg(dev, "Final txlow: RX: %d TX: %d RD: %d\n", txlow.rx,
+ txlow.tx, txlow.read_delay);
+
+ /* Backup txhigh: Same RX as backup txlow, decrement from max */
+
+ /*
+ * rx
+ * 127 ^
+ * |
+ * rxhigh --------->xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * fix rx | xxxxxxx ++++++++++++++++++
+ * 3/4 b/w ------------------------------><------->
+ * rxlow and | xxxxxxxxx +++++++++++|++++
+ * rxhigh | xxxxxxxxxx ++++++++++|++++
+ * | xxxxxxxxxxx +++++++++|++++
+ * | xxxxxxxxxxxx ++++++++|++++
+ * | xxxxxxxxxxxxx +++++++|++++
+ * | xxxxxxxxxxxxxx ++++++|++++
+ * | xxxxxxxxxxxxxxx +++++|++++
+ * | xxxxxxxxxxxxxxxx ++++|++++
+ * | xxxxxxxxxxxxxxxxx +++|++++
+ * rxlow --------->xxxxxxxxxxxxxxxxxx ++|++++
+ * | |
+ * ---------------------------------|---------> tx
+ * 0 | 127
+ * search
+ * txhigh
+ */
+
+ dev_dbg(dev, "Searching for Backup txhigh on RX = %d\n",
+ backuppoint.rx);
+ backuppoint.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_tx_high_ddr(f_pdata, mem, &backuppoint);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Backup txhigh: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ if (backuppoint.tx > txhigh.tx) {
+ txhigh = backuppoint;
+ dev_dbg(dev,
+ "Updating txhigh with the one at RX = %d\n",
+ backuppoint.rx);
+ }
+ dev_dbg(dev, "Final txhigh: RX: %d TX: %d RD: %d\n", txhigh.rx,
+ txhigh.tx, txhigh.read_delay);
+ }
+
+ /* Corner points: Define and verify bottomleft and topright boundaries */
+
+ /*
+ * rx
+ * 127 ^
+ * | topright
+ * | *
+ * rxhigh -----------xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * rxlow -----------xxxxxxxxxxxxxxxxxx +++++++
+ * | * |
+ * | bottom|left |
+ * --------|----------------------------|---> tx
+ * 0 | | 127
+ * | |
+ * txlow txhigh
+ *
+ * Verification: Test point 4 taps inside each corner, adjust
+ * read_delay ±1 if needed to ensure valid corners for gap search.
+ */
+
+ bottomleft.tx = txlow.tx;
+ bottomleft.rx = rxlow.rx;
+ if (txlow.read_delay <= rxlow.read_delay)
+ bottomleft.read_delay = txlow.read_delay;
+ else
+ bottomleft.read_delay = rxlow.read_delay;
+
+ /* Verify bottomleft: Test 4 taps inside, adjust read_delay if needed */
+ backupcornerpoint = bottomleft;
+ backupcornerpoint.tx += 4;
+ backupcornerpoint.rx += 4;
+ ret = cqspi_phy_apply_setting(f_pdata, &backupcornerpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ backupcornerpoint.read_delay--;
+ ret = cqspi_phy_apply_setting(f_pdata, &backupcornerpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ }
+
+ if (ret)
+ goto out;
+
+ bottomleft.read_delay = backupcornerpoint.read_delay;
+
+ topright.tx = txhigh.tx;
+ topright.rx = rxhigh.rx;
+ if (txhigh.read_delay >= rxhigh.read_delay)
+ topright.read_delay = txhigh.read_delay;
+ else
+ topright.read_delay = rxhigh.read_delay;
+
+ /* Verify topright: Test 4 taps inside, adjust read_delay if needed */
+ backupcornerpoint = topright;
+ backupcornerpoint.tx -= 4;
+ backupcornerpoint.rx -= 4;
+ ret = cqspi_phy_apply_setting(f_pdata, &backupcornerpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ backupcornerpoint.read_delay++;
+ ret = cqspi_phy_apply_setting(f_pdata, &backupcornerpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ }
+
+ if (ret)
+ goto out;
+
+ topright.read_delay = backupcornerpoint.read_delay;
+
+ dev_dbg(dev, "topright: RX: %d TX: %d RD: %d\n", topright.rx,
+ topright.tx, topright.read_delay);
+ dev_dbg(dev, "bottomleft: RX: %d TX: %d RD: %d\n", bottomleft.rx,
+ bottomleft.tx, bottomleft.read_delay);
+ cqspi_phy_find_gaplow_ddr(f_pdata, mem, &bottomleft, &topright, &gaplow);
+ dev_dbg(dev, "gaplow: RX: %d TX: %d RD: %d\n", gaplow.rx, gaplow.tx,
+ gaplow.read_delay);
+
+ /* Final point selection: Handle single vs dual passing regions */
+
+ if (bottomleft.read_delay == topright.read_delay) {
+ /*
+ * Single region: Use midpoint with temperature compensation.
+ * Gaplow approximates upper boundary of valid region.
+ *
+ * rx
+ * 127 ^
+ * | gaplow (approx. topright)
+ * | |
+ * rxhigh -----------xxxxxxx| failing
+ * | xxxxxxx| region
+ * | xxxxxxx| <--------------->
+ * | xxxxxxx| +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * rxlow -----------xxxxxxxxxxxxxxxxxx +++++++
+ * | * |
+ * | bottom|left |
+ * --------|----------------------------|---> tx
+ * 0 | | 127
+ * | |
+ * txlow txhigh
+ * (same read_delay)
+ *
+ * Temperature compensation: Valid region shifts with temp.
+ * Offset = region_size / (330 / (temp - 42°C))
+ * Factor 330 is empirically determined for this hardware.
+ */
+
+ dev_dbg(dev,
+ "bottomleft and topright at the same read delay.\n");
+
+ topright = gaplow;
+ searchpoint.read_delay = bottomleft.read_delay;
+ searchpoint.tx =
+ bottomleft.tx + ((topright.tx - bottomleft.tx) / 2);
+ searchpoint.rx =
+ bottomleft.rx + ((topright.rx - bottomleft.rx) / 2);
+
+ ret = cqspi_get_temp(&temp);
+ if (ret) {
+ /* Assume room temperature if sensor unavailable */
+ dev_dbg(dev,
+ "Unable to get temperature. Assuming room temperature\n");
+ temp = CQSPI_PHY_DEFAULT_TEMP;
+ }
+
+ if (temp < CQSPI_PHY_MIN_TEMP || temp > CQSPI_PHY_MAX_TEMP) {
+ dev_err(dev,
+ "Temperature outside operating range: %dC\n",
+ temp);
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (temp == CQSPI_PHY_MID_TEMP)
+ temp++; /* Avoid divide-by-zero */
+ dev_dbg(dev, "Temperature: %dC\n", temp);
+
+ /*
+ * Apply temperature offset: positive at high temp, negative at low.
+ * Compute the divisor once and apply to both TX and RX. Use int
+ * arithmetic throughout to avoid u8 wrapping on negative offsets.
+ */
+ temp = 330 / (temp - CQSPI_PHY_MID_TEMP);
+ searchpoint.tx = clamp((int)searchpoint.tx +
+ (topright.tx - bottomleft.tx) / temp,
+ 0, CQSPI_PHY_MAX_DELAY);
+ searchpoint.rx = clamp((int)searchpoint.rx +
+ (topright.rx - bottomleft.rx) / temp,
+ 0, CQSPI_PHY_MAX_DELAY);
+ } else {
+ /*
+ * Dual regions: Gap separates two valid regions, choose larger.
+ *
+ * rx
+ * 127 ^
+ * | topright
+ * | *
+ * rxhigh -----------xxxxx +++++++++++++++++++|
+ * | xxxxxx <region 2> ++++++++|
+ * | xxxxxxx +++++++++++++++++|
+ * | xxxxxxxx ++++++++++++++++|
+ * | xxxxxxxxx +++++++++++++++|
+ * | xxxxxxxxxx ++++++++++++++|
+ * | failing |
+ * | region |
+ * | xxxxxxxxxxxxx +++++++++++|
+ * | xxxxxxxxxxxxxx ++++++++++|
+ * | xxxxxxxxxxxxxxx +++++++++|
+ * | xxxxxxxxxxxxxxxx ++++++++|
+ * | xxxxxxxxx <region 1> +++++++|
+ * rxlow -----------xxxxxxxxxxxxxxxxxx ++++++|
+ * | * |
+ * | bottom|left |
+ * --------|----------------------------|---> tx
+ * 0 | | 127
+ * | |
+ * txlow txhigh
+ *
+ * Strategy: Compare Manhattan distances from gap boundaries to
+ * corners. Choose corner furthest from gap (larger region).
+ * Apply 16-tap margin inward, scale RX proportionally.
+ */
+
+ cqspi_phy_find_gaphigh_ddr(f_pdata, mem, &bottomleft,
+ &topright, &gaphigh);
+ dev_dbg(dev, "gaphigh: RX: %d TX: %d RD: %d\n", gaphigh.rx,
+ gaphigh.tx, gaphigh.read_delay);
+
+ if (topright.tx == bottomleft.tx) {
+ dev_err(dev, "zero TX span in dual-region: cannot compute search point\n");
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /* Compare Manhattan distances: choose corner furthest from gap */
+ if ((abs(gaplow.tx - bottomleft.tx) +
+ abs(gaplow.rx - bottomleft.rx)) <
+ (abs(gaphigh.tx - topright.tx) +
+ abs(gaphigh.rx - topright.rx))) {
+ /* Topright further: Use Region 2, 16 taps inward */
+ searchpoint = topright;
+ searchpoint.tx -= 16;
+ searchpoint.rx -= (16 * (topright.rx - bottomleft.rx)) /
+ (topright.tx - bottomleft.tx);
+ } else {
+ /* Bottomleft further: Use Region 1, 16 taps inward */
+ searchpoint = bottomleft;
+ searchpoint.tx += 16;
+ searchpoint.rx += (16 * (topright.rx - bottomleft.rx)) /
+ (topright.tx - bottomleft.tx);
+ }
+ }
+
+ /* Apply and verify final tuning point */
+ dev_dbg(dev, "Final tuning point: RX: %d TX: %d RD: %d\n",
+ searchpoint.rx, searchpoint.tx, searchpoint.read_delay);
+ ret = cqspi_phy_apply_setting(f_pdata, &searchpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ dev_err(dev,
+ "Failed to find pattern at final calibration point\n");
+ ret = -EINVAL;
+ goto out;
+ }
+
+ f_pdata->phy_setting.read_delay = searchpoint.read_delay;
+ f_pdata->phy_setting.rx = searchpoint.rx;
+ f_pdata->phy_setting.tx = searchpoint.tx;
+out:
+ if (ret)
+ f_pdata->use_phy = false;
+
+ return ret;
+}
+
+static int cqspi_phy_tuning_sdr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ struct device *dev = &cqspi->pdev->dev;
+ struct phy_setting rxlow, rxhigh, first, second, final;
+ u8 window1 = 0;
+ u8 window2 = 0;
+ int ret;
+
+ /*
+ * SDR tuning: 1D search for optimal RX delay (TX less critical).
+ * Find two consecutive windows, choose larger, use midpoint.
+ *
+ * rx
+ * 127 ^
+ * | |-----window at----------|
+ * | |-----read_delay = n+1---|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | rxlow(n+1) midpoint rxhigh(n+1)
+ * |
+ * | |---window at--------|
+ * | |---read_delay = n---|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | rxlow(n) midpoint rxhigh(n)
+ * |
+ * -----------------------------------------> tx
+ * 0 127
+ * read_delay=n read_delay=n+1
+ */
+
+ f_pdata->use_phy = true;
+ cqspi_phy_reset_setting(&rxlow);
+ cqspi_phy_reset_setting(&rxhigh);
+ cqspi_phy_reset_setting(&first);
+
+ /* First window: Find rxlow by incrementing read_delay from 0 */
+
+ /*
+ * rx
+ * 127 ^
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * search | |xxxxxxxxxxxxxxxxxxxx|
+ * rxlow | |xxxxxxxxxxxxxxxxxxxx|
+ * increasing | |xxxxxxxxxxxxxxxxxxxx|
+ * --------->|xxxxxxxxxxxxxxxxxxxx|
+ * read_delay | |xxxxxxxxxxxxxxxxxxx|
+ * until found | |xxxxxxxxxxxxxxxxxxx|
+ * | rxlow
+ * -----------------------------------------> tx
+ * 0 tx fixed at 127
+ */
+
+ do {
+ ret = cqspi_find_rx_low_sdr(f_pdata, mem, &rxlow);
+
+ if (ret)
+ rxlow.read_delay++;
+ } while (ret && rxlow.read_delay <= CQSPI_PHY_MAX_RD);
+
+ /* Find rxhigh: Decrement from RX=127 at same read_delay */
+
+ /*
+ * rx
+ * 127 ^ search rxhigh
+ * | (decrement from
+ * | 127 until found)
+ * | |
+ * | |
+ * | v
+ * | |------------------------|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | rxlow rxhigh
+ * -----------------------------------------> tx
+ * 0 tx fixed at 127
+ */
+
+ rxhigh.read_delay = rxlow.read_delay;
+ ret = cqspi_find_rx_high_sdr(f_pdata, mem, &rxhigh, rxlow.rx);
+ if (ret)
+ goto out;
+
+ /* Calculate first window midpoint for max margin */
+
+ /*
+ * rx
+ * 127 ^
+ * | |--------window1---------|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxx * xxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | rxlow ^ rxhigh
+ * ----------------------|------------------> tx
+ * 0 | tx fixed at 127
+ * window1/2
+ */
+
+ first.read_delay = rxlow.read_delay;
+ window1 = rxhigh.rx - rxlow.rx;
+ first.rx = rxlow.rx + (window1 / 2);
+
+ dev_dbg(dev, "First tuning point: RX: %d TX: %d RD: %d\n", first.rx,
+ first.tx, first.read_delay);
+ ret = cqspi_phy_apply_setting(f_pdata, &first);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret || first.read_delay > CQSPI_PHY_MAX_RD)
+ goto out;
+
+ /* Second window: Search at read_delay+1, may differ in size */
+
+ /*
+ * rx
+ * 127 ^
+ * | |-------|
+ * | |xxxxxxx|
+ * | |xxxxxxx|
+ * | |xxxxxxx|
+ * | |xxxxxxx|
+ * | |xxxxxxx|
+ * | rxlow rxhigh
+ * -----------------------------------------> tx
+ * 0
+ * read_delay = n (smaller window)
+ *
+ * rx
+ * 127 ^
+ * | |-----------------|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | rxlow rxhigh
+ * -----------------------------------------> tx
+ * 0
+ * read_delay = n+1 (larger window - better)
+ */
+
+ cqspi_phy_reset_setting(&rxlow);
+ cqspi_phy_reset_setting(&rxhigh);
+ cqspi_phy_reset_setting(&second);
+
+ rxlow.read_delay = first.read_delay + 1;
+ if (rxlow.read_delay > CQSPI_PHY_MAX_RD)
+ goto compare;
+
+ ret = cqspi_find_rx_low_sdr(f_pdata, mem, &rxlow);
+ if (ret)
+ goto compare;
+
+ rxhigh.read_delay = rxlow.read_delay;
+ ret = cqspi_find_rx_high_sdr(f_pdata, mem, &rxhigh, rxlow.rx);
+ if (ret)
+ goto compare;
+
+ /* Calculate second window midpoint */
+
+ /*
+ * rx
+ * 127 ^
+ * | |--------window2---------|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxx * xxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | rxlow ^ rxhigh
+ * ----------------------|------------------> tx
+ * 0 | tx fixed at 127
+ * window2/2
+ * read_delay = n+1
+ */
+
+ window2 = rxhigh.rx - rxlow.rx;
+ second.rx = rxlow.rx + (window2 / 2);
+ second.read_delay = rxlow.read_delay;
+
+ dev_dbg(dev, "Second tuning point: RX: %d TX: %d RD: %d\n", second.rx,
+ second.tx, second.read_delay);
+ ret = cqspi_phy_apply_setting(f_pdata, &second);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret || second.read_delay > CQSPI_PHY_MAX_RD)
+ window2 = 0;
+
+ /* Window comparison: Choose larger window for better margin */
+
+compare:
+ cqspi_phy_reset_setting(&final);
+ if (window2 > window1) {
+ final.rx = second.rx;
+ final.read_delay = second.read_delay;
+ } else {
+ final.rx = first.rx;
+ final.read_delay = first.read_delay;
+ }
+
+ /* Apply and verify final tuning point */
+
+ dev_dbg(dev, "Final tuning point: RX: %d TX: %d RD: %d\n", final.rx,
+ final.tx, final.read_delay);
+ ret = cqspi_phy_apply_setting(f_pdata, &final);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ f_pdata->phy_setting.read_delay = final.read_delay;
+ f_pdata->phy_setting.rx = final.rx;
+ f_pdata->phy_setting.tx = final.tx;
+
+out:
+ if (ret)
+ f_pdata->use_phy = false;
+
+ return ret;
+}
+
+static int cqspi_am654_ospi_execute_tuning(struct spi_mem *mem,
+ struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op)
+{
+ struct cqspi_st *cqspi =
+ spi_controller_get_devdata(mem->spi->controller);
+ struct cqspi_flash_pdata *f_pdata;
+ struct device *dev = &cqspi->pdev->dev;
+ u32 base_speed;
+ u32 phy_offset = 0;
+ int ret;
+
+ f_pdata = &cqspi->f_pdata[spi_get_chipselect(mem->spi, 0)];
+
+ /*
+ * A second spi-max-frequency value (the higher clock rate) must be
+ * present for tiered speed support. Without it there is nothing to
+ * calibrate towards, so skip tuning gracefully.
+ */
+ if (!f_pdata->max_clk_rate) {
+ dev_dbg(dev, "No higher clock rate configured, skipping tuning\n");
+ return 0;
+ }
+
+ base_speed = mem->spi->base_speed_hz;
+
+ if (write_op) {
+ /*
+ * For NAND: write the calibration pattern to the page cache.
+ * This uses write_op at the safe base speed (base_speed_hz is
+ * still active) so the write itself is reliable.
+ */
+ ret = cqspi_write_pattern_to_cache(f_pdata, mem, write_op);
+ if (ret) {
+ dev_warn(dev,
+ "failed to write pattern to cache: %d, skipping tuning\n",
+ ret);
+ goto out;
+ }
+
+ f_pdata->phy_write_op = *write_op;
+ } else {
+ ret = cqspi_get_phy_pattern_offset(dev, &phy_offset);
+ if (ret) {
+ dev_warn(dev,
+ "pattern partition not found: %d, skipping tuning\n",
+ ret);
+ goto out;
+ }
+
+ read_op->addr.val = phy_offset;
+ }
+
+ /*
+ * Verify the calibration pattern exists using the conservative base
+ * speed. At high clock rates the DLL is not yet trained, so DTR
+ * data capture is unreliable and the read would return garbage.
+ * Setting max_freq to 0 here causes apply_base_freq_cap() to cap the
+ * read to base_speed_hz, which is well within reliable DTR margins.
+ * max_freq is restored to max_speed_hz for the tuning-loop reads
+ * after base_speed_hz is cleared below.
+ */
+ f_pdata->phy_read_op = *read_op;
+ f_pdata->phy_read_op.max_freq = 0;
+
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (ret) {
+ dev_err(dev, "pattern not found: %d, skipping tuning\n", ret);
+ goto out;
+ }
+
+ /*
+ * Pattern confirmed. Now clear base_speed_hz so that tuning-loop
+ * exec_op calls run at max_speed_hz, and restore phy_read_op.max_freq
+ * so those reads also use the full speed.
+ */
+ mem->spi->base_speed_hz = 0;
+ f_pdata->phy_read_op.max_freq = mem->spi->max_speed_hz;
+
+ if (read_op->cmd.dtr || read_op->addr.dtr || read_op->dummy.dtr ||
+ read_op->data.dtr) {
+ f_pdata->use_dqs = true;
+ cqspi_phy_pre_config(cqspi, f_pdata, false);
+ ret = cqspi_phy_tuning_ddr(f_pdata, mem);
+ } else {
+ f_pdata->use_dqs = false;
+ cqspi_phy_pre_config(cqspi, f_pdata, true);
+ ret = cqspi_phy_tuning_sdr(f_pdata, mem);
+ }
+
+ if (ret)
+ dev_warn(dev, "tuning failed: %d\n", ret);
+
+ cqspi_phy_post_config(cqspi, f_pdata->read_delay);
+
+out:
+ /*
+ * Always restore the conservative base speed cap. On success, write
+ * back the validated maximum speed into the caller's op templates so
+ * that those specific ops bypass the cap in subsequent exec_op calls.
+ */
+ mem->spi->base_speed_hz = base_speed;
+ if (!ret) {
+ read_op->max_freq = mem->spi->max_speed_hz;
+ if (write_op)
+ write_op->max_freq = mem->spi->max_speed_hz;
+ }
+
+ return ret;
+}
+
+static int cqspi_mem_op_execute_tuning(struct spi_mem *mem,
+ struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op)
+{
+ struct cqspi_st *cqspi =
+ spi_controller_get_devdata(mem->spi->controller);
+
+ if (!cqspi->ddata->execute_tuning)
+ return -EOPNOTSUPP;
+
+ return cqspi->ddata->execute_tuning(mem, read_op, write_op);
+}
+
static int cqspi_of_get_flash_pdata(struct platform_device *pdev,
struct cqspi_flash_pdata *f_pdata,
struct device_node *np)
{
+ int nfreq, ret;
+
if (of_property_read_u32(np, "cdns,read-delay", &f_pdata->read_delay)) {
dev_err(&pdev->dev, "couldn't determine read-delay\n");
return -ENXIO;
@@ -1584,7 +3343,26 @@ static int cqspi_of_get_flash_pdata(struct platform_device *pdev,
return -ENXIO;
}
- if (of_property_read_u32(np, "spi-max-frequency", &f_pdata->clk_rate)) {
+ /*
+ * spi-max-frequency accepts one or two values:
+ * <max-freq> - single rate; no tiered speed support
+ * <base-freq max-freq> - conservative default and higher maximum
+ *
+ * With two values the SPI core sets spi->base_speed_hz = base-freq and
+ * spi->max_speed_hz = max-freq. Store the second value here as the
+ * controller's higher rate target for calibration.
+ */
+ nfreq = of_property_count_u32_elems(np, "spi-max-frequency");
+ if (nfreq == 2) {
+ ret = of_property_read_u32_index(np, "spi-max-frequency", 1,
+ &f_pdata->max_clk_rate);
+ if (ret) {
+ dev_err(&pdev->dev, "couldn't read spi-max-frequency[1]\n");
+ return ret;
+ }
+ } else if (nfreq == 1) {
+ f_pdata->max_clk_rate = 0;
+ } else {
dev_err(&pdev->dev, "couldn't determine spi-max-frequency\n");
return -ENXIO;
}
@@ -1736,6 +3514,7 @@ static const struct spi_controller_mem_ops cqspi_mem_ops = {
.exec_op = cqspi_exec_mem_op,
.get_name = cqspi_get_name,
.supports_op = cqspi_supports_mem_op,
+ .execute_tuning = cqspi_mem_op_execute_tuning,
};
static const struct spi_controller_mem_caps cqspi_mem_caps = {
@@ -2104,6 +3883,7 @@ static const struct cqspi_driver_platdata k2g_qspi = {
static const struct cqspi_driver_platdata am654_ospi = {
.hwcaps_mask = CQSPI_SUPPORTS_OCTAL | CQSPI_SUPPORTS_QUAD,
.quirks = CQSPI_NEEDS_WR_DELAY,
+ .execute_tuning = cqspi_am654_ospi_execute_tuning,
};
static const struct cqspi_driver_platdata intel_lgm_qspi = {
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 08/13] spi: cadence-quadspi: add PHY tuning support
2026-05-27 17:55 ` [PATCH v3 08/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
@ 2026-05-28 8:54 ` Miquel Raynal
0 siblings, 0 replies; 25+ messages in thread
From: Miquel Raynal @ 2026-05-28 8:54 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, richard, vigneshr, pratyush,
mwalle, takahiro.kuwano, linux-spi, devicetree, linux-kernel,
linux-mtd, praneeth, u-kumar1, a-dutta
On 27/05/2026 at 23:25:22 +0530, Santhosh Kumar K <s-k6@ti.com> wrote:
> The Cadence QSPI controller supports a delay-line PHY for high-speed
> operation. Without calibration the PHY is unused and read capture relies
> on a fixed delay, limiting throughput at frequencies above the base
> operating speed.
>
> Add an execute_tuning callback that performs delay-line calibration using
> a known data pattern written to a dedicated flash region. The pattern is
> either read from a NOR partition identified by the DT property
> cdns,phy-pattern-partition, or written to the NAND page cache before
> each calibration read.
>
> For DDR protocols (8D-8D-8D) a 2D sweep of (rx_delay, tx_delay) pairs
> is performed to find the widest passing region in the combined RX/TX
> space. Binary search locates the gap boundary between passing regions
> when two separate windows exist; the final operating point is placed at
> the centre of the larger region with a small temperature-dependent
> offset.
>
> For SDR protocols a 1D sweep of the RX delay is sufficient. Two windows
> at adjacent read_delay values are measured; the wider one's midpoint is
> selected.
>
> The tuning infrastructure is platform-specific: only am654-based OSPI
> controllers populate the execute_tuning hook. All other platform data
> entries return -EOPNOTSUPP and are unaffected.
>
> spi-max-frequency may carry two values in DT; the second (higher) value
> is the tuned target rate stored in max_clk_rate. When only one value is
> present max_clk_rate is zero and tuning is skipped.
>
> Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
> ---
There are more than 1800 new lines for the PHY tuning procedure. I left
that decision to Mark of course, by maybe we should move that into
another .c file.
> +static int cqspi_am654_ospi_execute_tuning(struct spi_mem *mem,
> + struct spi_mem_op *read_op,
> + struct spi_mem_op *write_op)
> +{
> + struct cqspi_st *cqspi =
> + spi_controller_get_devdata(mem->spi->controller);
> + struct cqspi_flash_pdata *f_pdata;
> + struct device *dev = &cqspi->pdev->dev;
> + u32 base_speed;
> + u32 phy_offset = 0;
> + int ret;
> +
> + f_pdata = &cqspi->f_pdata[spi_get_chipselect(mem->spi, 0)];
> +
> + /*
> + * A second spi-max-frequency value (the higher clock rate) must be
> + * present for tiered speed support. Without it there is nothing to
> + * calibrate towards, so skip tuning gracefully.
> + */
> + if (!f_pdata->max_clk_rate) {
> + dev_dbg(dev, "No higher clock rate configured, skipping tuning\n");
> + return 0;
> + }
> +
> + base_speed = mem->spi->base_speed_hz;
> +
> + if (write_op) {
> + /*
> + * For NAND: write the calibration pattern to the page cache.
> + * This uses write_op at the safe base speed (base_speed_hz is
> + * still active) so the write itself is reliable.
> + */
> + ret = cqspi_write_pattern_to_cache(f_pdata, mem, write_op);
> + if (ret) {
> + dev_warn(dev,
> + "failed to write pattern to cache: %d, skipping tuning\n",
> + ret);
> + goto out;
> + }
> +
> + f_pdata->phy_write_op = *write_op;
> + } else {
> + ret = cqspi_get_phy_pattern_offset(dev, &phy_offset);
> + if (ret) {
> + dev_warn(dev,
> + "pattern partition not found: %d, skipping tuning\n",
> + ret);
> + goto out;
> + }
> +
> + read_op->addr.val = phy_offset;
> + }
> +
> + /*
> + * Verify the calibration pattern exists using the conservative base
> + * speed. At high clock rates the DLL is not yet trained, so DTR
> + * data capture is unreliable and the read would return garbage.
> + * Setting max_freq to 0 here causes apply_base_freq_cap() to cap the
> + * read to base_speed_hz, which is well within reliable DTR margins.
> + * max_freq is restored to max_speed_hz for the tuning-loop reads
> + * after base_speed_hz is cleared below.
> + */
> + f_pdata->phy_read_op = *read_op;
> + f_pdata->phy_read_op.max_freq = 0;
> +
> + ret = cqspi_phy_check_pattern(f_pdata, mem);
> + if (ret) {
> + dev_err(dev, "pattern not found: %d, skipping tuning\n", ret);
> + goto out;
> + }
> +
> + /*
> + * Pattern confirmed. Now clear base_speed_hz so that tuning-loop
> + * exec_op calls run at max_speed_hz, and restore phy_read_op.max_freq
> + * so those reads also use the full speed.
> + */
> + mem->spi->base_speed_hz = 0;
If there is a way to avoid touching the core parameters, I would be for
using it, but maybe it is simpler to do it this way.
> + f_pdata->phy_read_op.max_freq = mem->spi->max_speed_hz;
> +
> + if (read_op->cmd.dtr || read_op->addr.dtr || read_op->dummy.dtr ||
> + read_op->data.dtr) {
> + f_pdata->use_dqs = true;
> + cqspi_phy_pre_config(cqspi, f_pdata, false);
> + ret = cqspi_phy_tuning_ddr(f_pdata, mem);
> + } else {
> + f_pdata->use_dqs = false;
> + cqspi_phy_pre_config(cqspi, f_pdata, true);
> + ret = cqspi_phy_tuning_sdr(f_pdata, mem);
> + }
> +
> + if (ret)
> + dev_warn(dev, "tuning failed: %d\n", ret);
> +
> + cqspi_phy_post_config(cqspi, f_pdata->read_delay);
> +
> +out:
> + /*
> + * Always restore the conservative base speed cap. On success, write
> + * back the validated maximum speed into the caller's op templates so
> + * that those specific ops bypass the cap in subsequent exec_op calls.
> + */
> + mem->spi->base_speed_hz = base_speed;
> + if (!ret) {
> + read_op->max_freq = mem->spi->max_speed_hz;
> + if (write_op)
> + write_op->max_freq = mem->spi->max_speed_hz;
> + }
Neat.
> +
> + return ret;
> +}
> +
> +static int cqspi_mem_op_execute_tuning(struct spi_mem *mem,
> + struct spi_mem_op *read_op,
> + struct spi_mem_op *write_op)
> +{
> + struct cqspi_st *cqspi =
> + spi_controller_get_devdata(mem->spi->controller);
> +
> + if (!cqspi->ddata->execute_tuning)
> + return -EOPNOTSUPP;
> +
> + return cqspi->ddata->execute_tuning(mem, read_op, write_op);
> +}
> +
> static int cqspi_of_get_flash_pdata(struct platform_device *pdev,
> struct cqspi_flash_pdata *f_pdata,
> struct device_node *np)
> {
> + int nfreq, ret;
> +
> if (of_property_read_u32(np, "cdns,read-delay", &f_pdata->read_delay)) {
> dev_err(&pdev->dev, "couldn't determine read-delay\n");
> return -ENXIO;
> @@ -1584,7 +3343,26 @@ static int cqspi_of_get_flash_pdata(struct platform_device *pdev,
> return -ENXIO;
> }
>
> - if (of_property_read_u32(np, "spi-max-frequency", &f_pdata->clk_rate)) {
> + /*
> + * spi-max-frequency accepts one or two values:
> + * <max-freq> - single rate; no tiered speed support
> + * <base-freq max-freq> - conservative default and higher maximum
> + *
> + * With two values the SPI core sets spi->base_speed_hz = base-freq and
> + * spi->max_speed_hz = max-freq. Store the second value here as the
> + * controller's higher rate target for calibration.
> + */
> + nfreq = of_property_count_u32_elems(np, "spi-max-frequency");
> + if (nfreq == 2) {
> + ret = of_property_read_u32_index(np, "spi-max-frequency", 1,
> + &f_pdata->max_clk_rate);
> + if (ret) {
> + dev_err(&pdev->dev, "couldn't read spi-max-frequency[1]\n");
> + return ret;
> + }
> + } else if (nfreq == 1) {
> + f_pdata->max_clk_rate = 0;
> + } else {
> dev_err(&pdev->dev, "couldn't determine spi-max-frequency\n");
> return -ENXIO;
> }
Why do we repeat that operation in the driver? Can't we just use what
the core has already done for us? Seems like we are parsing the same
data twice (even before this patchset).
Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 09/13] spi: cadence-quadspi: reject 2-byte-address DDR ops on PHY-tunable hardware
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (7 preceding siblings ...)
2026-05-27 17:55 ` [PATCH v3 08/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-28 9:01 ` Miquel Raynal
2026-05-27 17:55 ` [PATCH v3 10/13] spi: cadence-quadspi: enable PHY for direct reads and indirect writes Santhosh Kumar K
` (4 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Erratum i2383 affects the AM654 OSPI controller: in PHY DDR mode,
operations with a 2-byte address cause an internal state machine to
mis-compare the transmitted address byte count against 1 instead of 2,
locking up the address phase. [0]
Add a CQSPI_NO_2BYTE_ADDR_PHY_DDR quirk flag and set it on the am654_ospi
platform data. In cqspi_supports_mem_op(), when a controller carries this
quirk and has PHY tuning support, reject DDR operations that use 2-byte
addressing.
[0] https://www.ti.com/lit/er/sprz544c/sprz544c.pdf
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 508bc5bc4ab5..72208d376305 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -49,6 +49,7 @@ static_assert(CQSPI_MAX_CHIPSELECT <= SPI_DEVICE_CS_CNT_MAX);
#define CQSPI_DISABLE_RUNTIME_PM BIT(10)
#define CQSPI_NO_INDIRECT_MODE BIT(11)
#define CQSPI_HAS_WR_PROTECT BIT(12)
+#define CQSPI_NO_2BYTE_ADDR_PHY_DDR BIT(13)
/* Capabilities */
#define CQSPI_SUPPORTS_OCTAL BIT(0)
@@ -1627,6 +1628,18 @@ static bool cqspi_supports_mem_op(struct spi_mem *mem,
if (op->data.nbytes && op->data.buswidth != 8)
return false;
+ /*
+ * Erratum i2383: In PHY DDR mode, 2-byte addressing causes an
+ * internal state machine to mis-compare the transmitted
+ * address byte count against 1 instead of 2, locking up the
+ * address phase. Reject such ops on controllers that need it.
+ */
+ if (cqspi->ddata &&
+ (cqspi->ddata->quirks & CQSPI_NO_2BYTE_ADDR_PHY_DDR)) {
+ if (op->addr.nbytes == 2 && cqspi->ddata->execute_tuning)
+ return false;
+ }
+
if (cqspi->is_rzn1)
return false;
} else if (!all_false) {
@@ -3882,7 +3895,7 @@ static const struct cqspi_driver_platdata k2g_qspi = {
static const struct cqspi_driver_platdata am654_ospi = {
.hwcaps_mask = CQSPI_SUPPORTS_OCTAL | CQSPI_SUPPORTS_QUAD,
- .quirks = CQSPI_NEEDS_WR_DELAY,
+ .quirks = CQSPI_NEEDS_WR_DELAY | CQSPI_NO_2BYTE_ADDR_PHY_DDR,
.execute_tuning = cqspi_am654_ospi_execute_tuning,
};
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 09/13] spi: cadence-quadspi: reject 2-byte-address DDR ops on PHY-tunable hardware
2026-05-27 17:55 ` [PATCH v3 09/13] spi: cadence-quadspi: reject 2-byte-address DDR ops on PHY-tunable hardware Santhosh Kumar K
@ 2026-05-28 9:01 ` Miquel Raynal
0 siblings, 0 replies; 25+ messages in thread
From: Miquel Raynal @ 2026-05-28 9:01 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, richard, vigneshr, pratyush,
mwalle, takahiro.kuwano, linux-spi, devicetree, linux-kernel,
linux-mtd, praneeth, u-kumar1, a-dutta
On 27/05/2026 at 23:25:23 +0530, Santhosh Kumar K <s-k6@ti.com> wrote:
> Erratum i2383 affects the AM654 OSPI controller: in PHY DDR mode,
> operations with a 2-byte address cause an internal state machine to
> mis-compare the transmitted address byte count against 1 instead of 2,
> locking up the address phase. [0]
>
> Add a CQSPI_NO_2BYTE_ADDR_PHY_DDR quirk flag and set it on the am654_ospi
> platform data. In cqspi_supports_mem_op(), when a controller carries this
> quirk and has PHY tuning support, reject DDR operations that use 2-byte
> addressing.
>
> [0] https://www.ti.com/lit/er/sprz544c/sprz544c.pdf
>
> Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
> ---
> drivers/spi/spi-cadence-quadspi.c | 15 ++++++++++++++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index 508bc5bc4ab5..72208d376305 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
> @@ -49,6 +49,7 @@ static_assert(CQSPI_MAX_CHIPSELECT <= SPI_DEVICE_CS_CNT_MAX);
> #define CQSPI_DISABLE_RUNTIME_PM BIT(10)
> #define CQSPI_NO_INDIRECT_MODE BIT(11)
> #define CQSPI_HAS_WR_PROTECT BIT(12)
> +#define CQSPI_NO_2BYTE_ADDR_PHY_DDR BIT(13)
>
> /* Capabilities */
> #define CQSPI_SUPPORTS_OCTAL BIT(0)
> @@ -1627,6 +1628,18 @@ static bool cqspi_supports_mem_op(struct spi_mem *mem,
> if (op->data.nbytes && op->data.buswidth != 8)
> return false;
>
> + /*
> + * Erratum i2383: In PHY DDR mode, 2-byte addressing causes an
> + * internal state machine to mis-compare the transmitted
> + * address byte count against 1 instead of 2, locking up the
> + * address phase. Reject such ops on controllers that need it.
> + */
> + if (cqspi->ddata &&
> + (cqspi->ddata->quirks & CQSPI_NO_2BYTE_ADDR_PHY_DDR)) {
> + if (op->addr.nbytes == 2 && cqspi->ddata->execute_tuning)
> + return false;
> + }
I don't think this is a valid approach. What we want is to prevent
tuning in octal DTR mode with 2 bytes addressing, instead of preventing
reads/writes in octal DTR modes after tuning. Have you tried on an AM62A LP
SK? I bet probe fails..
The quirk should be handled at the beginning of the tuning procedure, so
we skip tuning entirely in this case.
Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 10/13] spi: cadence-quadspi: enable PHY for direct reads and indirect writes
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (8 preceding siblings ...)
2026-05-27 17:55 ` [PATCH v3 09/13] spi: cadence-quadspi: reject 2-byte-address DDR ops on PHY-tunable hardware Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-28 9:09 ` Miquel Raynal
2026-05-27 17:55 ` [PATCH v3 11/13] mtd: spinand: run PHY tuning after init and update dirmap frequencies Santhosh Kumar K
` (3 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
After PHY tuning completes, data transfers still use the default
read-capture path. The PHY pipeline must be activated around each
eligible transfer to benefit from the calibrated delay settings.
Add cqspi_phy_enable() to toggle PHY mode. Enabling sets the calibrated
read-capture delay, asserts PHY_EN and PHY_PIPELINE, and decrements the
dummy cycle count by one since the PHY pipeline absorbs that latency.
Disabling reverses all three. Returns cqspi_wait_idle() so callers can
abort if the controller stalls on enable; disable is best-effort.
Split cqspi_direct_read_execute() so PHY-eligible reads run DMA over the
16-byte-aligned middle section with PHY active, while unaligned head and
tail bytes are transferred without PHY. PHY is used when use_phy is set,
the transfer exceeds 16 bytes, and the frequency matches the tuned rate.
cqspi_memcpy_fromio() handles small and non-DMA-able transfers, with
special handling for 8D-8D-8D to ensure 2-byte-aligned I/O accesses.
For indirect writes, PHY is enabled for transfers of at least 1 KB
where the setup overhead is amortized.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 181 ++++++++++++++++++++++++++++--
1 file changed, 171 insertions(+), 10 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 72208d376305..80e7c572ab80 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -564,6 +564,61 @@ static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
writel(reg, reg_base + CQSPI_REG_READCAPTURE);
}
+static int cqspi_phy_enable(struct cqspi_flash_pdata *f_pdata, bool enable)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ void __iomem *reg_base = cqspi->iobase;
+ u32 reg;
+ u8 dummy;
+
+ if (enable) {
+ cqspi_readdata_capture(cqspi, true, f_pdata->use_dqs,
+ f_pdata->phy_setting.read_delay);
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg |= CQSPI_REG_CONFIG_PHY_EN | CQSPI_REG_CONFIG_PHY_PIPELINE;
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ /*
+ * The PHY data-capture pipeline absorbs one dummy cycle's
+ * worth of latency; reduce the count to avoid over-compensation.
+ */
+ reg = readl(reg_base + CQSPI_REG_RD_INSTR);
+ dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ reg);
+ dummy--;
+ reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB);
+ reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ dummy);
+ writel(reg, reg_base + CQSPI_REG_RD_INSTR);
+ } else {
+ cqspi_readdata_capture(cqspi, !cqspi->rclk_en, false,
+ f_pdata->read_delay);
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg &= ~(CQSPI_REG_CONFIG_PHY_EN |
+ CQSPI_REG_CONFIG_PHY_PIPELINE);
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_RD_INSTR);
+ dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ reg);
+ dummy++;
+ reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB);
+ reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ dummy);
+ writel(reg, reg_base + CQSPI_REG_RD_INSTR);
+ }
+
+ return cqspi_wait_idle(cqspi);
+}
+
static int cqspi_exec_flash_cmd(struct cqspi_st *cqspi, unsigned int reg)
{
void __iomem *reg_base = cqspi->iobase;
@@ -1191,6 +1246,7 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
void __iomem *reg_base = cqspi->iobase;
unsigned int remaining = n_tx;
unsigned int write_bytes;
+ bool use_phy_write;
int ret;
if (!refcount_read(&cqspi->refcount))
@@ -1226,6 +1282,15 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
if (cqspi->apb_ahb_hazard)
readl(reg_base + CQSPI_REG_INDIRECTWR);
+ /* Use PHY only for large writes where setup overhead is amortized */
+ use_phy_write = n_tx >= SZ_1K && f_pdata->use_phy;
+
+ if (use_phy_write) {
+ ret = cqspi_phy_enable(f_pdata, true);
+ if (ret)
+ goto failwr;
+ }
+
while (remaining > 0) {
size_t write_words, mod_bytes;
@@ -1266,6 +1331,9 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
goto failwr;
}
+ if (use_phy_write)
+ cqspi_phy_enable(f_pdata, false);
+
/* Disable interrupt. */
writel(0, reg_base + CQSPI_REG_IRQMASK);
@@ -1277,6 +1345,9 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
return 0;
failwr:
+ if (use_phy_write)
+ cqspi_phy_enable(f_pdata, false);
+
/* Disable interrupt. */
writel(0, reg_base + CQSPI_REG_IRQMASK);
@@ -1448,8 +1519,15 @@ static void cqspi_rx_dma_callback(void *param)
complete(&cqspi->rx_dma_complete);
}
-static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
- u_char *buf, loff_t from, size_t len)
+static bool cqspi_use_phy(struct cqspi_flash_pdata *f_pdata,
+ const struct spi_mem_op *op)
+{
+ return f_pdata->use_phy && op->data.nbytes > 16 &&
+ op->max_freq == f_pdata->max_clk_rate;
+}
+
+static int cqspi_direct_read_dma(struct cqspi_flash_pdata *f_pdata, u_char *buf,
+ loff_t from, size_t len)
{
struct cqspi_st *cqspi = f_pdata->cqspi;
struct device *dev = &cqspi->pdev->dev;
@@ -1461,19 +1539,14 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
dma_addr_t dma_dst;
struct device *ddev;
- if (!cqspi->rx_chan || !virt_addr_valid(buf)) {
- memcpy_fromio(buf, cqspi->ahb_base + from, len);
- return 0;
- }
-
ddev = cqspi->rx_chan->device->dev;
dma_dst = dma_map_single(ddev, buf, len, DMA_FROM_DEVICE);
if (dma_mapping_error(ddev, dma_dst)) {
dev_err(dev, "dma mapping failed\n");
return -ENOMEM;
}
- tx = dmaengine_prep_dma_memcpy(cqspi->rx_chan, dma_dst, dma_src,
- len, flags);
+ tx = dmaengine_prep_dma_memcpy(cqspi->rx_chan, dma_dst, dma_src, len,
+ flags);
if (!tx) {
dev_err(dev, "device_prep_dma_memcpy error\n");
ret = -EIO;
@@ -1507,6 +1580,94 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
return ret;
}
+static void cqspi_memcpy_fromio(const struct spi_mem_op *op, void *to,
+ const void __iomem *from, size_t count)
+{
+ if (op->data.buswidth == 8 && op->data.dtr) {
+ unsigned long from_addr = (unsigned long)from;
+
+ /* Handle unaligned start with 2-byte read */
+ if (count && !IS_ALIGNED(from_addr, 4)) {
+ *(u16 *)to = __raw_readw(from);
+ from += 2;
+ to += 2;
+ count -= 2;
+ }
+
+ /* Use 4-byte reads for aligned bulk (no readq for 32-bit) */
+ if (count >= 4) {
+ size_t len = round_down(count, 4);
+
+ memcpy_fromio(to, from, len);
+ from += len;
+ to += len;
+ count -= len;
+ }
+
+ /* Handle remaining 2 bytes */
+ if (count)
+ *(u16 *)to = __raw_readw(from);
+
+ return;
+ }
+
+ memcpy_fromio(to, from, count);
+}
+
+static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
+ const struct spi_mem_op *op)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ loff_t from = op->addr.val;
+ loff_t from_aligned, to_aligned;
+ size_t len = op->data.nbytes;
+ size_t len_aligned;
+ u_char *buf = op->data.buf.in;
+ int ret;
+
+ if (!cqspi->rx_chan || !virt_addr_valid(buf) || len <= 16) {
+ cqspi_memcpy_fromio(op, buf, cqspi->ahb_base + from, len);
+ return 0;
+ }
+
+ if (!cqspi_use_phy(f_pdata, op))
+ return cqspi_direct_read_dma(f_pdata, buf, from, len);
+
+ /* Split into unaligned head, aligned middle, unaligned tail */
+ from_aligned = ALIGN(from, 16);
+ to_aligned = ALIGN_DOWN(from + len, 16);
+ len_aligned = to_aligned - from_aligned;
+
+ if (from != from_aligned) {
+ ret = cqspi_direct_read_dma(f_pdata, buf, from,
+ from_aligned - from);
+ if (ret)
+ return ret;
+ buf += from_aligned - from;
+ }
+
+ if (len_aligned) {
+ ret = cqspi_phy_enable(f_pdata, true);
+ if (ret)
+ return ret;
+ ret = cqspi_direct_read_dma(f_pdata, buf, from_aligned,
+ len_aligned);
+ cqspi_phy_enable(f_pdata, false);
+ if (ret)
+ return ret;
+ buf += len_aligned;
+ }
+
+ if (to_aligned != (from + len)) {
+ ret = cqspi_direct_read_dma(f_pdata, buf, to_aligned,
+ (from + len) - to_aligned);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
static ssize_t cqspi_read(struct cqspi_flash_pdata *f_pdata,
const struct spi_mem_op *op)
{
@@ -1524,7 +1685,7 @@ static ssize_t cqspi_read(struct cqspi_flash_pdata *f_pdata,
if ((cqspi->use_direct_mode && ((from + len) <= cqspi->ahb_size)) ||
(cqspi->ddata && cqspi->ddata->quirks & CQSPI_NO_INDIRECT_MODE))
- return cqspi_direct_read_execute(f_pdata, buf, from, len);
+ return cqspi_direct_read_execute(f_pdata, op);
if (cqspi->use_dma_read && ddata && ddata->indirect_read_dma &&
virt_addr_valid(buf) && ((dma_align & CQSPI_DMA_UNALIGN) == 0))
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 10/13] spi: cadence-quadspi: enable PHY for direct reads and indirect writes
2026-05-27 17:55 ` [PATCH v3 10/13] spi: cadence-quadspi: enable PHY for direct reads and indirect writes Santhosh Kumar K
@ 2026-05-28 9:09 ` Miquel Raynal
0 siblings, 0 replies; 25+ messages in thread
From: Miquel Raynal @ 2026-05-28 9:09 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, richard, vigneshr, pratyush,
mwalle, takahiro.kuwano, linux-spi, devicetree, linux-kernel,
linux-mtd, praneeth, u-kumar1, a-dutta
On 27/05/2026 at 23:25:24 +0530, Santhosh Kumar K <s-k6@ti.com> wrote:
> After PHY tuning completes, data transfers still use the default
> read-capture path. The PHY pipeline must be activated around each
> eligible transfer to benefit from the calibrated delay settings.
>
> Add cqspi_phy_enable() to toggle PHY mode. Enabling sets the calibrated
> read-capture delay, asserts PHY_EN and PHY_PIPELINE, and decrements the
> dummy cycle count by one since the PHY pipeline absorbs that latency.
> Disabling reverses all three. Returns cqspi_wait_idle() so callers can
> abort if the controller stalls on enable; disable is best-effort.
>
> Split cqspi_direct_read_execute() so PHY-eligible reads run DMA over the
> 16-byte-aligned middle section with PHY active, while unaligned head and
> tail bytes are transferred without PHY. PHY is used when use_phy is set,
> the transfer exceeds 16 bytes, and the frequency matches the tuned rate.
> cqspi_memcpy_fromio() handles small and non-DMA-able transfers, with
> special handling for 8D-8D-8D to ensure 2-byte-aligned I/O accesses.
>
> For indirect writes, PHY is enabled for transfers of at least 1 KB
kiB :-)
> where the setup overhead is amortized.
>
> Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
> ---
> drivers/spi/spi-cadence-quadspi.c | 181 ++++++++++++++++++++++++++++--
> 1 file changed, 171 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index 72208d376305..80e7c572ab80 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
> @@ -564,6 +564,61 @@ static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
> writel(reg, reg_base + CQSPI_REG_READCAPTURE);
> }
>
> +static int cqspi_phy_enable(struct cqspi_flash_pdata *f_pdata, bool
> enable)
I'm fine with the logic, just the naming is very "TI" specific here. Can
we name the helper "cqspi_tune_phy(f_pdata, enable)"?
[...]
> static int cqspi_exec_flash_cmd(struct cqspi_st *cqspi, unsigned int reg)
> {
> void __iomem *reg_base = cqspi->iobase;
> @@ -1191,6 +1246,7 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
> void __iomem *reg_base = cqspi->iobase;
> unsigned int remaining = n_tx;
> unsigned int write_bytes;
> + bool use_phy_write;
> int ret;
>
> if (!refcount_read(&cqspi->refcount))
> @@ -1226,6 +1282,15 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
> if (cqspi->apb_ahb_hazard)
> readl(reg_base + CQSPI_REG_INDIRECTWR);
>
> + /* Use PHY only for large writes where setup overhead is amortized */
> + use_phy_write = n_tx >= SZ_1K && f_pdata->use_phy;
Maybe also "f_pdata->use_tuned_phy?
> + if (use_phy_write) {
> + ret = cqspi_phy_enable(f_pdata, true);
> + if (ret)
> + goto failwr;
> + }
> +
> while (remaining > 0) {
> size_t write_words, mod_bytes;
>
> @@ -1266,6 +1331,9 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
> goto failwr;
> }
>
> + if (use_phy_write)
> + cqspi_phy_enable(f_pdata, false);
> +
> /* Disable interrupt. */
> writel(0, reg_base + CQSPI_REG_IRQMASK);
>
> @@ -1277,6 +1345,9 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
> return 0;
>
> failwr:
> + if (use_phy_write)
> + cqspi_phy_enable(f_pdata, false);
> +
> /* Disable interrupt. */
> writel(0, reg_base + CQSPI_REG_IRQMASK);
>
> @@ -1448,8 +1519,15 @@ static void cqspi_rx_dma_callback(void *param)
> complete(&cqspi->rx_dma_complete);
> }
>
> -static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
> - u_char *buf, loff_t from, size_t len)
> +static bool cqspi_use_phy(struct cqspi_flash_pdata *f_pdata,
> + const struct spi_mem_op *op)
> +{
> + return f_pdata->use_phy && op->data.nbytes > 16 &&
Why is the check looking for 16 here, and 1kiB above?
> + op->max_freq == f_pdata->max_clk_rate;
> +}
> +
> +static int cqspi_direct_read_dma(struct cqspi_flash_pdata *f_pdata, u_char *buf,
> + loff_t from, size_t len)
> {
> struct cqspi_st *cqspi = f_pdata->cqspi;
> struct device *dev = &cqspi->pdev->dev;
> @@ -1461,19 +1539,14 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
> dma_addr_t dma_dst;
> struct device *ddev;
>
> - if (!cqspi->rx_chan || !virt_addr_valid(buf)) {
> - memcpy_fromio(buf, cqspi->ahb_base + from, len);
> - return 0;
> - }
This (and changes below) don't seem to be directly related to the PHY
addition, could we have those changes done in a separated patch, before
introducing PHY tuning use?
> -
> ddev = cqspi->rx_chan->device->dev;
> dma_dst = dma_map_single(ddev, buf, len, DMA_FROM_DEVICE);
> if (dma_mapping_error(ddev, dma_dst)) {
> dev_err(dev, "dma mapping failed\n");
> return -ENOMEM;
> }
> - tx = dmaengine_prep_dma_memcpy(cqspi->rx_chan, dma_dst, dma_src,
> - len, flags);
> + tx = dmaengine_prep_dma_memcpy(cqspi->rx_chan, dma_dst, dma_src, len,
> + flags);
Not related to the change, isn't it?
> if (!tx) {
> dev_err(dev, "device_prep_dma_memcpy error\n");
> ret = -EIO;
> @@ -1507,6 +1580,94 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
> return ret;
> }
>
[...]
> static ssize_t cqspi_read(struct cqspi_flash_pdata *f_pdata,
> const struct spi_mem_op *op)
> {
> @@ -1524,7 +1685,7 @@ static ssize_t cqspi_read(struct cqspi_flash_pdata *f_pdata,
>
> if ((cqspi->use_direct_mode && ((from + len) <= cqspi->ahb_size)) ||
> (cqspi->ddata && cqspi->ddata->quirks & CQSPI_NO_INDIRECT_MODE))
> - return cqspi_direct_read_execute(f_pdata, buf, from, len);
> + return cqspi_direct_read_execute(f_pdata, op);
This change could also be done in a different commit.
>
> if (cqspi->use_dma_read && ddata && ddata->indirect_read_dma &&
> virt_addr_valid(buf) && ((dma_align & CQSPI_DMA_UNALIGN) == 0))
Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 11/13] mtd: spinand: run PHY tuning after init and update dirmap frequencies
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (9 preceding siblings ...)
2026-05-27 17:55 ` [PATCH v3 10/13] spi: cadence-quadspi: enable PHY for direct reads and indirect writes Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-28 9:27 ` Miquel Raynal
2026-05-27 17:55 ` [PATCH v3 12/13] mtd: spi-nor: extract read op template construction into helper Santhosh Kumar K
` (2 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Run spi_mem_execute_tuning() in spinand_probe() after spinand_init()
completes. The read and write op templates are copied into persistent
fields in spinand_device so the controller can write the validated
frequency directly back into them. On success, propagate that frequency
to every dirmap's primary and secondary op templates. Updating the
secondary template ensures continuous-read dirmaps also benefit from
the validated speed, not just the primary read path.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/mtd/nand/spi/core.c | 35 +++++++++++++++++++++++++++++++++++
include/linux/mtd/spinand.h | 4 ++++
2 files changed, 39 insertions(+)
diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
index f1084d5e04b9..9b54e4607cfe 100644
--- a/drivers/mtd/nand/spi/core.c
+++ b/drivers/mtd/nand/spi/core.c
@@ -2030,6 +2030,41 @@ static int spinand_probe(struct spi_mem *mem)
if (ret)
return ret;
+ /*
+ * Copy the read and write op templates into persistent fields so
+ * execute_tuning can write the validated frequency back into them.
+ * Tuning failure is non-fatal; the device operates at base speed.
+ */
+ spinand->max_read_op = *spinand->op_templates->read_cache;
+ spinand->max_write_op = *spinand->op_templates->write_cache;
+
+ ret = spi_mem_execute_tuning(mem, &spinand->max_read_op,
+ &spinand->max_write_op);
+ if (ret && ret != -EOPNOTSUPP)
+ dev_warn(&mem->spi->dev, "Failed to execute PHY tuning: %d\n",
+ ret);
+
+ /*
+ * Dirmaps were set up in spinand_init() before tuning ran; update
+ * their op templates to use the validated frequency.
+ */
+ if (!ret) {
+ struct nand_device *nand = spinand_to_nand(spinand);
+ int i;
+
+ for (i = 0; i < nand->memorg.planes_per_lun; i++) {
+ if (spinand->dirmaps[i].rdesc) {
+ spinand->dirmaps[i].rdesc->info.primary_op_tmpl.max_freq =
+ spinand->max_read_op.max_freq;
+ spinand->dirmaps[i].rdesc->info.secondary_op_tmpl.max_freq =
+ spinand->max_read_op.max_freq;
+ }
+ if (spinand->dirmaps[i].wdesc)
+ spinand->dirmaps[i].wdesc->info.primary_op_tmpl.max_freq =
+ spinand->max_write_op.max_freq;
+ }
+ }
+
ret = mtd_device_register(mtd, NULL, 0);
if (ret)
goto err_spinand_cleanup;
diff --git a/include/linux/mtd/spinand.h b/include/linux/mtd/spinand.h
index 44f4347104d6..e5af90281762 100644
--- a/include/linux/mtd/spinand.h
+++ b/include/linux/mtd/spinand.h
@@ -786,6 +786,10 @@ struct spinand_device {
struct spinand_dirmap *dirmaps;
+ /* Persistent op templates updated by execute_tuning with validated speed. */
+ struct spi_mem_op max_read_op;
+ struct spi_mem_op max_write_op;
+
int (*select_target)(struct spinand_device *spinand,
unsigned int target);
unsigned int cur_target;
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 11/13] mtd: spinand: run PHY tuning after init and update dirmap frequencies
2026-05-27 17:55 ` [PATCH v3 11/13] mtd: spinand: run PHY tuning after init and update dirmap frequencies Santhosh Kumar K
@ 2026-05-28 9:27 ` Miquel Raynal
0 siblings, 0 replies; 25+ messages in thread
From: Miquel Raynal @ 2026-05-28 9:27 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, richard, vigneshr, pratyush,
mwalle, takahiro.kuwano, linux-spi, devicetree, linux-kernel,
linux-mtd, praneeth, u-kumar1, a-dutta
Hi Santhosh,
On 27/05/2026 at 23:25:25 +0530, Santhosh Kumar K <s-k6@ti.com> wrote:
> Run spi_mem_execute_tuning() in spinand_probe() after spinand_init()
> completes. The read and write op templates are copied into persistent
> fields in spinand_device so the controller can write the validated
> frequency directly back into them. On success, propagate that frequency
> to every dirmap's primary and secondary op templates. Updating the
> secondary template ensures continuous-read dirmaps also benefit from
> the validated speed, not just the primary read path.
>
> Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
> ---
> drivers/mtd/nand/spi/core.c | 35 +++++++++++++++++++++++++++++++++++
> include/linux/mtd/spinand.h | 4 ++++
> 2 files changed, 39 insertions(+)
>
> diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
> index f1084d5e04b9..9b54e4607cfe 100644
> --- a/drivers/mtd/nand/spi/core.c
> +++ b/drivers/mtd/nand/spi/core.c
> @@ -2030,6 +2030,41 @@ static int spinand_probe(struct spi_mem *mem)
> if (ret)
> return ret;
>
> + /*
> + * Copy the read and write op templates into persistent fields so
> + * execute_tuning can write the validated frequency back into them.
> + * Tuning failure is non-fatal; the device operates at base speed.
> + */
> + spinand->max_read_op = *spinand->op_templates->read_cache;
> + spinand->max_write_op = *spinand->op_templates->write_cache;
> +
> + ret = spi_mem_execute_tuning(mem, &spinand->max_read_op,
> + &spinand->max_write_op);
> + if (ret && ret != -EOPNOTSUPP)
> + dev_warn(&mem->spi->dev, "Failed to execute PHY tuning: %d\n",
> + ret);
> +
> + /*
> + * Dirmaps were set up in spinand_init() before tuning ran; update
> + * their op templates to use the validated frequency.
> + */
> + if (!ret) {
> + struct nand_device *nand = spinand_to_nand(spinand);
> + int i;
> +
> + for (i = 0; i < nand->memorg.planes_per_lun; i++) {
> + if (spinand->dirmaps[i].rdesc) {
> + spinand->dirmaps[i].rdesc->info.primary_op_tmpl.max_freq =
> + spinand->max_read_op.max_freq;
> + spinand->dirmaps[i].rdesc->info.secondary_op_tmpl.max_freq =
> + spinand->max_read_op.max_freq;
> + }
> + if (spinand->dirmaps[i].wdesc)
> + spinand->dirmaps[i].wdesc->info.primary_op_tmpl.max_freq =
> + spinand->max_write_op.max_freq;
> + }
> + }
Unfortunately, hot fixing the dirmaps is invalid. When we take the best
variant, we select a maximum speed that may be lower than the tuned PHY
speed. We cannot just overwrite that value without consequence, because
depending on the boundaries we cross, extra dummy cycles may be
required.
I believe spinand_select_op_variant() shall be aware of the different
possible speeds. It should look for the max_speed_hz capability and not
for the base_speed_hz, and fallback to base_speed_hz in case of
issue.
Or otherwise, maybe we could go through the whole I/O variant
selection again after tuning, with the actual maximum speed set.
Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 12/13] mtd: spi-nor: extract read op template construction into helper
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (10 preceding siblings ...)
2026-05-27 17:55 ` [PATCH v3 11/13] mtd: spinand: run PHY tuning after init and update dirmap frequencies Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-27 17:55 ` [PATCH v3 13/13] mtd: spi-nor: run PHY tuning after init and update dirmap frequency Santhosh Kumar K
2026-05-28 8:30 ` [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Miquel Raynal
13 siblings, 0 replies; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
From: Pratyush Yadav <pratyush@kernel.org>
spi_nor_spimem_read_data() and spi_nor_create_read_dirmap() both
open-coded the same sequence: build the read spi_mem_op, call
spi_nor_spimem_setup_op(), convert dummy cycles to bytes, and—in the
dirmap case—explicitly patch data.buswidth because setup_op skips it
when data.nbytes is zero.
Introduce spi_nor_spimem_get_read_op() to centralise this logic.
Initialising the op with data.nbytes set to a non-zero value ensures
spi_nor_spimem_setup_op() populates data.buswidth, removing the need
for the manual override in the dirmap path. The dirmap template is
initialised directly from the helper; direct-read callers overwrite
addr.val, data.nbytes, and data.buf.in before submitting.
Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/mtd/spi-nor/core.c | 66 +++++++++++++++++++++-----------------
1 file changed, 36 insertions(+), 30 deletions(-)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index a7bc458edc5c..2c9859fb0794 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -188,6 +188,37 @@ static int spi_nor_controller_ops_erase(struct spi_nor *nor, loff_t offs)
return nor->controller_ops->erase(nor, offs);
}
+/**
+ * spi_nor_spimem_get_read_op() - build a configured read op template
+ * @nor: the spi-nor device
+ *
+ * Returns a spi_mem_op with the command, address format, dummy cycles,
+ * and data buswidth configured for @nor. For direct reads, the caller
+ * must fill in addr.val, data.nbytes, and data.buf.in before use.
+ */
+static struct spi_mem_op spi_nor_spimem_get_read_op(struct spi_nor *nor)
+{
+ /*
+ * data.nbytes must be non-zero so spi_nor_spimem_setup_op()
+ * configures the data buswidth; callers replace it with the
+ * actual transfer length.
+ */
+ struct spi_mem_op op =
+ SPI_MEM_OP(SPI_MEM_OP_CMD(nor->read_opcode, 0),
+ SPI_MEM_OP_ADDR(nor->addr_nbytes, 0, 0),
+ SPI_MEM_OP_DUMMY(nor->read_dummy, 0),
+ SPI_MEM_OP_DATA_IN(2, NULL, 0));
+
+ spi_nor_spimem_setup_op(nor, &op, nor->read_proto);
+
+ /* convert the dummy cycles to the number of bytes */
+ op.dummy.nbytes = (nor->read_dummy * op.dummy.buswidth) / 8;
+ if (spi_nor_protocol_is_dtr(nor->read_proto))
+ op.dummy.nbytes *= 2;
+
+ return op;
+}
+
/**
* spi_nor_spimem_read_data() - read data from flash's memory region via
* spi-mem
@@ -201,21 +232,14 @@ static int spi_nor_controller_ops_erase(struct spi_nor *nor, loff_t offs)
static ssize_t spi_nor_spimem_read_data(struct spi_nor *nor, loff_t from,
size_t len, u8 *buf)
{
- struct spi_mem_op op =
- SPI_MEM_OP(SPI_MEM_OP_CMD(nor->read_opcode, 0),
- SPI_MEM_OP_ADDR(nor->addr_nbytes, from, 0),
- SPI_MEM_OP_DUMMY(nor->read_dummy, 0),
- SPI_MEM_OP_DATA_IN(len, buf, 0));
+ struct spi_mem_op op = spi_nor_spimem_get_read_op(nor);
bool usebouncebuf;
ssize_t nbytes;
int error;
- spi_nor_spimem_setup_op(nor, &op, nor->read_proto);
-
- /* convert the dummy cycles to the number of bytes */
- op.dummy.nbytes = (nor->read_dummy * op.dummy.buswidth) / 8;
- if (spi_nor_protocol_is_dtr(nor->read_proto))
- op.dummy.nbytes *= 2;
+ op.addr.val = from;
+ op.data.nbytes = len;
+ op.data.buf.in = buf;
usebouncebuf = spi_nor_spimem_bounce(nor, &op);
@@ -3642,28 +3666,10 @@ static int spi_nor_create_read_dirmap(struct spi_nor *nor)
{
struct spi_mem_dirmap_info info = {
.op_tmpl = &info.primary_op_tmpl,
- .primary_op_tmpl = SPI_MEM_OP(SPI_MEM_OP_CMD(nor->read_opcode, 0),
- SPI_MEM_OP_ADDR(nor->addr_nbytes, 0, 0),
- SPI_MEM_OP_DUMMY(nor->read_dummy, 0),
- SPI_MEM_OP_DATA_IN(0, NULL, 0)),
+ .primary_op_tmpl = spi_nor_spimem_get_read_op(nor),
.offset = 0,
.length = nor->params->size,
};
- struct spi_mem_op *op = info.op_tmpl;
-
- spi_nor_spimem_setup_op(nor, op, nor->read_proto);
-
- /* convert the dummy cycles to the number of bytes */
- op->dummy.nbytes = (nor->read_dummy * op->dummy.buswidth) / 8;
- if (spi_nor_protocol_is_dtr(nor->read_proto))
- op->dummy.nbytes *= 2;
-
- /*
- * Since spi_nor_spimem_setup_op() only sets buswidth when the number
- * of data bytes is non-zero, the data buswidth won't be set here. So,
- * do it explicitly.
- */
- op->data.buswidth = spi_nor_get_protocol_data_nbits(nor->read_proto);
nor->dirmap.rdesc = devm_spi_mem_dirmap_create(nor->dev, nor->spimem,
&info);
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH v3 13/13] mtd: spi-nor: run PHY tuning after init and update dirmap frequency
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (11 preceding siblings ...)
2026-05-27 17:55 ` [PATCH v3 12/13] mtd: spi-nor: extract read op template construction into helper Santhosh Kumar K
@ 2026-05-27 17:55 ` Santhosh Kumar K
2026-05-28 8:30 ` [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Miquel Raynal
13 siblings, 0 replies; 25+ messages in thread
From: Santhosh Kumar K @ 2026-05-27 17:55 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Introduce a persistent max_read_op field in struct spi_nor. Populate it
with the correct op layout before creating the read dirmap so
execute_tuning() receives a fully configured op. After both dirmaps are
set up, run spi_mem_execute_tuning() to let the controller validate the
read frequency and write it back into max_read_op.max_freq. Patch the
dirmap's op template with this value so subsequent dirmap reads use
the validated speed.
spi_nor_spimem_get_read_op() is updated to propagate max_read_op.max_freq
into every returned op, so non-dirmap reads via spi_nor_spimem_read_data()
also benefit from the validated frequency automatically.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/mtd/spi-nor/core.c | 19 +++++++++++++++++++
include/linux/mtd/spi-nor.h | 3 +++
2 files changed, 22 insertions(+)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index 2c9859fb0794..207e0679549e 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -216,6 +216,9 @@ static struct spi_mem_op spi_nor_spimem_get_read_op(struct spi_nor *nor)
if (spi_nor_protocol_is_dtr(nor->read_proto))
op.dummy.nbytes *= 2;
+ /* Propagate the validated frequency; zero before tuning. */
+ op.max_freq = nor->max_read_op.max_freq;
+
return op;
}
@@ -3773,6 +3776,9 @@ static int spi_nor_probe(struct spi_mem *spimem)
return -ENOMEM;
}
+ /* Populate the persistent template with the correct op layout for tuning. */
+ nor->max_read_op = spi_nor_spimem_get_read_op(nor);
+
ret = spi_nor_create_read_dirmap(nor);
if (ret)
return ret;
@@ -3781,6 +3787,19 @@ static int spi_nor_probe(struct spi_mem *spimem)
if (ret)
return ret;
+ /* Tuning failure is non-fatal; the device operates at base speed. */
+ ret = spi_mem_execute_tuning(spimem, &nor->max_read_op, NULL);
+ if (ret && ret != -EOPNOTSUPP)
+ dev_warn(dev, "Failed to execute PHY tuning: %d\n", ret);
+
+ /*
+ * The dirmap was created before tuning ran; update its op template
+ * to use the validated frequency.
+ */
+ if (!ret && nor->dirmap.rdesc)
+ nor->dirmap.rdesc->info.primary_op_tmpl.max_freq =
+ nor->max_read_op.max_freq;
+
return mtd_device_register(&nor->mtd, data ? data->parts : NULL,
data ? data->nr_parts : 0);
}
diff --git a/include/linux/mtd/spi-nor.h b/include/linux/mtd/spi-nor.h
index cdcfe0fd2e7d..6a11625f7b2d 100644
--- a/include/linux/mtd/spi-nor.h
+++ b/include/linux/mtd/spi-nor.h
@@ -419,6 +419,9 @@ struct spi_nor {
struct spi_mem_dirmap_desc *wdesc;
} dirmap;
+ /* Persistent op template updated by execute_tuning with validated speed. */
+ struct spi_mem_op max_read_op;
+
void *priv;
};
--
2.34.1
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support
2026-05-27 17:55 [PATCH v3 00/13] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (12 preceding siblings ...)
2026-05-27 17:55 ` [PATCH v3 13/13] mtd: spi-nor: run PHY tuning after init and update dirmap frequency Santhosh Kumar K
@ 2026-05-28 8:30 ` Miquel Raynal
13 siblings, 0 replies; 25+ messages in thread
From: Miquel Raynal @ 2026-05-28 8:30 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, richard, vigneshr, pratyush,
mwalle, takahiro.kuwano, linux-spi, devicetree, linux-kernel,
linux-mtd, praneeth, u-kumar1, a-dutta
Hi Santhosh,
Very happy to see this v3! Looks pretty neat overall.
On 27/05/2026 at 23:25:14 +0530, Santhosh Kumar K <s-k6@ti.com> wrote:
> This series implements PHY tuning support for the Cadence QSPI controller
> to enable reliable high-speed operations. Without PHY tuning, controllers
> use conservative timing that limits performance. PHY tuning calibrates
> RX/TX delay lines to find optimal data capture timing windows, enabling
> operation up to the controller's maximum frequency.
>
> Background:
> High-speed SPI memory controllers require precise timing calibration for
> reliable operation. At higher frequencies, board-to-board variations make
> fixed timing parameters inadequate. The Cadence QSPI controller includes
> a PHY interface with programmable delay lines (0-127 taps) for RX and TX
> paths, but these require runtime calibration to find the valid timing
> window.
>
> Approach:
> Add SDR/DDR PHY tuning algorithms for the Cadence controller:
>
> SDR Mode Tuning (1D search):
> - Searches for two consecutive valid RX delay windows
> - Selects the larger window and uses its midpoint for maximum margin
> - TX delay fixed at maximum (127) as it's less critical in SDR
>
> DDR Mode Tuning (2D search):
> - Finds RX boundaries (rxlow/rxhigh) using TX window sweeps
> - Finds TX boundaries (txlow/txhigh) at fixed RX positions
> - Defines valid region corners and detects gaps via binary search
> - Applies temperature compensation for optimal point selection
> - Handles single or dual passing regions with different strategies
>
> Patch description:
> Infrastructure (1-5):
> - Patch 1: Extend spi-max-frequency DT binding to accept an optional
> second value forming a [base-freq, max-freq] pair
> - Patch 2: Add cadence-specific cdns,phy-pattern-partition phandle for
> NOR flash PHY tuning pattern location
> - Patch 3: Parse two-element spi-max-frequency in spi.c; adds
> spi_device.base_speed_hz (0 when a single value is used,
> keeping all existing DT fully compatible)
> - Patch 4: Add spi_mem_apply_base_freq_cap(), called from
> spi_mem_exec_op() to cap non-PHY ops to base_speed_hz;
> tuned ops bypass the cap because execute_tuning() marks
> them with op->max_freq = max_speed_hz
> - Patch 5: Add execute_tuning callback to spi_controller_mem_ops and
> spi_mem_execute_tuning() wrapper in SPI-MEM core
>
> Cadence QSPI Implementation (6-10):
> - Patch 6: Move cqspi_readdata_capture() earlier (preparatory)
> - Patch 7: Add DQS bit to cqspi_readdata_capture() (preparatory)
> - Patch 8: Add complete PHY tuning support: DLL management, pattern
> verification (NOR via cdns,phy-pattern-partition phandle,
> NAND via write-to-cache), SDR 1D and DDR 2D search
> algorithms with temperature compensation, AM654-specific
> execute_tuning entry point; base_speed_hz is cleared during
> the tuning loop and restored unconditionally on return
> - Patch 9: Reject 2-byte-address DDR operations via a new
> CQSPI_NO_2BYTE_ADDR_PHY_DDR quirk flag to work around
> AM654 OSPI erratum i2383
> - Patch 10: Enable PHY for direct memory-mapped reads (aligned body
> region only; unaligned head and tail run without PHY) and
> for indirect writes >= 1 KB
>
> MTD core (11-13):
> - Patch 11: Integrate tuning in SPI-NAND probe; propagate the validated
> frequency to all plane dirmaps (primary and secondary op
> templates) and to the persistent write dirmap template
> - Patch 12: Extract spi_nor_spimem_get_read_op() helper (preparatory)
> - Patch 13: Integrate tuning in SPI-NOR probe; patch the dirmap op
> template with the validated frequency; store the result in
> nor->max_read_op so all subsequent reads (dirmap and direct)
> pick up the tuned speed automatically
>
> Series dependency:
> Merge after:
> https://lore.kernel.org/linux-spi/20260527173736.2243004-1-s-k6@ti.com/T/#u
Isn't the DQS series a prerequisite as well? I sent it as an RFC, we can
definitely consider it for merge together with this series once
ready.
Link: https://lore.kernel.org/linux-mtd/20260205-winbond-nand-next-phy-tuning-v1-0-5e7d3976f0f1@bootlin.com/
Do you confirm that you have "[PATCH DO NOT MERGE RFC 4/4] spi: cadence-qspi: Retrieve
DQS capability using the core helper" in your branch for the PHY tuning
series to work?
> Testing:
> This series was tested on TI's
> AM62Ax SK with OSPI NAND flash and
> AM62Px SK with OSPI NOR flash:
>
> Read throughput:
> |-------------------------------------|
> | | without PHY | with PHY |
> |-------------------------------------|
> | OSPI NOR | 37.5 MB/s | 216 MB/s |
I am impressed by the SPI NOR improvement o_O
> |-------------------------------------|
> | OSPI NAND | 9.2 MB/s | 35.1 MB/s |
> |-------------------------------------|
Was this tested in 8D-8D-8D mode?
> Write throughput:
> |-------------------------------------|
> | | without PHY | with PHY |
> |-------------------------------------|
> | OSPI NAND | 6 MB/s | 9.2 MB/s |
> |-------------------------------------|
Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 25+ messages in thread